Toy Store Search App with Cloud Databases, Serverless Runtimes and Open Source Integrations

1. Overview

Imagine stepping into a toy store virtually or in-person, where finding the perfect gift is effortless. You can describe what you're looking for, upload a picture of a toy, or even design your own creation, and the store instantly understands your needs and delivers a tailored experience. This isn't a futuristic fantasy; it's a reality powered by AI, cloud technology, and a vision for personalized e-commerce.

The Challenge: Simply finding that perfect product that matches your imagination can be difficult. Generic search terms, keywords and fuzzy searches often fall short, browsing endless pages can be tedious, and the disconnect between what you imagine and what's available can lead to frustration.

The Solution: The demo application tackles this challenge head-on, leveraging the power of AI to deliver a truly personalized and seamless experience with contextual search and custom generation of the product matching the search context.

What you'll build

As part of this lab, you will:

  1. Create an AlloyDB instance and load Toys Dataset
  2. Enable the pgvector and generative AI model extensions in AlloyDB
  3. Generate embeddings from the product description and perform real time Cosine similarity search for user search text
  4. Invoke Gemini 2.0 Flash to describe the image uploaded by user for contextual toy search
  5. Invoke Imagen 3 to custom create a toy based on user's interest
  6. Invoke a price prediction tool created using Gen AI Toolbox for Databases for price details of the custom created toy
  7. Deploy the solution in serverless Cloud Run Functions

Requirements

  • A browser, such as Chrome or Firefox
  • A Google Cloud project with billing enabled.

2. Architecture

Data Flow: Let's take a closer look at how data moves through our system:

  1. Contextual Search with AI-Powered RAG (Retrieval Augmented Generation)

Think of it like this: instead of just looking for "red car," the system understands the below:

"small vehicle suitable for a 3-year-old boy."

AlloyDB as the foundation: We use AlloyDB, Google Cloud's fully managed, PostgreSQL-compatible database, to store our toy data, including descriptions, image URLs, and other relevant attributes.

pgvector for Semantic Search: pgvector, a PostgreSQL extension, allows us to store vector embeddings of both toy descriptions and user search queries. This enables semantic search, meaning the system understands the meaning behind the words, not just the exact keywords.

Cosine Similarity for Relevance: We use cosine similarity to measure the semantic similarity between the user's search vector and the toy description vectors, surfacing the most relevant results.

ScaNN Index for Speed and Accuracy: To ensure rapid and accurate results, especially as our toy inventory grows, we integrate the ScaNN (Scalable Nearest Neighbors) index. This significantly improves the efficiency and recall of our vector search.

  1. Image-Based Search & Understanding with Gemini 2.0 Flash

Instead of typing the context as text, let's say the user wants to upload a picture of a familiar toy that they want to search with. Users can upload an image of a toy they like and get relevant features with this. We leverage Google's Gemini 2.0 Flash model, invoked using LangChain4j, to analyze the image and extract relevant context, such as the toy's color, material, type, and intended age group.

  1. Building Your Dream Toy Customized with Generative AI: Imagen 3

The real magic happens when users decide to create their own toy. Using Imagen 3, we allow them to describe their dream toy using simple text prompts. Imagine being able to say: "I want a plush dragon with purple wings and a friendly face" and seeing that dragon come to life on your screen! Imagen 3 then generates an image of the custom-designed toy, giving the user a clear visualization of their creation.

  1. Price Prediction Powered by Agents & Gen AI Toolbox for Databases

We've implemented a price prediction feature that estimates the cost of producing the custom-designed toy. This is powered by an agent that includes a sophisticated price calculation tool.

Gen AI Toolbox for Databases: This agent is seamlessly integrated with our database using Google's new open-source tool, Gen AI Toolbox for Databases. This allows the agent to access real-time data on material costs, manufacturing processes, and other relevant factors to provide an accurate price estimate. Read more about it here.

  1. Java Spring Boot, Gemini Code Assist and Cloud Run for Streamlined Development and Serverless Deployment

The entire application is built using Java Spring Boot, a robust and scalable framework. We leveraged Gemini Code Assist throughout the development process, particularly for front-end development, significantly accelerating the development cycle and improving code quality. We used Cloud Run for deploying the whole application and Cloud Run Functions for deploying the database and agentic functionalities as independent endpoints.

3. Before you begin

Create a project

  1. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.
  2. Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project .
  3. You'll use Cloud Shell, a command-line environment running in Google Cloud that comes preloaded with bq. Click Activate Cloud Shell at the top of the Google Cloud console.

Activate Cloud Shell button image

  1. Once connected to Cloud Shell, you check that you're already authenticated and that the project is set to your project ID using the following command:
gcloud auth list
  1. Run the following command in Cloud Shell to confirm that the gcloud command knows about your project.
gcloud config list project
  1. If your project is not set, use the following command to set it:
gcloud config set project <YOUR_PROJECT_ID>
  1. Enable the required APIs by running the following commands one by one in your Cloud Shell Terminal:

There is also a single command to run the below, but if you are a trial account user, you might encounter quota issues trying to enable these in bulk. That is why the commands are singled out one per line.

gcloud services enable alloydb.googleapis.com
gcloud services enable compute.googleapis.com 
gcloud services enable cloudresourcemanager.googleapis.com 
gcloud services enable servicenetworking.googleapis.com 
gcloud services enable run.googleapis.com 
gcloud services enable cloudbuild.googleapis.com 
gcloud services enable cloudfunctions.googleapis.com 
gcloud services enable aiplatform.googleapis.com

The alternative to the gcloud command is through the console by searching for each product or using this link.

If any API is missed, you can always enable it during the course of the implementation.

Refer documentation for gcloud commands and usage.

4. Database setup

In this lab we'll use AlloyDB as the database to hold the toystore data. It uses clusters to hold all of the resources, such as databases and logs. Each cluster has a primary instance that provides an access point to the data. Tables will hold the actual data.

Let's create an AlloyDB cluster, instance and table where the ecommerce dataset will be loaded.

Create a cluster and instance

  1. Navigate the AlloyDB page in the Cloud Console. An easy way to find most pages in Cloud Console is to search for them using the search bar of the console.
  2. Select CREATE CLUSTER from that page:

f76ff480c8c889aa.png

  1. You'll see a screen like the one below. Create a cluster and instance with the following values (Make sure the values match in case you are cloning the application code from the repo):
  • cluster id: "vector-cluster"
  • password: "alloydb"
  • PostgreSQL 15 compatible
  • Region: "us-central1"
  • Networking: "default"

538dba58908162fb.png

  1. When you select the default network, you'll see a screen like the one below.

Select SET UP CONNECTION.
7939bbb6802a91bf.png

  1. From there, select "Use an automatically allocated IP range" and Continue. After reviewing the information, select CREATE CONNECTION. 768ff5210e79676f.png
  2. Once your network is set up, you can continue to create your cluster. Click CREATE CLUSTER to complete setting up of the cluster as shown below:

e06623e55195e16e.png

Make sure to change the instance id to "

vector-instance"

.

Note that the Cluster creation will take around 10 minutes. Once it is successful, you should see a screen that shows the overview of your cluster you just created.

5. Data ingestion

Now it's time to add a table with the data about the store. Navigate to AlloyDB, select the primary cluster and then AlloyDB Studio:

847e35f1bf8a8bd8.png

You may need to wait for your instance to finish being created. Once it is, sign into AlloyDB using the credentials you created when you created the cluster. Use the following data for authenticating to PostgreSQL:

  • Username : "postgres"
  • Database : "postgres"
  • Password : "alloydb"

Once you have authenticated successfully into AlloyDB Studio, SQL commands are entered in the Editor. You can add multiple Editor windows using the plus to the right of the last window.

91a86d9469d499c4.png

You'll enter commands for AlloyDB in editor windows, using the Run, Format, and Clear options as necessary.

Enable Extensions

For building this app, we will use the extensions pgvector and google_ml_integration. The pgvector extension allows you to store and search vector embeddings. The google_ml_integration extension provides functions you use to access Vertex AI prediction endpoints to get predictions in SQL. Enable these extensions by running the following DDLs:

CREATE EXTENSION IF NOT EXISTS google_ml_integration CASCADE;
CREATE EXTENSION IF NOT EXISTS vector;

If you would like to check the extensions that have been enabled on your database, run this SQL command:

select extname, extversion from pg_extension;

Create a table

Create a table using the DDL statement below:

CREATE TABLE toys ( id VARCHAR(25), name VARCHAR(25), description VARCHAR(20000), quantity INT, price FLOAT, image_url VARCHAR(200), text_embeddings vector(768)) ;

On successful execution of the above command, you should be able to view the table in the database.

Ingest data

For this lab, we have test data of about 72 records in this SQL file. It contains the id, name, description, quantity, price, image_url fields. The other fields will be filled in later in the lab.

Copy the lines/insert statements from there and then paste those lines in a blank editor tab and select RUN.

To see the table contents, expand the Explorer section until you can see the table named apparels. Select the tricolon (⋮) to see the option to Query the table. A SELECT statement will open in a new Editor tab.

cfaa52b717f9aaed.png

Grant Permission

Run the below statement to grant execute rights on the embedding function to the user postgres:

GRANT EXECUTE ON FUNCTION embedding TO postgres;

Grant Vertex AI User ROLE to the AlloyDB service account

Go to Cloud Shell terminal and give the following command:

PROJECT_ID=$(gcloud config get-value project)

gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member="serviceAccount:service-$(gcloud projects describe $PROJECT_ID --format="value(projectNumber)")@gcp-sa-alloydb.iam.gserviceaccount.com" \
--role="roles/aiplatform.user"

6. Create embeddings for the context

It's much easier for computers to process numbers than to process text. An embedding system converts text into a series of floating point numbers that should represent the text, no matter how it's worded, what language it uses, etc.

Consider describing a seaside location. It might be called "on the water", "beachfront", "walk from your room to the ocean", "sur la mer", "на берегу океана" etc. These terms all look different, but their semantic meaning or in machine learning terminology, their embeddings should be very close to each other.

Now that the data and context are ready, we will run the SQL to add the embeddings of the product description to the table in the field embedding. There are a variety of embedding models you can use. We're using text-embedding-005 from Vertex AI. Be sure to use the same embedding model throughout the project!

Note: If you are using an existing Google Cloud Project created a while ago, you might need to continue to use older versions of the text-embedding model like textembedding-gecko.

Return to the AlloyDB Studio tab and type the following DML:

UPDATE toys set text_embeddings = embedding( 'text-embedding-005', description);

Look at the toys table again to see some embeddings. Be sure to rerun the SELECT statement to see the changes.

SELECT id, name, description, price, quantity, image_url, text_embeddings FROM toys;

This should return the embeddings vector, that looks like an array of floats, for the toy description as shown below:

7d32f7cd7204e1f3.png

Note: Newly created Google Cloud Projects under the free tier might face quota issues when it comes to the number of embedding requests allowed per second to the Embedding models. We suggest that you use a filter query for the ID and then selectively choose 1-5 records and so on, while generating the embedding.

7. Perform Vector search

Now that the table, data, and embeddings are all ready, let's perform the real time vector search for the user search text.

Suppose the user asks:

"I want a white plush teddy bear toy with a floral pattern."

You can find matches for this by running the query below:

select * from toys
ORDER BY text_embeddings <=> CAST(embedding('text-embedding-005', 'I want a white plush teddy bear toy with a floral pattern') as vector(768))
LIMIT 5;

Let's look at this query in detail:

In this query,

  1. The user's search text is: "I want a white plush teddy bear toy with a floral pattern."
  2. We are converting it to embeddings in the embedding() method using the model: text-embedding-005. This step should look familiar after the last step, where we applied the embedding function to all of the items in the table.
  3. "<=>" represents the use of the COSINE SIMILARITY distance method. You can find all the similarity measures available in the documentation of pgvector.
  4. We are converting the embedding method's result to vector type to make it compatible with the vectors stored in the database.
  5. LIMIT 5 represents that we want to extract 5 nearest neighbors for the search text.

Result looks like this:

fa7f0fc3a4c68804.png

As you can observe in your results, the matches are pretty close to the search text. Try changing the text to see how the results change.

Important Note:

Now let's say we want to increase the performance (query time), efficiency and recall of this Vector Search result using ScaNN index. Please read the steps in this blog to compare the difference in result with and without the index.

Optional Step: Improve Efficiency and Recall with ScaNN Index

Just listing the index creation steps here for convenience:

  1. Since we already have the cluster, instance, context and embeddings created, we just have to install the ScaNN extension using the following statement:
CREATE EXTENSION IF NOT EXISTS alloydb_scann;
  1. Next we will create the index (ScaNN):
CREATE INDEX toysearch_index ON toys
USING scann (text_embeddings cosine)
WITH (num_leaves=9);

In the above DDL, apparel_index is the name of the index

"toys" is my table

"scann" is the index method

"embedding" is the column in the table I want to index

"cosine" is the distance method I want to use with the index

"8" is the number of partitions to apply to this index. Set to any value between 1 to 1048576. For more information about how to decide this value, see Tune a ScaNN index.

I used a SQUARE ROOT of the number of data points as recommended in the ScaNN repo (When partitioning, num_leaves should be roughly the square root of the number of datapoints.).

  1. Check if the index is created using the query:
SELECT * FROM pg_stat_ann_indexes;
  1. Perform Vector Search using the same query we used without the index:
select * from toys
ORDER BY text_embeddings <=> CAST(embedding('text-embedding-005', 'I want a white plush teddy bear toy with a floral pattern') as vector(768))
LIMIT 5;

The above query is the same one that we used in the lab in step 8. However now we have the field indexed.

  1. Test with a simple search query with and without the index (by dropping the index):

This use case has only 72 records so the index does not really take effect. For a test conducted in another use case, the results are as follows:

The same Vector Search query on the INDEXED embeddings data results in quality search results and efficiency. The efficiency is vastly improved (in terms of time for execution: 10.37ms without ScaNN and 0.87ms with ScaNN) with the index. For more information on this topic, please refer to this blog.

8. Match Validation with the LLM

Before moving on and creating a service to return the best matches to an application, let's use a generative AI model to validate if these potential responses are truly relevant and safe to share with the user.

Ensuring the instance is set up for Gemini

First check if the Google ML Integration is already enabled for your Cluster and Instance. In AlloyDB Studio, give the following command:

show google_ml_integration.enable_model_support;

If the value is shown as "on", you can skip the next 2 steps and go directly to setting up the AlloyDB and Vertex AI Model integration.

  1. Go to your AlloyDB cluster's primary instance and click EDIT PRIMARY INSTANCE

cb76b934ba3735bd.png

  1. Navigate to the Flags section in the Advanced Configuration Options. and ensure that the google_ml_integration.enable_model_support flag is set to "on" as shown below:

6a59351fcd2a9d35.png

If it is not set to "on", set it to "on" and then click the UPDATE INSTANCE button. This step will take a few minutes.

AlloyDB and Vertex AI Model integration

Now you can connect to AlloyDB Studio and run the following DML statement to set up Gemini model access from AlloyDB, using your project ID where indicated. You may be warned of a syntax error before running the command, but it should run fine.

First up, we create the Gemini 1.5 model connection as shown below. Remember to replace $PROJECT_ID in the command below with your Google Cloud Project Id.

CALL
 google_ml.create_model( model_id => 'gemini-1.5',
   model_request_url => 'https://us-central1-aiplatform.googleapis.com/v1/projects/$PROJECT_ID/locations/us-central1/publishers/google/models/gemini-1.5-pro:streamGenerateContent',
   model_provider => 'google',
   model_auth_type => 'alloydb_service_agent_iam');

You can check on the models configured for access via the following command in AlloyDB Studio:

select model_id,model_type from google_ml.model_info_view;        

Finally, we need to grant permission for database users to execute the ml_predict_row function to run predictions via Google Vertex AI models. Run the following command:

GRANT EXECUTE ON FUNCTION ml_predict_row to postgres;

Note: If you are using an existing Google Cloud Project and an existing cluster/instance of AlloyDB created a while ago, you might need to drop the old references to the gemini-1.5 model and create again with the above CALL statement and run grant execute on function ml_predict_row again in case you face issues in the upcoming invocations of gemini-1.5.

Evaluating the responses

While we'll end up using one large query in the next section to ensure the responses from the query are reasonable, the query can be difficult to understand. We'll look at the pieces now and see how they come together in a few minutes.

  1. First we'll send a request to the database to get the 10 closest matches to a user query.
  2. To determine how valid responses are, we'll use an outer query where we explain how to evaluate the responses. It uses the recommended_text field that is the search text and content (which is the toy description field) of the inner table as part of the query.
  3. Using that, we'll then review the "goodness" of responses returned.
  4. The predict_row returns its result in JSON format. The code "-> 'candidates' -> 0 -> 'content' -> 'parts' -> 0 -> 'text'" is used to extract the actual text from that JSON. To see the actual JSON that is returned, you can remove this code.
  5. Finally, to get the LLM response, we extract it using REGEXP_REPLACE(gemini_validation, '[^a-zA-Z,: ]', '', 'g')
SELECT id,
       name,
       content,
       quantity,
       price,
       image_url,
       recommended_text,
       REGEXP_REPLACE(gemini_validation, '[^a-zA-Z,: ]', '', 'g') AS gemini_validation
  FROM (SELECT id,
               name,
               content,
               quantity,
               price,
               image_url,
               recommended_text,
               CAST(ARRAY_AGG(LLM_RESPONSE) AS TEXT) AS gemini_validation
          FROM (SELECT id,
                       name,
                       content,
                       quantity,
                       price,
                       image_url,
                       recommended_text,
                       json_array_elements(google_ml.predict_row(model_id => 'gemini-1.5',
                                                                   request_body => CONCAT('{ "contents": [ { "role": "user", "parts": [ { "text": "User wants to buy a toy and this is the description of the toy they wish to buy: ',                                                                                              recommended_text,                                                                                              '. Check if the following product items from the inventory are close enough to really, contextually match the user description. Here are the items: ',                                                                                         content,                                                                                         '. Return a ONE-LINE response with 3 values: 1) MATCH: if the 2 contexts are reasonably matching in terms of any of the color or color family specified in the list, approximate style match with any of the styles mentioned in the user search text: This should be a simple YES or NO. Choose NO only if it is completely irrelevant to users search criteria. 2) PERCENTAGE: percentage of match, make sure that this percentage is accurate 3) DIFFERENCE: A clear one-line easy description of the difference between the 2 products. Remember if the user search text says that some attribute should not be there, and the record has it, it should be a NO match. " } ] } ] }')::JSON)) -> 'candidates' -> 0 -> 'content' -> 'parts' -> 0 -> 'text' :: TEXT AS LLM_RESPONSE
                  FROM (SELECT id,
                               name,
                               description AS content,
                               quantity,
                               price,
                               image_url,
                               'Pink panther standing' AS recommended_text
                          FROM toys
                         ORDER BY text_embeddings <=> embedding('text-embedding-005',
                                                                'Pink panther standing')::VECTOR
                         LIMIT 10) AS xyz) AS X
         GROUP BY id,
                  name,
                  content,
                  quantity,
                  price,
                  image_url,
                  recommended_text) AS final_matches
 WHERE REGEXP_REPLACE(gemini_validation, '[^a-zA-Z,: ]', '', 'g') LIKE '%MATCH%:%YES%';

While that still might look daunting, hopefully you can make a bit more sense out of it. The results tell whether or not there's a match, what percentage the match is, and some explanation of the rating.

Notice that the Gemini model has streaming on by default, so the actual response is spread across multiple lines:

c2b006aeb3f3a2fc.png

9. Take the Toy Search to Cloud Serverlessly

Ready for taking this app to the web? Follow the steps below to make this Knowledge Engine Serverless with Cloud Run Functions:

  1. Go to Cloud Run Functions in Google Cloud Console to CREATE a new Cloud Run Function or use the link: https://console.cloud.google.com/functions/add.
  2. Select the Environment as "Cloud Run function". Provide Function Name "get-toys-alloydb" and choose Region as "us-central1". Set Authentication to "Allow unauthenticated invocations" and click NEXT. Choose Java 17 as runtime and Inline Editor for the source code.
  3. By default it would set the Entry Point to "gcfv2.HelloHttpFunction". Replace the placeholder code in HelloHttpFunction.java and pom.xml of your Cloud Run Function with the code from HelloHttpFunction.java and the pom.xml respectively.
  4. Remember to change the <<YOUR_PROJECT>> placeholder and the AlloyDB connection credentials with your values in the Java file. The AlloyDB credentials are the ones that we had used at the start of this codelab. If you have used different values, please modify the same in the Java file.
  5. Click Deploy.

Once deployed, in order to allow the Cloud Function to access our AlloyDB database instance, we'll create the VPC connector.

IMPORTANT STEP:

Once you are set out for deployment, you should be able to see the functions in the Google Cloud Run Functions console. Search for the newly created function (get-toys-alloydb), click on it, then click EDIT and change the following:

  1. Go to Runtime, build, connections and security settings
  2. Increase the timeout to 180 seconds
  3. Go to the CONNECTIONS tab:

4e83ec8a339cda08.png

  1. Under the Ingress settings, make sure "Allow all traffic" is selected.
  2. Under the Egress settings, Click on the Network dropdown and select "Add New VPC Connector" option and follow the instructions you see on the dialog box that pops-up:

8126ec78c343f199.png

  1. Provide a name for the VPC Connector and make sure the region is the same as your instance. Leave the Network value as default and set Subnet as Custom IP Range with the IP range of 10.8.0.0 or something similar that is available.
  2. Expand SHOW SCALING SETTINGS and make sure you have the configuration set to exactly the following:

7baf980463a86a5c.png

  1. Click CREATE and this connector should be listed in the egress settings now.
  2. Select the newly created connector
  3. Opt for all traffic to be routed through this VPC connector.
  4. Click on NEXT and then DEPLOY.

10. Test the Cloud Run Function

Once the updated Cloud Function is deployed, you should see the endpoint in the following format:

https://us-central1-YOUR_PROJECT_ID.cloudfunctions.net/get-toys-alloydb

Alternatively, you can test the Cloud Run Function as follows:

PROJECT_ID=$(gcloud config get-value project)

curl -X POST https://us-central1-$PROJECT_ID.cloudfunctions.net/get-toys-alloydb \
  -H 'Content-Type: application/json' \
  -d '{"search":"I want a standing pink panther toy"}' \
  | jq .

And the result:

23861e9091565a64.png

That's it! It is that simple to perform Similarity Vector Search using the Embeddings model on AlloyDB data.

11. Building the Web Application Client!

In this part, we will build a web application for the user to interact with and find matching toys based on text, image and even create a new toy based on their needs. Since the application is already built, you can follow the steps below to copy that over to your IDE and get the app up and running.

  1. Since we use Gemini 2.0 Flash to describe the image that the user may upload to find matching toys, we need to get the API KEY for this application. To do that, go to https://aistudio.google.com/apikey and get you API Key for your active Google Cloud Project that you are implementing this application in and save the key somewhere:

ae2db169e6a94e4a.png

  1. Navigate to the Cloud Shell Terminal
  2. Clone the repo with the following command:
git clone https://github.com/AbiramiSukumaran/toysearch

cd toysearch
  1. Once the repo is cloned, you should be able to access the project from your Cloud Shell Editor.
  2. You need to delete the folders "get-toys-alloydb" and "toolbox-toys" from the cloned project because these 2 are Cloud Run Functions code which can be referenced from the repo when you need them.
  3. Make sure all the necessary environment variables are set before you build and deploy the app. Navigate to the Cloud Shell Terminal and execute the following:
PROJECT_ID=$(gcloud config get-value project)

export PROJECT_ID $PROJECT_ID

export GOOGLE_API_KEY <YOUR API KEY that you saved>
  1. Build and run the app locally:

Making sure you are in the project directory, run the following commands:

mvn package

mvn spring-boot:run 
  1. Deploy on Cloud Run
gcloud run deploy --source .

12. Understanding the Generative AI details

No action needed. Just for your understanding:

Now that you have got the application to deploy, take a moment to understand how we accomplished the search (text and image) and generation.

  1. User text based Vector Search:

This is already addressed in the Cloud Run Functions that we deployed in the "Take the Vector Search application web" section.

  1. Image upload based Vector Search:

Instead of typing the context as text, let's say the user wants to upload a picture of a familiar toy that they want to search with. Users can upload an image of a toy they like and get relevant features with this.

We leverage Google's Gemini 2.0 Flash model, invoked using LangChain4j, to analyze the image and extract relevant context, such as the toy's color, material, type, and intended age group.

In just 5 steps, we took user multimodal data input to matching results with large language model invocation using an open source framework. Learn how:

package cloudcode.helloworld.web;

import dev.langchain4j.model.chat.ChatLanguageModel;
import dev.langchain4j.model.googleai.GoogleAiGeminiChatModel;
import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.data.message.AiMessage;
import dev.langchain4j.model.output.Response;
import dev.langchain4j.data.message.ImageContent;
import dev.langchain4j.data.message.TextContent;
import java.util.Base64;
import java.util.Optional;

public class GeminiCall {
  public String imageToBase64String(byte[] imageBytes) {
    String base64Img = Base64.getEncoder().encodeToString(imageBytes);
    return base64Img;
  }

  public String callGemini(String base64ImgWithPrefix) throws Exception {
    String searchText = "";

    // 1. Remove the prefix
    String base64Img = base64ImgWithPrefix.replace("data:image/jpeg;base64,", "");

    // 2. Decode base64 to bytes
    byte[] imageBytes = Base64.getDecoder().decode(base64Img);
    String image = imageToBase64String(imageBytes);

    // 3. Get API key from environment variable
        String apiKey = Optional.ofNullable(System.getenv("GOOGLE_API_KEY"))
                .orElseThrow(() -> new IllegalArgumentException("GOOGLE_API_KEY environment variable not set"));

    // 4. Invoke Gemini 2.0
    ChatLanguageModel gemini = GoogleAiGeminiChatModel.builder()
        .apiKey(apiKey)
        .modelName("gemini-2.0-flash-001")
        .build();

    Response<AiMessage> response = gemini.generate(
        UserMessage.from(
            ImageContent.from(image, "image/jpeg"),
            TextContent.from(
                "The picture has a toy in it. Describe the toy in the image in one line. Do not add any prefix or title to your description. Just describe that toy that you see in the image in one line, do not describe the surroundings and other objects around the toy in the image. If you do not see any toy in the image, send  response stating that no toy is found in the input image.")));
   
    // 5. Get the text from the response and send it back to the controller
    searchText = response.content().text().trim();
    System.out.println("searchText inside Geminicall: " + searchText);
    return searchText;
  }
}
  1. Understand how we used Imagen 3 to build a customized toy based on user request with Generative AI.

Imagen 3 then generates an image of the custom-designed toy, giving the user a clear visualization of their creation. This is how we did it in just 5 steps:

// Generate an image using a text prompt using an Imagen model
    public String generateImage(String projectId, String location, String prompt)
        throws ApiException, IOException {
      final String endpoint = String.format("%s-aiplatform.googleapis.com:443", location);
      PredictionServiceSettings predictionServiceSettings =
      PredictionServiceSettings.newBuilder().setEndpoint(endpoint).build();
     
      // 1. Set up the context and prompt
      String context = "Generate a photo-realistic image of a toy described in the following input text from the user. Make sure you adhere to all the little details and requirements mentioned in the prompt. Ensure that the user is only describing a toy. If it is anything unrelated to a toy, politely decline the request stating that the request is inappropriate for the current context. ";
      prompt = context + prompt;

      // 2. Initialize a client that will be used to send requests. This client only needs to be created
      // once, and can be reused for multiple requests.
      try (PredictionServiceClient predictionServiceClient =
          PredictionServiceClient.create(predictionServiceSettings)) {
 
      // 3. Invoke Imagen 3
        final EndpointName endpointName =
            EndpointName.ofProjectLocationPublisherModelName(
                projectId, location, "google", "imagen-3.0-generate-001"); //"imagegeneration@006"; imagen-3.0-generate-001
        Map<String, Object> instancesMap = new HashMap<>();
        instancesMap.put("prompt", prompt);
        Value instances = mapToValue(instancesMap);
        Map<String, Object> paramsMap = new HashMap<>();
        paramsMap.put("sampleCount", 1);
        paramsMap.put("aspectRatio", "1:1");
        paramsMap.put("safetyFilterLevel", "block_few");
        paramsMap.put("personGeneration", "allow_adult");
        paramsMap.put("guidanceScale", 21);
        paramsMap.put("imagenControlScale", 0.95); //Setting imagenControlScale
        Value parameters = mapToValue(paramsMap);
       
      // 4. Get prediction response image
        PredictResponse predictResponse =
            predictionServiceClient.predict(
                endpointName, Collections.singletonList(instances), parameters);

      // 5. Return the Base64 Encoded String to the controller
        for (Value prediction : predictResponse.getPredictionsList()) {
          Map<String, Value> fieldsMap = prediction.getStructValue().getFieldsMap();
          if (fieldsMap.containsKey("bytesBase64Encoded")) {
            bytesBase64EncodedOuput = fieldsMap.get("bytesBase64Encoded").getStringValue();
        }
      }
      return bytesBase64EncodedOuput.toString();
    }
  }

Price Prediction

In the previous section above, we discussed how Imagen is generating the image of a toy that the user wishes to design on their own. In order for them to be able to buy it, the application needs to set a price for it and we have employed an intuitive logic to define a price for the custom made-to-order toy. The logic is to use the average price of the top 5 closest matching toys (in terms of description) of the toy that the user designs.

The price prediction for the generated toy is an important part of this application and we have used an agentic approach to generate this. Introducing Gen AI Toolbox for Databases.

13. Gen AI Toolbox for Databases

Gen AI Toolbox for Databases is an open source server from Google that makes it easier to build Gen AI tools for interacting with databases. It enables you to develop tools easier, faster, and more securely by handling the complexities such as connection pooling, authentication, and more. It helps you build Gen AI tools that let your agents access data in your database.

Here are the steps you need to follow to be able to set this up for getting your Tool ready and make our application agentic: Link to Toolbox Codelab

Your application can now use this deployed Cloud Run Function endpoint to populate the price along with the generated Imagen result for the custom made to order toy image.

14. Test your Web Application

Now that all the components of your application are built and deployed, it is ready to be served on the cloud. Test your application for all scenarios. Here is a video link to what you might expect:

https://www.youtube.com/shorts/ZMqUAWsghYQ

This is what the landing page looks like:

241db19e7176e93e.png

15. Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this post, follow these steps:

  1. In the Google Cloud console, go to the Manage resources page.
  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

16. Congratulations

Congratulations! You have successfully performed a Toystore Contextual Search and Generation using AlloyDB, pgvector, Imagen and Gemini 2.0 while leveraging open source libraries to build robust integrations. By combining the capabilities of AlloyDB, Vertex AI, and Vector Search, we've taken a giant leap forward in making contextual and vector searches accessible, efficient, and truly meaning-driven.