PaLM 및 LangChain4J를 사용하여 Java로 사용자 및 문서와 함께 생성형 AI 기반 채팅

1. 소개

최종 업데이트: 2024년 2월 5일

생성형 AI란 무엇인가요

생성형 AI 또는 생성형 인공지능은 AI를 사용하여 텍스트, 이미지, 음악, 오디오, 동영상과 같은 새로운 콘텐츠를 만드는 것을 의미합니다.

생성형 AI는 요약, 질의응답, 분류 등 다양한 작업을 즉시 수행할 수 있는 대규모 AI 모델인 파운데이션 모델을 기반으로 합니다. 또한 파운데이션 모델은 예시 데이터가 매우 적어도 최소한의 학습만으로 특정 사용 사례에 맞게 조정할 수 있습니다.

생성형 AI는 어떻게 작동하나요?

생성형 AI는 ML (머신러닝) 모델을 사용하여 사람이 만든 콘텐츠의 데이터 세트에서 패턴과 관계를 학습하는 방식으로 작동합니다. 그런 다음 학습된 패턴을 사용하여 새 콘텐츠를 생성합니다.

생성형 AI 모델을 학습시키는 가장 일반적인 방법은 지도 학습을 사용하는 것입니다. 지도 학습에는 사람이 만든 콘텐츠와 해당 라벨의 집합이 부여됩니다. 그런 다음 사람이 만든 콘텐츠와 유사하고 동일한 라벨이 지정된 콘텐츠를 생성하는 방법을 학습합니다.

일반적인 생성형 AI 애플리케이션은 무엇인가요?

생성형 AI는 방대한 콘텐츠를 처리하여 텍스트, 이미지, 사용자 친화적인 형식을 통해 유용한 정보와 답변을 제공합니다. 생성형 AI는 다음과 같은 용도로 사용할 수 있습니다.

향상된 채팅 및 검색 환경을 통해 고객 상호작용 개선
대화형 인터페이스 및 요약을 통해 방대한 양의 구조화되지 않은 데이터 탐색
제안 요청서 (RFP)에 대한 답장을 작성하고, 동일한 마케팅 콘텐츠를 5가지 언어로 작성하고, 고객 계약이 규정을 준수하는지 확인하는 등의 반복적인 작업 지원

Google Cloud에는 어떤 생성형 AI 제품이 있나요?

Vertex AI를 사용하면 ML 전문 지식이 거의 없어도 애플리케이션에 기반 모델을 사용, 맞춤설정, 삽입할 수 있습니다. Model Garden에서 기반 모델에 액세스하고, Generative AI Studio에서 간단한 UI를 통해 모델을 조정하거나, 데이터 과학 노트북에서 모델을 사용할 수 있습니다.

Vertex AI Search and Conversation은 개발자가 생성형 AI 기반 검색엔진과 챗봇을 빌드할 수 있는 가장 빠른 방법을 제공합니다.

Duet AI는 Google Cloud 및 IDE 전반에서 사용할 수 있는 AI 기반 공동작업 도구로, 더 많은 작업을 더 빠르게 수행할 수 있도록 지원합니다.

이 Codelab에서는 무엇에 중점을 두나요?

이 Codelab에서는 모든 머신러닝 제품과 서비스를 포함하는 Google Cloud Vertex AI에서 호스팅되는 PaLM 2 대규모 언어 모델 (LLM)에 중점을 둡니다.

Java를 사용하여 PaLM API와 상호작용하고 LangChain4J LLM 프레임워크 오케스트레이터를 사용합니다. 다양한 구체적인 예를 통해 LLM을 활용하여 질문에 답변하고, 아이디어를 생성하고, 엔티티와 구조화된 콘텐츠를 추출하고, 요약하는 방법을 알아봅니다.

LangChain4J 프레임워크에 대해 자세히 알려 줘.

LangChain4J 프레임워크는 LLM 자체뿐만 아니라 벡터 데이터베이스 (시맨틱 검색용), 문서 로더 및 스플리터 (문서 분석 및 학습용), 출력 파서 등 다양한 구성요소를 오케스트레이션하여 Java 애플리케이션에 대규모 언어 모델을 통합하기 위한 오픈소스 라이브러리입니다.

GitHub 프로젝트 페이지에서:

이 프로젝트의 목표는 Java 애플리케이션에 AI/LLM 기능을 통합하는 것을 간소화하는 것입니다.

이는 다음 덕분에 가능합니다.

간단하고 일관된 추상화 계층으로, 코드가 LLM 제공업체, 임베딩 스토어 제공업체 등 구체적인 구현에 의존하지 않도록 설계되었습니다. 이를 통해 구성요소를 쉽게 교체할 수 있습니다.
위에 언급된 추상화의 다양한 구현을 통해 다양한 LLM과 임베딩 저장소 중에서 선택할 수 있습니다.
다음과 같은 LLM 기반의 인기 기능
자체 데이터를 수집 (문서, 코드베이스 등)하여 LLM이 데이터에 따라 행동하고 응답할 수 있는 기능
LLM에 태스크 (즉석에서 정의됨)를 위임하는 자율 에이전트. LLM은 태스크를 완료하기 위해 노력합니다.
LLM 응답의 품질을 최대한 높이는 데 도움이 되는 프롬프트 템플릿
메모리: 현재 및 과거 대화에 대한 컨텍스트를 LLM에 제공합니다.
원하는 구조의 LLM 응답을 Java POJO로 수신하기 위한 구조화된 출력
간단한 API 뒤에 있는 복잡한 AI 동작을 선언적으로 정의하는 'AI 서비스'
체인을 사용하여 일반적인 사용 사례에서 광범위한 상용구 코드의 필요성을 줄입니다.
자동 조정을 통해 LLM과의 모든 입력과 출력이 유해하지 않도록 합니다.

학습할 내용

PaLM 및 LangChain4J를 사용하도록 Java 프로젝트를 설정하는 방법
비구조화 콘텐츠에서 유용한 정보를 추출하는 방법 (항목 또는 키워드 추출, JSON으로 출력)
사용자와 대화를 만드는 방법
채팅 모델을 사용하여 자체 문서에 관해 질문하는 방법

필요한 항목

Java 프로그래밍 언어에 대한 지식
Google Cloud 프로젝트
브라우저(예: Chrome, Firefox)

2. 설정 및 요건

자습형 환경 설정

Google Cloud Console에 로그인하여 새 프로젝트를 만들거나 기존 프로젝트를 재사용합니다. 아직 Gmail이나 Google Workspace 계정이 없는 경우 계정을 만들어야 합니다.

프로젝트 이름은 이 프로젝트 참가자의 표시 이름입니다. 이는 Google API에서 사용하지 않는 문자열이며 언제든지 업데이트할 수 있습니다.
프로젝트 ID는 모든 Google Cloud 프로젝트에서 고유하며, 변경할 수 없습니다(설정된 후에는 변경할 수 없음). Cloud 콘솔은 고유한 문자열을 자동으로 생성합니다. 일반적으로는 신경 쓰지 않아도 됩니다. 대부분의 Codelab에서는 프로젝트 ID (일반적으로 PROJECT_ID로 식별됨)를 참조해야 합니다. 생성된 ID가 마음에 들지 않으면 다른 임의 ID를 생성할 수 있습니다. 또는 직접 시도해 보고 사용 가능한지 확인할 수도 있습니다. 이 단계 이후에는 변경할 수 없으며 프로젝트 기간 동안 유지됩니다.
참고로 세 번째 값은 일부 API에서 사용하는 프로젝트 번호입니다. 이 세 가지 값에 대한 자세한 내용은 문서를 참고하세요.

다음으로 Cloud 리소스/API를 사용하려면 Cloud 콘솔에서 결제를 사용 설정해야 합니다. 이 Codelab 실행에는 많은 비용이 들지 않습니다. 이 튜토리얼이 끝난 후에 요금이 청구되지 않도록 리소스를 종료하려면 만든 리소스 또는 프로젝트를 삭제하면 됩니다. Google Cloud 신규 사용자는 300달러(USD) 상당의 무료 체험판 프로그램에 참여할 수 있습니다.

Cloud Shell 시작

Google Cloud를 노트북에서 원격으로 실행할 수 있지만, 이 Codelab에서는 Cloud에서 실행되는 명령줄 환경인 Cloud Shell을 사용합니다.

Cloud Shell 활성화

Cloud Console에서 Cloud Shell 활성화를 클릭합니다.

Cloud Shell을 처음 시작하는 경우 설명이 포함된 중간 화면이 제공됩니다. 중간 화면이 표시되면 계속을 클릭합니다.

Cloud Shell을 프로비저닝하고 연결하는 작업은 몇 분이면 끝납니다.

이 가상 머신에는 필요한 개발 도구가 모두 로드되어 있습니다. 영구적인 5GB 홈 디렉터리를 제공하고 Google Cloud에서 실행되므로 네트워크 성능과 인증이 크게 개선됩니다. 이 Codelab에서 대부분의 작업은 브라우저로 수행할 수 있습니다.

Cloud Shell에 연결되면 인증이 완료되었고 프로젝트가 해당 프로젝트 ID로 설정된 것을 확인할 수 있습니다.

Cloud Shell에서 다음 명령어를 실행하여 인증되었는지 확인합니다.

gcloud auth list

명령어 결과

 Credentialed Accounts
ACTIVE  ACCOUNT
*       <my_account>@<my_domain.com>

To set the active account, run:
    $ gcloud config set account `ACCOUNT`

Cloud Shell에서 다음 명령어를 실행하여 gcloud 명령어가 프로젝트를 알고 있는지 확인합니다.

gcloud config list project

명령어 결과

[core]
project = <PROJECT_ID>

또는 다음 명령어로 설정할 수 있습니다.

gcloud config set project <PROJECT_ID>

명령어 결과

Updated property [core/project].

3. 개발 환경 준비

이 Codelab에서는 Cloud Shell 터미널과 코드 편집기를 사용하여 Java 프로그램을 개발합니다.

Vertex AI API 사용 설정

Google Cloud 콘솔에서 프로젝트 이름이 Google Cloud 콘솔 상단에 표시되는지 확인합니다. 선택되어 있지 않다면 프로젝트 선택을 클릭하여 프로젝트 선택기를 열고 원하는 프로젝트를 선택합니다.
Google Cloud 콘솔의 Vertex AI 부분에 있지 않은 경우 다음을 수행합니다.
검색에 Vertex AI를 입력하고 Enter 키를 누릅니다.
검색 결과에서 Vertex AI를 클릭합니다. Vertex AI 대시보드가 표시됩니다.
Vertex AI 대시보드에서 모든 권장 API 사용 설정을 클릭합니다.

이렇게 하면 여러 API가 사용 설정되지만 Codelab에서 가장 중요한 API는 aiplatform.googleapis.com입니다. 다음 명령어를 실행하여 Cloud Shell 터미널의 명령줄에서도 사용 설정할 수 있습니다.

$ gcloud services enable aiplatform.googleapis.com

Gradle로 프로젝트 구조 만들기

Java 코드 예시를 빌드하려면 Gradle 빌드 도구와 Java 버전 17을 사용합니다. Gradle로 프로젝트를 설정하려면 Cloud Shell 터미널에서 디렉터리 (여기서는 palm-workshop)를 만들고 해당 디렉터리에서 gradle init 명령어를 실행합니다.

$ mkdir palm-workshop
$ cd palm-workshop

$ gradle init

Select type of project to generate:
  1: basic
  2: application
  3: library
  4: Gradle plugin
Enter selection (default: basic) [1..4] 2

Select implementation language:
  1: C++
  2: Groovy
  3: Java
  4: Kotlin
  5: Scala
  6: Swift
Enter selection (default: Java) [1..6] 3

Split functionality across multiple subprojects?:
  1: no - only one application project
  2: yes - application and library projects
Enter selection (default: no - only one application project) [1..2] 1

Select build script DSL:
  1: Groovy
  2: Kotlin
Enter selection (default: Groovy) [1..2] 1

Generate build using new APIs and behavior (some features may change in the next minor release)? (default: no) [yes, no] 

Select test framework:
  1: JUnit 4
  2: TestNG
  3: Spock
  4: JUnit Jupiter
Enter selection (default: JUnit Jupiter) [1..4] 4

Project name (default: palm-workshop): 
Source package (default: palm.workshop): 

> Task :init
Get more help with your project: https://docs.gradle.org/7.4/samples/sample_building_java_applications.html

BUILD SUCCESSFUL in 51s
2 actionable tasks: 2 executed

하위 프로젝트를 사용하지 않고 (옵션 1) 빌드 파일에 Groovy 구문을 사용 (옵션 1)하여 새 빌드 기능을 사용하지 않고 (옵션 없음) JUnit Jupiter (옵션 4)로 테스트를 생성하여 Java 언어 (옵션 3)를 사용하여 애플리케이션 (옵션 2)을 빌드합니다. 프로젝트 이름은 palm-workshop을 사용하고 소스 패키지는 palm.workshop을 사용할 수 있습니다.

프로젝트 구조는 다음과 같습니다.

├── gradle 
│   └── ...
├── gradlew 
├── gradlew.bat 
├── settings.gradle 
└── app
    ├── build.gradle 
    └── src
        ├── main
        │   └── java 
        │       └── palm
        │           └── workshop
        │               └── App.java
        └── test
            └── ...

필요한 종속 항목을 추가하도록 app/build.gradle 파일을 업데이트해 보겠습니다. guava 종속 항목이 있는 경우 이를 삭제하고 LangChain4J 프로젝트 및 로거 누락 메시지를 방지하기 위한 로깅 라이브러리의 종속 항목으로 대체할 수 있습니다.

dependencies {
    // Use JUnit Jupiter for testing.
    testImplementation 'org.junit.jupiter:junit-jupiter:5.8.1'

    // Logging library
    implementation 'org.slf4j:slf4j-jdk14:2.0.9'

    // This dependency is used by the application.
    implementation 'dev.langchain4j:langchain4j-vertex-ai:0.24.0'
    implementation 'dev.langchain4j:langchain4j:0.24.0'
}

LangChain4j에는 두 가지 종속 항목이 있습니다.

핵심 프로젝트에 하나,
전용 Vertex AI 모듈용 하나가 있습니다.

프로그램을 컴파일하고 실행하는 데 Java 17을 사용하려면 plugins {} 블록 아래에 다음 블록을 추가하세요.

java {
    toolchain {
        languageVersion = JavaLanguageVersion.of(17)
    }
}

한 가지 더 변경해야 합니다. 사용자가 빌드 도구를 호출할 때 명령줄에서 실행할 기본 클래스를 재정의할 수 있도록 app/build.gradle의 application 블록을 업데이트합니다.

application {
    mainClass = providers.systemProperty('javaMainClass')
                         .orElse('palm.workshop.App')
}

빌드 파일이 애플리케이션을 실행할 준비가 되었는지 확인하려면 간단한 Hello World! 메시지를 출력하는 기본 기본 클래스를 실행하면 됩니다.

$ ./gradlew run -DjavaMainClass=palm.workshop.App

> Task :app:run
Hello World!

BUILD SUCCESSFUL in 3s
2 actionable tasks: 2 executed

이제 LangChain4J 프로젝트를 사용하여 PaLM 대규모 언어 텍스트 모델로 프로그래밍할 수 있습니다.

참고로 이제 전체 app/build.gradle 빌드 파일은 다음과 같습니다.

plugins {
    // Apply the application plugin to add support for building a CLI application in Java.
    id 'application'
}

java {
    toolchain {
        // Ensure we compile and run on Java 17
        languageVersion = JavaLanguageVersion.of(17)
    }
}

repositories {
    // Use Maven Central for resolving dependencies.
    mavenCentral()
}

dependencies {
    // Use JUnit Jupiter for testing.
    testImplementation 'org.junit.jupiter:junit-jupiter:5.8.1'

    // This dependency is used by the application.
    implementation 'dev.langchain4j:langchain4j-vertex-ai:0.24.0'
    implementation 'dev.langchain4j:langchain4j:0.24.0'
    implementation 'org.slf4j:slf4j-jdk14:2.0.9'
}

application {
    mainClass = providers.systemProperty('javaMainClass').orElse('palm.workshop.App')
}

tasks.named('test') {
    // Use JUnit Platform for unit tests.
    useJUnitPlatform()
}

4. PaLM의 채팅 모델에 대한 첫 번째 호출 만들기

이제 프로젝트가 올바르게 설정되었으므로 PaLM API를 호출할 차례입니다.

app/src/main/java/palm/workshop 디렉터리 (기본 App.java 클래스와 함께)에 ChatPrompts.java라는 새 클래스를 만들고 다음 콘텐츠를 입력합니다.

package palm.workshop;

import dev.langchain4j.model.vertexai.VertexAiChatModel;
import dev.langchain4j.chain.ConversationalChain;

public class ChatPrompts {
    public static void main(String[] args) {
        VertexAiChatModel model = VertexAiChatModel.builder()
            .endpoint("us-central1-aiplatform.googleapis.com:443")
            .project("YOUR_PROJECT_ID")
            .location("us-central1")
            .publisher("google")
            .modelName("chat-bison@001")
            .maxOutputTokens(400)
            .maxRetries(3)
            .build();

        ConversationalChain chain = ConversationalChain.builder()
            .chatLanguageModel(model)
            .build();

        String message = "What are large language models?";
        String answer = chain.execute(message);
        System.out.println(answer);

        System.out.println("---------------------------");

        message = "What can you do with them?";
        answer = chain.execute(message);
        System.out.println(answer);

        System.out.println("---------------------------");

        message = "Can you name some of them?";
        answer = chain.execute(message);
        System.out.println(answer);
    }
}

이 첫 번째 예에서는 대화의 멀티턴 측면을 더 쉽게 처리할 수 있도록 VertexAiChatModel 클래스와 LangChain4J ConversationalChain를 가져와야 합니다.

다음으로 main 메서드에서 VertexAiChatModel의 빌더를 사용하여 채팅 언어 모델을 구성하여 다음을 지정합니다.

엔드포인트,
프로젝트
지역,
게시자,
모델 이름 (chat-bison@001)

이제 언어 모델이 준비되었으므로 ConversationalChain를 준비할 수 있습니다. 이는 LangChain4J에서 채팅 언어 모델 자체와 같은 대화를 처리하는 다양한 구성요소를 함께 구성할 수 있도록 제공하는 상위 수준 추상화입니다. 채팅 대화 기록을 처리하거나 벡터 데이터베이스에서 정보를 가져오는 검색기와 같은 다른 도구를 연결하는 다른 구성요소도 포함될 수 있습니다. 하지만 걱정하지 마세요. 이 Codelab의 뒷부분에서 다시 다루겠습니다.

그런 다음 채팅 모델과 멀티턴 대화를 통해 여러 개의 상호 관련된 질문을 합니다. 처음에는 LLM에 대해 궁금해하고, LLM으로 무엇을 할 수 있는지, LLM의 예는 무엇인지 묻습니다. 반복하지 않아도 됩니다. LLM은 해당 대화의 맥락에서 'them'이 LLM을 의미한다는 것을 알고 있습니다.

이 멀티턴 대화를 진행하려면 체인에서 execute() 메서드를 호출하면 됩니다. 그러면 대화 컨텍스트에 추가되고 채팅 모델이 답장을 생성하여 채팅 기록에도 추가합니다.

이 클래스를 실행하려면 Cloud Shell 터미널에서 다음 명령어를 실행하세요.

./gradlew run -DjavaMainClass=palm.workshop.ChatPrompts

다음과 비슷한 출력이 표시됩니다.

$ ./gradlew run -DjavaMainClass=palm.workshop.ChatPrompts
Starting a Gradle Daemon, 2 incompatible and 2 stopped Daemons could not be reused, use --status for details

> Task :app:run
Large language models (LLMs) are artificial neural networks that are trained on massive datasets of text and code. They are designed to understand and generate human language, and they can be used for a variety of tasks, such as machine translation, question answering, and text summarization.
---------------------------
LLMs can be used for a variety of tasks, such as:

* Machine translation: LLMs can be used to translate text from one language to another.
* Question answering: LLMs can be used to answer questions posed in natural language.
* Text summarization: LLMs can be used to summarize text into a shorter, more concise form.
* Code generation: LLMs can be used to generate code, such as Python or Java code.
* Creative writing: LLMs can be used to generate creative text, such as poems, stories, and scripts.

LLMs are still under development, but they have the potential to revolutionize a wide range of industries. For example, LLMs could be used to improve customer service, create more personalized marketing campaigns, and develop new products and services.
---------------------------
Some of the most well-known LLMs include:

* GPT-3: Developed by OpenAI, GPT-3 is a large language model that can generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.
* LaMDA: Developed by Google, LaMDA is a large language model that can chat with you in an open-ended way, answering your questions, telling stories, and providing different kinds of creative content.
* PaLM 2: Developed by Google, PaLM 2 is a large language model that can perform a wide range of tasks, including machine translation, question answering, and text summarization.
* T5: Developed by Google, T5 is a large language model that can be used for a variety of tasks, including text summarization, question answering, and code generation.

These are just a few examples of the many LLMs that are currently being developed. As LLMs continue to improve, they are likely to play an increasingly important role in our lives.

BUILD SUCCESSFUL in 25s
2 actionable tasks: 2 executed

PaLM이 3개의 관련 질문에 답변했습니다.

VertexAIChatModel 빌더를 사용하면 재정의할 수 있는 기본값이 이미 있는 선택적 매개변수를 정의할 수 있습니다. 예를 들면 다음과 같습니다.

.temperature(0.2): 대답의 창의성 정도를 정의합니다 (0은 창의성이 낮고 사실에 기반한 경우가 많으며 1은 창의적인 출력을 위한 것입니다).
.maxOutputTokens(50) - 예시에서는 생성된 답변의 길이에 따라 400개의 토큰이 요청되었습니다 (토큰 3개는 단어 4개와 대략적으로 동일).
.topK(20): 텍스트 완성에 사용될 가능성이 있는 단어의 최대 개수 (1~40) 중에서 단어를 무작위로 선택합니다.
.topP(0.95) - 총 확률이 해당 부동 소수점 수 (0과 1 사이)에 합산되는 가능한 단어를 선택합니다.
.maxRetries(3) - 시간당 요청 할당량을 초과하는 경우 모델이 호출을 3번 다시 시도하도록 할 수 있습니다.

5. 개성이 있는 유용한 챗봇

이전 섹션에서는 특정 컨텍스트를 제공하지 않고 LLM 챗봇에 바로 질문을 시작했습니다. 하지만 이러한 챗봇을 특정 작업이나 특정 주제에 대한 전문가가 되도록 특화할 수 있습니다.

그 방법은 무엇일까요? 상황 설정: LLM에 당면한 작업, 컨텍스트를 설명하고, 해야 할 일, 가져야 할 페르소나, 원하는 대답 형식, 챗봇이 특정 방식으로 행동하도록 하려면 어조를 몇 가지 예와 함께 제공합니다.

프롬프트 작성에 관한 도움말에서는 다음 그래픽을 통해 이 접근 방식을 잘 보여줍니다.

https://medium.com/@eldatero/master-the-perfect-chatgpt-prompt-formula-c776adae8f19

이 점을 설명하기 위해 prompts.chat 웹사이트에서 아이디어를 얻어 보겠습니다. 이 웹사이트에는 맞춤형 챗봇이 다음과 같은 역할을 할 수 있도록 하는 훌륭하고 재미있는 아이디어가 많이 나열되어 있습니다.

그림 이모티콘 번역기: 사용자 메시지를 그림 이모티콘으로 번역
프롬프트 개선 도구: 더 나은 프롬프트를 만듭니다.
학술지 검토자: 연구 논문 검토 지원
개인 스타일리스트에게 의상 스타일 추천을 받으세요.

LLM 챗봇을 체스 플레이어로 바꾸는 예시가 있습니다. 이를 구현해 보겠습니다.

다음과 같이 ChatPrompts 클래스를 업데이트합니다.

package palm.workshop;

import dev.langchain4j.chain.ConversationalChain;
import dev.langchain4j.data.message.SystemMessage;
import dev.langchain4j.memory.chat.MessageWindowChatMemory;
import dev.langchain4j.model.vertexai.VertexAiChatModel;
import dev.langchain4j.store.memory.chat.InMemoryChatMemoryStore;

public class ChatPrompts {
    public static void main(String[] args) {
        VertexAiChatModel model = VertexAiChatModel.builder()
            .endpoint("us-central1-aiplatform.googleapis.com:443")
            .project("YOUR_PROJECT_ID")
            .location("us-central1")
            .publisher("google")
            .modelName("chat-bison@001")
            .maxOutputTokens(7)
            .maxRetries(3)
            .build();

        InMemoryChatMemoryStore chatMemoryStore = new InMemoryChatMemoryStore();

        MessageWindowChatMemory chatMemory = MessageWindowChatMemory.builder()
            .chatMemoryStore(chatMemoryStore)
            .maxMessages(200)
            .build();

        chatMemory.add(SystemMessage.from("""
            You're an expert chess player with a high ELO ranking.
            Use the PGN chess notation to reply with the best next possible move.
            """
        ));


        ConversationalChain chain = ConversationalChain.builder()
            .chatLanguageModel(model)
            .chatMemory(chatMemory)
            .build();

        String pgn = "";
        String[] whiteMoves = { "Nf3", "c4", "Nc3", "e3", "Dc2", "Cd5"};
        for (int i = 0; i < whiteMoves.length; i++) {
            pgn += " " + (i+1) + ". " + whiteMoves[i];
            System.out.println("Playing " + whiteMoves[i]);
            pgn = chain.execute(pgn);
            System.out.println(pgn);
        }
    }
}

단계별로 살펴보겠습니다.

채팅의 메모리를 처리하려면 몇 가지 새로운 가져오기가 필요합니다.
체스에 관한 전체 논문이 아닌 다음 수를 생성하기만 하면 되므로 최대 토큰 수를 적은 수 (여기서는 7)로 설정하여 채팅 모델을 인스턴스화합니다.
다음으로 채팅 대화를 저장할 채팅 메모리 저장소를 만듭니다.
마지막 움직임을 유지하기 위해 실제 창 형식의 채팅 메모리를 만듭니다.
채팅 메모리에서 채팅 모델이 누구여야 하는지 (예: 전문 체스 선수)를 알려주는 '시스템' 메시지를 추가합니다. '시스템' 메시지는 일부 컨텍스트를 추가하는 반면 '사용자' 및 'AI' 메시지는 실제 토론입니다.
메모리와 채팅 모델을 결합하는 대화형 체인을 만듭니다.
그런 다음 흰색의 이동 목록이 있으며 이를 반복합니다. 체인은 매번 다음 백의 움직임으로 실행되고 채팅 모델은 다음 최적의 움직임으로 대답합니다.

이러한 이동으로 이 클래스를 실행하면 다음과 같은 출력이 표시됩니다.

$ ./gradlew run -DjavaMainClass=palm.workshop.ChatPrompts
Starting a Gradle Daemon (subsequent builds will be faster)

> Task :app:run
Playing Nf3
1... e5
Playing c4
2... Nc6
Playing Nc3
3... Nf6
Playing e3
4... Bb4
Playing Dc2
5... O-O
Playing Cd5
6... exd5

워! PaLM이 체스하는 방법을 알고 있나요? 정확히는 아니지만 학습 중에 모델이 일부 체스 게임 해설이나 이전 게임의 PGN (Portable Game Notation) 파일을 본 적이 있을 수 있습니다. 하지만 이 챗봇은 최고의 바둑, 쇼기, 체스 플레이어를 이긴 AI인 AlphaZero를 이기지 못할 가능성이 높으며, 모델이 실제 게임 상태를 제대로 기억하지 못해 대화가 엉뚱한 방향으로 흘러갈 수도 있습니다.

채팅 모델은 매우 강력하며 사용자와 풍부한 상호작용을 만들고 다양한 상황별 작업을 처리할 수 있습니다. 다음 섹션에서는 유용한 작업인 텍스트에서 구조화된 데이터 추출을 살펴보겠습니다.

6. 구조화되지 않은 텍스트에서 정보 추출

이전 섹션에서는 사용자와 채팅 언어 모델 간의 대화를 만들었습니다. 하지만 LangChain4J를 사용하면 채팅 모델을 사용하여 비정형 텍스트에서 정형 정보를 추출할 수도 있습니다.

사람의 전기 또는 설명이 주어졌을 때 사람의 이름과 나이를 추출한다고 가정해 보겠습니다. 대규모 언어 모델에 영리하게 조정된 프롬프트를 사용하여 JSON 데이터 구조를 생성하도록 지시할 수 있습니다 (일반적으로 프롬프트 엔지니어링이라고 함).

다음과 같이 ChatPrompts 클래스를 업데이트합니다.

package palm.workshop;

import dev.langchain4j.model.vertexai.VertexAiChatModel;
import dev.langchain4j.service.AiServices;
import dev.langchain4j.service.UserMessage;

public class ChatPrompts {

    static class Person {
        String name;
        int age;
    }

    interface PersonExtractor {
        @UserMessage("""
            Extract the name and age of the person described below.
            Return a JSON document with a "name" and an "age" property, \
            following this structure: {"name": "John Doe", "age": 34}
            Return only JSON, without any markdown markup surrounding it.
            Here is the document describing the person:
            ---
            {{it}}
            ---
            JSON: 
            """)
        Person extractPerson(String text);
    }

    public static void main(String[] args) {
        VertexAiChatModel model = VertexAiChatModel.builder()
            .endpoint("us-central1-aiplatform.googleapis.com:443")
            .project("YOUR_PROJECT_ID")
            .location("us-central1")
            .publisher("google")
            .modelName("chat-bison@001")
            .maxOutputTokens(300)
            .build();
        
        PersonExtractor extractor = AiServices.create(PersonExtractor.class, model);

        Person person = extractor.extractPerson("""
            Anna is a 23 year old artist based in Brooklyn, New York. She was born and 
            raised in the suburbs of Chicago, where she developed a love for art at a 
            young age. She attended the School of the Art Institute of Chicago, where 
            she studied painting and drawing. After graduating, she moved to New York 
            City to pursue her art career. Anna's work is inspired by her personal 
            experiences and observations of the world around her. She often uses bright 
            colors and bold lines to create vibrant and energetic paintings. Her work 
            has been exhibited in galleries and museums in New York City and Chicago.    
            """
        );

        System.out.println(person.name);
        System.out.println(person.age);
    }
}

이 파일의 다양한 단계를 살펴보겠습니다.

Person 클래스는 사람 (이름과 나이)을 설명하는 세부정보를 나타내도록 정의됩니다.
PersonExtractor 인터페이스는 구조화되지 않은 텍스트 문자열이 주어지면 인스턴스화된 Person 인스턴스를 반환하는 메서드로 생성됩니다.
extractPerson()에는 프롬프트를 연결하는 @UserMessage 주석이 달려 있습니다. 이 프롬프트는 모델이 정보를 추출하는 데 사용되며, 파싱되고 Person 인스턴스로 언마샬링되는 JSON 문서 형식으로 세부정보를 반환합니다.

이제 main() 메서드의 콘텐츠를 살펴보겠습니다.

채팅 모델이 인스턴스화됩니다.
LangChain4J의 AiServices 클래스 덕분에 PersonExtractor 객체가 생성됩니다.
그런 다음 Person person = extractor.extractPerson(...)를 호출하여 구조화되지 않은 텍스트에서 사람의 세부정보를 추출하고 이름과 나이가 포함된 Person 인스턴스를 다시 가져올 수 있습니다.

이제 다음 명령어를 사용하여 이 클래스를 실행합니다.

$ ./gradlew run -DjavaMainClass=palm.workshop.ChatPrompts

> Task :app:run
Anna
23

예. 안나입니다. 23살입니다.

이 AiServices 접근 방식에서 특히 흥미로운 점은 강력한 유형의 객체로 작동한다는 것입니다. 채팅 LLM과 직접 상호작용하지 않습니다. 대신 추출된 개인 정보를 나타내는 Person 클래스와 같은 구체적인 클래스를 사용하고 Person 인스턴스를 반환하는 extractPerson() 메서드가 있는 PersonExtractor 클래스가 있습니다. LLM의 개념은 추상화되어 있으며 Java 개발자는 일반 클래스와 객체만 조작합니다.

7. 검색 증강 생성: 문서와 채팅하기

대화로 돌아가 보겠습니다. 이번에는 문서에 관해 질문할 수 있습니다. 문서의 추출 데이터베이스에서 관련 정보를 검색할 수 있는 챗봇을 빌드합니다. 이 정보는 모델이 학습에서 비롯된 대답을 생성하려고 하는 대신 대답을 '그라운딩'하는 데 사용됩니다. 이 패턴을 RAG, 즉 검색 증강 생성이라고 합니다.

검색 증강 생성에는 간단히 말해 두 단계가 있습니다.

수집 단계: 문서를 로드하고, 더 작은 청크로 분할하고, 문서의 벡터 표현 ('벡터 임베딩')을 시맨틱 검색이 가능한 '벡터 데이터베이스'에 저장합니다.

질문 단계: 이제 사용자가 문서에 관해 챗봇에 질문할 수 있습니다. 질문도 벡터로 변환되어 데이터베이스의 다른 모든 벡터와 비교됩니다. 가장 유사한 벡터는 일반적으로 의미적으로 관련이 있으며 벡터 데이터베이스에서 반환됩니다. 그런 다음 LLM에 대화의 컨텍스트, 데이터베이스에서 반환된 벡터에 해당하는 텍스트 스니펫이 제공되고 이러한 스니펫을 살펴봄으로써 답변을 그라운딩하도록 요청됩니다.

문서 준비

이 새로운 데모에서는 Google이 개척한 '트랜스포머' 신경망 아키텍처에 관해 질문합니다. 이는 오늘날 모든 최신 대규모 언어 모델이 구현되는 방식입니다.

wget 명령어를 사용하여 인터넷에서 PDF를 다운로드하면 이 아키텍처를 설명하는 연구 논문 ('Attention is all you need')을 검색할 수 있습니다.

wget -O attention-is-all-you-need.pdf \
    https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf

대화형 검색 체인 구현

문서 수집을 먼저 진행하고 사용자가 문서에 관해 질문하는 쿼리 시간을 진행하는 2단계 접근 방식을 빌드하는 방법을 단계별로 살펴보겠습니다.

문서 수집

문서 수집 단계의 첫 번째 단계는 다운로드한 PDF 파일을 찾아 읽을 PdfParser를 준비하는 것입니다.

PdfDocumentParser pdfParser = new PdfDocumentParser();
Document document = pdfParser.parse(
    new FileInputStream(new File("/home/YOUR_USER_NAME/palm-workshop/attention-is-all-you-need.pdf")));

일반적인 채팅 언어 모델을 만드는 대신 먼저 '임베딩' 모델의 인스턴스를 만듭니다. 텍스트 조각 (단어, 문장 또는 단락)의 벡터 표현을 만드는 역할을 하는 특정 모델 및 엔드포인트입니다.

VertexAiEmbeddingModel embeddingModel = VertexAiEmbeddingModel.builder()
    .endpoint("us-central1-aiplatform.googleapis.com:443")
    .project("YOUR_PROJECT_ID")
    .location("us-central1")
    .publisher("google")
    .modelName("textembedding-gecko@001")
    .maxRetries(3)
    .build();

다음으로, 함께 협업하여 다음 작업을 실행할 몇 가지 클래스가 필요합니다.

PDF 문서를 청크로 로드하고 분할합니다.
이러한 모든 청크의 벡터 임베딩을 만듭니다.

InMemoryEmbeddingStore<TextSegment> embeddingStore = 
    new InMemoryEmbeddingStore<>();

EmbeddingStoreIngestor storeIngestor = EmbeddingStoreIngestor.builder()
    .documentSplitter(DocumentSplitters.recursive(500, 100))
    .embeddingModel(embeddingModel)
    .embeddingStore(embeddingStore)
    .build();
storeIngestor.ingest(document);

EmbeddingStoreRetriever retriever = EmbeddingStoreRetriever.from(embeddingStore, embeddingModel);

인메모리 벡터 데이터베이스인 InMemoryEmbeddingStore 인스턴스가 벡터 임베딩을 저장하기 위해 생성됩니다.

DocumentSplitters 클래스 덕분에 문서가 청크로 분할됩니다. PDF 파일의 텍스트를 500자 길이의 스니펫으로 분할하며, 100자의 중복이 있습니다 (단어나 문장이 잘리지 않도록 다음 청크와 중복됨).

저장소 '인제스터'는 문서 분할기, 벡터를 계산하는 임베딩 모델, 메모리 내 벡터 데이터베이스를 연결합니다. 그러면 ingest() 메서드가 수집을 처리합니다.

이제 첫 번째 단계가 끝났습니다. 문서가 관련 벡터 임베딩과 함께 텍스트 청크로 변환되어 벡터 데이터베이스에 저장되었습니다.

질문하기

이제 질문할 준비를 하세요. 일반적인 채팅 모델을 만들어 대화를 시작할 수 있습니다.

VertexAiChatModel model = VertexAiChatModel.builder()
    .endpoint("us-central1-aiplatform.googleapis.com:443")
    .project("YOUR_PROJECT_ID")
    .location("us-central1")
    .publisher("google")
    .modelName("chat-bison@001")
    .maxOutputTokens(1000)
    .build();

embeddingStore 변수의 벡터 데이터베이스와 임베딩 모델을 연결하는 '리트리버' 클래스도 필요합니다. 이 함수의 역할은 사용자의 질문에 대한 벡터 임베딩을 계산하여 벡터 데이터베이스를 쿼리하고 데이터베이스에서 유사한 벡터를 찾는 것입니다.

EmbeddingStoreRetriever retriever = 
    EmbeddingStoreRetriever.from(embeddingStore, embeddingModel);

이제 ConversationalRetrievalChain 클래스를 인스턴스화할 수 있습니다 (이는 검색 증강 생성 패턴의 다른 이름일 뿐임).

ConversationalRetrievalChain rag = ConversationalRetrievalChain.builder()
    .chatLanguageModel(model)
    .retriever(retriever)
    .promptTemplate(PromptTemplate.from("""
        Answer to the following query the best as you can: {{question}}
        Base your answer on the information provided below:
        {{information}}
        """
    ))
    .build();

이 '체인'은 다음을 함께 바인딩합니다.

이전에 구성한 채팅 언어 모델입니다.
리트리버는 벡터 임베딩 쿼리를 데이터베이스의 벡터와 비교합니다.
프롬프트 템플릿은 채팅 모델이 제공된 정보 (즉, 벡터 임베딩이 사용자 질문의 벡터와 유사한 문서의 관련 발췌문)를 기반으로 대답해야 한다고 명시적으로 말합니다.

이제 질문할 준비가 되었습니다.

String result = rag.execute("What neural network architecture can be used for language models?");
System.out.println(result);
System.out.println("------------");

result = rag.execute("What are the different components of a transformer neural network?");
System.out.println(result);
System.out.println("------------");

result = rag.execute("What is attention in large language models?");
System.out.println(result);
System.out.println("------------");

result = rag.execute("What is the name of the process that transforms text into vectors?");
System.out.println(result);

다음을 사용하여 프로그램을 실행합니다.

$ ./gradlew run -DjavaMainClass=palm.workshop.ChatPrompts

출력에 질문에 대한 답변이 표시됩니다.

The Transformer is a neural network architecture that can be used for 
language models. It is based solely on attention mechanisms, dispensing 
with recurrence and convolutions. The Transformer has been shown to 
outperform recurrent neural networks and convolutional neural networks on 
a variety of language modeling tasks.
------------
The Transformer is a neural network architecture that can be used for 
language models. It is based solely on attention mechanisms, dispensing 
with recurrence and convolutions. The Transformer has been shown to 
outperform recurrent neural networks and convolutional neural networks on a 
variety of language modeling tasks. The Transformer consists of an encoder 
and a decoder. The encoder is responsible for encoding the input sequence 
into a fixed-length vector representation. The decoder is responsible for 
decoding the output sequence from the input sequence. The decoder uses the 
attention mechanism to attend to different parts of the input sequence when 
generating the output sequence.
------------
Attention is a mechanism that allows a neural network to focus on specific 
parts of an input sequence. In the context of large language models, 
attention is used to allow the model to focus on specific words or phrases 
in a sentence when generating output. This allows the model to generate 
more relevant and informative output.
------------
The process of transforming text into vectors is called word embedding. 
Word embedding is a technique that represents words as vectors in a 
high-dimensional space. The vectors are typically learned from a large 
corpus of text, and they capture the semantic and syntactic relationships 
between words. Word embedding has been shown to be effective for a variety 
of natural language processing tasks, such as machine translation, question 
answering, and sentiment analysis.

전체 솔루션

복사 및 붙여넣기를 용이하게 하기 위해 ChatPrompts 클래스의 전체 콘텐츠는 다음과 같습니다.

package palm.workshop;

import dev.langchain4j.chain.ConversationalRetrievalChain;
import dev.langchain4j.data.document.Document;
import dev.langchain4j.data.document.parser.PdfDocumentParser;
import dev.langchain4j.data.document.splitter.DocumentSplitters;
import dev.langchain4j.data.segment.TextSegment; 
import dev.langchain4j.model.input.PromptTemplate;
import dev.langchain4j.model.vertexai.VertexAiChatModel;
import dev.langchain4j.model.vertexai.VertexAiEmbeddingModel;
import dev.langchain4j.retriever.EmbeddingStoreRetriever;
import dev.langchain4j.store.embedding.EmbeddingStoreIngestor;
import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;

public class ChatPrompts {
    public static void main(String[] args) throws IOException {
        PdfDocumentParser pdfParser = new PdfDocumentParser();
        Document document = pdfParser.parse(new FileInputStream(new File("/ABSOLUTE_PATH/attention-is-all-you-need.pdf")));

        VertexAiEmbeddingModel embeddingModel = VertexAiEmbeddingModel.builder()
            .endpoint("us-central1-aiplatform.googleapis.com:443")
            .project("YOUR_PROJECT_ID")
            .location("us-central1")
            .publisher("google")
            .modelName("textembedding-gecko@001")
            .maxRetries(3)
            .build();

        InMemoryEmbeddingStore<TextSegment> embeddingStore = 
            new InMemoryEmbeddingStore<>();

        EmbeddingStoreIngestor storeIngestor = EmbeddingStoreIngestor.builder()
            .documentSplitter(DocumentSplitters.recursive(500, 100))
            .embeddingModel(embeddingModel)
            .embeddingStore(embeddingStore)
            .build();
        storeIngestor.ingest(document);

        EmbeddingStoreRetriever retriever = EmbeddingStoreRetriever.from(embeddingStore, embeddingModel);

        VertexAiChatModel model = VertexAiChatModel.builder()
            .endpoint("us-central1-aiplatform.googleapis.com:443")
            .project("genai-java-demos")
            .location("us-central1")
            .publisher("google")
            .modelName("chat-bison@001")
            .maxOutputTokens(1000)
            .build();

        ConversationalRetrievalChain rag = ConversationalRetrievalChain.builder()
            .chatLanguageModel(model)
            .retriever(retriever)
            .promptTemplate(PromptTemplate.from("""
                Answer to the following query the best as you can: {{question}}
                Base your answer on the information provided below:
                {{information}}
                """
            ))
            .build();

        String result = rag.execute("What neural network architecture can be used for language models?");
        System.out.println(result);
        System.out.println("------------");

        result = rag.execute("What are the different components of a transformer neural network?");
        System.out.println(result);
        System.out.println("------------");

        result = rag.execute("What is attention in large language models?");
        System.out.println(result);
        System.out.println("------------");

        result = rag.execute("What is the name of the process that transforms text into vectors?");
        System.out.println(result);
    }
}

8. 축하합니다

축하합니다. LangChain4J와 PaLM API를 사용하여 Java로 첫 생성형 AI 채팅 애플리케이션을 빌드했습니다. 그 과정에서 대규모 언어 채팅 모델이 매우 강력하며 자체 문서, 데이터 추출, 심지어 어느 정도 체스까지 플레이하는 등 다양한 작업을 처리할 수 있다는 것을 알게 되었습니다.

PaLM 및 LangChain4J를 사용하여 Java로 사용자 및 문서와 함께 생성형 AI 기반 채팅

1. 소개

생성형 AI란 무엇인가요

생성형 AI는 어떻게 작동하나요?

일반적인 생성형 AI 애플리케이션은 무엇인가요?

Google Cloud에는 어떤 생성형 AI 제품이 있나요?

이 Codelab에서는 무엇에 중점을 두나요?

LangChain4J 프레임워크에 대해 자세히 알려 줘.

학습할 내용

필요한 항목

2. 설정 및 요건

자습형 환경 설정

Cloud Shell 시작

Cloud Shell 활성화

3. 개발 환경 준비

Vertex AI API 사용 설정

Gradle로 프로젝트 구조 만들기

4. PaLM의 채팅 모델에 대한 첫 번째 호출 만들기

5. 개성이 있는 유용한 챗봇

6. 구조화되지 않은 텍스트에서 정보 추출

7. 검색 증강 생성: 문서와 채팅하기

문서 준비

대화형 검색 체인 구현

문서 수집

질문하기

전체 솔루션

8. 축하합니다

다음 단계

추가 자료

참조 문서

PaLM 및 LangChain4J를 사용하여 Java로 사용자 및 문서와 함께 생성형 AI 기반 채팅 컬렉션을 사용해 정리하기 내 환경설정을 기준으로 콘텐츠를 저장하고 분류하세요.

1. 소개

생성형 AI란 무엇인가요

생성형 AI는 어떻게 작동하나요?

일반적인 생성형 AI 애플리케이션은 무엇인가요?

Google Cloud에는 어떤 생성형 AI 제품이 있나요?

이 Codelab에서는 무엇에 중점을 두나요?

LangChain4J 프레임워크에 대해 자세히 알려 줘.

학습할 내용

필요한 항목

2. 설정 및 요건

자습형 환경 설정

Cloud Shell 시작

Cloud Shell 활성화

3. 개발 환경 준비

Vertex AI API 사용 설정

Gradle로 프로젝트 구조 만들기

4. PaLM의 채팅 모델에 대한 첫 번째 호출 만들기

5. 개성이 있는 유용한 챗봇

6. 구조화되지 않은 텍스트에서 정보 추출

7. 검색 증강 생성: 문서와 채팅하기

문서 준비

대화형 검색 체인 구현

문서 수집

질문하기

전체 솔루션

8. 축하합니다

다음 단계

추가 자료

참조 문서

PaLM 및 LangChain4J를 사용하여 Java로 사용자 및 문서와 함께 생성형 AI 기반 채팅