Agent Chains

LangChain

Framework for developing applications powered by language models; connect a language model to other sources of data; allow a language model to interact with its environment
Implementation of the paper ReAct: Synergizing Reasoning and Acting in Language Models which demonstrates a prompting technique to allow the model to “reason” (with a chain-of-thoughts) and “act” (by being able to use a tool from a predefined set of tools, such as being able to search the internet). This combination is shown to drastically improve output text quality and give large language models the ability to correctly solve problems.
Misc
- Notes from:
  - A Gentle Intro to Chaining LLMs, Agents, and utils via LangChain
    - See article for workflow for multi-chains
    - Also shows some diagnostic methods that are included in the library
- Available vector stores for document embeddings
  - Also see Databases, Vector Databases
- See article for description and link to code for a manual, more controlled parsing of markdown files to get text and code blocks
Frameworks
- LangChain: Comprehensive but complex, offering numerous integrations and features.
- crewAI: Specialized in multi-agent orchestration, great for complex agent interactions.
- Instructor: Focused on structured outputs and instruction following.
- PydanticAI:
  - Harnesses the power of Pydantic to validate and structure model outputs, ensuring responses are consistent across runs.
  - Deep integration with Pydantic’s ecosystem
  - Familiar patterns for FastAPI developers
  - Model-agnostic (OpenAI, Anthropic, Gemini, Ollama)
  - Built-in dependency injection for testing
  - Seamless Logfire integration for real-time monitoring
Components
- Document loader: Facilitate the data loading from various sources, including CSV files, SQL databases, and public datasets like Wikipedia.
- Agent: Use the language model as a reasoning engine to determine which actions to take and in which order. It repeats through a continuous cycle of thought-action-observation until the task is completed.
- Chain: Different from agents, they consist of predetermined sequences of actions, which are hard coded. It addresses complex and well-defined tasks by guiding multiple tools with high-level directions.
- Memory: Currently the beta version supports accessing windows of past messages, this provides the application with a conversational interface
In general, chains are what you get by sequentially connecting one or more large language models (LLMs) in a logical way.
Chains can be built using LLMs or “Agents”.
- Agents provide the ability to answer questions that require recent or specialty information the LLM hasn’t been trained on.
  - e.g. “What will the weather be like tomorrow?”
  - An agent has access to an LLM and a suite of tools for example Google Search, Python REPL, math calculator, weather APIs, etc. (list of supported agents)
  - LLMs will use tools to interact with Agents (list of tools)
    - Tool process
      - Uses a LLMChain for building the API URL based on our input instructions and makes the API call.
      - Upon receiving the response, it uses another LLMChain that summarizes the response to get the answer to our original question.
  - ReAct (Reason + Act) is a popular agent that picks the most usable tool (from a list of tools), based on what the input query is.
    - Output Components: observation, a thought, or it takes an action. This is mainly due to the ReAct framework and the associated prompt that the agent is using (See example with ZERO_SHOT_REACT_DESCRIPTION)
  - serpapi is useful for answering questions about current events.
Chains can be simple (i.e. Generic) or specialized (i.e. Utility).
- Generic — A single LLM is the simplest chain. It takes an input prompt and the name of the LLM and then uses the LLM for text generation (i.e. output for the prompt).
  - Generic chains are more often used as building blocks for Utility chains
- Utility — These are specialized chains, comprised of many LLMs to help solve a specific task. For example,
  - LangChain supports some end-to-end chains (such as AnalyzeDocumentChain for summarization, QnA, etc) and some specific ones (such as GraphQnAChain for creating, querying, and saving graphs). Programme Aided Language Model reads complex math problems (described in natural language) and generates programs (for solving the math problem) as the intermediate reasoning steps, but offloads the solution step to a runtime such as a Python interpreter.
  - 2-Chain Examples
    - Chain 1 is used to clean the prompt (remove extra whitespaces, shorten prompt, etc) and chain 2 is used to call an LLM with this clean prompt. (link)
    - Chain 1 is used to generate a synopsis for a play and chain is used to write a review based on this synopsis. (link)
Document Loaders (docs)- various helper functions that take various formats and types of data and produce a document output
- Formats like like markdown, word docs, text, PowerPoint, images, HTML, PDF, csvs, AsciiDoc (adoc), etc.
- Examples
  - GitLoader function clones the repository and load relevant files as documents
  - YoutubeLoader - gets subtitles from videos
  - DataFrameLoader - converts text columns in panda dfs to documents
- Also, tons of other functions for googledrive or dbs like bigquery, duckdb or cloud storage like s3 or confluence or email or discord, etc.
Text Spitters (Docs) - After loading the documents, they’re usually fed to text splitter to create chunks of text due to LLM context constraints. The chunks of text can then be transformed into embeddings.
- From “Sales and Support Chatbot” article in example below

# Define text chunk strategy
splitter = CharacterTextSplitter(
  chunk_size=2000,
  chunk_overlap=50,
  separator=" "
)
# GDS guides
gds_loader = GitLoader(
    clone_url="https://github.com/neo4j/graph-data-science",
    repo_path="./repos/gds/",
    branch="master",
    file_filter=lambda file_path: file_path.endswith(".adoc")
    and "pages" in file_path,
)
gds_data = gds_loader.load()
# Split documents into chunks
gds_data_split = splitter.split_documents(gds_data)
print(len(gds_data_split)) #771

Embeddings
- OpenAI’s text-embedding-ada-002 model is easy to work with, achieves the highest performance out of all of OpenAI’s embedding models (on the BEIR benchmark), and is also the cheapest ($0.0004/1K tokens).
- HuggingFace’s sentence-transformers , which reportedly has better performance than OpenAI’s embeddings, but that involves downloading the model and running it on your own server.
Example: Generic
- Build Prompt

from langchain.prompts import PromptTemplate
prompt = PromptTemplate(
    input_variables=["product"],
    template="What is a good name for a company that makes {product}?",
)
print(prompt.format(product="podcast player"))
# OUTPUT
# What is a good name for a company that makes podcast player?

If using multiple variables, then you need, e.g. print(prompt.format(product="podcast player", audience="children”), to get the updated prompt.
Create LLMChain instance and run

from langchain.llms import OpenAI
from langchain.chains import LLMChain
llm = OpenAI(
          model_name="text-davinci-003", # default model
          temperature=0.9) #temperature dictates how whacky the output should be
llmchain = LLMChain(llm=llm, prompt=prompt)
llmchain.run("podcast player")

If you had more than one input_variables, then you won’t be able to use run. Instead, you’ll have to pass all the variables as a dict.
- e.g., LLMchain({“product”: “podcast player”, “audience”: “children”}).
Using the less expensive chat models: chatopenai = ChatOpenAI(model_name="gpt-3.5-turbo")
Example: Multiple Chains and Multiple Input Variables
- Goal: create an age-appropriate gift generator
- Chain 1: Find age

# Chain1 - solve math problem, get the age
from langchain.agents import initialize_agent
from langchain.agents import AgentType
from langchain.agents import load_tools

llm = OpenAI(temperature=0)
tools = load_tools(["pal-math"], llm=llm)
agent = initialize_agent(tools,
                        llm,
                        agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
                        verbose=True)

pal-math is a math-solving tool
React agent uses the tool to answer the age problem
Chain 2: Recommend a gift

template = """You are a gift recommender. Given a person's age,\n
it is your job to suggest an appropriate gift for them. If age is under 10,\n
the gift should cost no more than [{budget}]{style='color: #990000'} otherwise it should cost atleast 10 times [{budget}]{style='color: #990000'}.
Person Age:
{output}
Suggest gift:"""
prompt_template = PromptTemplate(input_variables=["output", "budget"], template=template)
chain_two = LLMChain(llm=llm, prompt=prompt_template)

“{output}” is the name of the output from the 1st chain
- Find the name of the output of a chain: print(agent.agent.llm_chain.output_keys)
The prompt includes a conditional that transforms {budget} (more below)
LLMchain is used when there are multiple variable in the template
Combine Chains and Run

overall_chain = SequentialChain(
                input_variables=["input"],
                memory=SimpleMemory(memories={"budget": "100 GBP"}),
                chains=[agent, chain_two],
                verbose=True)
overall_chain.run("If my age is half of my dad's age and he is going to be 60 next year, what is my current age?")

The prompt is only for the 1st chain, and it’s output, Age, will be input for the second chain.
SimpleMemory is used pass the variable for the second prompt which adds some additional context to the second chain — the {budget} for the gift.
Output

#> Entering new SequentialChain chain...
#> Entering new AgentExecutor chain...
# I need to figure out my dad's current age and then divide it by two.
#Action: PAL-MATH
#Action Input: What is my dad's current age if he is going to be 60 next year?
#Observation: 59
#Thought: I now know my dad's current age, so I can divide it by two to get my age.
#Action: Divide 59 by 2
#Action Input: 59/2
#Observation: Divide 59 by 2 is not a valid tool, try another one.
#Thought: I can use PAL-MATH to divide 59 by 2.
#Action: PAL-MATH
#Action Input: Divide 59 by 2
#Observation: 29.5
#Thought: I now know the final answer.
#Final Answer: My current age is 29.5 years old.
#> Finished chain.
# For someone of your age, a good gift would be something that is both practical and meaningful. Consider something like a nice watch, a piece of jewelry, a nice leather bag, or a gift card to a favorite store or restaurant.\nIf you have a larger budget, you could consider something like a weekend getaway, a spa package, or a special experience.'}
#> Finished chain.

Example: Sales and Support Chatbot (article)
- Create embeddings from text data sources and store in Chroma vector store

# Define embedding model
OPENAI_API_KEY = "OPENAI_API_KEY"
embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)
sales_data = medium_data_split + yt_data_split
sales_store = Chroma.from_documents(
    sales_data, embeddings, collection_name="sales"
)
support_data = kb_data + gds_data_split + so_data
support_store = Chroma.from_documents(
    support_data, embeddings, collection_name="support"
)

Sales data is from Medium articles and YouTube subtitles
Support data is from docs in a couple github repos and stackoverflow
Instantiate chatgpt

llm = ChatOpenAI(
    model_name="gpt-3.5-turbo",
    temperature=0,
    openai_api_key=OPENAI_API_KEY,
    max_tokens=512,
)

Sales prompt template

sales_template = """As a Neo4j marketing bot, your goal is to provide accurate 
and helpful information about Neo4j, a powerful graph database used for 
building various applications. You should answer user inquiries based on the 
context provided and avoid making up answers. If you don't know the answer, 
simply state that you don't know. Remember to provide relevant information 
about Neo4j's features, benefits, and use cases to assist the user in 
understanding its value for application development.
[{context}]{style='color: #990000'}
Question: {question}"""
SALES_PROMPT = PromptTemplate(
    template=sales_template, input_variables=["context", "question"]
)
sales_qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=sales_store.as_retriever(),
    chain_type_kwargs={"prompt": SALES_PROMPT},
)

“{context}” in the template is the data stored in the vector store
“retriever” arg points to vector store (e.g. sales_store) and uses the as_retriever method to get the embeddings
Support prompt template

support_template = """
As a Neo4j Customer Support bot, you are here to assist with any issues 
a user might be facing with their graph database implementation and Cypher statements.
Please provide as much detail as possible about the problem, how to solve it, and steps a user should take to fix it.
If the provided context doesn't provide enough information, you are allowed to use your knowledge and experience to offer you the best possible assistance.
{context}
Question: {question}"""
SUPPORT_PROMPT = PromptTemplate(
    template=support_template, input_variables=["context", "question"]
)
support_qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=support_store.as_retriever(),
    chain_type_kwargs={"prompt": SUPPORT_PROMPT},
)

See Sales template
Add React agent to determine whether LLM should use Sales or Support templates and contexts

tools = [
    Tool(
        name="sales",
        func=sales_qa.run,
        description="""useful for when a user is interested in various Neo4j information, 
                      use-cases, or applications. A user is not asking for any debugging, but is only
                      interested in general advice for integrating and using Neo4j.
                      Input should be a fully formed question.""",
    ),
    Tool(
        name="support",
        func=support_qa.run,
        description="""useful for when when a user asks to optimize or debug a Cypher statement or needs
                      specific instructions how to accomplish a specified task. 
                      Input should be a fully formed question.""",
    ),
]

agent = initialize_agent(
    tools, 
    llm, 
    agent="zero-shot-react-description", 
    verbose=True
)
agent.run("""What are some GPT-4 applications with Neo4j?""")

In this example, the tools used by the Agent are custom data sources and prompt templates