In this lesson, we will recap LangChain components, review its essential concepts, and discuss how to use the newly introduced LangSmith platform. Additionally, we will create a basic Large Language Model application to understand its capabilities better.

LangChain Recap

LangChain is a specialized framework designed for building LLM-powered applications. It streamlines the development of intelligent, responsive LLMs and Libraries for handling chains and agents with integrated components. It also offers Templates for deployable task-specific architectures and LangSmith for debugging in a testing environment. The key features of LangChain, like Models, Vector Stores, Chains, etc., have been explained in detail in the previous lesson.

💡

While LangChain is suitable for prototyping, LangSmith enables an environment for debugging, testing, and optimizing LLM apps.

LangChain Hub

LangChain Hub is a centralized repository for community-sourced prompts tailored to various use cases like classification or summarization. The Hub supports both public contributions and private organizational use, fostering a collaborative development environment. The platform's version control system enables users to track prompt modifications and maintain consistency across applications.

The Hub offers features like Prompt Exploration, ideal for fresh interactions with language models or specific prompts to achieve particular objectives. It also simplifies the process of finding and utilizing effective prompts for various models. Additionally, the user can share, modify, and track prompt versions with Prompt Versioning. It allows for the easy management of different versions of prompts, a highly relevant feature in real-world projects where reverting to earlier versions may be necessary.

The user-friendly interface of the Hub allows for prompt testing, customization, and iteration in a playground environment.

💡

Discover, version control, and experiment with different prompts for LangChain and LLMs directly in your browser: docs.smith.langchain.com.

LangSmith

LangSmith provides an environment for evaluating and monitoring the quality of LLM outputs. An integral part of its functionality includes metadata monitoring, token usage, and execution time, which are crucial for resource management.

This platform facilitates the refinement of new chains and tools, potentially enhancing their efficiency and performance. Users can create diverse testing environments tailored to specific needs, enabling thorough evaluation under various conditions. Additionally, the service provides visualization tools that can aid in identifying response patterns and trends, thereby supporting a deeper understanding and assessment of performance.

Lastly, the platform supports tracing the runs associated with an active instance and testing and evaluating any prompts or answers generated.

LangSmith is designed with user-friendliness in mind. The platform offers a range of tutorials and documentation to help you get started.

The setup for LangChain requires installing the necessary libraries and configuring the required environment variables, which we will cover in the following section. For certain functionalities like tracing, you need to have a LangSmith account. Please follow the steps outlined below to set up a new account.

Head over to the LangSmith website and sign up for an account. You can use various supported login methods.
Once your account is set up, go to the settings page. Here, you'll find the option to create an API key.
Click the 'Generate API Key' button to receive your API key.

Versioning

You can commit a prompt after implementing and debugging your chain. Add this prompt under your handle's namespace to view it in the Hub.

from langchain import hub
from langchain.prompts.chat import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")

hub.push("<handle>/rag", prompt)

The sample code.

During evaluation, if you come up with a better idea after trying the prompt, you can push the updated prompt to the same key to "commit" a new version of the prompt. For instance, let's add a system message to the prompt.

# You may try making other changes and saving them in a new commit.
from langchain import schema

prompt.messages.insert(0,
   schema.SystemMessage(
       content="You are a precise, autoregressive question-answering system."
   )
  )

The sample code.

With the saved changes, we can analyze how the change reflects the model performance. The newest version of the prompt is saved as the latest version.

# Pushing to the same prompt "repo" will create a new commit
hub.push(f"{handle}/rag-prompt", prompt)

The sample code.

Tracing

LangSmith allows users to review the inputs and outputs of each element in the chain by simplifying the process of logging runs for your LLM applications. This feature is useful when debugging your application or understanding the behavior of specific components. The following section will explore the optional environment variables that enable the tracing feature. For more information, you can visit the documentation.

Serving (LangServe)

LangServe helps developers deploy LangChain-powered applications and chains as a REST API. It is integrated with FastAPI, which makes the process of creating API endpoints easy and accessible. It is possible to quickly deploy applications by using the langserve package. The deployment process is out of the scope of this course. However, you can learn more about the process from the Github repository.

QuestionAnswering Chain & LangChain Hub

The next steps are loading data from a webpage, splitting it into smaller chunks, transforming them into embeddings, storing them on the Deep Lake vector store, and utilizing the prompt templates from the LangSmith Hub.

Before exploring the code, installing the essential libraries from the Python package (pip) manager is necessary.

pip install -q langchain==0.0.346 openai==1.3.7 tiktoken==0.5.2 cohere==4.37 deeplake==3.8.11 langchainhub==0.1.14

The sample command.

The next step is to set the API keys in the environment for OpenAI, utilized in the embedding generation process, and the Activeloop key, required for storing data in the cloud.

import os

os.environ["OPENAI_API_KEY"] = "<YOUR_OPENAI_API_KEY>"
os.environ["ACTIVELOOP_TOKEN"] = "<YOUR_ACTIVELOOP_API_KEY>"

The sample code.

You can optionally use the following environment variables to keep track of the runs in the LangSmith dashboard under the projects section.

os.environ["LANGCHAIN_TRACING_V2"]=True 
os.environ["LANGCHAIN_ENDPOINT"]="https://api.smith.langchain.com"
os.environ["LANGCHAIN_API_KEY"]="<YOUR_LANGSMITH_API_KEY>"
os.environ["LANGCHAIN_PROJECT"]="langsmith-intro" # if not specified, defaults to "default"

The sample code.

Now, we can read the content of a webpage using the WebBaseLoader class. It will return a single instance of the Document class containing all the textual information from the mentioned address. Subsequently, the lengthy text is divided into smaller segments of 500 characters each, with no overlap, resulting in 130 chunks.

from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Loading
loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
data = loader.load()
print(len(data))

# Split
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
all_splits = text_splitter.split_documents(data)
print(len(all_splits))

The sample code.

1
130

The output.

Chunks can be saved to the Deep Lake vector store through LangChain integration. The DeepLake class handles converting texts into embeddings via OpenAI's API and then stores these results in the cloud. The dataset can be loaded from the GenAI360 course organization, or you can use your organization name (which defaults to your username) to create the dataset. Note that this task incurs the associated costs of using OpenAI endpoints.

from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import DeepLake

vectorstore = DeepLake.from_documents(
						all_splits,
						dataset_path="hub://genai360/langsmith_intro",
						embedding=OpenAIEmbeddings(), overwrite=False)

The sample code.

Your Deep Lake dataset has been successfully created!
Creating 130 embeddings in 1 batches of size 130:: 100%|██████████| 1/1 [00:05<00:00,  5.81s/it] dataset (path='hub://genai360/langsmith_intro', tensors=['text', 'metadata', 'embedding', 'id'])

  tensor      htype       shape      dtype  compression
  -------    -------     -------    -------  ------- 
   text       text      (130, 1)      str     None   
 metadata     json      (130, 1)      str     None   
 embedding  embedding  (130, 1536)  float32   None   
    id        text      (130, 1)      str     None

The output.

Once the data is processed, we can retrieve a prompt from the LangChain hub, which provides a ChatPromptTemplate instance. This eliminates the need for designing a prompt through trial and error, allowing us to build upon already tested implementations. The following code tagged a specific prompt version so future changes would not impact the active deployment version.

from langchain import hub

prompt = hub.pull("rlm/rag-prompt:50442af1")
print(prompt)

The sample code.

ChatPromptTemplate(input_variables=['context', 'question'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: {question} \nContext: {context} \nAnswer:"))])

The output.

Finally, we can employ the RetrievalQA chain to fetch related documents from the database and utilize the ChatOpenAI model to use these documents to generate our final response.

# LLM
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

# RetrievalQA
qa_chain = RetrievalQA.from_chain_type(
    llm,
    retriever=vectorstore.as_retriever(),
    chain_type_kwargs={"prompt": prompt}
)

question = "What are the approaches to Task Decomposition?"
result = qa_chain({"query": question})
result["result"]

The sample code.

The approaches to task decomposition include using LLM with simple prompting, task-specific instructions, and human inputs.

The output.

Prompt versioning supports ongoing experimentation and collaboration, effectively preventing the accidental deployment of chain components that haven't been sufficiently validated.

Conclusion

In this lesson, we discussed how to use the LangChain Hub to store and share prompts for a retrieval QA chain. The Hub is a centralized location to manage, version, and share prompts. LangSmith excels in diagnosing errors, comparing prompt effectiveness, assessing output quality, and tracking key metadata like token usage and execution time for optimizing LLM applications.

The platform also provides a detailed analysis of how different prompts affected the LLM performance. The intuitive UI and the valuable insights the platform offers make the iterative process of refining LLMs more transparent and manageable. It's evident that the LangSmith platform, even in its beta phase, has the potential to be a significant tool for developers aiming to leverage the full potential of LLMs.

LangSmith provides an immediate functionality to sift through your runs and presents metrics. These metrics are essential for quickly assessing latency and the total token count throughout your application.

>> Notebook.

RESOURCES:

hub-examples: LangSmith cookbook

github.com

the art of LangSmith article

The Art of LangSmithing

A guide to testing, evaluating, and monitoring LLM calls for production using LangSmith.

betterprogramming.pub

LangSmith Introduction

LangChain Recap

LangChain Hub

LangSmith

Versioning

Tracing

Serving (LangServe)

QuestionAnswering Chain & LangChain Hub

Conclusion

RESOURCES:

github.com

The Art of LangSmithing