Module 1 Introduction - Basics of Retrieval Augmented Generation with Langchain and LlamaIndex

This module covers the basic concepts of Langchain and Llamaindex and prepares you to build a basic RAG application with both frameworks and help you decide which tool to select for your use case (spoiler: they both have their utility!). This is a recap of the Langchain concepts covered in the earlier LangChain & Vector Databases in Production course, together with a brief introduction to the Llamaindex framework. It also contains a clear summary of the different strengths and focus of each framework. This course will be focussed on advanced RAG topics so we recommend taking our earlier course and reading examples from LangChain and Llamaindex documentation to complement this module.

LangChain: Basic Concepts Recap
LlamaIndex Introduction: Precision and Simplicity in Information Retrieval
Chat with Your Code: LlamaIndex and Activeloop Deep Lake for GitHub Repositories

LangChain: Basic Concepts Recap

In this lesson, students will recap on the functionalities and components of LangChain, a framework designed to work with generative AI and LLM-based applications. This material was covered in depth with code and project examples in our earlier course LangChain & Vector Databases in Production. Students will be introduced to crucial preprocessing techniques like document loading and chunking, understanding the indexing of document segments and embedding models, as well as the structure and functionality of chains, memory modules, and vector stores. Additionally, students will gain insight into working with chat models, LLMs, embedding models, and constructing sequential chains to automate complex interactions.

LlamaIndex Introduction: Precision and Simplicity in Information Retrieval

In this lesson, students will learn about the LlamaIndex framework, designed to enhance the capabilities of Large Language Models by integrating them with Retrieval-Augmented Generation (RAG) systems. The framework allows LLM-based applications to fetch accurate and relevant information using vector stores, connectors, nodes, and index types for better-informed responses. The lesson covers vector stores and their importance in semantic search, the role of data connectors and LlamaHub in data ingestion, the creation of node objects from documents, and the indexing of data for quick retrieval. Students will also learn about the practical construction and usage of query engines, routers, and the distinction between saving indexes locally and on the cloud. Finally, it compares LlamaIndex with the LangChain frameworks and concludes by discussing the practical application and effectiveness of LlamaIndex in LLM applications.

Chat with Your Code: LlamaIndex and Activeloop Deep Lake for GitHub Repositories

In this project lesson, students will learn how to use LlamaIndex in conjunction with Activeloop Deep Lake to index GitHub repositories, enabling interaction with codebases through natural language queries. They will understand both tools' core functionalities, the synergy between data structuring and optimized storage, and the setup process for integrating these technologies. The lesson will guide students through installing necessary packages, setting up a Python virtual environment, loading and parsing GitHub repository data, building an index, and querying this index using a combination of LlamaIndex and Deep Lake. Additionally, the lesson covers the customization and flexibility of LlamaIndex's API for tailored data retrieval and response synthesis, and it concludes by comparing LlamaIndex with LangChain for building chatbots with external data.