In recent years, the AI industry has grown significantly and is still growing rapidly day by day. Slowly it's becoming an integral part of our daily life, and giving rise to our productivity while cutting off the required time and effort. Whether we are cooking and need a professional instructor or to automate large machines in industries, your saviour AI is here.
Suppose, you have an exam coming up in a week and you are still left with most of the topics to prepare or you just need to check your answer whether it's correct or not. Riffling through 100 and 1000 pages is clearly going to cost you your precious time, why not just ask the question to your book and it spits out the answer ? But that's impossible...
Well, No! Not when AI is around.
In a few minutes you will be able to do that. We will create an AI tool that will allow you to feed your document(s) and ask it questions.
In this post, we hope to give you with
1. A brief overview of Llama-2:
Llama 2 is an open source large language model (LLM) provided by Meta for research and commercial use.
Llama 2 comes in three variants:
2. Brief overview of langchain:
Langchain is a software framework for large language models (LLMs) designed to simplify the creation of applications using AI and LLM.
3. Brief of HuggingFace:
HuggingFace is a platform where the machine learning community collaborates on models, datasets, and applications. We are using it to download the Llama-2 7b chat model.
4. How to ingest your local document:
The process of ingesting document involves following steps:
Split Text into chunks:
Chunking is the process of breaking large pieces of texts into smaller segments.
# driver to create chunks of text and python file
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
python_splitter = RecursiveCharacterTextSplitter.from_language(
language=Language.PYTHON, chunk_size=880, chunk_overlap=200
)
# creating chunks of texts
texts = text_splitter.split_documents(text_documents)
Create embeddings:
An embedding is a low-dimensional space that can be used to translate high-dimensional vectors.Ideally, an embedding captures some of the input's semantics by clustering semantically comparable inputs in the embedding space. Embeddings can be learned and reused across models.
# Create embeddings
embeddings = HuggingFaceInstructEmbeddings(
model_name=EMBEDDING_MODEL_NAME,
model_kwargs={"device": device_type},)
Store the embeddings:
we are using chromaDB to store the embeddings.
db = Chroma.from_documents(
texts,
embeddings,
persist_directory=PERSIST_DIRECTORY,
client_settings=CHROMA_SETTINGS,
)
4. Use Llama 2 to fetch answer from your document:
To QA with our document we have to follow these steps:
Load the vectors:
vectors are used to represent text or data in a numerical form that the model can understand and process. This representation is known as an embedding.
# load the vectorstore
db = Chroma(
embedding_function=embeddings,
persist_directory=PERSIST_DIRECTORY,
)
retriever = db.as_retriever()
Create Prompt Template:
template = """Use the following pieces of context to answer the question at the end. If you don't know the answer,\
just say that you don't know, don't try to make up an answer.
{context}
{history}
Question: {question}
Helpful Answer:"""
prompt = PromptTemplate(
input_variables=["history", "context", "question"], template=template)
Perform QA:
qa = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
# chain_type="refine",
retriever=retriever,
return_source_documents=True,
chain_type_kwargs={"prompt": prompt, "memory": memory},
)
To perform interactive QA, loop the process:
# Interactive questions and answers
while True:
query = input("\nEnter a query: ")
if query == "exit":
break
# Get the answer from the chain
res = qa(query)
answer, docs = res["result"], res["source_documents"]
# Print the result
print("\n\n> Question:")
print(query)
print("\n> Answer:")
print(answer)
Output: