GPT model loses general knowledge when processing document content

I’m working on a basic LangChain application where I load PDF documents and query them with GPT. The setup works great for document-specific questions, but I noticed something weird. When I ask basic questions like simple math problems, the model acts like it doesn’t know anything.

import PyPDF2
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.chains.question_answering import load_qa_chain
from langchain.llms import OpenAI

# Load and process document
pdf_file = PyPDF2.PdfReader('./research_paper.pdf')
content = ''
for page_num, page in enumerate(pdf_file.pages):
    page_text = page.extract_text()
    if page_text:
        content += page_text

# Split text into chunks
splitter = RecursiveCharacterTextSplitter(
    chunk_size=800,
    chunk_overlap=150,
    separators=["\n", " "]
)
text_chunks = splitter.split_text(content)

# Create vector store
embedding_model = OpenAIEmbeddings(model='gpt-3.5-turbo')
vector_db = FAISS.from_texts(text_chunks, embedding_model)

# Setup QA chain
qa_chain = load_qa_chain(OpenAI(), chain_type="stuff")

# Query the system
user_query = "What are the main findings?"
relevant_docs = vector_db.similarity_search(user_query)
response = qa_chain.run(input_documents=relevant_docs, question=user_query)

The issue is that the model seems to only know about the PDF content and forgets its general training knowledge. Is there a way to maintain both the document-specific info and the model’s original capabilities? I want it to answer document questions AND general knowledge questions.

Yeah, this is a super common RAG problem. Your QA chain only sees the retrieved docs, so the model basically ignores everything else it knows.

I hit this same wall last year building a knowledge base. The fix is dead simple - just update your prompt to tell the model it can use both the docs AND its general knowledge.

This worked for me:

from langchain.prompts import PromptTemplate

custom_prompt = PromptTemplate(
    template="""Use the following context to answer the question. If the answer isn't in the context, you can use your general knowledge to help.

    Context: {context}
    
    Question: {question}
    
    Answer:""",
    input_variables=["context", "question"]
)

qa_chain = load_qa_chain(OpenAI(), chain_type="stuff", prompt=custom_prompt)

Or you could build a routing system - check if the question relates to your docs first. Low similarity scores? Skip the document context and go straight to the LLM.

Basically, when you stuff docs into the prompt, the model gets tunnel vision. You’ve got to explicitly tell it that going beyond those docs is fine.

had the same problem with a chatbot for work docs. the fix is easy - just add fallback logic. if vector search scores are low (under 0.6), skip the doc context and send the raw question straight to gpt. takes like 5 lines of code and gets rid of that “i dont know basic math” nonsense.

RAG tunnel vision drives me crazy. Hit the same wall building an internal tool that had to handle both company docs and general tech questions.

Best fix? Automate the routing decision entirely. Skip the manual prompt tweaking and similarity score babysitting.

I built a workflow that analyzes each question first - document-related stuff goes through RAG, general knowledge questions skip vector search and go straight to the LLM.

You can add conditional logic for edge cases too. When document similarity scores are iffy, it tries both paths and picks the better response.

Took me 30 minutes to set up with automation workflows. No more prompt engineering nightmares or manual routing headaches. System handles everything based on your rules.

Bonus: built-in logging shows you which path each question took, so you can optimize over time.

Your RAG setup constrains the model to only use retrieved document chunks. The model gets those text pieces as context and prioritizes them over what it learned during training. I hit this same problem building a legal document assistant. Here’s what worked: I implemented a hybrid approach that checks semantic similarity scores first. If similarity drops below 0.7, I skip the document context and let the model use its general knowledge instead. I also modified the retrieval step. Rather than always feeding documents to the QA chain, I added a preliminary check to see if the query needs documents or general knowledge. You can use simple keyword matching or a quick classification prompt. The key insight? RAG doesn’t have to be all-or-nothing. Sometimes the model’s training beats your documents.

Been there. Built a document analyzer for our legal team and hit this exact problem.

Your chain setup forces the model to treat document chunks as the only valid info source. Ask “what’s 2+2” but the retrieved docs are about patent law? GPT thinks it can only use patent law to answer.

I skipped the routing approaches others mentioned. Kept the RAG pipeline but changed how retrieval works. Instead of always pulling X documents, I set a dynamic threshold.

relevant_docs = vector_db.similarity_search_with_score(user_query, k=3)
filtered_docs = [doc for doc, score in relevant_docs if score < 0.5]  # Lower is better for cosine distance

if not filtered_docs:
    # No relevant docs found, skip RAG entirely
    response = OpenAI().generate([user_query])
else:
    response = qa_chain.run(input_documents=filtered_docs, question=user_query)

Math questions never see irrelevant document context. Document questions get full RAG treatment.

Worked way better than prompt tweaking. The model doesn’t get confused about what info to prioritize when there’s no relevant context.

I’ve encountered this issue frequently. The core problem lies in how context is presented to the model — it assumes that the retrieved document chunks represent the entirety of available information. To mitigate this, consider changing the chain type from “stuff” to “map_reduce” or “refine”. These options manage information differently and allow the model’s general knowledge to remain intact. Additionally, I created a simple classifier using sentence transformers to determine if a query requires document context. If the relevance score is low, I bypass the vector search and query the LLM directly. Another useful tip is to prefix the document chunks with a note that indicates to the model that it can draw from its overall knowledge when necessary. Essentially, overly emphasizing documents in prompts can lead the model to disregard its training.