Skip to Main Content

Northwestern Campus Library

Northwestern AI in Education

Resources for faculty and students at CT State Northwestern on AI in education, including practical applications, ethical considerations, and more.

Retrieval-Augmented Generation

What is Retrieval-Augmented Generation (RAG)?

Retrieval-Augmented Generation (RAG) is a method used in artificial intelligence to improve how systems answer questions or generate content. Instead of relying only on what the model has learned during training, RAG combines that with real-time access to external information, such as a database, a collection of articles, or a library catalog.

Think of it like this: instead of guessing the answer from memory, the AI first looks up relevant documents and then uses them to create a more accurate and up-to-date response.

This approach helps ensure the output is more reliable, especially in areas that change often, such as research, science, or policy. Users can upload their own documents or information and have a real-time "chat" with a chatbot discussing only that privatized information. By using RAG, we can hone in on the information that we want to extract or share without receiving irrelevant or erroneous results. 

Image credit: https://www.promptingguide.ai/_next/image?url=%2F_next%2Fstatic%2Fmedia%2Frag-framework.81dc2cdc.png&w=3840&q=75

RAG in Higher Education

In a college or university setting, RAG models can:

  • Support students in finding relevant sources for research papers

  • Help faculty design better prompts or assignments involving AI tools

  • Provide more accurate answers grounded in trusted academic resources

  • Bridge the gap between AI tools and library databases


Examples of RAG in Action

RAG-based tools are especially helpful in higher education because they can:

  • Summarize and explain dense academic texts

  • Answer research questions with citations from trusted sources

  • Guide students to relevant articles and resources

  • Interpret course materials that faculty or students upload

  • Provide real-time, explainable support for studying and writing


Examples of RAG in Action

1. Academic Search Assistant
A student asks: “What are recent peer-reviewed findings on microplastics in drinking water?”
A RAG model retrieves up-to-date studies from the library’s science databases, then generates a summary grounded in those sources.

2. Custom Library Chatbot
A user types: “How do I cite a government report in MLA?”
The chatbot searches the library’s citation LibGuides or style handbooks and returns a correct example with explanations.

3. Faculty Teaching Tool
A professor asks: “Give me a quick summary of three recent papers I uploaded on neurodiversity in education.”
RAG reads the uploaded PDFs, pulls key themes from each, and generates a concise overview suitable for class prep or a slide deck.

4. Student Study Assistant (with uploaded content)
A student uploads a set of assigned readings and asks:
“What does the author say about social contract theory across these chapters?”
The RAG tool scans the uploaded documents, retrieves the most relevant sections, and generates a clear summary with direct quotes and citations.


Tools That Utilize RAG

The following tools and platforms use Retrieval-Augmented Generation (RAG) to provide more accurate and context-aware responses. Some are ready to use, while others are open source and can be customized by technically skilled users.

1. NotebookLM (by Google)

  • What it does: NotebookLM allows users to upload documents (PDFs, Google Docs, text files) and ask questions about them. It uses RAG to generate answers that are grounded in the uploaded material.

  • Best for: Students, researchers, and instructors who want to interact with course notes, articles, or research materials.

  • No coding required: This is a user-friendly, browser-based tool. Great for beginners!

2. ChatGPT with File Upload (Pro version)

  • What it does: Allows users to upload files and ask questions about the content. ChatGPT retrieves relevant excerpts from the files and uses them to generate responses.

  • Best for: Students and faculty who want quick answers based on their documents.

  • Note: RAG behavior depends on file handling capabilities, available in the Pro version using GPT-4.

3. Elicit (by Ought.org)

  • What it does: A research assistant designed to help find and summarize academic papers. Elicit uses retrieval techniques to surface evidence from uploaded or indexed texts.

  • Best for: Literature reviews, finding relevant research, and exploring citations.

  • Free to use with no coding required.

4. Haystack (Open Source – Python)

  • What it does: A powerful framework for building custom RAG pipelines using your own data and language models. It connects document stores (like PDFs, databases, etc.) with question-answering systems.

  • Best for: Developers, data scientists, or library technologists.

  • Use cases: Build research assistants, search engines, or course-specific bots.

5. LangChain (Open Source – Python/JavaScript)

  • What it does: A toolkit for chaining together components like document loaders, vector databases, and large language models. RAG is a key feature of many LangChain apps.

  • Best for: Developers or institutions creating custom academic tools.

  • Often paired with: OpenAI, Hugging Face, Pinecone, or Chroma.

6. LlamaIndex (formerly GPT Index)

  • What it does: Helps connect LLMs with structured or unstructured data, including PDFs, websites, Notion pages, or SQL databases.

  • Best for: Creating searchable knowledge bases or course content assistants.

  • Integrates with: LangChain and other frameworks.