📚 Book RAG Chat System

A powerful Retrieval-Augmented Generation (RAG) system that allows you to upload PDF books and chat with them using AI. Each book is stored separately, and you can select which book to chat with.

Features

📤 Upload PDF Books: Upload multiple PDF books to the system
📖 Book Selection: Select which book you want to chat with from a dropdown
💬 AI-Powered Chat: Ask questions about the book content using RAG
🗑️ Book Management: Delete books you no longer need
💾 Persistent Storage: Books are stored in a vector database for fast retrieval

Requirements

Python 3.12+
Google Gemini API key (for embeddings and LLM)

Installation

Clone the repository:

git clone <repository-url>
cd Book-Rag

Install dependencies:

pip install -r requirements.txt

Set up environment variables: Create a .env file in the root directory:

GOOGLE_API_KEY=your_google_api_key_here

Usage

Running the Application

Start the Gradio web interface:

python app.py

The application will start on http://localhost:7860 (or the URL shown in the terminal).

Using the Web Interface

Upload a Book:
- Click "Choose File" under "Upload Book"
- Select a PDF file
- Click "Upload Book"
- Wait for the upload confirmation
Select a Book:
- Use the dropdown menu under "Select Book"
- Choose the book you want to chat with
Chat with the Book:
- Type your question in the text box
- Click "Send" or press Enter
- The AI will answer based on the book's content
Manage Books:
- Click "Refresh Book List" to update the list
- Click "Delete Selected Book" to remove a book
- Click "Clear Chat" to clear the current conversation

Architecture

Components

vectorstore.py: Contains the BookRAGSystem class that manages:
- Multiple book vector stores (one per book)
- PDF loading and chunking
- RAG query processing
- Book metadata management
app.py: Gradio web interface with:
- File upload functionality
- Book selection dropdown
- Chat interface
- Book management features
gemini_llm.py: Google Gemini LLM configuration

How It Works

Upload: When you upload a PDF, it's:
- Loaded and split into chunks
- Embedded using Google Gemini embeddings
- Stored in a separate Chroma collection per book
Query: When you ask a question:
- The system retrieves relevant chunks from the selected book
- The LLM generates an answer based on the retrieved context
- The answer is displayed in the chat interface

Technical Details

Vector Store: ChromaDB with persistent storage
Embeddings: Google Gemini Embedding Model (models/gemini-embedding-001)
LLM: Google Gemini (gemini-2.5-flash-lite)
Chunking: RecursiveCharacterTextSplitter (1000 chars, 200 overlap)
Retrieval: Top-k similarity search (k=4 by default)

File Structure

Book-Rag/
├── app.py                 # Gradio web interface
├── vectorstore.py         # RAG system core logic
├── gemini_llm.py          # LLM configuration
├── main.py                # Legacy script (can be removed)
├── requirements.txt       # Python dependencies
├── .env                   # Environment variables (create this)
├── chroma_langchain_db/   # Vector database storage
└── docs/                  # Sample PDF files

Troubleshooting

"Book not found" error

Make sure you've selected a book from the dropdown
Try refreshing the book list

Upload fails

Ensure the file is a valid PDF
Check that you have write permissions in the directory
Verify your Google API key is set correctly

No response from AI

Check your internet connection
Verify your Google API key is valid and has quota
Check the console for error messages

License

This project is open source and available for use.

Contributing

Feel free to submit issues and enhancement requests!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📚 Book RAG Chat System

Features

Requirements

Installation

Usage

Running the Application

Using the Web Interface

Architecture

Components

How It Works

Technical Details

File Structure

Troubleshooting

"Book not found" error

Upload fails

No response from AI

License

Contributing

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
chroma_langchain_db		chroma_langchain_db
docs		docs
.python-version		.python-version
README.md		README.md
app.py		app.py
gemini_llm.py		gemini_llm.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
vectorstore.py		vectorstore.py

MohamedAziz15/Book-Rag

Folders and files

Latest commit

History

Repository files navigation

📚 Book RAG Chat System

Features

Requirements

Installation

Usage

Running the Application

Using the Web Interface

Architecture

Components

How It Works

Technical Details

File Structure

Troubleshooting

"Book not found" error

Upload fails

No response from AI

License

Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages