Simple SQLite-Like Database in C

Overview

This project is a simple SQLite-inspired database implemented in C, designed to demonstrate core database concepts:

Persistent storage of rows in leaf nodes using a B+ tree.
Efficient search, insert, and traversal operations.
Basic command parsing for inserting and selecting rows.

The primary goal of this project is learning by doing: to deeply understand how databases internally manage data, optimize read/write operations, and scale efficiently.

Motivation

My journey into databases started with a simple video about indexing, which sparked my curiosity about how databases store data, manage queries, and optimize performance. Reading about database internals — pages on disk, I/O costs, and tree-based indexing

I wanted to go beyond theory and experience the challenges firsthand, so I decided to build a minimal database engine to better understand and visualize these concepts.

I detailed my learning about Databases basic data storage and why using B+trees in this Notion page Databases

Features

This simple database supports one table with the following schema:

id (integer)
username (string, max 32 characters)
email (string, max 255 characters)

The database supports the following commands:

insert <id> <username> <email>: Inserts a new row into the database.
select: Retrieves and displays all rows in the database.
select <id>: Searches for a row by its ID and displays it if found.
bulk_insert inserts rows from "sample_data.txt" file for testing.

Meta commands:

.exit: Exits the database program and frees allocated memory.
.btree: Displays the structure of the B+ tree.
.constants: Displays constants used in the database.
.help: Displays help information about available commands.
.cache: Displays LRU cache structure.

Building and Running

To build and run the database, follow these steps:

Clone the repository:

git clone https://github.com/AhmedTrb/Database.git
cd Database

Compile the code using gcc and run the executable:
```
gcc -o db db.c
./db database.db
```

Database File Structure

The database uses a simple file-based storage mechanism to persist data using a B+Tree. The file structure is as follows:

+------------------- Database File -------------------+
|                                                     |
|  [PAGE 0] (Root Internal Node)                      |
|  ┌───────────────────────────────────────────────┐  |
|  │ Node Type: INTERNAL                           │  |
|  │ Is Root: true                                 │  |
|  │ Num Keys: 1                                   │  |
|  │                                               │  |
|  │ Keys: [4]                                     │  |
|  │ Children: [1] [2]                             │  |
|  └───────────────────────────────────────────────┘  |
|                                                     |
|  [PAGE 1] (Leaf Node)                               |
|  ┌───────────────────────────────────────────────┐  |
|  │ Node Type: LEAF                               │  |
|  │ Num Cells: 3                                  │  |
|  │ Next Leaf: 2                                  │  |
|  │                                               │  |
|  │ Row #1 → id=1 username="ahmed" email="a@x.com"│  |
|  │ Row #2 → id=2 username="john"  email="j@x.com"│  |
|  │ Row #3 → id=3 username="ali"   email="al@x.com│  |
|  └───────────────────────────────────────────────┘  |
|                                                     |
|  [PAGE 2] (Leaf Node)                               |
|  ┌───────────────────────────────────────────────┐  |
|  │ Node Type: LEAF                               │  |
|  │ Num Cells: 2                                  │  |
|  │ Next Leaf: NULL                               │  |
|  │                                               │  |
|  │ Row #4 → id=4 username="lisa"  email="li@x.com│  |
|  │ Row #5 → id=5 username="omar"  email="o@x.com"│  |
|  └───────────────────────────────────────────────┘  |
|                                                     |
+-----------------------------------------------------+

LRU Cache

To optimize read operations, the database implements a simple LRU (Least Recently Used) cache. The cache stores recently accessed pages in memory, reducing the need for frequent disk I/O operations. The cache has a fixed capacity and uses a doubly linked list and a hash table to track the order of page usage. When a page is accessed, it is moved to the front of the list. If the cache is full and a new page needs to be added, the least recently used page (at the end of the list) is evicted.

+--------------------- LRU Cache ----------------------+
|                                                     |
|  Doubly Linked List (Order of Use):                 |
|                                                     |
|  HEAD (MRU)                                         |
|   ┌──────────┐    ┌──────────┐    ┌──────────┐      |
|   │ Page 12  │<-->| Page 8   │<-->| Page 5   │      |
|   └──────────┘    └──────────┘    └──────────┘      |
|                         ^                           |
|                         |                           |
|                      TAIL (LRU)                     |
|                                                     |
+-----------------------------------------------------+
|                                                     |
|  Hash Table (Direct Access by Page Number):         |
|                                                     |
|  Index: PageNum → Pointer                           |
|  -------------------------------------------------  |
|   [5]  → (ptr to node holding Page 5)               |
|   [8]  → (ptr to node holding Page 8)               |
|   [12] → (ptr to node holding Page 12)              |
|                                                     |
|  Empty slots are NULL for pages not cached.         |
+-----------------------------------------------------+

Acknowledgment

This project follows this excellent step-by-step tutorial DB Tutorial, which guided the initial implementation of:

B+ tree structure for table storage.
Row storage in leaf nodes.
Disk persistence through simple file-based storage.

I built upon this foundation by adding new utilities and experimenting with different aspects of the database to understand each component in depth. Things I added:

select id
LRU Cache to optimize read operations.
bulk_insert command to insert multiple rows from a file.
Debug and visualization utilities (.cache, .btree).

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.idea		.idea
README.md		README.md
db.c		db.c
db.exe		db.exe
test.db		test.db

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Simple SQLite-Like Database in C

Overview

Motivation

Features

Building and Running

Database File Structure

LRU Cache

Acknowledgment

About

Uh oh!

Languages

AhmedTrb/Database

Folders and files

Latest commit

History

Repository files navigation

Simple SQLite-Like Database in C

Overview

Motivation

Features

Building and Running

Database File Structure

LRU Cache

Acknowledgment

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages