Skip to content
View amrgaberM's full-sized avatar

Block or report amrgaberM

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
amrgaberM/README.md

Hi, I'm Amr Hassan

Typing SVG

Software Engineer specializing in production LLM systems and backend infrastructure.

Cairo, Egypt

LinkedIn Medium Email


About Me

Software engineer specializing in production LLM systems and backend infrastructure.

I don't just call APIs I build the systems that power them:

  • Trained a 124M parameter GPT model from scratch using PyTorch (no Hugging Face shortcuts)
  • Engineered AST-aware RAG pipelines that understand code dependencies, not just keywords
  • Deployed autonomous agents using FastAPI + Docker that run security scans at CI/CD scale

Currently: Building tools that make LLMs understand codebases like senior engineers do.

Open to: Backend Engineer, ML Engineer, Python Developer roles in Cairo or remote.


Featured Projects

CodeBase Intelligence (RAG)

GitHub Repo Live Demo

Problem it solves: Developers waste hours navigating unfamiliar codebases.
Solution: AST-aware RAG that retrieves relevant code across 2000+ files in seconds.
Impact: 0.81 F1 score on multi-file retrieval → 3x faster than grep/regex search.

Tech: Llama 3.3, LangChain, ChromaDB, AST Parser

Python LangChain FastAPI

CodeSense AI (Security Agent)

GitHub Repo Live Demo

Problem it solves: Manual code review misses 40% of security vulnerabilities.
Solution: CI/CD-integrated agent that scans PRs for SQLi/XSS in real-time.
Impact: Sub-second analysis using Groq LPU inference.

Tech: Llama 3.3, Groq API, FastAPI, Docker, Webhooks

FastAPI Docker Groq

FabulaGPT (124M LLM)

GitHub Repo Blog Post

Problem it solves: Understanding Transformer internals, not just using APIs.
Solution: GPT-2 architecture built from scratch in raw PyTorch.
Impact: 41% reduction in validation loss using gradient accumulation on consumer GPUs.

Tech: PyTorch, Transformers, CUDA, TinyStories Dataset

PyTorch Hugging Face

VulnAI (Security ML)

GitHub Repo Blog Post

Problem it solves: Regex-based security tools generate too many false positives.
Solution: Ensemble model fine-tuned on 21K C functions from Devign dataset.
Impact: 66% accuracy, outperforming regex baselines by 12%.

Tech: CodeBERT, Scikit-learn, Pandas, Devign Dataset

Scikit-learn Pandas


Current Focus

🔨 Building: Production-ready code review agent with AST validation
📚 Learning: System design patterns, Data Structures & Algorithms (HackerRank)
🎯 Next: Deploying multi-agent systems with LangGraph


Technical Stack

Core Engineering

Python C++ SQL Bash

AI & LLM Engineering

PyTorch Hugging Face LangChain LlamaIndex ChromaDB

Backend & DevOps

FastAPI Docker Git Linux


📫 Let's Connect

Open to opportunities in Backend Engineering, ML Engineering, and Python Development

LinkedIn Portfolio Email

Pinned Loading

  1. GPT-Implementation GPT-Implementation Public

    Research code implementing the "Attention Is All You Need" architecture. Engineers a stable training loop for a 163M LLM using reduced-precision techniques on free-tier compute.

    Jupyter Notebook

  2. FabulaGPT FabulaGPT Public

    A high-performance implementation of a GPT-2 architecture optimized for emergent storytelling. Trained on the TinyStories dataset, this project focuses on achieving linguistic coherence and narrati…

    Python

  3. injury-prediction-prevention-ml injury-prediction-prevention-ml Public

    A machine learning system for predicting and preventing athlete injuries using advanced data analysis, risk assessment, and tailored recommendations.

    HTML

  4. codebase-intelligence codebase-intelligence Public

    Production-grade RAG system that understands entire codebases. AST-aware chunking, hybrid retrieval, dependency graphs, and multi-file reasoning for GitHub repositories.

    Python

  5. codesense-ai codesense-ai Public

    AI-powered code review tool with CLI, REST API, and GitHub integration

    Python

  6. vulnai vulnai Public

    Multi-model vulnerability detection for C code using CodeBERT, GraphCodeBERT, and CodeT5. Trained on Microsoft’s Devign dataset, VulnAI identifies both keyword-based and structural vulnerabilities …

    Python