CommitGuard is a PR-focused security scanner that automatically analyzes Pull Requests for leaked secrets and insecure changes. It inspects the PR’s commits/diff for hardcoded credentials (API keys, tokens, passwords, private keys) and risky configurations (e.g., disabled TLS verification), then posts a structured report back to the PR.
The analysis can be LLM-assisted via LangChain (OpenAI or other providers) to classify findings into risk levels (HIGH/MEDIUM/LOW) and generate concise explanations and evidence.
- Python 3.12+
Install dependencies and the tool locally:
pip install -r requirements.txt
pip install -e .In your repository:
Settings → Secrets and variables → Actions → New repository secret
Add the following secrets:
Used to read commit data (and private repos if needed).
Recommended permissions: - Contents: Read - Pull requests: Read
Used for LLM-based risk classification.
If you use another LLM provider (Ollama, Azure, etc.), configure the corresponding environment variables instead.
Create file:
.github/workflows/pr-automation.yml
with the following content:
name: PR Automation
on:
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
permissions:
issues: write
pull-requests: write
contents: read
jobs:
bot:
name: Run CommitGuard
runs-on: ubuntu-latest
steps:
- name: Run CommitGuard on PR
uses: SanyaKor/CommitGuard@v1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
GH_PAT: ${{ secrets.GH_PAT }}On every PR update CommitGuard will automatically:
- analyze PR commits and changes
- detect leaked secrets and insecure configs
- classify findings by severity
- post a comment with results in the PR
- upload full JSON report as workflow artifact
No manual steps are required.
Example:
Run via CLI:.
commitguard --repo <GITHUB_REPO_URL> --n <NUMBER_OF_COMMITS> --out <OUTPUT_JSON_FILE_NAME>--repo— GitHub repository URL (HTTPS or SSH)--n— number of commits to fetch (1–100)--out— Output json file name(default - suspicious_commits.json)
** Scan the last 5 commits of a repo via HTTPS**
commitguard --repo https://github.com/owner/repo.git --n 5 --out output.json- Suspicious findings are reported with counts of HIGH, MEDIUM, and LOW threats.
- Results are saved to
suspicious_commits.json, including:- code line
- file + line location
- commit metadata
- risk level (LLM)
[
{
"line": "password = \"supersecret123\"",
"location": "app/config.py:42",
"author": "octocat",
"date": "2025-09-30T12:00:00Z",
"commit_message": "fix db connection",
"llm_response": "HIGH: hardcoded password",
"commit_sha": "xxxxxxxxxxxxxxxxxxxxxx"
}
]This program requires a valid GitHub API key AND LLM key(openai), dont forget to set your own personal access token as an environment variable:
export GH_PAT="your_personal_access_token_here"
export OPENAI_API_KEY=sk-...CommitGuard uses LangChain, so you can plug any chat model with a LangChain wrapper: OpenAI, Ollama, AnthropicLLM, ... etc.
Set API key via env:
export OPENAI_API_KEY=sk-...CommitGuard includes a test suite to validate core functionality:
- GitHub API auth
- Commit fetching (sync/async)
- Leak parser (regex, test context, entropy)
- TODO: LLM integration flow
Run tests with:
pytest -vPlanned improvements and next steps for CommitGuard:
- Expand leak parser rules with more patterns (cloud provider keys, OAuth tokens, etc.)
- Refine entropy-based detection to reduce false positives
- Optimize LLM integration (batch processing, better scoring, MCP, data vectorisation + db hosting(chroma))
- Add more tests for leak parser and LLM workflow
