Skip to content

spydisec/PDFtoOFX

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

31 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ANZ Plus to OFX Converter

Convert ANZ Plus bank statement PDFs to OFX format for seamless import into Actual Budget and other personal finance applications.

Tests Coverage Python Docker License

✨ Features

  • 🎨 Modern Web Interface - Drag-and-drop PDF upload with instant conversion
  • πŸ“ Multi-line Descriptions - Captures complete transaction details including merchant, location, and reference numbers
  • 🎯 Smart Truncation - Preserves merchant names in OFX NAME field by removing common prefixes
  • πŸ’― Accurate Detection - Uses balance change analysis for reliable credit/debit identification
  • πŸ” Collision-free IDs - Sequential FITID strategy ensures unique transaction IDs for duplicate prevention
  • πŸ“Š Complete History - Converts all transactions without filtering to maintain accurate balances
  • ⚑ OFX v2.20 XML - Modern format compatible with Actual Budget and other finance apps
  • πŸ“‹ Comprehensive Logging - Production-ready logging with file rotation, JSON formatting, and monitoring support

πŸš€ Quick Start

🐳 Option 1: Docker (Recommended)

Official multi-architecture image - automatically built and published via GitHub Actions:

# Pull and run from Docker Hub (no installation required!)
docker pull spydisec/anzplus-ofx-converter:latest
docker run -d -p 8000:8000 spydisec/anzplus-ofx-converter:latest

# Open your browser to http://localhost:8000

Why Docker?

  • βœ… No Python installation required
  • βœ… Works on Windows, Mac, Linux
  • βœ… Multi-architecture support (amd64, arm64/Apple Silicon)
  • βœ… Production-ready configuration
  • βœ… Automatic health checks
  • βœ… Simple version pinning

Version Pinning (Production):

# Pin to specific version (recommended for stability)
docker run -d -p 8000:8000 spydisec/anzplus-ofx-converter:1.0.0

# Or always use latest
docker run -d -p 8000:8000 spydisec/anzplus-ofx-converter:latest

Custom Configuration:

docker run -d -p 8000:8000 \
  -e WORKERS=8 \
  -e LOG_LEVEL=debug \
  -e ENVIRONMENT=production \
  spydisec/anzplus-ofx-converter:latest

Docker Compose:

version: '3.8'
services:
  anzplus-ofx-converter:
    image: spydisec/anzplus-ofx-converter:latest
    ports:
      - "8000:8000"
    environment:
      - ENVIRONMENT=production
      - WORKERS=4
      - LOG_LEVEL=info
    restart: unless-stopped

Build Locally (Optional):

Most users should use the official image. For development:

# Windows
.\docker\build-local.ps1

# Linux/Mac
docker build -f docker/Dockerfile -t anzplus-ofx:local .

πŸ“š Full Docker documentation: docker/README.md

🐍 Option 2: Python Installation

For local development or customization:

Installation

# Clone the repository
git clone https://github.com/spydisec/PDFtoOFX.git
cd PDFtoOFX

# Create and activate virtual environment (recommended)
python3 -m venv .venv
source .venv/bin/activate  # On Linux/Mac
# or
.venv\Scripts\Activate.ps1  # On Windows

# Install dependencies
pip install -r requirements.txt

πŸ“š For detailed installation instructions including Linux self-hosting, systemd service setup, and Nginx configuration, see INSTALLATION.md

Web Interface

Web Interface

Launch the elegant web app with drag-and-drop file upload:

python run_web.py

Then open http://localhost:8000 in your browser.

Web Interface Features:

  • 🎨 Minimalistic design with Tailwind CSS
  • πŸ“€ Drag-and-drop PDF upload
  • ⚑ Real-time conversion progress
  • πŸ“₯ One-click OFX download
  • πŸ”’ Secure (files processed in memory, auto-deleted after download)
  • πŸ“± Fully mobile responsive

Command Line Interface

python convert_pdf.py input.pdf output.ofx

Example:

python convert_pdf.py statement.pdf statement.ofx

Output:

Converting: statement.pdf
Output to: statement.ofx

Step 1: Extracting text from PDF...
  βœ“ Extracted 5438 characters
Step 2: Parsing transactions...
  βœ“ Found 26 transactions
  βœ“ Date range: 2026-01-02 to 2026-01-22
  βœ“ Opening balance: $3117.92
  βœ“ Closing balance: $232.16
Step 3: Generating OFX file...
  βœ“ Generated 8661 bytes of OFX data
Step 4: Writing OFX file...
  βœ“ Saved to statement.ofx

βœ… Conversion complete!

πŸ“₯ Import to Actual Budget

  1. Open Actual Budget
  2. Navigate to your account
  3. Click Import β†’ Select File
  4. Choose your .ofx file
  5. Review and approve transactions

Verify Duplicate Detection:
Re-import the same OFX file to confirm all transactions are detected as duplicates, validating the FITID strategy.

🏦 Supported Bank

This converter is specifically designed for ANZ Plus digital PDF statements.

βœ… Supported:

  • ANZ Plus digital PDF statements
  • Transaction format: Date, Description, Credit, Debit, Balance columns
  • Multi-line transaction details

❌ Not Supported:

  • Other ANZ products (ANZ Classic, ANZ Access, etc.) - different formats
  • Scanned/image PDFs - no OCR functionality
  • Other banks - requires bank-specific parsers

πŸ“ Transaction Example

ANZ Plus PDF:

22 Jan VISA DEBIT PURCHASE CARD 1633 MYKI        $25.00    $233.45
       PAYMENTS MELBOURNE

Generated OFX:

<STMTTRN>
  <TRNTYPE>DEBIT</TRNTYPE>
  <DTPOSTED>20260122000000.000[+0:UTC]</DTPOSTED>
  <TRNAMT>-25.00</TRNAMT>
  <FITID>ANZ_20260122_0001</FITID>
  <NAME>MYKI PAYMENTS MELBOURNE</NAME>
  <MEMO>VISA DEBIT PURCHASE CARD 1633 MYKI PAYMENTS MELBOURNE</MEMO>
</STMTTRN>

Improvements:

  • NAME: MYKI PAYMENTS MELBOURNE (merchant visible, smart truncation)
  • MEMO: Full description with all details
  • Type: Correctly identified as DEBIT using balance analysis
  • FITID: ANZ_20260122_0001 (unique, collision-free)

πŸ“‚ Project Structure

PDFtoOFX/
β”œβ”€β”€ README.md                    # This file
β”œβ”€β”€ ARCHITECTURE.md              # Technical documentation
β”œβ”€β”€ LICENSE                      # MIT license
β”œβ”€β”€ pyproject.toml              # Package configuration
β”œβ”€β”€ requirements.txt            # Dependencies
β”œβ”€β”€ requirements-dev.txt        # Development dependencies
β”‚
β”œβ”€β”€ run_web.py                  # Web app launcher
β”œβ”€β”€ convert_pdf.py              # CLI tool
β”‚
β”œβ”€β”€ app/                        # Application code
β”‚   β”œβ”€β”€ models.py               # Pydantic data models
β”‚   β”œβ”€β”€ services/               # Core business logic
β”‚   β”‚   β”œβ”€β”€ anz_plus_parser.py  # ANZ Plus PDF parser
β”‚   β”‚   β”œβ”€β”€ fitid_generator.py  # Unique ID generation
β”‚   β”‚   β”œβ”€β”€ ofx_generator.py    # OFX XML generation
β”‚   β”‚   └── pdf_extractor.py    # PDF text extraction
β”‚   └── web/                    # Web application
β”‚       β”œβ”€β”€ main.py             # FastAPI app
β”‚       β”œβ”€β”€ routes.py           # API endpoints
β”‚       └── templates/          # HTML templates
β”‚
└── tests/                      # Test suite
    └── test_converter.py       # Unit & integration tests

πŸ› οΈ Requirements

  • Python: 3.11 or higher
  • Dependencies:
    • ofxtools>=0.9.6 - OFX v2.20 XML generation
    • pdfplumber>=0.11.9 - PDF text extraction
    • pydantic>=2.12.5 - Data validation
    • python-dateutil - Date parsing
    • fastapi>=0.109.0 - Web framework
    • uvicorn[standard]>=0.27.0 - ASGI server
    • jinja2>=3.1.3 - Template rendering
    • python-multipart>=0.0.6 - File uploads
    • aiofiles>=23.2.1 - Async file operations

πŸ§ͺ Testing

Run the test suite:

# Run all tests
python -m pytest tests/ -v

# Run with coverage report
python -m pytest tests/ --cov=app --cov-report=html

Test Results:

βœ… 6/6 tests passing
βœ… 91% code coverage
βœ… FITID collision detection verified
βœ… Multi-line description capture validated
βœ… Smart truncation tested
βœ… End-to-end conversion verified

πŸ”§ Development

Setup Development Environment

# Create virtual environment
python -m venv .venv

# Activate (Windows)
.venv\Scripts\activate

# Activate (Linux/Mac)
source .venv/bin/activate

# Install development dependencies
pip install -r requirements-dev.txt

Code Style

  • Follow PEP 8 guidelines
  • Use type hints for all functions
  • Write docstrings for public APIs
  • Maximum line length: 100 characters

Running the Web App in Development

# Auto-reload on code changes
python run_web.py

# Or use uvicorn directly
uvicorn app.web.main:app --reload --port 8000

Pre-commit Checks

# Format code
black app/ tests/

# Run linter
flake8 app/ tests/

# Type checking
mypy app/

# Run tests
pytest tests/ -v

🀝 Contributing

Contributions are welcome! Here's how:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Add tests for new functionality
  5. Ensure all tests pass (pytest tests/ -v)
  6. Commit your changes (git commit -m 'Add amazing feature')
  7. Push to the branch (git push origin feature/amazing-feature)
  8. Open a Pull Request

Contribution Guidelines

  • Write tests for new features
  • Maintain or improve code coverage (currently 91%)
  • Follow existing code style and patterns
  • Update documentation as needed

πŸ“‹ How It Works

See ARCHITECTURE.md for detailed technical documentation.

High-level flow:

  1. Extract text from PDF using pdfplumber
  2. Parse transactions with regex patterns (supports multi-line descriptions)
  3. Truncate smartly (remove common prefixes, preserve merchant names)
  4. Detect credit/debit using balance change analysis
  5. Generate unique FITIDs using sequential counter
  6. Create OFX v2.20 XML using ofxtools library

⚠️ Known Limitations

  • Single Bank: Currently only supports ANZ Plus PDF format
  • Digital Only: Does not support scanned/image PDFs (no OCR)
  • Date Handling: Uses current year if not found in PDF
  • Manual Categorization: Transactions import uncategorized

πŸš€ Deployment Options

🐳 Docker (Recommended for Production)

Pull from Docker Hub:

docker pull spydisec/anzplus-ofx-converter:latest
docker run -d -p 8000:8000 --restart unless-stopped spydisec/anzplus-ofx-converter:latest

Using Docker Compose:

version: '3.8'
services:
  anzplus-ofx:
    image: spydisec/anzplus-ofx-converter:latest
    ports:
      - "8000:8000"
    environment:
      - WORKERS=4
      - LOG_LEVEL=info
    restart: unless-stopped

Multi-Architecture Support:

  • Automatically pulls correct image for your platform
  • Supports: linux/amd64 (Intel/AMD), linux/arm64 (Apple Silicon, ARM)

πŸ“š Complete Docker guide: docker/README.md

Local Development

python run_web.py

Cloud Platforms

Container-based (Docker):

  • Render.com: Free tier available, auto-deploy from GitHub
  • Railway.app: $5/month, easy Docker deployment
  • Fly.io: Pay-as-you-go, multi-region, global edge
  • AWS ECS/Fargate: Enterprise-grade container orchestration
  • Google Cloud Run: Serverless containers, auto-scaling

Traditional hosting:

πŸ› Troubleshooting

Web app won't start

# Check if port 8000 is in use
netstat -ano | findstr :8000

# Use different port
uvicorn app.web.main:app --port 3000

PDF conversion fails

  • Ensure PDF is from ANZ Plus (not ANZ Classic/Access)
  • Verify PDF is not scanned/image-based
  • Check PDF is not password-protected

Balance mismatch in Actual Budget

  • All transactions are now preserved (including ROUND UP)
  • Verify opening and closing balances match your PDF
  • Check for any filtered transactions

πŸ“„ License

This project is licensed under the MIT License - see LICENSE file for details.

πŸ™ Acknowledgments

πŸ“ž Support


Note: This project is not affiliated with ANZ Bank. It is an independent tool created for personal finance management.

Made with ❀️ for the Actual Budget community

About

No description or website provided.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Contributors 3

  •  
  •  
  •