LightDepth is a lightweight monocular depth estimation model, built with a ResNet18 encoder-decoder architecture, it demonstrates core concepts of depth estimation from single RGB images without the complexity of production-scale models. The project provides a clear, straightforward implementation ideal for learning about feature extraction, multi-scale processing, and depth prediction in computer vision. Compared project performance with Depth Anything V2 on the NYU Depth V2 test dataset, demonstrating relatively competitive results with significantly fewer parameters and faster inference times.
- Simple Architecture: ResNet18 encoder with U-Net style decoder. Uses pretrained ImageNet weights.
- Easy Training: Straightforward training script with minimal configuration.
- L1 Loss: Simple and effective loss function.
- Multiple Metrics: RMSE, MAE, AbsRel, and SqRel metrics.
- Fast Inference: 72% faster than Depth Anything V2.
- Compact Model: 42% fewer parameters than Depth Anything V2 small (14.3M vs 24.8M).
Download the NYU Depth v2 dataset from Kaggle.
Place the dataset in data/nyu/ directory with the following structure:
data/nyu/
├── nyu2_train.csv
├── nyu2_test.csv
├── nyu2_train/
│ ├── *.jpg
│ └── *.png
└── nyu2_test/
├── *.jpg
└── *.png
Download the trained model checkpoint from Google Drive.
Training Infrastructure:
- GPU: RTX 5080 16GB
- CPU: AMD Ryzen 7 9800X3D
- RAM: 32GB
- OS: Windows 11
- Epochs: 50
- Dataset: NYU Depth v2
For detailed methodology, experiments, and analysis, see: Project Report
- Encoder: ResNet18 (pretrained on ImageNet)
- Decoder: 4-stage upsampling with skip connections
- Channels: [64, 64, 128, 256, 512] → [256, 128, 64, 32] → 1
- Loss: L1 (Mean Absolute Error)
- Metrics: RMSE, MAE, AbsRel, SqRel
LightDepth vs Depth Anything V2 on NYU Depth v2 test set:
| Metric | LightDepth | Depth Anything V2 | Winner |
|---|---|---|---|
| Parameters | 14,330,369 | 24,785,089 | LightDepth (42% fewer) |
| RMSE ↓ | 2.9477 | 2.8074 | Depth Anything V2 (5.0%) |
| MAE ↓ | 2.6758 | 2.2360 | Depth Anything V2 (19.7%) |
| Absolute Relative Error ↓ | 0.9724 | 1.0272 | LightDepth (5.3%) |
| Squared Relative Error ↓ | 2.6063 | 3.7849 | LightDepth (31.1%) |
| Total Inference Time ↓ | 5 seconds | 18 seconds | LightDepth (72.2% faster) |
Key Findings:
- LightDepth achieves competitive performance with 42% fewer parameters
- Significantly faster inference time (72% improvement)
- Better performance on relative error metrics
- Slight trade-off in absolute error metrics (RMSE, MAE)
Sample depth predictions on NYU Depth V2 test set:
| Input | Ground Truth | LightDepth | Depth Anything V2 |
|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
| Input | Ground Truth | LightDepth | Depth Anything V2 |
|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
| Input | Ground Truth | LightDepth | Depth Anything V2 |
|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
- Python 3.14
- PyTorch
- CUDA-capable GPU (recommended)
- Clone the repository:
git clone https://github.com/suxrobGM/lightdepth.git
cd lightdepth- Install dependencies either via PDM or pip:
# Using PDM (Recommended)
pdm install
# Or using pip
pip install -r requirements.txtNote
You can install pdm via pip install pdm if you don't have it already.
You can train the model from scratch or use a pretrained model from the link above.
# Using PDM
pdm train
# Or directly with Python
python scripts/train.py --config config.yamlOptions:
--config: Path to configuration file (default:config.yaml)--resume: Resume training from checkpoint file (optional)
The model saves the best checkpoint to checkpoints/best_model.pth.
Evaluate a trained model on the test set:
# Using PDM
pdm eval --checkpoint checkpoints/best_model.pth
# Or directly with Python
python scripts/eval.py --checkpoint checkpoints/best_model.pth --config config.yamlNote
If you downloaded the pretrained model, use its checkpoint path. For example: --checkpoint checkpoints/lightdepth.pth
Options:
--checkpoint: Path to model checkpoint (required)--config: Path to configuration file
Run inference on a single image:
# Using PDM
pdm infer --checkpoint checkpoints/best_model.pth --input image.png --output depth.png
# Or directly with Python
python scripts/infer.py --checkpoint checkpoints/best_model.pth --input image.png --output depth.pngNote
If you downloaded the pretrained model, use its checkpoint path. For example: --checkpoint checkpoints/lightdepth.pth
Options:
--checkpoint: Path to model checkpoint (required)--input: Path to input RGB image (required)--output: Path to save depth map (default:output/depth.png)--colormap: Colormap for visualization (plasma,viridis,magma,inferno,gray). Usegrayfor grayscale output.
Compare LightDepth with Depth Anything V2:
pdm compare --checkpoint checkpoints/best_model.pthNote
If you downloaded the pretrained model, use its checkpoint path. For example: --checkpoint checkpoints/lightdepth.pth
Options:
--config: Path to configuration filelightdepth-checkpoint: Path to LightDepth model checkpoint (required)--dav2-model: Depth Anything V2 model. Applicable options:depth-anything/Depth-Anything-V2-Small-hf,depth-anything/Depth-Anything-V2-Base,depth-anything/Depth-Anything-V2-Large(default:depth-anything/Depth-Anything-V2-Small-hf)--output: Path to save comparison results JSON (default:output/comparison_results.json)--visualize: Number of samples to visualize (0 to disable)
Edit config.yaml to change training settings. It has well-documented parameters for easy customization.
lightdepth/
├── src/lightdepth/
│ ├── models/ # Model architectures
│ │ ├── encoder.py # ResNet18 encoder
│ │ ├── decoder.py # U-Net decoder
│ │ └── lightdepth.py # Complete model
│ ├── data/ # Data loading
│ │ ├── dataset.py # NYU dataset
│ │ ├── dataloader.py # DataLoader
│ │ └── transforms.py # Data augmentations
│ └── utils/ # Utilities
│ ├── config.py # Configuration
│ ├── losses.py # L1 loss
│ └── metrics.py # RMSE metric
│ └── visualization.py # Visualization utilities
├── scripts/
│ ├── train.py # Training script
│ ├── eval.py # Evaluation script
│ ├── infer.py # Inference script
│ └── compare_models.py # Comparison script with Depth Anything V2
├── config.yaml # Default training and inference configuration
├── requirements.txt # Python dependencies for pip
└── pyproject.toml # Python dependencies for PDM
This project is licensed under the MIT License - see the LICENSE file for details.
- Sukhrobbek Ilyosbekov
- CS 7180 Advanced Perception
If you use LightDepth in your research or project, please cite:
@misc{lightdepth2025,
title={LightDepth: Lightweight Depth Estimation},
author={Sukhrobbek Ilyosbekov},
year={2025},
url={https://github.com/suxrobGM/lightdepth}
}This project was inspired by and compared against Depth Anything V2:
@article{depth_anything_v2,
title={Depth Anything V2},
author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Zhao, Zhen and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
journal={arXiv:2406.09414},
year={2024},
url={https://arxiv.org/abs/2406.09414}
}











