-
Notifications
You must be signed in to change notification settings - Fork 540
Add lazy EMD solver with O(n) memory requirement #788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Implement emd_c_lazy in C++ network simplex for memory-efficient OT - Add lazy mode to emd2() accepting coordinates (X_a, X_b) instead of cost matrix - Support sqeuclidean, euclidean, and cityblock metrics - Add __restrict__ for SIMD optimization - Remove debug output from network_simplex_simple.h - Add tests for lazy solver and metric variants
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #788 +/- ##
===========================================
- Coverage 97.07% 84.97% -12.10%
===========================================
Files 107 107
Lines 22156 22249 +93
===========================================
- Hits 21507 18906 -2601
- Misses 649 3343 +2694 🚀 New features to boost your workflow:
|
rflamary
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nearly there, this is a wonderful PR that will allows solvers on very large data (at the cost of computing time)
ot/lp/_network_simplex.py
Outdated
| metric="sqeuclidean", | ||
| numItermax=100000, | ||
| log=False, | ||
| return_matrix=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since it is sparse, it could be true by default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The solver here is only for dense ? So not sure why it should be true by defautl
Add lazy EMD solver with on-the-fly distance computation
Types of changes
Motivation and context / Related issue
Addresses memory limitations when computing OT with large point clouds. Instead of pre-computing and storing the full n×m cost matrix, the lazy solver computes distances on-the-fly during the network simplex algorithm. This reduces memory from O(nm) to O(n+m) while maintaining exact EMD solutions.
How has this been tested
test_solvers.py:test_solve_sample_lazyandtest_solve_sample_lazy_emdPR checklist