-
Notifications
You must be signed in to change notification settings - Fork 2
Add comprehensive test suite infrastructure for numta #25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: deepentropy <8287111+deepentropy@users.noreply.github.com>
Co-authored-by: deepentropy <8287111+deepentropy@users.noreply.github.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements a comprehensive test suite infrastructure for the numta library, adding extensive testing capabilities to ensure reliability after recent feature additions (pandas support, visualization, pattern recognition, and streaming indicators). The implementation adds 635 new tests across multiple testing dimensions.
Key Changes:
- Unit testing infrastructure with auto-discovered parametrized tests for 100+ numta functions
- Performance benchmarking framework comparing numta against TA-Lib and pandas-ta
- Accuracy testing framework with multiple data type generators and detailed metrics
- Comprehensive feature tests for pandas extensions and streaming indicators
- Enhanced CI/CD pipeline with coverage reporting and benchmark jobs
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 28 comments.
Show a summary per file
| File | Description |
|---|---|
tests/conftest.py |
Shared pytest fixtures for OHLCV data generation, edge cases, and conditional import handling with custom marker registration |
tests/unit/test_all_functions.py |
Auto-discovered parametrized tests for all numta functions with edge case handling and output validation |
tests/benchmark/benchmark_runner.py |
Performance testing framework with BenchmarkRunner class for comparing implementations and generating reports |
tests/accuracy/test_accuracy.py |
Accuracy testing framework with AccuracyMetrics dataclass and comparison against TA-Lib/pandas-ta |
tests/test_pandas_ext_comprehensive.py |
36 comprehensive tests for pandas .ta accessor including edge cases and result validation |
tests/test_streaming_comprehensive.py |
41 tests for streaming indicators covering batch equivalence, reset functionality, and circular buffer operations |
pyproject.toml |
Added pytest-cov dependency and configured pytest markers, coverage settings, and test exclusions |
.github/workflows/test.yml |
Enhanced CI workflow with separate unit test and benchmark jobs, coverage reporting, and workflow permissions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
|
|
||
| # Correlation coefficient | ||
| if np.std(valid_a) > 0 and np.std(valid_b) > 0: | ||
| correlation = float(np.corrcoef(valid_a, valid_b)[0, 1]) |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The correlation coefficient calculation could fail with a ZeroDivisionError if the valid arrays have zero standard deviation and the conditional doesn't catch it properly. While there's a check for np.std(valid_a) > 0 and np.std(valid_b) > 0, the correlation calculation happens inside that condition, but numpy's corrcoef could still have numerical issues. Consider wrapping the correlation calculation in a try-except block for robustness.
| correlation = float(np.corrcoef(valid_a, valid_b)[0, 1]) | |
| try: | |
| correlation = float(np.corrcoef(valid_a, valid_b)[0, 1]) | |
| except Exception: | |
| correlation = 1.0 if mae < 1e-10 else 0.0 |
| valid_values = {-100, 0, 100} | ||
| unique_values = set(np.unique(result[~np.isnan(result)]).astype(int)) | ||
| assert unique_values.issubset(valid_values), \ | ||
| f"{func_name} returned unexpected values: {unique_values}" |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertion checks if unique values are a subset of {-100, 0, 100}, but it first converts float results to integers. This could mask issues where the function returns non-integer values like -99.5 or 50.0. Consider checking the dtype or using a more precise validation that ensures the values are actually integers or at least very close to the expected integer values.
| valid_values = {-100, 0, 100} | |
| unique_values = set(np.unique(result[~np.isnan(result)]).astype(int)) | |
| assert unique_values.issubset(valid_values), \ | |
| f"{func_name} returned unexpected values: {unique_values}" | |
| valid_values = np.array([-100, 0, 100]) | |
| non_nan_values = np.unique(result[~np.isnan(result)]) | |
| # Check that each unique value is close to one of the valid values | |
| for val in non_nan_values: | |
| assert np.any(np.isclose(val, valid_values, atol=1e-8)), \ | |
| f"{func_name} returned unexpected value: {val}" | |
| # Optionally, check that all non-NaN values are integer-valued | |
| assert np.all(np.mod(non_nan_values, 1) == 0), \ | |
| f"{func_name} returned non-integer values: {non_nan_values}" |
| uses: codecov/codecov-action@v4 | ||
| with: | ||
| files: ./coverage.xml | ||
| fail_ci_if_error: false |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The workflow uses codecov/codecov-action@v4 but doesn't provide the required token parameter. While the action has fail_ci_if_error: false and continue-on-error: true to prevent CI failures, codecov uploads typically require a token for private repositories. Consider adding the token via secrets or documenting that this only works for public repositories.
| fail_ci_if_error: false | |
| fail_ci_if_error: false | |
| token: ${{ secrets.CODECOV_TOKEN }} |
| - name: Run tests with coverage | ||
| if: matrix.python-version == '3.12' | ||
| run: | | ||
| pip install pytest-cov |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pytest-cov package is already added to the dev dependencies at line 38, so the pip install pytest-cov command in the workflow is redundant. The package should already be installed from the pip install -e ".[dev]" step on line 34.
| pip install pytest-cov |
| 'STOCHF': {'inputs': ['high', 'low', 'close'], 'params': { | ||
| 'fastk_period': 5, 'fastd_period': 3, 'fastd_matype': 0 | ||
| }}, | ||
| 'STOCHRSI': {'inputs': ['close'], 'params': {'timeperiod': 14}}, |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The STOCHRSI function is defined to return a tuple with 2 outputs in MULTI_OUTPUT_FUNCTIONS, but according to TA-Lib documentation, STOCHRSI should return 2 values (fastk and fastd). However, the signature in FUNCTION_SIGNATURES on line 92 only defines it as taking close with timeperiod, which doesn't specify the additional fastk_period, fastd_period, and fastd_matype parameters that STOCHRSI typically accepts. This could cause the test to fail or produce incorrect comparisons.
| 'STOCHRSI': {'inputs': ['close'], 'params': {'timeperiod': 14}}, | |
| 'STOCHRSI': {'inputs': ['close'], 'params': {'timeperiod': 14, 'fastk_period': 5, 'fastd_period': 3, 'fastd_matype': 0}}, |
| HAS_NUMBA = False | ||
|
|
||
| try: | ||
| import talib |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'talib' is not used.
| HAS_TALIB = False | ||
|
|
||
| try: | ||
| import pandas_ta |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'pandas_ta' is not used.
| import pytest | ||
| import numpy as np | ||
| from dataclasses import dataclass | ||
| from typing import Dict, List, Callable, Optional, Tuple |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'List' is not used.
Import of 'Optional' is not used.
Import of 'Dict' is not used.
Import of 'Callable' is not used.
| from typing import Dict, List, Callable, Optional, Tuple | |
| from typing import Tuple |
| from numta.streaming import ( | ||
| # Base classes | ||
| StreamingIndicator, | ||
| CircularBuffer, | ||
| # Overlap | ||
| StreamingSMA, | ||
| StreamingEMA, | ||
| StreamingBBANDS, | ||
| StreamingDEMA, | ||
| StreamingTEMA, | ||
| StreamingWMA, | ||
| # Momentum | ||
| StreamingRSI, | ||
| StreamingMACD, | ||
| StreamingSTOCH, | ||
| StreamingMOM, | ||
| StreamingROC, | ||
| # Volatility | ||
| StreamingATR, | ||
| StreamingTRANGE, | ||
| # Volume | ||
| StreamingOBV, | ||
| StreamingAD, | ||
| ) |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'StreamingIndicator' is not used.
Import of 'StreamingMACD' is not used.
Import of 'StreamingSTOCH' is not used.
Import of 'StreamingAD' is not used.
| SMA, EMA, BBANDS, DEMA, TEMA, WMA, | ||
| RSI, MACD, STOCH, MOM, ROC, | ||
| ATR, TRANGE, | ||
| OBV, AD, |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'DEMA' is not used.
Import of 'TEMA' is not used.
Import of 'RSI' is not used.
Import of 'MACD' is not used.
Import of 'STOCH' is not used.
Import of 'MOM' is not used.
Import of 'ROC' is not used.
Import of 'ATR' is not used.
Import of 'TRANGE' is not used.
Import of 'OBV' is not used.
Import of 'AD' is not used.
| SMA, EMA, BBANDS, DEMA, TEMA, WMA, | |
| RSI, MACD, STOCH, MOM, ROC, | |
| ATR, TRANGE, | |
| OBV, AD, | |
| SMA, EMA, BBANDS, WMA, |
Implements test infrastructure to bulletproof execution after recent feature additions (pandas support, viz, pattern recognition, streaming).
Changes
Unit Testing Infrastructure
tests/conftest.py: Shared fixtures for OHLCV data (numpy/pandas), edge cases (empty, NaN, inf, constant), and conditional import handling (HAS_PANDAS, HAS_NUMBA, HAS_TALIB)tests/unit/test_all_functions.py: Auto-discovered parametrized tests for all 100+ numta functions withFUNCTION_SIGNATURESmapping, edge case handling, and output type validationPerformance Testing Framework
tests/benchmark/benchmark_runner.py:BenchmarkRunnerclass withBenchmarkResult/ComparisonResultdataclasses for comparing numta vs TA-Lib vs pandas-ta with report generationAccuracy Testing Framework
tests/accuracy/test_accuracy.py:AccuracyMetricsdataclass (MAE, RMSE, correlation) with data generators for random/trending/cyclical/volatile patternsNew Feature Tests
tests/test_pandas_ext_comprehensive.py: 36 tests for.taaccessor—registration, index preservation, edge cases, results matchingtests/test_streaming_comprehensive.py: 41 tests for streaming indicators—batch equivalence, reset functionality, circular buffer opsCI/CD
.github/workflows/test.yml: Added benchmark job, coverage reporting, workflow permissionspyproject.toml: Added pytest-cov, custom markers, coverage configurationUsage
+635 new tests (974 total, all passing)
Original prompt
This pull request was created as a result of the following prompt from Copilot chat.
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.