-
Notifications
You must be signed in to change notification settings - Fork 0
andyzorigin/data_contamination
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Usage:
python compute_contamination_metrics.py --input-data <input_data> --scenario-data <scenario_data> --output-stats <output_stats> --input-format <input_format>
For instance, you can call this with The Pile, e.g. have:
input_data = 00.jsonl (download https://pile.eleuther.ai/)
scenario_data = (example included with repo, but can use HELM to generate)
output_stats = arbitrary output file name, e.g. "output_stats"
input_format = the_pile
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published