- Python >= 3.10
Install all required packages:
pip install -r requirements.txtKey dependencies
| Package | Version |
|---|---|
| numpy | 1.26.4 |
| tqdm | 4.66.4 |
| requests | 2.32.3 |
| openai | 1.40.6 |
| SPARQLWrapper | 2.0.0 |
| sentence-transformers | 3.0.1 |
| transformers | 4.44.2 |
| tokenizers | 0.19.1 |
| Service | Endpoint | Description |
|---|---|---|
| Virtuoso SPARQL | http://localhost:8890/sparql |
Must be loaded with a Freebase-compatible RDF dump using namespace http://rdf.freebase.com/ns/ |
├── main.py # End-to-end pipeline and evaluation
├── run.py # Entry point script
├── llm_client.py # LLM API interface with retry logic
├── freebase_func.py # SPARQL utilities for Freebase (incl. CVT handling)
├── chains_oneshot.py # One-shot schema-level path enumeration
├── schema_tri.py # Schema trie construction and management
├── prompt_list.py # Prompt templates for LLM calls
├── freebase_schema.csv # Freebase schema definitions
├── described.jsonl # Entity descriptions
├── data/ # Input datasets
├── output/ # Intermediate results
├── requirements.txt
└── README.md
The input dataset is a JSON file containing a list of question objects:
[
{
"question": "Who directed Titanic?",
"topic_entity": {
"m.0jcx": "Titanic"
},
"answer": "James Cameron"
}
]| Field | Type | Description |
|---|---|---|
question |
string | Natural language question |
topic_entity |
dict | Mapping from Freebase MIDs to entity names |
answer |
string | Gold answer |
Set the following environment variables before running:
export LLM_API_KEY=<your_api_key>
export LLM_API_BASE=<your_api_base_url>python main.py \
--data_path data/cwq.json \
--limit 100 \
--workers 8 \
--model gpt-4| Argument | Description | Default |
|---|---|---|
--data_path |
Path to input dataset | - |
--limit |
Number of questions to evaluate | None (all) |
--workers |
Number of parallel workers | 1 |
--model |
LLM backend identifier | gpt-4 |
Final predictions are saved in JSON format:
{
"idx": 0,
"question": "...",
"gold": ["James Cameron"],
"pred": "James Cameron"
}Per-question artifacts are saved under output/ for analysis and debugging.
This project is released for research purposes.