Skip to content

XMUDM/HyQBench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HyQBench

HyQBench is a comprehensive benchmark designed to evaluate and advance the capabilities of different large language models (LLMs) and methods in hybrid queries. It adopts an innovative data collection framework and incorporates optimization algorithms specifically for semantic operators, enabling the fully automated construction of a diverse dataset of high-quality hybrid query instances. In addition, HyQBench is equipped with the first fully automated evaluation framework, composed of four flexible and modular components. We further design enriched evaluation metrics that emphasize the models’ ability to plan efficiently, and pay more attention to key factors in real-world deployment, such as computational cost. The project aims to promote fair, accurate, and reproducible research in hybrid queries.


📑 Contents

This repository contains the following main components:

  • Dataset

    Contains the dataset of HyQBench. This dataset is generated using an automated method, containing various query types and covering multiple domains. It provides sample data and download links. Example data and download links are provided here.

  • Code

    Provides all code for data generation and evaluation. The code is modular and extensible, supporting the generation of various parts of the data, as well as easy integration with different LLMs and methods.

  • Popular LLMs Response

    Contains responses from various popular LLMs on the HyQBench dataset. This section enables comparative analysis of model capabilities and highlights strengths and weaknesses in hybrid queries.

  • Experimental Resutls

    Presents experimental results, including detailed performance metrics, visualizations. This section helps researchers understand the current state-of-the-art and identify future research directions.

Click to view details.

If you find HyQBench useful, please star the project!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published