Skip to content
/ DeepCGC Public

[TKDE 26] DeepCGC: Unveiling the Deep Clustering Mechanism of Fast Graph Condensation

Notifications You must be signed in to change notification settings

XYGaoG/DeepCGC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DeepCGC: Unveiling the Deep Clustering Mechanism of Fast Graph Condensation

Introduction

This repository contains the official implementation of our paper DeepCGC: Unveiling the Deep Clustering Mechanism of Fast Graph Condensation.

DeepCGC extends CGC from our WWW 2025 paper Rethinking and Accelerating Graph Condensation: A Training-Free Approach with Class Partition [code].

We generalize CGC's class-to-node matching principle into a broader latent-space formulation, revealing that graph condensation can be interpreted as a class-wise clustering problem in the latent space.

GC

The key improvements of DeepCGC include:

  • 🎯 Clustering-driven optimization objective
  • 🔄 Non-linear, invertible relay model
  • 💪 Enhanced representational capacity while maintaining efficiency

For more works about graph condensation, please refer to our TKDE'25 survey paper 🔥Graph Condensation: A Survey and paper list Graph Condensation Papers.

Requirements

Required dependencies are provided in ./requirements.txt.

Dataset

Configure the dataset directory path via args.raw_data_dir and ensure all datasets are downloaded to this location.

  • For Cora and Citeseer, they will be downloaded from PYG.
  • Ogbn-products will be downloaded from OGB.
  • For Ogbn-arxiv, Flickr and Reddit, we use the datasets provided by GraphSAINT. They are available on Google Drive link (alternatively, BaiduYun link (code: f1ao)). Note that the links are provided by GraphSAINT team.

Condensation

To condense the graph using DeepCGC and train GCN models:

$ python main.py --gpu 0 --dataset reddit --ratio 0.001 --generate_adj 1

For more efficient graphless variant DeepCGC-X:

$ python main.py --gpu 0 --dataset reddit --ratio 0.001 --generate_adj 0

Results will be recorded in ./results/ and condensed graphs will be saved in ./cond_graph/.

Additional Scripts

Comprehensive scripts for different condensation ratios are provided in ./script.sh.

Hyper-parameters

For pre-defined condensation ratios in ./script.sh, hyperparameters are automatically loaded from ./config/dataset_name.yaml.

For custom condensation ratios, we recommend hyperparameter search on the validation set for optimal performance.

Citation

@inproceedings{gao2025rethinking,
 title={Rethinking and Accelerating Graph Condensation: A Training-Free Approach with Class Partition},
 author={Gao, Xinyi and Ye, Guanhua and Chen, Tong and Zhang, Wentao and Yu, Junliang and Yin, Hongzhi},
 booktitle={Proceedings of the ACM on Web Conference 2025},
 year={2025}
}
@article{gao2025graph,
  title={Graph condensation: A survey},
  author={Gao, Xinyi and Yu, Junliang and Chen, Tong and Ye, Guanhua and Zhang, Wentao and Yin, Hongzhi},
  journal={IEEE Transactions on Knowledge and Data Engineering},
  year={2025},
  publisher={IEEE}
}

About

[TKDE 26] DeepCGC: Unveiling the Deep Clustering Mechanism of Fast Graph Condensation

Resources

Stars

Watchers

Forks