👁️ Seeing Eye to AI: Comparing Human Gaze and Model Attention in Video Memorability

📑 Contents

About
Setup
Dataset
Repository Structure
Usage
Citation

🤖 About

Understanding what makes a video memorable has important applications in advertising or education technology. Towards this goal, we investigate spatio-temporal attention mechanisms underlying video memorability. Different from previous works that fuse multiple features, we adopt a simple CNN+Transformer architecture that enables analysis of spatio-temporal attention while matching state-of-the-art (SoTA) performance on video memorability prediction. We compare model attention against human gaze fixations collected through a small-scale eye-tracking study where humans perform the video memory task. We uncover the following insights: (i) Quantitative saliency metrics show that our model, trained only to predict a memorability score, exhibits similar spatial attention patterns to human gaze, especially for more memorable videos. (ii) The model assigns greater importance to initial frames in a video, mimicking human attention patterns. (iii) Panoptic segmentation reveals that both (model and humans) assign a greater share of attention to things and less attention to stuff as compared to their occurrence probability.

For more details, please visit our project website or read our paper.

🛠️ Setup

Clone the repository

git clone [repository-url]
cd [repository-name]

Install dependencies

pip install -r requirements.txt

📊 Dataset

The Memento dataset can be downloaded from http://memento.csail.mit.edu/#Dataset.

📁 Repository Structure

.
├── main.py                # Main training script
├── embed.py               # Video embedding generation
├── attention.py           # Attention matrix extraction
├── panoptic.py            # Panoptic segmentation
├── requirements.txt       # Python dependencies
├── eyetracking/           # Eye-tracking data and related processing
├── utils/
│   ├── model.py           # Transformer model implementation
│   └── dataset.py         # Dataset handling

🚀 Usage

1. Generate Embeddings

python embed.py --path /path/to/videos

2. Train Model

python main.py \
    --path /path/to/embeddings \
    --train_data_path /path/to/train.csv \
    --val_data_path /path/to/val.csv

3. Attention Analysis

Extract attention matrices to analyze model's focus:

python attention.py \
    --model_path /path/to/trained/model.pt \
    --val_path /path/to/val.csv \
    --features_path /path/to/features

4. Panoptic Segmentation

Generate panoptic segmentation results:

python panoptic.py \
    --video_path /path/to/videos

📝 Citation

If you use this code in your research, please cite our paper:

@article{kumar2025eyetoai,
    title = {{Seeing Eye to AI: Comparing Human Gaze and Model Attention in Video Memorability}},
    author = {Kumar, Prajneya and Khandelwal, Eshika and Tapaswi, Makarand and Sreekumar, Vishnu},
    year = {2025},
    booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

👁️ Seeing Eye to AI: Comparing Human Gaze and Model Attention in Video Memorability

📑 Contents

🤖 About

🛠️ Setup

📊 Dataset

📁 Repository Structure

🚀 Usage

1. Generate Embeddings

2. Train Model

3. Attention Analysis

4. Panoptic Segmentation

📝 Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
eyetracking		eyetracking
utils		utils
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
attention.py		attention.py
embed.py		embed.py
main.py		main.py
panoptic.py		panoptic.py
requirements.txt		requirements.txt

License

esh04/SeeingEyeToAI

Folders and files

Latest commit

History

Repository files navigation

👁️ Seeing Eye to AI: Comparing Human Gaze and Model Attention in Video Memorability

📑 Contents

🤖 About

🛠️ Setup

📊 Dataset

📁 Repository Structure

🚀 Usage

1. Generate Embeddings

2. Train Model

3. Attention Analysis

4. Panoptic Segmentation

📝 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages