Skip to content

Emo-CLIM: Emotion-Aligned Contrastive Learning Between Images and Music [ICASSP 2024]

License

Notifications You must be signed in to change notification settings

shantistewart/Emo-CLIM

Repository files navigation

Emotion-Aligned Contrastive Learning Between Images and Music

Shanti Stewart1, Kleanthis Avramidis1 *, Tiantian Feng1 *, Shrikanth Narayanan1
1 Signal Analysis and Interpretation Lab, University of Southern California
* Equal contribution

arXiv

This repository is the official implementation of Emotion-Aligned Contrastive Learning Between Images and Music (accepted at ICASSP 2024).

In this work, we introduce Emo-CLIM, a framework for Emotion-Aligned Contrastive Learning Between Images and Music. Our method learns an emotion-aligned joint embedding space between images and music. This embedding space is learned via emotion-supervised contrastive learning, using an adapted cross-modal version of SupCon. By evaluating the joint embeddings through downstream cross-modal retrieval and music tagging tasks, we show that our approach successfully aligns images and music.

We provide code for contrastive pre-training and downstream cross-modal retrieval and music tagging evaluation tasks.

Installation

We recommend using a conda environment with Python >= 3.10 :

conda create -n emo-clim python=3.10
conda activate emo-clim

Clone the repository and install the dependencies:

git clone https://github.com/shantistewart/Emo-CLIM
cd Emo-CLIM && pip install -e .

You will also need to install the CLIP model:

pip install git+https://github.com/openai/CLIP.git

Project Structure

Emo-CLIM/
├── climur/               # core directory for pretraining and downstream evaluation
│  ├── dataloaders/          # PyTorch Dataset classes
│  ├── losses/               # PyTorch loss functions
│  ├── models/               # PyTorch Module classes
│  ├── scripts/              # training and evaluation scripts
│  ├── trainers/             # PyTorch Lightning LightningModule classes
│  └── utils/                # utility functions
├── configs/              # configuration files for training and evaluation
├── data_prep/            # data preparation scripts
├── figures/              # Emo-CLIM figures
├── plots/                # t-SNE plots
├── results_test/         # cross-modal retrieval evaluation results on test set
├── results_val/          # cross-modal retrieval evaluation results on validation set
└── tests/                # test scripts

Citation

If this project helps your research, please cite our paper:

@inproceedings{Stewart-2024-EmoCLIM,
  title={Emotion-Aligned Contrastive Learning Between Images and Music}, 
  author={Stewart, Shanti and Avramidis*, Kleanthis and Feng*, Tiantian and Narayanan, Shrikanth},
  booktitle={IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP}, 
  year={2024}
}

Contact

If you have any questions, please get in touch: [email protected]

About

Emo-CLIM: Emotion-Aligned Contrastive Learning Between Images and Music [ICASSP 2024]

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •