Skip to content

Latest commit

 

History

History
26 lines (21 loc) · 1.76 KB

README.md

File metadata and controls

26 lines (21 loc) · 1.76 KB

YorùbáVoice

Landing page for data, code and publications for this project sponsored by an Imminent Research Grant.

In 2022, we launched the curation and recording of 40 hours of high-fidelity speech data for the Yorùbá language, the third most widely spoken language in Africa with over 40 million L1 speakers. We partner with the YorubaName organization in Nigeria to encourage volunteers both online and offline to record their voices.

BibTeX entry and citation info

If you make use of our dataset, please cite the our paper.

@misc{ogunremi2023iroyinspeech,
      title={\`{I}r\`{o}y\`{i}nSpeech: A multi-purpose Yor\`{u}b\'{a} Speech Corpus}, 
      author={Tolulope Ogunremi and Kola Tubosun and Anuoluwapo Aremu and Iroro Orife and David Ifeoluwa Adelani},
      year={2023},
      eprint={2307.16071},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}