Skip to content

Build large audio corpora in various languages → {Yorùbá, Urhobo, Ẹ̀dó, Èʋe, Ị̀gbò}

License

Notifications You must be signed in to change notification settings

Niger-Volta-LTI/audio-corpora-builder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

audio corpora builder

Build large audio corpora in various languages → {Yorùbá, Urhobo, Edo, Èʋe, Igbo}

Audio Corpora

Curate specific language corpora from the wealth of audio available in good quality on YouTube The process is as follows:

  • Locate a list of existing playlists, e.g. OrisunTV Iroyin
  • Alternatively, create a new playlist with a custom set of YouTube videos
  • Update yoruba_sources.yml with the reference to the playlist
  • Execute $ python download_youtube.py --output ./audio/

Install dependencies

  • Python 3.7 or later
  • pip install -r requirements.txt

About

Build large audio corpora in various languages → {Yorùbá, Urhobo, Ẹ̀dó, Èʋe, Ị̀gbò}

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages