Skip to content

0.2 - Towards the Base Camp

Compare
Choose a tag to compare
@pzelasko pzelasko released this 18 Nov 14:17
· 2156 commits to master since this release
98493d6

New features:

  • K2SpeechRecognitionIterableDataset that supports more efficient batching #116
  • Support for torchaudio.sox_effects data augmentation alongside WavAugment #124

Breaking changes:

  • the data augmentation APIs in Lhotse expect augment_fn argument instead of augmenter, that has a signature like: def augment_fn(samples: np.ndarray, sampling_rate: int) -> np.ndarray #124

New corpora:

  • Mobvoi Hotwords #109

Enhancements:

  • progress bars for corpus downloads and feature extraction #131
  • re-using cached LibriSpeech manifests for faster data preparation #133
  • LilcomFilesWriter and NumpyFilesWriter use sub-directories for storage to reduce the filesystem load #134

Several bug fixes and improved testing.