v1.2.0 Multi-vocabulary tokenizers for CP Word, Octuple & MuMIDI
Changes
- 7fe9df6 becea47 :
CP Word
,Octuple
andMuMIDI
tokenizers now have severalVocabulary
objects withinself.vocab
, each for every token type (Pitch, Duration ...). This allows to easily create several input / output layers of different sizes, fitting the token types vocabulary sizes. example here - 05c1ab9
MIDITokenizer
base class now hasMIDITokenizer
call
(link tomidi_to_tokens
),len
(returnslen(self.vocab)
) andgetitem
(returnsself.vocab[item]
, converting a token to an event and vice versa) magic methods.
Compatibility
CP Word
,Octuple
andMuMIDI
tokenizations from < v1.2.0 will not be compatible anymore, datasets have to be retokenized
Thanks
Special thanks to @envilk for his contribution !