The previous code is 30% slower than the baseline. #2

chenzhehuai · 2019-02-28T08:58:21Z

Improvements:

with the same config, the previous code will result in 10% more tokens in each frame compared with baseline. The reason is because GetCutoff function is over emitting tokens, while for baseline it is over emitting&non emiting tokens. (please check it use --verbose 6 and the number of all tokens is shown in line 895, but not in line 693, which only contains emitting) Hence we should use 10% less max_active config.
the baseline use kaldi version HashList, while the previous code use std::unordered_map. please still use HashList or its variants. The reason is because HashList is allocating memory in Pool/Block fashion, hence it is 10% faster than std::unordered_map. To show this problem, I temporally do a hack in line 247, plese check it

After these two improvements, the current code is with similar speed as the baseline. HOWEVER, please check the quality of 1-best & lattice, I have discuss about my worries here kaldi-asr#3061 (review)

Improvements: 1. with the same config, the previous code will result in 10% more tokens in each frame compared with baseline. The reason is because GetCutoff function is over emitting tokens, while for baseline it is over emitting&non emiting tokens. (please check it use --verbose 6 and the number of all tokens is shown in line 895, but not in line 693, which only contains emitting) Hence we should use 10% less max_active config. 2. the baseline use kaldi version HashList, while the previous code use std::unordered_map. please still use HashList or its variants. The reason is because HashList is allocating memory in Pool/Block fashion, hence it is 10% faster than std::unordered_map. To show this problem, I temporally do a hack in line 247, plese check it After these two improvements, the current code is with similar speed as the baseline. HOWEVER, please check the quality of 1-best & lattice, I have discuss about my worries here kaldi-asr#3061 (review)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The previous code is 30% slower than the baseline. #2

The previous code is 30% slower than the baseline. #2

chenzhehuai commented Feb 28, 2019

The previous code is 30% slower than the baseline. #2

Are you sure you want to change the base?

The previous code is 30% slower than the baseline. #2

Conversation

chenzhehuai commented Feb 28, 2019