Skip to content
This repository has been archived by the owner on Oct 13, 2022. It is now read-only.

[WIP] Add iterated loss #193

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

zhu-han
Copy link
Contributor

@zhu-han zhu-han commented May 12, 2021

This PR implements iterated loss from #179 (comment).
Reference: https://arxiv.org/pdf/1910.10324.pdf

The following results could be reproduced with:

python mmi_att_transformer_train.py --world-size 2 --full-libri 0 --use-ali-model 0 --max-duration 250 --iterated-layers 5 --iterated-scale 0.3

Results with different iterated scale are shown in Table 1, it doesn't show clear improvement now.

  • Table 1
iterated scale test-clean test-other test-clean (rescore) test-other (rescore)
- 6.74 17.18 5.63 14.92
- 6.78 17.49 5.76 15.31
0.01 6.71 17.34 5.6 14.86
0.05 6.58 17.35 5.69 15.06
0.30 6.57 17.6 5.61 15.38
1.00 6.77 17.69 5.8 15.58
10.00 6.93 18.31 5.88 16.22

The first two lines are baseline results with no iterated loss. I run it twice to see the randomness of results.

Details:

  • It adds an extra mmi loss after the 6th conformer layer. Also tried adding after both 4th and 8th layers, the results are similar, shown in Table 2.
  • The weight of the bigram lm in mmi loss is not updated using the extra mmi loss. The comparison with the other way is shown in Table 3.

Extra results:

  • Table 2 (Add extra mmi losses after 4th and 8th layers)
iterated scale test-clean test-other test-clean (rescore) test-other (rescore)
- 6.74 17.18 5.63 14.92
- 6.78 17.49 5.76 15.31
0.30 6.61 17.65 5.65 15.52
1.00 6.66 18.13 5.68 15.87
10.00 6.75 18.43 5.74 16.25
  • Table 3 (Update bigram using extra mmi loss or not)
model test-clean test-other test-clean (rescore) test-other (rescore)
- 6.57 17.6 5.61 15.38
+ update bigram with extra loss 6.75 17.77 5.69 15.53

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant