Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Adjust output audio speed in YourTTS #3966

Open
Rakshith12-pixel opened this issue Aug 13, 2024 · 5 comments
Open

[Feature request] Adjust output audio speed in YourTTS #3966

Rakshith12-pixel opened this issue Aug 13, 2024 · 5 comments
Labels
feature request feature requests for making TTS better. wontfix This will not be worked on but feel free to help.

Comments

@Rakshith12-pixel
Copy link

Hello,

I have finetuned YourTTS on a number of new speakers, and the quality of audio, pronounciation is good. However, the audio output is a bit fast. I have tried postprocessing like resampling etc, but it changes the pitch.

There is a speed feature available in xttsv2. Can we have a similar one for YourTTS or is there any workaround for this?

Some inputs would be highly appreciated.

Thanks

@Rakshith12-pixel Rakshith12-pixel added the feature request feature requests for making TTS better. label Aug 13, 2024
@JamesD-git
Copy link

I've tried going through all the config files and changing length scale, that didn't seem to work but I think if you use the VITS backend rather than glowTTS it should improve results

Not actually sure how to do this, but if you figure it out let me know!

@JamesD-git
Copy link

Hey there @Rakshith12-pixel, if you are on Mac head to Users/{User}/Library/Application Support/tts/tts_models.......your_tts and there is a config file in there. On line 335 you'll find length scale

I'm sure there's something similar on windows but don't have a machine to access/adjust

@Rakshith12-pixel
Copy link
Author

Thanks @JamesD-git . Just for clarification - isnt the backend VITS already for YourTTS?
I have tried to add the speed feature, similar to the one in XTTS but couldn't do that.

Also, I am on Operating System: Ubuntu 22.04.4 LTS.

Any ideas on how to proceed?

@JamesD-git
Copy link

@Rakshith12-pixel Yes, the default is VITS, I got that wrong - The length scale in config is a valid workaround, it works better on long sentences than short ones but I've found that 2.4 is a good setting - Would definitely like to see this feature properly implemented in a pull request though!

Copy link

stale bot commented Sep 18, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

@stale stale bot added the wontfix This will not be worked on but feel free to help. label Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request feature requests for making TTS better. wontfix This will not be worked on but feel free to help.
Projects
None yet
Development

No branches or pull requests

2 participants