Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Transformers 4.43+ Support #3974

Open
timwillhack opened this issue Aug 19, 2024 · 2 comments
Open

[Feature request] Transformers 4.43+ Support #3974

timwillhack opened this issue Aug 19, 2024 · 2 comments
Labels
feature request feature requests for making TTS better.

Comments

@timwillhack
Copy link

I'm on windows 10, running python 3.9 and trying to get your fork of coqui to be able to use transformers 4.43+. It seems to be stuck at 4.42.4 on the dev branch. There are a couple other models I'd like to be able to have in the same project such as parler, parler can't use transformers < 4.43. I'm at least thankful you managed to get it to get past 4.40 ;)

I'm seeing some other people having other issues that have transformers 4.43+, so I wasn't sure if maybe I just need a certain branch or what. Thanks

@timwillhack timwillhack added the feature request feature requests for making TTS better. label Aug 19, 2024
@eginhard
Copy link
Contributor

Best to post directly in the fork, but there is an issue for that already: idiap#65

The issue with the XTTS streaming code is that it relies a lot on internals of the transformers library that can change a lot between versions. Is there any specific reason that you need both Parler and Coqui in the same environment?

@timwillhack
Copy link
Author

I'm building an app that uses a lot of different models, and I don't want to be restricted in what I can use, so I'm trying to stay on top of it. The original repo here (which I accidentally opened an issue on I guess) only goes up to 4.40, so at least your fork gets up to 4.42.4. I went to use Llama 3.1 recently in 4.42.4 and it had issues with configuration data not matching how 4.43 can understand (for rope scaling). I like in parler being able to describe the speaker, which I can't do in coqui, but parler isn't realtime on my machine and coqui is. I'm using streaming results for most of the things I'm doing so I don't really want to deal with the hassle of any overhead separating into more than one project. So I don't want to have two environments to maintain coqui being stuck in 4.42. So basically some things I want to use parler for what I think seems more natural but not real time, or coqui if real time is important given the use case. (I have use cases for both things). Sorry my train of thought is all over the place. I think I was just sort of hoping since I've seen some people apparently get past their issues where they have transformers 4.43 in their post, that maybe I'm missing a branch or patch that would let me install 4.43 and move on. Thank you for your work on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request feature requests for making TTS better.
Projects
None yet
Development

No branches or pull requests

2 participants