why xtts v2 inference time used RAM double(or more 3x) then GPU or VRAM #3976

saiful9379 · 2024-08-20T19:32:14Z

Describe the bug

For the example when model loading the RAM required close to 5 GB and VRAM use 2.1 GB. How can i reduce RAM uses for loading the model infernce fime. basically i try to figure out which is the issue for taking more RAM. Here i found when i initialize the GPT block then this model used closed to 5 GB RAM. this RAM is not GPU memory.

To Reproduce

Inference used RAM : 4634.7890625

Expected behavior

Expected low RAM use when inference

Logs

No response

Environment

- python==3.10
- torch                     2.2.1+cu121              pypi_0    pypi
- torchaudio                2.2.1+cu121              pypi_0    pypi
- deepspeed                 0.10.3                   pypi_0    pypi

Additional context

No response

Tasks

Give feedback

No tasks being tracked yet.

Options

saiful9379 added the bug Something isn't working label Aug 20, 2024

saiful9379 changed the title ~~why xtts v2 inferrence time used RAM double(or more 3x) then GPU or VRAM~~ why xtts v2 inference time used RAM double(or more 3x) then GPU or VRAM Aug 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why xtts v2 inference time used RAM double(or more 3x) then GPU or VRAM #3976

why xtts v2 inference time used RAM double(or more 3x) then GPU or VRAM #3976

saiful9379 commented Aug 20, 2024 •

edited

Loading

Tasks

why xtts v2 inference time used RAM double(or more 3x) then GPU or VRAM #3976

why xtts v2 inference time used RAM double(or more 3x) then GPU or VRAM #3976

Comments

saiful9379 commented Aug 20, 2024 • edited Loading

Describe the bug

To Reproduce

Expected behavior

Logs

Environment

Additional context

Tasks

saiful9379 commented Aug 20, 2024 •

edited

Loading