NPU support for LLM acceleration without gpu #2257
Closed
sebastienbo
started this conversation in
Ideas
Replies: 2 comments
-
I concur with your perspective; acquiring a 64GB DDR5 RAM module is indeed more feasible compared to obtaining a 64GB GPU at present.Indeed, incorporating NPU support holds the promise of delivering significant advantages to users in terms of model inference compared to solely relying on GPU support. |
Beta Was this translation helpful? Give feedback.
0 replies
-
If we did add support for NPUs the code would go in llama.cpp, which we use for LLM inference. See these upstream issues: |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Is it possible to support NPU's ? Those are way more specialised for LLM traversal and don't require a gpu.
It would offload the cpu and would be able to run from Laptops with NPU's
Here is an example code: https://intel.github.io/intel-npu-acceleration-library/llm.html
Beta Was this translation helpful? Give feedback.
All reactions