-
-
Notifications
You must be signed in to change notification settings - Fork 669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The inference speed on the mobile end is a bit slow #928
Comments
Did you enable wasm simd and multi-threads? If not, you may give a try with env.backends.onnx.wasm.simd = true and env.backends.onnx.wasm.numThreads = xxx (a reasonable value according to your core number). |
I believe multi-threading al already handled automatically: #882 I can't imagine SIMD not being the same? |
Both are turned on but no real changes have occurred
|
@xenova Does SIMD have to be enabled manually? |
Usually we'd see perf change if playing with the env.backends.onnx.wasm.numThreads (better or worse, and larger number doesn't always mean better). But to make multi-threads work, your web server needs to be cross-origin isolated (https://web.dev/articles/coop-coep). You may open your console, and check if "crossOriginIsolated" is true or false. |
Thanks for the reminder. I just checked and crossOriginIsolated is false. I will try to change it to see if it has any effect. |
Question
If it is a mobile device that does not support WebGPU, how can we improve the inference speed of the model? I have tried WebWorker, but the results were not satisfactory
The text was updated successfully, but these errors were encountered: