-
Notifications
You must be signed in to change notification settings - Fork 127
Build issues on Mac #1
Comments
If this PR is reviewed and merged, I'm happy to write up detailed installation steps for Mac within the README. |
@Dimitrije-V Thanks, this PR works on my end. For those wondering which dependencies you need, you'll need brew install cmake boost In terms of performance, my 14" 2021 M1 MacBook Pro returned a response via the API within about 20-30 seconds. It took several minutes to complete a request when used with fauxpilot in an existing python project. I didn't investigate why. I assume it's due to the increased token count. |
This is awesome thank you @Dimitrije-V for your contributions - I have merged your PR - really appreciate your contribution. @dabdine thank you for reporting your performance. I'd also be interested to know which model you were using (2b or 6b or something else?) and how the I opened issue #3 separately to track the slow completion for long inputs. I will also add some notes to the Thanks again! |
Great! I'll give it another shot today. Do you have any specific scenarios in mind for performance testing (number of threads, prompt, etc)? |
Thanks very much @dabdine - what I might do is write some proper benchmark scripts and standard prompts that can be tested after compile but for now can I get you to try a short python prompt - maybe something like the below: import os
import json
def main():
"""this is the main function that opens the file and loads the json data""" And a longer python prompt (perhaps you can load the convert ggml script from the repo and go to the bottom of the file) With the 2021 M1 MBP I believe you have 6 "performance" cores? So maybe try For reference I'm able to get sub 10 second generation on my AMD Ryzen 5000 for the first prompt and the 2B model with |
From M2 Pro:
when I run with -t 12 - it takes over 20 sec.
The results are wildly inconsistent though. |
thanks for sharing these results @tectiv3 - certainly food for thought. |
When building on mac, it is not possible to use
cmake -D CMAKE_EXE_LINKER_FLAGS="-static" ..
As it returns:
Instead, we need to use
cmake ..
Furthermore, I needed to amend the file ggml/examples/codegen/CMakeLists.txt to explicitly find the boost headers, with:
Finally, I had to change a few lines in ggml/examples/codegen/serve.cpp to stop further build errors:
Line 54:
crow::json::wvalue response = {{"token","1"}, {"expires_at", 2600000000}, {"refresh_in",900}};
had to change to:crow::json::wvalue response = {{"token","1"}, {"expires_at", static_cast<std::uint64_t>(2600000000)}, {"refresh_in",900}};
Line 191:
{"logprobs", NULL}
had to change to:{"logprobs", nullptr}
Line 198:
{"prompt_tokens", embd_inp.size()},
had to change to:{"prompt_tokens", static_cast<std::uint64_t>(embd_inp.size())},
Line 199:
{"total_tokens", n_past + embd_inp.size()}
had to change to:{"total_tokens", static_cast<std::uint64_t>(n_past + embd_inp.size())}
Line 206:
{"created", std::time(NULL)},
had to change to:{"created", static_cast<std::int64_t>(std::time(nullptr))},
After all of these changes, I was finally able to get this tool working. I've opened a PR with the fixes I made:
ravenscroftj/ggml#1
The text was updated successfully, but these errors were encountered: