Build issues on Mac #1

Dimitrije-V · 2023-04-11T22:53:02Z

When building on mac, it is not possible to use cmake -D CMAKE_EXE_LINKER_FLAGS="-static" ..
As it returns:

ld: library not found for -lcrt0.o
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[3]: *** [bin/codegen] Error 1
make[2]: *** [examples/codegen/CMakeFiles/codegen.dir/all] Error 2
make[1]: *** [examples/codegen

Instead, we need to use cmake ..

Furthermore, I needed to amend the file ggml/examples/codegen/CMakeLists.txt to explicitly find the boost headers, with:

find_package(Boost REQUIRED)
include_directories(${Boost_INCLUDE_DIRS})

Finally, I had to change a few lines in ggml/examples/codegen/serve.cpp to stop further build errors:

Line 54: crow::json::wvalue response = {{"token","1"}, {"expires_at", 2600000000}, {"refresh_in",900}}; had to change to:
crow::json::wvalue response = {{"token","1"}, {"expires_at", static_cast<std::uint64_t>(2600000000)}, {"refresh_in",900}};

Line 191: {"logprobs", NULL} had to change to: {"logprobs", nullptr}

Line 198: {"prompt_tokens", embd_inp.size()}, had to change to: {"prompt_tokens", static_cast<std::uint64_t>(embd_inp.size())},

Line 199: {"total_tokens", n_past + embd_inp.size()} had to change to: {"total_tokens", static_cast<std::uint64_t>(n_past + embd_inp.size())}

Line 206: {"created", std::time(NULL)}, had to change to: {"created", static_cast<std::int64_t>(std::time(nullptr))},

After all of these changes, I was finally able to get this tool working. I've opened a PR with the fixes I made:
ravenscroftj/ggml#1

The text was updated successfully, but these errors were encountered:

Dimitrije-V · 2023-04-11T23:00:07Z

If this PR is reviewed and merged, I'm happy to write up detailed installation steps for Mac within the README.

dabdine · 2023-04-12T00:41:55Z

@Dimitrije-V Thanks, this PR works on my end.

For those wondering which dependencies you need, you'll need cmake and boost, which you can install with Homebrew:

brew install cmake boost

In terms of performance, my 14" 2021 M1 MacBook Pro returned a response via the API within about 20-30 seconds. It took several minutes to complete a request when used with fauxpilot in an existing python project. I didn't investigate why. I assume it's due to the increased token count.

ravenscroftj · 2023-04-12T06:20:43Z

This is awesome thank you @Dimitrije-V for your contributions - I have merged your PR - really appreciate your contribution.

@dabdine thank you for reporting your performance. I'd also be interested to know which model you were using (2b or 6b or something else?) and how the -t switch affects performance on mac. GGML is fully ARM NEON compatible and should be doing performant things using the apple silicon.

I opened issue #3 separately to track the slow completion for long inputs.

I will also add some notes to the BUILD.md with both of your observations.

Thanks again!

dabdine · 2023-04-12T13:57:11Z

This is awesome thank you @Dimitrije-V for your contributions - I have merged your PR - really appreciate your contribution.

@dabdine thank you for reporting your performance. I'd also be interested to know which model you were using (2b or 6b or something else?) and how the -t switch affects performance on mac. GGML is fully ARM NEON compatible and should be doing performant things using the apple silicon.

I opened issue #3 separately to track the slow completion for long inputs.

I will also add some notes to the BUILD.md with both of your observations.

Thanks again!

Great! I'll give it another shot today. Do you have any specific scenarios in mind for performance testing (number of threads, prompt, etc)?

ravenscroftj · 2023-04-12T20:01:43Z

Thanks very much @dabdine - what I might do is write some proper benchmark scripts and standard prompts that can be tested after compile but for now can I get you to try a short python prompt - maybe something like the below:

import os
import json

def main():
   """this is the main function that opens the file and loads the json data"""

And a longer python prompt (perhaps you can load the convert ggml script from the repo and go to the bottom of the file)

With the 2021 M1 MBP I believe you have 6 "performance" cores? So maybe try -t 6 and see how that goes?

For reference I'm able to get sub 10 second generation on my AMD Ryzen 5000 for the first prompt and the 2B model with -t 6.

tectiv3 · 2023-04-14T03:40:08Z

From M2 Pro:

./bin/codegen -t 10 -m ../../models/codegen-6B-multi-ggml-4bit-quant-001.bin -p 'import os                                                                                                     (22s 939ms)
                                          import json

                                          def main():
                                             """this is the main function that opens the file and loads the json data"""'
main: seed = 1681443411
gptj_model_load: loading model from '../../models/codegen-6B-multi-ggml-4bit-quant-001.bin' - please wait ...
gptj_model_load: n_vocab = 51200
gptj_model_load: n_ctx   = 2048
gptj_model_load: n_embd  = 4096
gptj_model_load: n_head  = 16
gptj_model_load: n_layer = 33
gptj_model_load: n_rot   = 64
gptj_model_load: f16     = 2
gptj_model_load: ggml ctx size = 5269.92 MB
gptj_model_load: memory_size =  1056.00 MB, n_mem = 67584
gptj_model_load: ......................................... done
gptj_model_load: model size =  4213.84 MB / num tensors = 335
main: number of tokens in prompt = 29

import os
import json

def main():
   """this is the main function that opens the file and loads the json data"""

  path = 'test_data/output_data.json'
  file = open(path, 'r')
  data = json.load(file)

  #print data

  return data


if __name__ == '__main__':
  main()<|endoftext|>

main: mem per token = 18109648 bytes
main:     load time =  1242.47 ms
main:   sample time =     9.70 ms
main:  predict time =  8380.64 ms / 89.16 ms per token
main:    total time = 10096.66 ms

when I run with -t 12 - it takes over 20 sec.

./bin/codegen -t 8 -m ../../models/codegen-6B-multi-ggml-4bit-quant-001.bin -p 'import os                                                                                                      (12s 872ms)
                                          import json

                                          def main():
                                             """this is the main function that opens the file and loads the json data"""'
main: seed = 1681443538
gptj_model_load: loading model from '../../models/codegen-6B-multi-ggml-4bit-quant-001.bin' - please wait ...
gptj_model_load: n_vocab = 51200
gptj_model_load: n_ctx   = 2048
gptj_model_load: n_embd  = 4096
gptj_model_load: n_head  = 16
gptj_model_load: n_layer = 33
gptj_model_load: n_rot   = 64
gptj_model_load: f16     = 2
gptj_model_load: ggml ctx size = 5269.92 MB
gptj_model_load: memory_size =  1056.00 MB, n_mem = 67584
gptj_model_load: ......................................... done
gptj_model_load: model size =  4213.84 MB / num tensors = 335
main: number of tokens in prompt = 29

import os
import json

def main():
   """this is the main function that opens the file and loads the json data"""

    # open json file
    with open("/home/pi/raspi-weather/json/weather.json") as json_data:
        data = json.load(json_data)
        # print the data out
        print data

if __name__ == "__main__":
    main()
<|endoftext|>

main: mem per token = 18109616 bytes
main:     load time =  1242.58 ms
main:   sample time =    10.98 ms
main:  predict time =  4640.35 ms / 44.62 ms per token
main:    total time =  6138.13 ms

The results are wildly inconsistent though.

ravenscroftj · 2023-04-15T11:42:08Z

thanks for sharing these results @tectiv3 - certainly food for thought.

ravenscroftj closed this as completed Apr 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build issues on Mac #1

Build issues on Mac #1

Dimitrije-V commented Apr 11, 2023

Dimitrije-V commented Apr 11, 2023 •

edited

Loading

dabdine commented Apr 12, 2023

ravenscroftj commented Apr 12, 2023 •

edited

Loading

dabdine commented Apr 12, 2023

ravenscroftj commented Apr 12, 2023

tectiv3 commented Apr 14, 2023

ravenscroftj commented Apr 15, 2023 •

edited

Loading

Build issues on Mac #1

Build issues on Mac #1

Comments

Dimitrije-V commented Apr 11, 2023

Dimitrije-V commented Apr 11, 2023 • edited Loading

dabdine commented Apr 12, 2023

ravenscroftj commented Apr 12, 2023 • edited Loading

dabdine commented Apr 12, 2023

ravenscroftj commented Apr 12, 2023

tectiv3 commented Apr 14, 2023

ravenscroftj commented Apr 15, 2023 • edited Loading

Dimitrije-V commented Apr 11, 2023 •

edited

Loading

ravenscroftj commented Apr 12, 2023 •

edited

Loading

ravenscroftj commented Apr 15, 2023 •

edited

Loading