CLI Testing Tool for Parsing Results to Standard Output #363

adreichert · 2024-08-24T01:26:25Z

Summary

This PR adds a simple CLI tool that wraps the LlamaParse constructor.

It is intended to make testing easier. It's useful for quickly parsing files to see the results, visually comparing different models, or quickly inspecting JSON output to see the different fields. Hopefully, it will also help people get started with the tool more quickly. It is not intended to be complete. I've included the options I vary most of the time.

The output JSON is typically passed to jq -r
I've been using is for the past several months.

I think others would benefit. I have a more complicated version that will fetch past jobs' results or status using job ids. I'll add more functionality to this file if this PR is approved.

Example Usage

python -m llama_parse.tool parse foo.pdf

Testing

Help Message

python -m llama_parse.tool parse --help

Usage: python -m llama_parse.tool parse [OPTIONS] FILE

  Parse the given file and output the result to the STDOUT

  All supported arguments match those of the LlamaParse constructor. Please
  refer to the official documentation for more information.

Options:
  --api-key <api-key>             Defaults to $LLAMA_CLOUD_API_KEY
  --vendor-multimodal-model-name <model>
  --vendor-multimodal-api-key <vendor-api-key>
  --invalidate-cache
  --result-type <result-type>
  --help                          Show this message and exit.

Example Script

Everything parsed

python -m llama_parse.tool parse \
    $FILE \
    --invalidate-cache > ~/Desktop/a.md

python -m llama_parse.tool parse \
    $FILE \
    --result-type='text' \
    --invalidate-cache > ~/Desktop/a.txt

python -m llama_parse.tool parse \
    $FILE \
    --result-type='json' \
    --invalidate-cache > ~/Desktop/a.json

python -m llama_parse.tool parse \
    $FILE \
    --vendor-multimodal-model-name='openai-gpt4o' \
    --vendor-multimodal-api-key=$VENDOR_KEY \
    --invalidate-cache > ~/Desktop/b.md


python -m llama_parse.tool parse \
    $FILE \
    --result-type='json' \
    --api-key=$LLAMA_CLOUD_API_KEY \
    --vendor-multimodal-model-name='openai-gpt4o' \
    --vendor-multimodal-api-key=$VENDOR_KEY \
    --invalidate-cache > ~/Desktop/b.json

review-notebook-app · 2024-08-24T04:43:53Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

* wip * wip] * wip

logan-markewich · 2024-09-02T04:00:22Z

This looks great! Thanks for the contribution

A few notes:

It looks like click wasn't added to the toml dependencies, so any one using this will run into an import error if it's not installed
It seems like not every option is included (which is totally fine, there's tons). Maybe @hexapode can comment on if any should be added
We can actually make this usable on the command line without needing python -m -- check out this example: https://stackoverflow.com/questions/59286983/how-to-run-a-script-using-pyproject-toml-settings-and-poetry
Let's add a section in the readme for CLI usage

…lama_parse into adreichert/cli-tool

adreichert · 2024-09-10T18:46:28Z

Updated Readme
Added Click
Add CLI command lp-tool

adreichert · 2024-09-18T23:38:36Z

@logan-markewich Any feedback?

adreichert added 2 commits August 23, 2024 18:04

CLI Tool

8c8cf0b

Fix linting and formatting issue

bb25813

adreichert changed the title ~~CLI Tool for Parsing Results to Standard Output~~ CLI Testing Tool for Parsing Results to Standard Output Aug 24, 2024

adreichert marked this pull request as ready for review August 24, 2024 04:48

BinaryBrain requested a review from jerryjliu August 27, 2024 14:01

hexapode and others added 4 commits August 28, 2024 09:26

Update README.md

dbf24a7

Update README.md

bac204f

Update README.md

df1453e

Support take_screenshot (run-llama#372)

f13a1a2

* wip * wip] * wip

hexapode requested a review from logan-markewich September 2, 2024 03:52

ravi03071991 and others added 6 commits September 4, 2024 11:46

Add timeout to the image request using httpx (run-llama#378)

f304c2d

CLI Tool

112fb03

Fix linting and formatting issue

5be610c

Add CLI Tool + Update Readme

5f5d1f5

Merge branch 'adreichert/cli-tool' of https://github.com/adreichert/l…

8a96f3b

…lama_parse into adreichert/cli-tool

Add Missing line

f481b4f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLI Testing Tool for Parsing Results to Standard Output #363

CLI Testing Tool for Parsing Results to Standard Output #363

adreichert commented Aug 24, 2024 •

edited

Loading

review-notebook-app bot commented Aug 24, 2024

logan-markewich commented Sep 2, 2024

adreichert commented Sep 10, 2024 •

edited

Loading

adreichert commented Sep 18, 2024

CLI Testing Tool for Parsing Results to Standard Output #363

Are you sure you want to change the base?

CLI Testing Tool for Parsing Results to Standard Output #363

Conversation

adreichert commented Aug 24, 2024 • edited Loading

Summary

Example Usage

Testing

Help Message

Example Script

review-notebook-app bot commented Aug 24, 2024

logan-markewich commented Sep 2, 2024

adreichert commented Sep 10, 2024 • edited Loading

adreichert commented Sep 18, 2024

adreichert commented Aug 24, 2024 •

edited

Loading

adreichert commented Sep 10, 2024 •

edited

Loading