add a basic bfcl command-line interface #621

mattf · 2024-09-04T17:00:37Z

add a simple cli wrapping openfunctions_evaluation.py (bfcl run) and eval_runner.py (bfcl evaluate).

➜ bfcl
                                                                                                                    
 Usage: bfcl [OPTIONS] COMMAND [ARGS]...                                                                            
                                                                                                                    
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --install-completion            Install completion for the current shell.                                        │
│ --show-completion               Show completion for the current shell, to copy it or customize the installation. │
│ --help                -h        Show this message and exit.                                                      │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ───────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ run        Run one or more models on a test-category (same as openfunctions_evaluation).                         │
│ evaluate   Evaluate results from run of one or more models on a test-category (same as eval_runner).             │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯


➜ bfcl run -h
                                                                                                                    
 Usage: bfcl run [OPTIONS]                                                                                          
                                                                                                                    
 Run one or more models on a test-category (same as openfunctions_evaluation).                                      
                                                                                                                    
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --model                           TEXT     A list of model names to evaluate.                                    │
│                                            [default: gorilla-openfunctions-v2]                                   │
│ --test-category                   TEXT     A list of test categories to run the evaluation on. [default: all]    │
│ --api-sanity-check        -c               Perform the REST API status sanity check before running the           │
│                                            evaluation.                                                           │
│ --temperature                     FLOAT    The temperature parameter for the model. [default: 0.001]             │
│ --top-p                           FLOAT    The top-p parameter for the model. [default: 1.0]                     │
│ --max-tokens                      INTEGER  The maximum number of tokens for the model. [default: 1200]           │
│ --num-gpus                        INTEGER  The number of GPUs to use. [default: 1]                               │
│ --timeout                         INTEGER  The timeout for the model in seconds. [default: 60]                   │
│ --num-threads                     INTEGER  The number of threads to use. [default: 1]                            │
│ --gpu-memory-utilization          FLOAT    The GPU memory utilization. [default: 0.9]                            │
│ --help                    -h               Show this message and exit.                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯


➜ bfcl evaluate -h
                                                                                                                    
 Usage: bfcl evaluate [OPTIONS]                                                                                     
                                                                                                                    
 Evaluate results from run of one or more models on a test-category (same as eval_runner).                          
                                                                                                                    
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *  --model                     TEXT  A list of model names to evaluate. [default: None] [required]               │
│ *  --test-category             TEXT  A list of test categories to run the evaluation on. [default: None]         │
│                                      [required]                                                                  │
│    --api-sanity-check  -c            Perform the REST API status sanity check before running the evaluation.     │
│    --help              -h            Show this message and exit.                                                 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

mattf added 6 commits September 4, 2024 12:34

avoid initializing EVAL_GROUND_TRUTH on module load

fc41e2e

allow loading from eval_checker

05dbf6f

move openfunctions_evaluation.py into the bfcl package

66357dd

refactor eval_runner entry to separate arg parsing from main()

076f617

add basic bfcl cli

e372172

order commands in help listing

2c71926

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add a basic bfcl command-line interface #621

add a basic bfcl command-line interface #621

mattf commented Sep 4, 2024 •

edited

Loading

add a basic bfcl command-line interface #621

Are you sure you want to change the base?

add a basic bfcl command-line interface #621

Conversation

mattf commented Sep 4, 2024 • edited Loading

mattf commented Sep 4, 2024 •

edited

Loading