Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data visualization based on evaluation CSV files #296

Open
4 of 7 tasks
ruiAzevedo19 opened this issue Jul 30, 2024 · 1 comment
Open
4 of 7 tasks

Data visualization based on evaluation CSV files #296

ruiAzevedo19 opened this issue Jul 30, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@ruiAzevedo19
Copy link
Collaborator

ruiAzevedo19 commented Jul 30, 2024

Goal: create a HTML report with graphs for data visualization.
Tool: D3.js library for data visualization graphs

TODO

  • Create a table for the evaluation CSV file

  • Scatter plot

    • Create a CSV file that stores the models meta information such as pricing and human-readable names
      • Extend the model's interface with a MetaInformation function that returns a model's meta information
      • Write the meta information in a CSV file
    • Create a scatter plot with the model costs and score
  • error bars over multiple runs, to show variance

@ruiAzevedo19 ruiAzevedo19 added the enhancement New feature or request label Jul 30, 2024
@ruiAzevedo19 ruiAzevedo19 added this to the v0.6.0 milestone Jul 30, 2024
@ruiAzevedo19 ruiAzevedo19 self-assigned this Jul 30, 2024
ruiAzevedo19 added a commit that referenced this issue Jul 30, 2024
ruiAzevedo19 added a commit that referenced this issue Jul 30, 2024
ruiAzevedo19 added a commit that referenced this issue Jul 30, 2024
… a CSV file, so it can be used for data visualization

Part of #296
ruiAzevedo19 added a commit that referenced this issue Jul 30, 2024
ruiAzevedo19 added a commit that referenced this issue Jul 30, 2024
ruiAzevedo19 added a commit that referenced this issue Jul 30, 2024
… a CSV file, so it can be used for data visualization

Part of #296
ruiAzevedo19 added a commit that referenced this issue Jul 30, 2024
ruiAzevedo19 added a commit that referenced this issue Jul 31, 2024
…e JSON response, to avoid these values to be converted latter on

Part of #296
ruiAzevedo19 added a commit that referenced this issue Jul 31, 2024
…der to the model package, since it is model related

Part of #296
ruiAzevedo19 added a commit that referenced this issue Jul 31, 2024
ruiAzevedo19 added a commit that referenced this issue Jul 31, 2024
…e it can error if the file already exists

Part of #296
ruiAzevedo19 added a commit that referenced this issue Jul 31, 2024
ruiAzevedo19 added a commit that referenced this issue Jul 31, 2024
ruiAzevedo19 added a commit that referenced this issue Jul 31, 2024
@bauersimon
Copy link
Member

bauersimon commented Aug 1, 2024

Leaving this here until we have the summing logic in the visualization.

# script.sh <evaluation-without-extension> <meta-without-extension>

pip install csvkit

sed -i '1s/-/_/g' $1.csv # SQL does not like hyphens in column names.
sed -i '1s/-/_/g' $2.csv # SQL does not like hyphens in column names.

csvsql --query "SELECT model_id, language, SUM(score) AS score, SUM(coverage) AS coverage, SUM(files_executed) AS files_executed, SUM(files_executed_maximum_reachable) AS files_executed_maximum_reachable, SUM(generate_tests_for_file_character_count) AS generate_tests_for_file_character_count, SUM(processing_time) AS processing_time, SUM(response_character_count) AS response_character_count, SUM(response_no_error) AS response_no_error, SUM(response_no_excess) AS response_no_excess, SUM(response_with_code) AS response_with_code, SUM(tests_passing) AS tests_passing FROM $1 WHERE task NOT LIKE '%-symflower-fix' GROUP BY model_id, language" $1.csv > $1-by-language.csv

csvsql --query "SELECT model_id, SUM(score) AS score, SUM(CASE WHEN language = 'golang' THEN score ELSE 0 END) AS golang_score, SUM(CASE WHEN language = 'java' THEN score ELSE 0 END) AS java_score, SUM(CASE WHEN language = 'ruby' THEN score ELSE 0 END) AS ruby_score FROM $1 WHERE task NOT LIKE '%-symflower-fix' GROUP BY model_id" $1.csv > $1-by-language-score.csv

csvsql --query "SELECT $1.model_id, model_name, (completion + prompt + request) AS cost, SUM(score) AS score, SUM(coverage) AS coverage, SUM(files_executed) AS files_executed, SUM(files_executed_maximum_reachable) AS files_executed_maximum_reachable, SUM(generate_tests_for_file_character_count) AS generate_tests_for_file_character_count, SUM(processing_time) AS processing_time, SUM(response_character_count) AS response_character_count, SUM(response_no_error) AS response_no_error, SUM(response_no_excess) AS response_no_excess, SUM(response_with_code) AS response_with_code, SUM(tests_passing) AS tests_passing FROM $1 LEFT JOIN $2 ON $1.model_id = $2.model_id WHERE task NOT LIKE '%-symflower-fix' GROUP BY $1.model_id" $1.csv $2.csv > $1-total.csv

csvsql --query "SELECT model_id, task, SUM(score) AS score, SUM(coverage) AS coverage, SUM(files_executed) AS files_executed, SUM(files_executed_maximum_reachable) AS files_executed_maximum_reachable, SUM(generate_tests_for_file_character_count) AS generate_tests_for_file_character_count, SUM(processing_time) AS processing_time, SUM(response_character_count) AS response_character_count, SUM(response_no_error) AS response_no_error, SUM(response_no_excess) AS response_no_excess, SUM(response_with_code) AS response_with_code, SUM(tests_passing) AS tests_passing FROM $1 WHERE task NOT LIKE '%-symflower-fix' GROUP BY model_id, task" $1.csv > $1-by-task.csv

csvsql --query "SELECT model_id, task, language, SUM(score) AS score, SUM(coverage) AS coverage, SUM(files_executed) AS files_executed, SUM(files_executed_maximum_reachable) AS files_executed_maximum_reachable, SUM(generate_tests_for_file_character_count) AS generate_tests_for_file_character_count, SUM(processing_time) AS processing_time, SUM(response_character_count) AS response_character_count, SUM(response_no_error) AS response_no_error, SUM(response_no_excess) AS response_no_excess, SUM(response_with_code) AS response_with_code, SUM(tests_passing) AS tests_passing FROM $1 WHERE task NOT LIKE '%-symflower-fix' GROUP BY model_id, task, language" $1.csv > $1-by-task-by-language.csv

csvsql --query "SELECT model_id, SUM(CASE WHEN task NOT LIKE '%-symflower-fix' THEN score ELSE 0 END) AS score, SUM(CASE WHEN task LIKE '%-symflower-fix' THEN score ELSE 0 END) AS score_fix, SUM(CASE WHEN task NOT LIKE '%-symflower-fix' THEN files_executed ELSE 0 END) AS files_executed, SUM(CASE WHEN task LIKE '%-symflower-fix' THEN files_executed ELSE 0 END) AS files_executed_fix FROM $1 WHERE (task LIKE 'transpile%' OR task LIKE 'write-tests%') AND language = 'golang' GROUP BY model_id " $1.csv > $1-by-symflower-fix.csv

@ruiAzevedo19 ruiAzevedo19 modified the milestones: v0.6.0, v0.7.0 Aug 1, 2024
ruiAzevedo19 added a commit that referenced this issue Aug 1, 2024
…e JSON response, to avoid these values to be converted latter on

Part of #296
ruiAzevedo19 added a commit that referenced this issue Aug 1, 2024
…der to the model package, since it is model related

Part of #296
ruiAzevedo19 added a commit that referenced this issue Aug 1, 2024
ruiAzevedo19 added a commit that referenced this issue Aug 1, 2024
…e it can error if the file already exists

Part of #296
ruiAzevedo19 added a commit that referenced this issue Aug 1, 2024
ruiAzevedo19 added a commit that referenced this issue Aug 1, 2024
ruiAzevedo19 added a commit that referenced this issue Aug 1, 2024
ruiAzevedo19 added a commit that referenced this issue Aug 1, 2024
ruiAzevedo19 added a commit that referenced this issue Aug 2, 2024
… generic name, since it can sort all kind of CSV records

Part of #296
ruiAzevedo19 added a commit that referenced this issue Aug 2, 2024
ruiAzevedo19 added a commit that referenced this issue Aug 2, 2024
ruiAzevedo19 added a commit that referenced this issue Aug 2, 2024
… generic name, since it can sort all kind of CSV records

Part of #296
ruiAzevedo19 added a commit that referenced this issue Aug 2, 2024
ruiAzevedo19 added a commit that referenced this issue Aug 2, 2024
ruiAzevedo19 added a commit that referenced this issue Aug 6, 2024
…e JSON response, to avoid these values to be converted latter on

Part of #296
ruiAzevedo19 added a commit that referenced this issue Aug 6, 2024
…der to the model package, since it is model related

Part of #296
ruiAzevedo19 added a commit that referenced this issue Aug 6, 2024
ruiAzevedo19 added a commit that referenced this issue Aug 6, 2024
…e it can error if the file already exists

Part of #296
ruiAzevedo19 added a commit that referenced this issue Aug 6, 2024
… generic name, since it can sort all kind of CSV records

Part of #296
ruiAzevedo19 added a commit that referenced this issue Aug 6, 2024
ruiAzevedo19 added a commit that referenced this issue Aug 6, 2024
Munsio pushed a commit that referenced this issue Aug 28, 2024
…e JSON response, to avoid these values to be converted latter on

Part of #296
Munsio pushed a commit that referenced this issue Aug 28, 2024
…der to the model package, since it is model related

Part of #296
Munsio pushed a commit that referenced this issue Aug 28, 2024
Munsio pushed a commit that referenced this issue Aug 28, 2024
…e it can error if the file already exists

Part of #296
Munsio pushed a commit that referenced this issue Aug 28, 2024
… generic name, since it can sort all kind of CSV records

Part of #296
Munsio pushed a commit that referenced this issue Aug 28, 2024
Munsio pushed a commit that referenced this issue Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants