Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make BFCL User-Friendly and Easy to Extend #510

Open
wants to merge 35 commits into
base: main
Choose a base branch
from

Conversation

devanshamin
Copy link
Contributor

@devanshamin devanshamin commented Jul 7, 2024

Thank you for open-sourcing BFCL and for your efforts in maintaining it. As I explored the codebase, I noticed some areas for improvement, including duplicate functions, constants, and variables that are inaccessible outside their functions, as well as a lack of abstract classes/methods.

To address these issues, I've begun refactoring the codebase to make it more straightforward to follow, customize, install, and extend. Although this work is still in progress, I wanted to share it with the contributors to seek feedback and gauge interest in reviewing or contributing to the PR. I understand that it’s a significant refactor and may require considerable time and effort, so any assistance or feedback would be greatly appreciated.

To-Do:

Refactor

  • bfcl package structure
  • benchmark and model_handler
  • evaluate and eval_checker
  • Update README
  • Update CHANGELOG

Test

  • Dependencies
  • benchmark -> OSS and Proprietary model handlers
  • evaluate

Goal: To make BFCL similar to lm-evaluation-harness but for function calling.

Please let me know your thoughts and if you would be interested in reviewing or contributing to this PR.

- Move `model_handler` to `bfcl/model_handler`
- Separate `oss` and `proprietary` model
- Move java and javascript parsers to `bfcl/parser`
- Standardize model handlers and remove duplicate methods
- Test data compilation handled by `bfcl/types.py:LeaderboardCategories.load_data` method
@ShishirPatil
Copy link
Owner

Hey @devanshamin Thank you so much for flagging and suggesting improvements! We agree with everything you mentioned. If it helps, one design decision we have adopted is that when in doubt, defer to simplicity and ease of code readability, given this is an OSS. re: landing the PR: We are absolutely delighted you want to contribute to the Berkeley Function Calling Leaderboard (BFCL) project, and will be absolutely on-board to review and land this PR! Welcome aboard mate!

@devanshamin
Copy link
Contributor Author

Hey @devanshamin Thank you so much for flagging and suggesting improvements! We agree with everything you mentioned. If it helps, one design decision we have adopted is that when in doubt, defer to simplicity and ease of code readability, given this is an OSS. re: landing the PR: We are absolutely delighted you want to contribute to the Berkeley Function Calling Leaderboard (BFCL) project, and will be absolutely on-board to review and land this PR! Welcome aboard mate!

Awesome! I'm glad to hear you're on board. Keeping the theme of simplicity in mind, I was thinking of coming up with a detailed plan for the refactor and getting your feedback on it. @HuanzhiMao reached out to me, and we are planning on setting up a Zoom call. During the call, I can go over the changes that I have made and the plan, and hear your thoughts and feedback on how to move forward. After the meeting, I can write up a draft with next steps which we can track over here.

- poetry build system is no longer used
- To allow for separate dependencies for oss and proprietary model
- test category is already added to each example during loading the data
@devanshamin devanshamin force-pushed the refactor_bfcl branch 2 times, most recently from 4d8a86c to 58a0648 Compare July 9, 2024 01:05
- Use same test groups for benchmarking and evaluation
- Add a custom enum class with intuitive methods to dynamically create test groups
- Use custom enum to reduce manually creation of test groups
- Update benchmark cli args to accept test group argument
- Add pydantic validator to validate test group and test categories
@devanshamin
Copy link
Contributor Author

Here is an article outlining steps on merging this PR - #521

- Load original json test data files
- Add `id` and `test_category` keys to each example
- Save model responses for each test category in a separate file
- Single cli entrypoint with subcommands to run benchmark and evaluation
@HuanzhiMao HuanzhiMao changed the base branch from main to dev/huanzhi July 14, 2024 01:50
@HuanzhiMao HuanzhiMao changed the base branch from dev/huanzhi to main July 15, 2024 06:16
ShishirPatil pushed a commit that referenced this pull request Aug 27, 2024
This PR aims to improve the organization and distribution of the
codebase by packaging the BFCL codebase. This PR is part of a series of
changes that break down the tasks outlined in #510.

---------

Co-authored-by: Huanzhi Mao <[email protected]>
ShishirPatil pushed a commit that referenced this pull request Sep 11, 2024
This PR reorganizes the model handler by splitting it into two distinct
components: an Open Source (OSS) model handler and a Proprietary model
handler. This change is part of a series of updates that address the
tasks outlined in issue #510.

---------

Co-authored-by: Huanzhi Mao <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants