Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BFCL] Adding actionGemma model handler #610

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

kishoreKunisetty
Copy link

ActionGemma is a Gemma2 9b based fine-tuned model for function calling. This PR adds model handler for the it.

@HuanzhiMao HuanzhiMao added the BFCL-New Model Add New Model to BFCL label Aug 27, 2024
@HuanzhiMao
Copy link
Collaborator

Thank you for the PR and welcome! We’re currently busy with a new dataset release. I’ll review your submission later this week and aim to provide feedback by next Monday. Apologies for the delay.

"""

class ActionGemmaHandler(OSSHandler):
def __init__(self, model_name, temperature=0.001, top_p=1, max_tokens=512, dtype="bfloat16") -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the huggingface model card, the model is supposed to be run with float32 instead of bflaot16.

Comment on lines +63 to +68
content = f"\n{TASK_INSTRUCTION}\n<end_of_turn>\n"
# content += f"{FORMAT_INSTRUCTION}\n<end_of_turn>\n\n"
content += "<unused0>\n" + json.dumps(tools) + "\n<unused1>\n\n"

content += f"<start_of_turn>user\n{query}<end_of_turn>\n\n"
return SYSTEM_PROMPT + f"\n{content}\n<start_of_turn>assistant"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the model card:

sample response from applied chat template

<bos>
      <start_of_turn>system
You are an expert in composing functions. You are given a question and a set of possible functions. 
Based on the question, you will need to make one or more function/tool calls to achieve the purpose. 
If none of the functions can be used, point it out and refuse to answer. 
If the given question lacks the parameters required by the function, also point it out.<end_of_turn>

      <start_of_turn>user
अमेरिका के राष्ट्रपति कौन है?<end_of_turn>

        <unused0>
[{"name": "get_weather", "description": "Get the current weather for a location", "parameters": {"location": {"type": "string", "description": "The city and state, e.g. San Francisco, New York"}, "unit": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature to return"}}},  {"name": "search", "description": "Search for information on the internet", "parameters": {"query": {"type": > "string", "description": "The search query, e.g. 'latest news on AI'"}}}]<unused1>

  <start_of_turn>assistant

This seems different from the one formatted here, especially the SYSTEM_PROMPT.

Copy link
Collaborator

@HuanzhiMao HuanzhiMao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you resolve the inconsistency with the huggingface model card?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BFCL-New Model Add New Model to BFCL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants