Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added new Sequencing datasets: API BLEND EVAL #548

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

sadhana01
Copy link

@sadhana01 sadhana01 commented Jul 24, 2024

Dataset consists of SeqATIS, SeqSNIPS, SeqSGD, SeqMultiWOZ,SeqTopV2.
Added two versions sequencing and sequencing lite with their Licenses .
Added code changes pertaining to the two datasets.

SeqTopV2 full dataset is missing since it is too big and Git LFS on github.com does not currently support pushing LFS objects to public forks. (To be discussed)

@sadhana01 sadhana01 changed the title Added new Sequencing datasets: SeqATIS, SeqSNIPS, SeqSGD, SeqMultiWOZ,SeqTopV2 Added new Sequencing datasets: API BLEND EVAL Jul 24, 2024
Copy link
Owner

@ShishirPatil ShishirPatil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @sadhana01

  1. There are different License files in the PR :) This is one too many. Also BFCL and the entire Gorilla project is already under a License which applies to the entire directory so we don't need these files?
  2. Maybe we can move the data files (jsons) from each category into the data/ sub-directory? We have organized all the data files into a single directory. The good thing is that the files already have the name SeqATIS etc so just moving them there should be good?
  3. I'll slowly review each of the rest sequentially since there are quite a few of them. Done reviewing

@sadhana01
Copy link
Author

sadhana01 commented Jul 25, 2024

@ShishirPatil They are different licenses for each dataset (3 different). I had initially put all the data files under the data folder. However I was advised by lawyers to include the license for each dataset which is why I created subfolders for them . We can discuss the best way forward .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants