Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overarching Plan (to MVP / version 1) #11

Open
57 of 63 tasks
dcrescim opened this issue Oct 11, 2021 · 4 comments
Open
57 of 63 tasks

Overarching Plan (to MVP / version 1) #11

dcrescim opened this issue Oct 11, 2021 · 4 comments

Comments

@dcrescim
Copy link
Collaborator

dcrescim commented Oct 11, 2021

Hey Folks!
I thought it might be a bit easier if we had one issue that had the current "state of the world".
It would have a list of all completed Estimators/Functions and next to each it would have a person's name if someone was working on it or it'd be checked if it was complete and merged in dev.

Ping me in the comments beneath and I'll add you to whichever estimators you want to work on.

I went through the scikit-learn docs yesterday and broke out the Estimators that we would need for an MVP of scikit.js (let's call it version 1).

Version 1

The focus here is on simple models, and all the preprocessing, and metrics that you'd need to perform high quality model generation.

linear_model

  • LinearRegression
  • LassoRegression
  • RidgeRegression
  • ElasticNet
  • LogisticRegression
  • SGDClassifier
  • SGDRegressor

cluster

  • KMeans

neighbors

dummy

  • DummyClassifier
  • DummyRegressor

impute

  • SimpleImputer

preprocessing

  • StandardScaler
  • MinMaxScaler
  • MaxAbsScaler
  • Normalizer
  • RobustScaler
  • LabelEncoder
  • OneHotEncoder
  • OrdinalEncoder

pipeline

  • Pipeline

compose

  • ColumnTransformer

tree

metrics

  • accuracyScore
  • confusionMatrix
  • hingeLoss
  • logLoss
  • precisionScore
  • recallScore
  • rocAucScore
  • zeroOneLoss
  • meanAbsoluteError
  • meanSquaredError
  • meanSquaredLogError
  • r2Score

So pick whichever ya want, and ping me, and I'll update the issue and put your name next to the Estimator / Function.

Some great resources for contributors

Hello folks! Time flies when you're having fun :)
We are rounding the corner the completion of the MVP / Version 1 list above. I thought it would be good to go through scikit-learn and make a list of the next most important things. That list is below as well as some general todos (docs, tutorials). Feel free to ping me or comment below and grab whatever interests in the following list.

Onward and Upward!

linear_model

  • Exact solution for linear_regression

datasets

naive_bayes

svm

  • LinearSVC
  • LinearSVR
  • SVC
  • SVR

model_selection

decomposition

  • PCA

hyper_parameter

ensemble

  • VotingRegressor
  • VotingClassifier
  • RandomForestClassifier
  • RandomForestRegressor

docs

  • Make Basic Docs site
  • Push the Basic Docs site to scikit.org. Have scikit.js redirect to scikit.org
  • Make Basic Docs site show api for all functions / classes that we export
  • Make it build browser and node versions
  • Make the tests run against browser and node environments
@risenW
Copy link
Member

risenW commented Oct 13, 2021

Thanks for creating this @dcrescim I'll add some more features we may need as well.

@DirkToewe
Copy link
Collaborator

One thing we should also be working towards is showing off the strength of machine learning in the browser: interactivity. We should build some kind of playground, similar to the Tensorflow Playground.

@risenW
Copy link
Member

risenW commented Jan 18, 2022

One thing we should also be working towards is showing off the strength of machine learning in the browser: interactivity. We should build some kind of playground, similar to the Tensorflow Playground.

+1 on this from me. Another suggestion, we have a bunch of drag and drop/select features where users can upload sample data, select an ML algorithm we support, and then run training and predictions on it.

cc @dcrescim @yawetse @Lewuathe @steveoni

@dcrescim
Copy link
Collaborator Author

I totally agree with this. I wonder if there is a way that we can support this on our docs site.
Just have a page at scikitjs.org/playground and it is all setup to mess around with data.
That just makes it easier to for us to build that playground as part of this git repo and so it is free/easy to deploy.

Couldn't agree more with the ideas above @DirkToewe @risenW

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants