Skip to content

Latest commit

 

History

History
196 lines (153 loc) · 14 KB

CHANGELOG.md

File metadata and controls

196 lines (153 loc) · 14 KB

Changelog

All notable changes to this project will be documented in this file. The format is based on Keep a Changelog.

[Unreleased]

Added

  • Added a benchmark script to compare PyTorch Frame with PyTorch Tabular (#398, #444)
  • Added is_floating_point method to MultiNestedTensor and MultiEmbeddingTensor (#445)
  • Added support for inferring stype.categorical from boolean columns in utils.infer_series_stype (#421)

Changed

  • Set weights_only=True in torch_frame.load from PyTorch 2.4 (#423)

Deprecated

Removed

Fixed

  • Fixed size mismatch RuntimeError in transforms.CatToNumTransform (#446)
  • Removed CUDA synchronizations from nn.LinearEmbeddingEncoder (#432)
  • Removed CUDA synchronizations from N/A imputation logic in nn.StypeEncoder (#433, #434)

[0.2.3] - 2024-07-08

Added

  • Added MovieLens 1M dataset (#397)
  • Added light-weight MLP (#372)
  • Added R^2 metric (#403)

Changed

  • Updated ExcelFormer implementation and related scripts (#391)

[0.2.2] - 2024-03-04

Added

  • Avoided for-loop in EmbeddingEncoder (#366)
  • Added image_embedded and one tabular image dataset (#344)
  • Added benchmarking suite for encoders (#360)
  • Added dataframe text benchmark script (#354, #367)
  • Added DataFrameTextBenchmark dataset (#349)
  • Added support for empty TensorFrame (#339)

Changed

  • Changed a workflow of Encoder's na_forward method resulting in performance boost (#364)
  • Removed ReLU applied in FCResidualBlock (#368)

Deprecated

Removed

Fixed

  • Fixed bug in empty MultiNestedTensor handling (#369)
  • Fixed the split of DataFrameTextBenchmark (#358)
  • Fixed empty MultiNestedTensor col indexing (#355)

[0.2.1] - 2024-01-16

Added

  • Support more stypes in LinearModelEncoder (#325)
  • Added stype_encoder_dict to some models (#319)
  • Added HuggingFaceDatasetDict (#287)

Changed

  • Supported decoder embedding model in examples/transformers_text.py (#333)
  • Removed implicit clones in StypeEncoder (#286)

Deprecated

Removed

Fixed

  • Fixed TimestampEncoder not applying CyclicEncoder to cyclic features (#311)
  • Fixed NaN masking in multicateogrical stype (#307)

[0.2.0] - 2023-12-15

Added

  • Added support for Boolean masks in index_select of _MultiTensor 334
  • Added more text documentation (#291)
  • Added col_to_model_cfg (#270)
  • Support saving/loading of GBDT models (#269)
  • Added documentation on handling different stypes (#271)
  • Added TimestampEncoder (#225)
  • Added LightGBM (#248)
  • Added time columns to the MultimodalTextBenchmark (#253)
  • Added CyclicEncoding (#251)
  • Added PositionalEncoding (#249)
  • Added optional col_names argument in StypeEncoder (#247)
  • Added col_to_text_embedder_cfg and use MultiEmbeddingTensor for text_embedded (#246)
  • Added col_encoder_dict in StypeWiseFeatureEncoder (#244)
  • Added LinearEmbeddingEncoder for embedding stype (#243)
  • Added support for torch_frame.text_embedded in GBDT (#239)
  • Support Metric in GBDT (#236)
  • Added auto-inference of stype (#221)
  • Enabled list input in multicategorical stype (#224)
  • Added Timestamp stype (#212)
  • Added multicategorical to MultimodalTextBenchmark (#208)
  • Added support for saving and loading of TensorFrame with complex stypes. (#197)
  • Added stype.embedding (#194)
  • Added TensorFrame concatenation of complex stypes. (#190)
  • Added text_tokenized example (#174)
  • Added Cohere embedding example (#186)
  • Added AmazonFineFoodReviews dataset and OpenAI embedding example (#182)
  • Added save and load logic for FittableBaseTransform (#178)
  • Added MultiEmbeddingTensor (#181, #193, #198, #199, #217)
  • Added to_dense() for MultiNestedTensor (#170)
  • Added example for multicategorical stype (#162)
  • Added sequence_numerical stype (#159)
  • Added MultiCategoricalEmbeddingEncoder (#155)
  • Added advanced indexing for MultiNestedTensor (#150, #161, #163, #165)
  • Added multicategorical stype (#128, #151)
  • Added MultiNestedTensor (#149)

Changed

  • Set stype.embedding as the parent of stype.text_embedded and unified stype.text_embedded with its parent in :obj:tensor_frame (#277)
  • Renamed torch_frame.stype module to torch_frame._stype (#275)
  • Renamed text_tokenized_cfg into col_to_text_tokenized_cfg (#257)
  • Made Trompt output 2-dim embeddings in forward
  • Renamed text_embedder_cfg into col_to_text_embedder_cfg

Removed

  • No manual passing of in_channels to LinearEmbeddingEncoder for stype.text_embedded (#222)

[0.1.0] - 2023-10-23

Added

  • Added basic text_tokenized (#157)
  • Added Mercari dataset (#123)
  • Added the model performance benchmark script (#114)
  • Added DataFrameBenchmark (#107)
  • Added concat and equal ops for TensorFrame (#100)
  • Use ROC-AUC for binary classification in GBDT (#98)
  • Infer task_type in dataset (#97)
  • Added text_embedded example (#95)
  • Added MultimodalTextBenchmark (#92, #117)
  • Renamed x_dict to feat_dict in TensorFrame (#86)
  • Added TabTransformer example (#82)
  • Added TabNet example (#85)
  • Added dataset tensorframe and col_stats caching (#84)
  • Added TabTransformer (#74)
  • Added TabNet (#35)
  • Added text embedded stype, mapper and encoder. (#78)
  • Added ExcelFormer example (#46)
  • Added support for inductive DataFrame to TensorFrame transformation (#75)
  • Added CatBoost baseline and tuned CatBoost example. (#73)
  • Added na_strategy as argument in StypeEncoder. (#69)
  • Added NAStrategy class and impute NaN values in MutualInformationSort. (#68)
  • Added XGBoost baseline and updated tuned XGBoost example. (#57)
  • Added CategoricalCatBoostEncoder and MutualInformationSort transforms needed by ExcelFromer (#52)
  • Added tutorial example script (#54)
  • Added ResNet (#48)
  • Added ExcelFormerEncoder (#42)
  • Made FTTransformer take TensorFrame as input (#45)
  • Added Tompt example (#39)
  • Added post_module in StypeEncoder (#43)
  • Added FTTransformer (#40, #41)
  • Added ExcelFormer (#26)
  • Added Yandex collections (#37)
  • Added TabularBenchmark collections (#33)
  • Added the Bank Marketing dataset (#34)
  • Added the Mushroom, Forest Cover Type, and Poker Hand datasets (#32)
  • Added PeriodicEncoder (#31)
  • Added NaN handling in StypeEncoder (#28)
  • Added LinearBucketEncoder (#22)
  • Added Trompt (#25)
  • Added TromptDecoder (#24)
  • Added TromptConv (#23)
  • Added StypeWiseFeatureEncoder (#16)
  • Added indexing/shuffling and column select functionality in Dataset (#18, #19)
  • Added Adult Census Income dataset (#17)
  • Added column-level statistics and dataset materialization (#15)
  • Added FTTransformerConvs (#12)
  • Added DataLoader capabilities (#11)
  • Added TensorFrame.index_select (#10)
  • Added Dataset.to_tensor_frame (#9)
  • Added base classes TensorEncoder, FeatureEncoder, TableConv, Decoder (#5)
  • Added TensorFrame (#4)
  • Added Titanic dataset (#3)
  • Added Dataset base class (#3)