Releases: huggingface/optimum
Releases · huggingface/optimum
v1.26.1: Patch release
Add back from_transformers for base model by @echarlaix in #2288
v1.26.0: ColPali, D-FINE, InternLM2
ONNX export
- D-FINE support by @xenova in #2249
- ColPali support by @Balladie in #2251
- InternLM2 support by @gmf14 in #2244
- Chinese CLIP support by @xenova in #1591
- Qwen3 support by @IlyasMoutawwakil in #2278
New features & enhancements
- Add onnxslim support by @inisis in #2258
- Introduce ORTSessionMixin and enable general io binding by @IlyasMoutawwakil in #2234
- Fix and uniformize hub kwargs by @IlyasMoutawwakil in #2276
- Add compatibility with transformers 4.52 by @echarlaix in #2270
- Distribute and complete onnxruntime tests (decoder models) by @IlyasMoutawwakil in #2278
- Add ONNX Runtime optimization support for ModernBERT by @amas0 in #2208
New Contributors
v1.25.3: Patch release
- Fix ORT pipelines by @echarlaix in #2274
Full Changelog**: v1.25.2...v1.25.3
v1.25.2: Patch release
What's Changed
- Upgrade optimum-intel in setup extras by @echarlaix in #2271
- Match transformers behavior with return_dict by @IlyasMoutawwakil in #2269
Full Changelog: v1.25.1...v1.25.2
v1.25.1: Patch release
What's Changed
- Updated readme/pypi page by @IlyasMoutawwakil in #2268
- Fix bug ORTModelForFeatureExtraction by @Abdennacer-Badaoui in #2267
- Fix doc TPU section by @echarlaix in #2265
Full Changelog: v1.25.0...v1.25.1
v1.25.0: ViTPose, RT-DETR, EfficientNet, Moonshine ONNX
🚀 New Features & Enhancements
- Add ONNX export support for ViTPose, RT-DETR, EfficientNet, Moonshine
- Infer if the model needs to be exported to ONNX during loading
from optimum.onnxruntime import ORTModelForCausalLM
model_id = "meta-llama/Llama-3.2-1B"
- model = ORTModelForCausalLM.from_pretrained(model_id, export=True)
+ model = ORTModelForCausalLM.from_pretrained(model_id)
- Transformers v4.49, v4.50 and v4.51 compatibility
👥 New Contributors
A huge thank you to our first-time contributors:
- @ruidazeng
- @ariG23498
- @janak2
- @qubvel
- @zhxchen17
- @xieofxie
- @EFord36
- @Thas-Tayapongsak
- @hans00
- @Abdennacer-Badaoui
What's Changed
- Update ort training installation instructions by @echarlaix in #2173
- Dev version by @echarlaix in #2175
- Fixed All Typos in docs by @ruidazeng in #2185
- Remove deprecated ORTModel class by @echarlaix in #2187
- avoid library_name guessing if it is known in parameters standartization by @eaidova in #2179
- Infer whether a model needs to be exported to ONNX or not by @echarlaix in #2181
- Update optimum neuron extra by @dacorvo in #2190
- Add support for Moonshine ONNX export (& seq2seq models with non-legacy cache &
Tensor.repeat_interleave
) by @xenova in #2162 - ViTPose by @ariG23498 in #2183
- ViTPose export fix by @echarlaix in #2192
- Remove ORTTrainer code snippet from README by @echarlaix in #2194
- Remove README code snippets by @echarlaix in #2195
- Add transformers v4.49 support by @echarlaix in #2191
- Fix test benchmark suite by @echarlaix in #2199
- fix the onnx export custom model example; fix repo name; fix opset version; remove deprecated arg; by @janak2 in #2203
- Limit transformers version for bettertransformer support by @echarlaix in #2198
- Add ONNX config for RT-DETR (and RT-DETRv2) by @qubvel in #2201
- Remove deprecated notebook by @echarlaix in #2205
- Update CI runner to ubuntu 22.04 by @echarlaix in #2206
- Add executorch documentation section by @echarlaix in #2193
- Fix typo in exporters/onnx/utils.py by @zhxchen17 in #2210
- Link Optimum-ExecuTorch to parent Optimum on Hub by @guangy10 in #2222
- Fix CI and update Transformers (4.51.1) by @IlyasMoutawwakil in #2225
- Remove FP16_Optimizer patch for DeepSpeed by @Rohan138 in #2213
- Fix diffusers by @IlyasMoutawwakil in #2229
- Remove diffusers extra by @echarlaix in #2207
- TRT engine docs by @IlyasMoutawwakil in #1396
- Always use a deafult user agent by @IlyasMoutawwakil in #2230
- dedup _get_model_external_data_paths by @xieofxie in #2217
- Clean up workflows by @IlyasMoutawwakil in #2231
- reduce area of patch_everywhere for avoid unexpected replacements by @eaidova in #2237
- add dinov2 onnx optimizer support by @EFord36 in #2227
- Fix code quality test by @echarlaix in #2239
- Add onnx export for efficientnet by @Thas-Tayapongsak in #2214
- add loading image processor by @eaidova in #2254
- Fix
CLIPSdpaAttention
had dropped since v4.48 by @hans00 in #2245 - Increase clip opset by @echarlaix in #2256
- Add feature extraction support for image models by @Abdennacer-Badaoui in #2255
- adding token classification task for qwen2 by @Abdennacer-Badaoui in #2261
- upgrade min transformers version for phi3 by @echarlaix in #2263
v1.24.0: SD3 & Flux, DinoV2, Modernbert, GPTQModel, Transformers v4.48...
Release Notes: Optimum v1.24.0
We’re excited to announce the release of Optimum v1.24.0. This update expands ONNX-based model capabilities and includes several improvements, bug fixes, and new contributions from the community.
🚀 New Features & Enhancements
ORTQuantizer
now supports models with ONNX subfolders.- ONNX Runtime IO Binding support for all supported Transformers models (no models left behind).
- SD3 and Flux model support added to
ORTDiffusionPipeline
enabling latest diffusion-based models. - Transformers v4.47 and v4.48 compatibility, ensuring seamless integration with the latest advancements in Hugging Face's ecosystem.
- ONNX export support extended to various models, including Decision Transformer, ModernBERT, Megatron-BERT, Dinov2, OLMo, and many more (see details).
🔧 Key Fixes & Optimizations
- Dropped support for Python 3.8
- Bug fixes in
ModelPatcher
, SDXL refiner export, and device checks for improved reliability.
👥 New Contributors
A huge thank you to our first-time contributors:
Your contributions make Optimum better! 🎉
For a detailed list of all changes, please check out the full changelog.
🚀 Happy optimizing!
What's Changed
- Onnx granite by @gabe-l-hart in #2043
- Drop python 3.8 by @echarlaix in #2086
- Update Dockerfile base image by @echarlaix in #2089
- add transformers 4.36 tests by @echarlaix in #2085
- [
fix
] Allow ORTQuantizer over models with subfolder ONNX files by @tomaarsen in #2094 - SD3 and Flux support by @IlyasMoutawwakil in #2073
- Remove datasets as required dependency by @echarlaix in #2087
- Add ONNX Support for Decision Transformer Model by @ra9hur in #2038
- Generate guidance for flux by @IlyasMoutawwakil in #2104
- Unbundle inputs generated by
DummyTimestepInputGenerator
by @JingyaHuang in #2107 - Pass the revision to SentenceTransformer models by @bndos in #2105
- Rembert onnx support by @mlynatom in #2108
- fix bug
ModelPatcher
returns empty outputs by @LoSealL in #2109 - Fix workflow to mark issues as stale by @echarlaix in #2110
- Remove doc-build by @echarlaix in #2111
- Downgrade stale bot to v8 and fix permissions by @echarlaix in #2112
- Update documentation color from google tpu section by @echarlaix in #2113
- Fix workflow to mark PRs as stale by @echarlaix in #2116
- Enable transformers v4.47 support by @echarlaix in #2119
- Add ONNX export support for MGP-STR by @xenova in #2099
- Add ONNX export support for OLMo and OLMo2 by @xenova in #2121
- Pass on
model_kwargs
when exporting a SentenceTransformers model by @sjrl in #2126 - Add ONNX export support for DinoV2, Hiera, Maskformer, PVT, SigLIP, SwinV2, VitMAE, and VitMSN models by @xenova in #2001
- move check_dummy_inputs_allowed to common export utils by @eaidova in #2114
- Remove CI macos runners by @echarlaix in #2129
- Enable GPTQModel by @jiqing-feng in #2064
- Skip private model loading for external contributors by @echarlaix in #2130
- fix sdxl refiner export by @eaidova in #2133
- Export to ExecuTorch: Initial Integration by @guangy10 in #2090
- Fix AutoModel can't load gptq model due to module prefix mismatch vs AutoModelForCausalLM by @LRL-ModelCloud in #2146
- Update docker files by @echarlaix in #2102
- Limit diffusers version by @IlyasMoutawwakil in #2150
- Add ONNX export support for ModernBERT by @xenova in #2131
- Allow GPTQModel to auto select Marlin or faster kernels for inference only ops by @LRL-ModelCloud in #2138
- fix device check by @jiqing-feng in #2136
- Replace check_if_xxx_greater with is_xxx_version by @echarlaix in #2152
- Add tf available and version by @echarlaix in #2154
- Add ONNX export support for
PatchTST
by @xenova in #2101 - fix infer task from model_name if model from sentence transformer by @eaidova in #2151
- Unpin diffusers and pass onnx exporters tests by @IlyasMoutawwakil in #2153
- Uncomment modernbert config by @IlyasMoutawwakil in #2155
- Skip optimum-benchmark when loading namespace modules by @IlyasMoutawwakil in #2159
- Fix PR doc upload by @regisss in #2161
- Move executorch to optimum-executorch by @echarlaix in #2165
- Adding Onnx Support For Megatron-Bert by @pragyandev in #2169
- Transformers 4.48 by @IlyasMoutawwakil in #2158
- Update ort CIs (slow, gpu, train) by @IlyasMoutawwakil in #2024
v1.23.3: Patch release
- Add sentence-transformers and timm documentation example by @echarlaix in #2072
- Create token type ids when not provided by @echarlaix in #2081
- Add transformers v4.46 support by @echarlaix in #2078
v1.23.2: Patch release
- Fix compatibility with diffusers < 0.25.0 #2063 @echarlaix
- Update the habana extra #2077 @regisss
Full Changelog: v1.23.1...v1.23.2
v1.23.1: Patch release
- Fix doc build by @regisss in #2050
- Don't hardcode the logger level to INFO let users set TRANSFORMERS_VERBOSITY by @tomaarsen in #2047
- Add workflow to mark issues as stale by @regisss in #2051
- Fix onnx export when transformers >= v4.45 (impacting sentence-transformers and timm models) by @echarlaix in #2053 and #2054