Version 1.7#

For a short description of the main highlights of the release, please refer to Release Highlights for scikit-learn 1.7.

Legend for changelogs

Major Feature something big that you couldn’t do before.
Feature something that you couldn’t do before.
Efficiency an existing feature now may not require as much computation or memory.
Enhancement a miscellaneous minor improvement.
Fix something that previously didn’t work as documented – or according to reasonable expectations – should now work.
API Change you will need to change your code to have the same effect in the future; or a feature will be removed in the future.

Version 1.7.0#

June 2025

Changed models#

Fix Change the ConvergenceWarning message of estimators that rely on the "lbfgs" optimizer internally to be more informative and to avoid suggesting to increase the maximum number of iterations when it is not user-settable or when the convergence problem happens before reaching it. By Olivier Grisel. #31316

Changes impacting many modules#

Sparse update: As part of the SciPy change from spmatrix to sparray, all internal use of sparse now supports both sparray and spmatrix. All manipulations of sparse objects should work for either spmatrix or sparray. This is pass 1 of a migration toward sparray (see SciPy migration to sparray By Dan Schult #30858

Support for Array API#

Additional estimators and functions have been updated to include support for all Array API compliant inputs.

See Array API support (experimental) for more details.

Feature sklearn.utils.check_consistent_length now supports Array API compatible inputs. By Stefanie Senger #29519
Feature sklearn.metrics.explained_variance_score and sklearn.metrics.mean_pinball_loss now support Array API compatible inputs. By Virgil Chan #29978
Feature sklearn.metrics.fbeta_score, sklearn.metrics.precision_score and sklearn.metrics.recall_score now support Array API compatible inputs. By Omar Salman #30395
Feature sklearn.utils.extmath.randomized_svd now support Array API compatible inputs. By Connor Lane and Jérémie du Boisberranger. #30819
Feature sklearn.metrics.hamming_loss now support Array API compatible inputs. By Thomas Li #30838
Feature preprocessing.Binarizer now supports Array API compatible inputs. By Yaroslav Korobko, Olivier Grisel, and Thomas Li. #31190
Feature sklearn.metrics.jaccard_score now supports Array API compatible inputs. By Omar Salman #31204
array-api-compat and array-api-extra are now vendored within the scikit-learn source. Users of the experimental array API standard support no longer need to install array-api-compat in their environment. by Lucas Colley #30340

Metadata routing#

Refer to the Metadata Routing User Guide for more details.

Feature ensemble.BaggingClassifier and ensemble.BaggingRegressor now support metadata routing through their predict, predict_proba, predict_log_proba and decision_function methods and pass **params to the underlying estimators. By Stefanie Senger. #30833

`sklearn.base`#

Enhancement base.BaseEstimator now has a parameter table added to the estimators HTML representation that can be visualized with jupyter. By Guillaume Lemaitre and Dea María Léon #30763

`sklearn.calibration`#

Fix CalibratedClassifierCV now raises FutureWarning instead of UserWarning when passing cv="prefit”. By Olivier Grisel
CalibratedClassifierCV with method="sigmoid" no longer crashes when passing float64-dtyped sample_weight along with a base estimator that outputs float32-dtyped predictions. By Olivier Grisel #30873

`sklearn.compose`#

API Change The force_int_remainder_cols parameter of compose.ColumnTransformer and compose.make_column_transformer is deprecated and will be removed in 1.9. It has no effect. By Jérémie du Boisberranger #31167

`sklearn.covariance`#

Fix Support for n_samples == n_features in sklearn.covariance.MinCovDet has been restored. By Antony Lee. #30483

`sklearn.datasets`#

Enhancement New parameter return_X_y added to datasets.make_classification. The default value of the parameter does not change how the function behaves. By Success Moses and Adam Cooper #30196

`sklearn.decomposition`#

Feature DictionaryLearning, SparseCoder and MiniBatchDictionaryLearning now have a inverse_transform method. By Rémi Flamary #30443

`sklearn.ensemble`#

Feature ensemble.HistGradientBoostingClassifier and ensemble.HistGradientBoostingRegressor allow for more control over the validation set used for early stopping. You can now pass data to be used for validation directly to fit via the arguments X_val, y_val and sample_weight_val. By Christian Lorentzen. #27124
Fix ensemble.VotingClassifier and ensemble.VotingRegressor validate estimators to make sure it is a list of tuples. By Thomas Fan. #30649

`sklearn.feature_selection`#

Enhancement feature_selection.RFECV now gives access to the ranking and support in each iteration and cv step of feature selection. By Marie S. #30179
Fix feature_selection.SelectFromModel now correctly works when the estimator is an instance of linear_model.ElasticNetCV with its l1_ratio parameter being an array-like. By Vasco Pereira. #31107

`sklearn.gaussian_process`#

Enhancement gaussian_process.GaussianProcessClassifier now includes a latent_mean_and_variance method that exposes the mean and the variance of the latent function, \(f\), used in the Laplace approximation. By Miguel González Duque #22227

`sklearn.inspection`#

Enhancement Add custom_values parameter in inspection.partial_dependence. It enables users to pass their own grid of values at which the partial dependence should be calculated. By Freddy A. Boulton and Stephen Pardy #26202
Enhancement inspection.DecisionBoundaryDisplay now supports plotting all classes for multi-class problems when response_method is ‘decision_function’, ‘predict_proba’ or ‘auto’. By Lucy Liu #29797
Fix inspection.partial_dependence now raises an informative error when passing an empty list as the categorical_features parameter. None should be used instead to indicate that no categorical features are present. By Pedro Lopes. #31146
API Change inspection.partial_dependence does no longer accept integer dtype for numerical feature columns. Explicit conversion to floating point values is now required before calling this tool (and preferably even before fitting the model to inspect). By Olivier Grisel #30409

`sklearn.linear_model`#

Enhancement linear_model.SGDClassifier and linear_model.SGDRegressor now accept l1_ratio=None when penalty is not "elasticnet". By Marc Bresson. #30730
Enhancement Fitting linear_model.Lasso and linear_model.ElasticNet with fit_intercept=True is faster for sparse input X because an unnecessary re-computation of the sum of residuals is avoided. By Christian Lorentzen #31387
Fix linear_model.LogisticRegression and linear_model.LogisticRegressionCV now properly pass sample weights to utils.class_weight.compute_class_weight when fit with class_weight="balanced". By Shruti Nath and Olivier Grisel #30057
Fix Added a new parameter tol to linear_model.LinearRegression that determines the precision of the solution coef_ when fitting on sparse data. By Success Moses #30521
Fix The update and initialization of the hyperparameters now properly handle sample weights in linear_model.BayesianRidge. By Antoine Baker. #30644
Fix linear_model.BayesianRidge now uses the full SVD to correctly estimate the posterior covariance matrix sigma_ when n_samples < n_features. By Antoine Baker #31094
API Change The parameter n_alphas has been deprecated in the following classes: linear_model.ElasticNetCV and linear_model.LassoCV and linear_model.MultiTaskElasticNetCV and linear_model.MultiTaskLassoCV, and will be removed in 1.9. The parameter alphas now supports both integers and array-likes, removing the need for n_alphas. From now on, only alphas should be set to either indicate the number of alphas to automatically generate (int) or to provide a list of alphas (array-like) to test along the regularization path. By Siddharth Bansal. #30616
API Change Using the "liblinear" solver for multiclass classification with a one-versus-rest scheme in linear_model.LogisticRegression and linear_model.LogisticRegressionCV is deprecated and will raise an error in version 1.8. Either use a solver which supports the multinomial loss or wrap the estimator in a sklearn.multiclass.OneVsRestClassifier to keep applying a one-versus-rest scheme. By Jérémie du Boisberranger. #31241

`sklearn.manifold`#

Enhancement manifold.MDS will switch to use n_init=1 by default, starting from version 1.9. By Dmitry Kobak #31117
Fix manifold.MDS now correctly handles non-metric MDS. Furthermore, the returned stress value now corresponds to the returned embedding and normalized stress is now allowed for metric MDS. By Dmitry Kobak #30514
Fix manifold.MDS now uses eps=1e-6 by default and the convergence criterion was adjusted to make sense for both metric and non-metric MDS and to follow the reference R implementation. The formula for normalized stress was adjusted to follow the original definition by Kruskal. By Dmitry Kobak #31117

`sklearn.metrics`#

Feature metrics.brier_score_loss implements the Brier score for multiclass classification problems and adds a scale_by_half argument. This metric is notably useful to assess both sharpness and calibration of probabilistic classifiers. See the docstrings for more details. By Varun Aggarwal, Olivier Grisel and Antoine Baker. #22046
Feature Add class method from_cv_results to metrics.RocCurveDisplay, which allows easy plotting of multiple ROC curves from model_selection.cross_validate results. By Lucy Liu #30399
Enhancement metrics.det_curve, metrics.DetCurveDisplay.from_estimator, and metrics.DetCurveDisplay.from_estimator now accept a drop_intermediate option to drop thresholds where true positives (tp) do not change from the previous or subsequent thresholds. All points with the same tp value have the same fnr and thus same y coordinate in a DET curve. By Arturo Amor #29151
Enhancement class_likelihood_ratios now has a replace_undefined_by param. When there is a division by zero, the metric is undefined and the set values are returned for LR+ and LR-. By Stefanie Senger #29288
Fix metrics.log_loss now raises a ValueError if values of y_true are missing in labels. By Varun Aggarwal, Olivier Grisel and Antoine Baker. #22046
Fix metrics.det_curve and metrics.DetCurveDisplay now return an extra threshold at infinity where the classifier always predicts the negative class i.e. tps = fps = 0. By Arturo Amor #29151
Fix class_likelihood_ratios now raises UndefinedMetricWarning instead of UserWarning when a division by zero occurs. By Stefanie Senger #29288
Fix metrics.RocCurveDisplay will no longer set a legend when label is None in both the line_kwargs and the chance_level_kw. By Arturo Amor #29727
Fix Additional sample_weight checking has been added to metrics.mean_absolute_error, metrics.mean_pinball_loss, metrics.mean_absolute_percentage_error, metrics.mean_squared_error, metrics.root_mean_squared_error, metrics.mean_squared_log_error, metrics.root_mean_squared_log_error, metrics.explained_variance_score, metrics.r2_score, metrics.mean_tweedie_deviance, metrics.mean_poisson_deviance, metrics.mean_gamma_deviance and metrics.d2_tweedie_score. sample_weight can only be 1D, consistent to y_true and y_pred in length or a scalar. By Lucy Liu. #30886
Fix d2_log_loss_score now properly handles the case when labels is passed and not all of the labels are present in y_true. By Vassilis Margonis #30903
Fix Fix metrics.adjusted_mutual_info_score numerical issue when number of classes and samples is low. By Hleb Levitski #31065
API Change The sparse parameter of metrics.fowlkes_mallows_score is deprecated and will be removed in 1.9. It has no effect. By Luc Rocher. #28981
API Change The raise_warning parameter of metrics.class_likelihood_ratios is deprecated and will be removed in 1.9. An UndefinedMetricWarning will always be raised in case of a division by zero. By Stefanie Senger. #29288
API Change In sklearn.metrics.RocCurveDisplay.from_predictions, the argument y_pred has been renamed to y_score to better reflect its purpose. y_pred will be removed in 1.9. By Bagus Tris Atmaja in #29865

`sklearn.mixture`#

Feature Added an attribute lower_bounds_ in the mixture.BaseMixture class to save the list of lower bounds for each iteration thereby providing insights into the convergence behavior of mixture models like mixture.GaussianMixture. By Manideep Yenugula #28559
Efficiency Simplified redundant computation when estimating covariances in GaussianMixture with a covariance_type="spherical" or covariance_type="diag". By Leonce Mekinda and Olivier Grisel #30414
Efficiency GaussianMixture now consistently operates at float32 precision when fitted with float32 data to improve training speed and memory efficiency. Previously, part of the computation would be implicitly cast to float64. By Olivier Grisel and Omar Salman. #30415

`sklearn.model_selection`#

Fix Hyper-parameter optimizers such as model_selection.GridSearchCV now forward sample_weight to the scorer even when metadata routing is not enabled. By Antoine Baker #30743

`sklearn.multiclass`#

Fix The predict_proba method of sklearn.multiclass.OneVsRestClassifier now returns zero for all classes when all inner estimators never predict their positive class. By Luis M. B. Varona, Marc Bresson, and Jérémie du Boisberranger. #31228

`sklearn.multioutput`#

Enhancement The parameter base_estimator has been deprecated in favour of estimator for multioutput.RegressorChain and multioutput.ClassifierChain. By Success Moses and dikraMasrour #30152

`sklearn.neural_network`#

Feature Added support for sample_weight in neural_network.MLPClassifier and neural_network.MLPRegressor. By Zach Shu and Christian Lorentzen #30155
Feature Added parameter for loss in neural_network.MLPRegressor with options "squared_error" (default) and "poisson" (new). By Christian Lorentzen #30712
Fix neural_network.MLPRegressor now raises an informative error when early_stopping is set and the computed validation set is too small. By David Shumway. #24788

`sklearn.pipeline`#

Enhancement Expose the verbose_feature_names_out argument in the pipeline.make_union function, allowing users to control feature name uniqueness in the pipeline.FeatureUnion. By Abhijeetsingh Meena #30406

`sklearn.preprocessing`#

Enhancement preprocessing.KBinsDiscretizer with strategy="uniform" now accepts sample_weight. Additionally with strategy="quantile" the quantile_method can now be specified (in the future quantile_method="averaged_inverted_cdf" will become the default). By Shruti Nath and Olivier Grisel #29907
Fix preprocessing.KBinsDiscretizer now uses weighted resampling when sample weights are given and subsampling is used. This may change results even when not using sample weights, although in absolute and not in terms of statistical properties. By Shruti Nath and Jérémie du Boisberranger #29907
Fix Now using scipy.stats.yeojohnson instead of our own implementation of the Yeo-Johnson transform. Fixed numerical stability (mostly overflows) of the Yeo-Johnson transform with PowerTransformer(method="yeo-johnson") when scipy version is >= 1.12. Initial PR by Xuefeng Xu completed by Mohamed Yaich, Oussama Er-rabie, Mohammed Yaslam Dlimi, Hamza Zaroual, Amine Hannoun and Sylvain Marié. #31227

`sklearn.svm`#

Fix svm.LinearSVC now properly passes sample weights to utils.class_weight.compute_class_weight when fit with class_weight="balanced". By Shruti Nath #30057

`sklearn.utils`#

Enhancement utils.multiclass.type_of_target raises a warning when the number of unique classes is greater than 50% of the number of samples. This warning is raised only if y has more than 20 samples. By Rahil Parikh. #26335
Enhancement :func: resample now handles sample weights which allows weighted resampling. By Shruti Nath and Olivier Grisel #29907
Enhancement utils.class_weight.compute_class_weight now properly accounts for sample weights when using strategy “balanced” to calculate class weights. By Shruti Nath #30057
Enhancement Warning filters from the main process are propagated to joblib workers. By Thomas Fan #30380
Enhancement The private helper function utils._safe_indexing now officially supports pyarrow data. For instance, passing a pyarrow Table as X in a compose.ColumnTransformer is now possible. By Christian Lorentzen #31040
Fix In utils.estimator_checks we now enforce for binary classifiers a binary y by taking the minimum as the negative class instead of the first element, which makes it robust to y shuffling. It prevents two checks from wrongly failing on binary classifiers. By Antoine Baker. #30775
Fix utils.extmath.randomized_svd and utils.extmath.randomized_range_finder now validate their input array to fail early with an informative error message on invalid input. By Connor Lane. #30819

Code and documentation contributors

Thanks to everyone who has contributed to the maintenance and improvement of the project since version 1.6, including:

4hm3d, Aaron Schumacher, Abhijeetsingh Meena, Acciaro Gennaro Daniele, Achraf Tasfaout, Adrien Linares, Adrin Jalali, Agriya Khetarpal, Aiden Frank, Aitsaid Azzedine Idir, ajay-sentry, Akanksha Mhadolkar, Alfredo Saucedo, Anderson Chaves, Andres Guzman-Ballen, Aniruddha Saha, antoinebaker, Antony Lee, Arjun S, ArthurDbrn, Arturo, Arturo Amor, ash, Ashton Powell, ayoub.agouzoul, Bagus Tris Atmaja, Benjamin Danek, Boney Patel, Camille Troillard, Chems Ben, Christian Lorentzen, Christian Veenhuis, Christine P. Chai, claudio, Code_Blooded, Colas, Colin Coe, Connor Lane, Corey Farwell, Daniel Agyapong, Dan Schult, Dea María Léon, Deepak Saldanha, dependabot[bot], Dimitri Papadopoulos Orfanos, Dmitry Kobak, Domenico, Elham Babaei, emelia-hdz, EmilyXinyi, Emma Carballal, Eric Larson, fabianhenning, Gael Varoquaux, Gil Ramot, Gordon Grey, Goutam, G Sreeja, Guillaume Lemaitre, Haesun Park, Hanjun Kim, Helder Geovane Gomes de Lima, Henri Bonamy, Hleb Levitski, Hugo Boulenger, IlyaSolomatin, Irene, Jérémie du Boisberranger, Jérôme Dockès, JoaoRodriguesIST, Joel Nothman, Josh, Kevin Klein, Loic Esteve, Lucas Colley, Luc Rocher, Lucy Liu, Luis M. B. Varona, lunovian, Mamduh Zabidi, Marc Bresson, Marco Edward Gorelli, Marco Maggi, Maren Westermann, Marie Sacksick, Martin Jurča, Miguel González Duque, Mihir Waknis, Mohamed Ali SRIR, Mohamed DHIFALLAH, mohammed benyamna, Mohit Singh Thakur, Mounir Lbath, myenugula, Natalia Mokeeva, Olivier Grisel, omahs, Omar Salman, Pedro Lopes, Pedro Olivares, Preyas Shah, Radovenchyk, Rahil Parikh, Rémi Flamary, Reshama Shaikh, Rishab Saini, rolandrmgservices, SanchitD, Santiago Castro, Santiago Víquez, scikit-learn-bot, Scott Huberty, Shruti Nath, Siddharth Bansal, Simarjot Sidhu, Sortofamudkip, sotagg, Sourabh Kumar, Stefan, Stefanie Senger, Stefano Gaspari, Stephen Pardy, Success Moses, Sylvain Combettes, Tahar Allouche, Thomas J. Fan, Thomas Li, ThorbenMaa, Tim Head, Umberto Fasci, UV, Vasco Pereira, Vassilis Margonis, Velislav Babatchev, Victoria Shevchenko, viktor765, Vipsa Kamani, Virgil Chan, vpz, Xiao Yuan, Yaich Mohamed, Yair Shimony, Yao Xiao, Yaroslav Halchenko, Yulia Vilensky, Yuvi Panda