Version 1.7#
For a short description of the main highlights of the release, please refer to Release Highlights for scikit-learn 1.7.
Legend for changelogs
Major Feature something big that you couldn’t do before.
Feature something that you couldn’t do before.
Efficiency an existing feature now may not require as much computation or memory.
Enhancement a miscellaneous minor improvement.
Fix something that previously didn’t work as documented – or according to reasonable expectations – should now work.
API Change you will need to change your code to have the same effect in the future; or a feature will be removed in the future.
Version 1.7.0#
June 2025
Changed models#
Fix Change the
ConvergenceWarningmessage of estimators that rely on the"lbfgs"optimizer internally to be more informative and to avoid suggesting to increase the maximum number of iterations when it is not user-settable or when the convergence problem happens before reaching it. By Olivier Grisel. #31316
Changes impacting many modules#
Sparse update: As part of the SciPy change from spmatrix to sparray, all internal use of sparse now supports both sparray and spmatrix. All manipulations of sparse objects should work for either spmatrix or sparray. This is pass 1 of a migration toward sparray (see SciPy migration to sparray By Dan Schult #30858
Support for Array API#
Additional estimators and functions have been updated to include support for all Array API compliant inputs.
See Array API support (experimental) for more details.
Feature
sklearn.utils.check_consistent_lengthnow supports Array API compatible inputs. By Stefanie Senger #29519Feature
sklearn.metrics.explained_variance_scoreandsklearn.metrics.mean_pinball_lossnow support Array API compatible inputs. By Virgil Chan #29978Feature
sklearn.metrics.fbeta_score,sklearn.metrics.precision_scoreandsklearn.metrics.recall_scorenow support Array API compatible inputs. By Omar Salman #30395Feature
sklearn.utils.extmath.randomized_svdnow support Array API compatible inputs. By Connor Lane and Jérémie du Boisberranger. #30819Feature
sklearn.metrics.hamming_lossnow support Array API compatible inputs. By Thomas Li #30838Feature
preprocessing.Binarizernow supports Array API compatible inputs. By Yaroslav Korobko, Olivier Grisel, and Thomas Li. #31190Feature
sklearn.metrics.jaccard_scorenow supports Array API compatible inputs. By Omar Salman #31204array-api-compat and array-api-extra are now vendored within the scikit-learn source. Users of the experimental array API standard support no longer need to install array-api-compat in their environment. by Lucas Colley #30340
Metadata routing#
Refer to the Metadata Routing User Guide for more details.
Feature
ensemble.BaggingClassifierandensemble.BaggingRegressornow support metadata routing through theirpredict,predict_proba,predict_log_probaanddecision_functionmethods and pass**paramsto the underlying estimators. By Stefanie Senger. #30833
sklearn.base#
Enhancement
base.BaseEstimatornow has a parameter table added to the estimators HTML representation that can be visualized with jupyter. By Guillaume Lemaitre and Dea María Léon #30763
sklearn.calibration#
Fix
CalibratedClassifierCVnow raisesFutureWarninginstead ofUserWarningwhen passingcv="prefit”. By Olivier GriselCalibratedClassifierCVwithmethod="sigmoid"no longer crashes when passingfloat64-dtypedsample_weightalong with a base estimator that outputsfloat32-dtyped predictions. By Olivier Grisel #30873
sklearn.compose#
API Change The
force_int_remainder_colsparameter ofcompose.ColumnTransformerandcompose.make_column_transformeris deprecated and will be removed in 1.9. It has no effect. By Jérémie du Boisberranger #31167
sklearn.covariance#
Fix Support for
n_samples == n_featuresinsklearn.covariance.MinCovDethas been restored. By Antony Lee. #30483
sklearn.datasets#
Enhancement New parameter
return_X_yadded todatasets.make_classification. The default value of the parameter does not change how the function behaves. By Success Moses and Adam Cooper #30196
sklearn.decomposition#
Feature
DictionaryLearning,SparseCoderandMiniBatchDictionaryLearningnow have ainverse_transformmethod. By Rémi Flamary #30443
sklearn.ensemble#
Feature
ensemble.HistGradientBoostingClassifierandensemble.HistGradientBoostingRegressorallow for more control over the validation set used for early stopping. You can now pass data to be used for validation directly tofitvia the argumentsX_val,y_valandsample_weight_val. By Christian Lorentzen. #27124Fix
ensemble.VotingClassifierandensemble.VotingRegressorvalidateestimatorsto make sure it is a list of tuples. By Thomas Fan. #30649
sklearn.feature_selection#
Enhancement
feature_selection.RFECVnow gives access to the ranking and support in each iteration and cv step of feature selection. By Marie S. #30179Fix
feature_selection.SelectFromModelnow correctly works when the estimator is an instance oflinear_model.ElasticNetCVwith itsl1_ratioparameter being an array-like. By Vasco Pereira. #31107
sklearn.gaussian_process#
Enhancement
gaussian_process.GaussianProcessClassifiernow includes alatent_mean_and_variancemethod that exposes the mean and the variance of the latent function, \(f\), used in the Laplace approximation. By Miguel González Duque #22227
sklearn.inspection#
Enhancement Add
custom_valuesparameter ininspection.partial_dependence. It enables users to pass their own grid of values at which the partial dependence should be calculated. By Freddy A. Boulton and Stephen Pardy #26202Enhancement
inspection.DecisionBoundaryDisplaynow supports plotting all classes for multi-class problems whenresponse_methodis ‘decision_function’, ‘predict_proba’ or ‘auto’. By Lucy Liu #29797Fix
inspection.partial_dependencenow raises an informative error when passing an empty list as thecategorical_featuresparameter.Noneshould be used instead to indicate that no categorical features are present. By Pedro Lopes. #31146API Change
inspection.partial_dependencedoes no longer accept integer dtype for numerical feature columns. Explicit conversion to floating point values is now required before calling this tool (and preferably even before fitting the model to inspect). By Olivier Grisel #30409
sklearn.linear_model#
Enhancement
linear_model.SGDClassifierandlinear_model.SGDRegressornow acceptl1_ratio=Nonewhenpenaltyis not"elasticnet". By Marc Bresson. #30730Enhancement Fitting
linear_model.Lassoandlinear_model.ElasticNetwithfit_intercept=Trueis faster for sparse inputXbecause an unnecessary re-computation of the sum of residuals is avoided. By Christian Lorentzen #31387Fix
linear_model.LogisticRegressionandlinear_model.LogisticRegressionCVnow properly pass sample weights toutils.class_weight.compute_class_weightwhen fit withclass_weight="balanced". By Shruti Nath and Olivier Grisel #30057Fix Added a new parameter
toltolinear_model.LinearRegressionthat determines the precision of the solutioncoef_when fitting on sparse data. By Success Moses #30521Fix The update and initialization of the hyperparameters now properly handle sample weights in
linear_model.BayesianRidge. By Antoine Baker. #30644Fix
linear_model.BayesianRidgenow uses the full SVD to correctly estimate the posterior covariance matrixsigma_whenn_samples < n_features. By Antoine Baker #31094API Change The parameter
n_alphashas been deprecated in the following classes:linear_model.ElasticNetCVandlinear_model.LassoCVandlinear_model.MultiTaskElasticNetCVandlinear_model.MultiTaskLassoCV, and will be removed in 1.9. The parameteralphasnow supports both integers and array-likes, removing the need forn_alphas. From now on, onlyalphasshould be set to either indicate the number of alphas to automatically generate (int) or to provide a list of alphas (array-like) to test along the regularization path. By Siddharth Bansal. #30616API Change Using the
"liblinear"solver for multiclass classification with a one-versus-rest scheme inlinear_model.LogisticRegressionandlinear_model.LogisticRegressionCVis deprecated and will raise an error in version 1.8. Either use a solver which supports the multinomial loss or wrap the estimator in asklearn.multiclass.OneVsRestClassifierto keep applying a one-versus-rest scheme. By Jérémie du Boisberranger. #31241
sklearn.manifold#
Enhancement
manifold.MDSwill switch to usen_init=1by default, starting from version 1.9. By Dmitry Kobak #31117Fix
manifold.MDSnow correctly handles non-metric MDS. Furthermore, the returned stress value now corresponds to the returned embedding and normalized stress is now allowed for metric MDS. By Dmitry Kobak #30514Fix
manifold.MDSnow useseps=1e-6by default and the convergence criterion was adjusted to make sense for both metric and non-metric MDS and to follow the reference R implementation. The formula for normalized stress was adjusted to follow the original definition by Kruskal. By Dmitry Kobak #31117
sklearn.metrics#
Feature
metrics.brier_score_lossimplements the Brier score for multiclass classification problems and adds ascale_by_halfargument. This metric is notably useful to assess both sharpness and calibration of probabilistic classifiers. See the docstrings for more details. By Varun Aggarwal, Olivier Grisel and Antoine Baker. #22046Feature Add class method
from_cv_resultstometrics.RocCurveDisplay, which allows easy plotting of multiple ROC curves frommodel_selection.cross_validateresults. By Lucy Liu #30399Enhancement
metrics.det_curve,metrics.DetCurveDisplay.from_estimator, andmetrics.DetCurveDisplay.from_estimatornow accept adrop_intermediateoption to drop thresholds where true positives (tp) do not change from the previous or subsequent thresholds. All points with the same tp value have the samefnrand thus same y coordinate in a DET curve. By Arturo Amor #29151Enhancement
class_likelihood_ratiosnow has areplace_undefined_byparam. When there is a division by zero, the metric is undefined and the set values are returned forLR+andLR-. By Stefanie Senger #29288Fix
metrics.log_lossnow raises aValueErrorif values ofy_trueare missing inlabels. By Varun Aggarwal, Olivier Grisel and Antoine Baker. #22046Fix
metrics.det_curveandmetrics.DetCurveDisplaynow return an extra threshold at infinity where the classifier always predicts the negative class i.e. tps = fps = 0. By Arturo Amor #29151Fix
class_likelihood_ratiosnow raisesUndefinedMetricWarninginstead ofUserWarningwhen a division by zero occurs. By Stefanie Senger #29288Fix
metrics.RocCurveDisplaywill no longer set a legend whenlabelisNonein both theline_kwargsand thechance_level_kw. By Arturo Amor #29727Fix Additional
sample_weightchecking has been added tometrics.mean_absolute_error,metrics.mean_pinball_loss,metrics.mean_absolute_percentage_error,metrics.mean_squared_error,metrics.root_mean_squared_error,metrics.mean_squared_log_error,metrics.root_mean_squared_log_error,metrics.explained_variance_score,metrics.r2_score,metrics.mean_tweedie_deviance,metrics.mean_poisson_deviance,metrics.mean_gamma_devianceandmetrics.d2_tweedie_score.sample_weightcan only be 1D, consistent toy_trueandy_predin length or a scalar. By Lucy Liu. #30886Fix
d2_log_loss_scorenow properly handles the case whenlabelsis passed and not all of the labels are present iny_true. By Vassilis Margonis #30903Fix Fix
metrics.adjusted_mutual_info_scorenumerical issue when number of classes and samples is low. By Hleb Levitski #31065API Change The
sparseparameter ofmetrics.fowlkes_mallows_scoreis deprecated and will be removed in 1.9. It has no effect. By Luc Rocher. #28981API Change The
raise_warningparameter ofmetrics.class_likelihood_ratiosis deprecated and will be removed in 1.9. AnUndefinedMetricWarningwill always be raised in case of a division by zero. By Stefanie Senger. #29288API Change In
sklearn.metrics.RocCurveDisplay.from_predictions, the argumenty_predhas been renamed toy_scoreto better reflect its purpose.y_predwill be removed in 1.9. By Bagus Tris Atmaja in #29865
sklearn.mixture#
Feature Added an attribute
lower_bounds_in themixture.BaseMixtureclass to save the list of lower bounds for each iteration thereby providing insights into the convergence behavior of mixture models likemixture.GaussianMixture. By Manideep Yenugula #28559Efficiency Simplified redundant computation when estimating covariances in
GaussianMixturewith acovariance_type="spherical"orcovariance_type="diag". By Leonce Mekinda and Olivier Grisel #30414Efficiency
GaussianMixturenow consistently operates atfloat32precision when fitted withfloat32data to improve training speed and memory efficiency. Previously, part of the computation would be implicitly cast tofloat64. By Olivier Grisel and Omar Salman. #30415
sklearn.model_selection#
Fix Hyper-parameter optimizers such as
model_selection.GridSearchCVnow forwardsample_weightto the scorer even when metadata routing is not enabled. By Antoine Baker #30743
sklearn.multiclass#
Fix The
predict_probamethod ofsklearn.multiclass.OneVsRestClassifiernow returns zero for all classes when all inner estimators never predict their positive class. By Luis M. B. Varona, Marc Bresson, and Jérémie du Boisberranger. #31228
sklearn.multioutput#
Enhancement The parameter
base_estimatorhas been deprecated in favour ofestimatorformultioutput.RegressorChainandmultioutput.ClassifierChain. By Success Moses and dikraMasrour #30152
sklearn.neural_network#
Feature Added support for
sample_weightinneural_network.MLPClassifierandneural_network.MLPRegressor. By Zach Shu and Christian Lorentzen #30155Feature Added parameter for
lossinneural_network.MLPRegressorwith options"squared_error"(default) and"poisson"(new). By Christian Lorentzen #30712Fix
neural_network.MLPRegressornow raises an informative error whenearly_stoppingis set and the computed validation set is too small. By David Shumway. #24788
sklearn.pipeline#
Enhancement Expose the
verbose_feature_names_outargument in thepipeline.make_unionfunction, allowing users to control feature name uniqueness in thepipeline.FeatureUnion. By Abhijeetsingh Meena #30406
sklearn.preprocessing#
Enhancement
preprocessing.KBinsDiscretizerwithstrategy="uniform"now acceptssample_weight. Additionally withstrategy="quantile"thequantile_methodcan now be specified (in the futurequantile_method="averaged_inverted_cdf"will become the default). By Shruti Nath and Olivier Grisel #29907Fix
preprocessing.KBinsDiscretizernow uses weighted resampling when sample weights are given and subsampling is used. This may change results even when not using sample weights, although in absolute and not in terms of statistical properties. By Shruti Nath and Jérémie du Boisberranger #29907Fix Now using
scipy.stats.yeojohnsoninstead of our own implementation of the Yeo-Johnson transform. Fixed numerical stability (mostly overflows) of the Yeo-Johnson transform withPowerTransformer(method="yeo-johnson")when scipy version is>= 1.12. Initial PR by Xuefeng Xu completed by Mohamed Yaich, Oussama Er-rabie, Mohammed Yaslam Dlimi, Hamza Zaroual, Amine Hannoun and Sylvain Marié. #31227
sklearn.svm#
Fix
svm.LinearSVCnow properly passes sample weights toutils.class_weight.compute_class_weightwhen fit withclass_weight="balanced". By Shruti Nath #30057
sklearn.utils#
Enhancement
utils.multiclass.type_of_targetraises a warning when the number of unique classes is greater than 50% of the number of samples. This warning is raised only ifyhas more than 20 samples. By Rahil Parikh. #26335Enhancement :func:
resamplenow handles sample weights which allows weighted resampling. By Shruti Nath and Olivier Grisel #29907Enhancement
utils.class_weight.compute_class_weightnow properly accounts for sample weights when using strategy “balanced” to calculate class weights. By Shruti Nath #30057Enhancement Warning filters from the main process are propagated to joblib workers. By Thomas Fan #30380
Enhancement The private helper function
utils._safe_indexingnow officially supports pyarrow data. For instance, passing a pyarrowTableasXin acompose.ColumnTransformeris now possible. By Christian Lorentzen #31040Fix In
utils.estimator_checkswe now enforce for binary classifiers a binaryyby taking the minimum as the negative class instead of the first element, which makes it robust toyshuffling. It prevents two checks from wrongly failing on binary classifiers. By Antoine Baker. #30775Fix
utils.extmath.randomized_svdandutils.extmath.randomized_range_findernow validate their input array to fail early with an informative error message on invalid input. By Connor Lane. #30819
Code and documentation contributors
Thanks to everyone who has contributed to the maintenance and improvement of the project since version 1.6, including:
4hm3d, Aaron Schumacher, Abhijeetsingh Meena, Acciaro Gennaro Daniele, Achraf Tasfaout, Adrien Linares, Adrin Jalali, Agriya Khetarpal, Aiden Frank, Aitsaid Azzedine Idir, ajay-sentry, Akanksha Mhadolkar, Alfredo Saucedo, Anderson Chaves, Andres Guzman-Ballen, Aniruddha Saha, antoinebaker, Antony Lee, Arjun S, ArthurDbrn, Arturo, Arturo Amor, ash, Ashton Powell, ayoub.agouzoul, Bagus Tris Atmaja, Benjamin Danek, Boney Patel, Camille Troillard, Chems Ben, Christian Lorentzen, Christian Veenhuis, Christine P. Chai, claudio, Code_Blooded, Colas, Colin Coe, Connor Lane, Corey Farwell, Daniel Agyapong, Dan Schult, Dea María Léon, Deepak Saldanha, dependabot[bot], Dimitri Papadopoulos Orfanos, Dmitry Kobak, Domenico, Elham Babaei, emelia-hdz, EmilyXinyi, Emma Carballal, Eric Larson, fabianhenning, Gael Varoquaux, Gil Ramot, Gordon Grey, Goutam, G Sreeja, Guillaume Lemaitre, Haesun Park, Hanjun Kim, Helder Geovane Gomes de Lima, Henri Bonamy, Hleb Levitski, Hugo Boulenger, IlyaSolomatin, Irene, Jérémie du Boisberranger, Jérôme Dockès, JoaoRodriguesIST, Joel Nothman, Josh, Kevin Klein, Loic Esteve, Lucas Colley, Luc Rocher, Lucy Liu, Luis M. B. Varona, lunovian, Mamduh Zabidi, Marc Bresson, Marco Edward Gorelli, Marco Maggi, Maren Westermann, Marie Sacksick, Martin Jurča, Miguel González Duque, Mihir Waknis, Mohamed Ali SRIR, Mohamed DHIFALLAH, mohammed benyamna, Mohit Singh Thakur, Mounir Lbath, myenugula, Natalia Mokeeva, Olivier Grisel, omahs, Omar Salman, Pedro Lopes, Pedro Olivares, Preyas Shah, Radovenchyk, Rahil Parikh, Rémi Flamary, Reshama Shaikh, Rishab Saini, rolandrmgservices, SanchitD, Santiago Castro, Santiago Víquez, scikit-learn-bot, Scott Huberty, Shruti Nath, Siddharth Bansal, Simarjot Sidhu, Sortofamudkip, sotagg, Sourabh Kumar, Stefan, Stefanie Senger, Stefano Gaspari, Stephen Pardy, Success Moses, Sylvain Combettes, Tahar Allouche, Thomas J. Fan, Thomas Li, ThorbenMaa, Tim Head, Umberto Fasci, UV, Vasco Pereira, Vassilis Margonis, Velislav Babatchev, Victoria Shevchenko, viktor765, Vipsa Kamani, Virgil Chan, vpz, Xiao Yuan, Yaich Mohamed, Yair Shimony, Yao Xiao, Yaroslav Halchenko, Yulia Vilensky, Yuvi Panda