Scikit-Learn Curve Fit

The JupyterLab notebook of this post can be found here.

Like what have been discussed is first, second and third post of this optimization series, scikit-learn has many classes and methods for regression and classification.

Like before, we mainly focus on curve fitting for optimization application.

A complex one dimensional function plus noise has been used as training data. Several methods investigated based on mean square error metric and also visual plotting.

First method was a simple linear steepest gradient regressor. It tries to find the best linear estimation of training data. Default metric is mean square error. Before using this estimator, the input data has been standardized. Since the output of the original function is almost between -1 and +1, then standardization is not necessary, but to show the elaborated method of scikit-learn of using pipelines processing, this has been added. Using regressors and classifiers in scikit-learn is very nice from programmers view.

Second method is like the first method, but the feature vector with some non-linear functions has been expanded. The result when the order of expansion is high is not very bad.

Third, the piece-wise functions has been used. We know that using piece-wise functions is not possible in high number of dimensions, but for few number of dimensions, it usually works fine.

Support vector regressor works based on kernel methods. This regressor gives good results too if the order of non-linear functions which expand the features would be sufficiently enough.

Haar-like functions can be assumed as piece-wise binary functions. We have used these functions combined with support vector regressors. The MSE of estimation is good, but function tries to estimate noise too. Therefore, the output has noisy behavior. Moreover, for a good estimation, the order of Haar-like functions should be high.

Support vector regressor has been used with geometrical functions for features expansion and the results was better than polynomial expansions.

Finally, Gaussian Processes regressor has been used. The result of GP regressor was amazing. By tuning few of its parameters, very good results reached. A very well-known resource to learn how Gaussian Processes works is the book “Gaussian Processes for Machine Learning” by “C. Rasmussen and C. Williams”.

Be First to Comment

Leave a Reply Cancel reply