Machine Learning：Linear Models

Published On 2022/06/09 Thursday, Singapore

Linear Models provide simple and fast baselines for more complicated models. When the number of features is large, more complex models may be hard to beat linear models.

Regression

The target variable is a linear combination of the features. The relationship between feature and target is linear. For regression problems, widely used linear models are Linear Regression and SVM.

Linear Regression

from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X_train,y_train)
model.score(X_test,y_test)

More detailed notes on Linear regression can be found here.

Linear Kernel SVM

from sklearn.svm import SVR

model = SVR(kernel="linear")
model.fit(X_train,y_train)

Classification

Linear models suit data that is almost linearly separable. The decision boundaries are straight lines.

from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
model.fit(X_train,y_train)

More detailed notes on Logistic regression can be found here.

Underfitting

Linear models can be underfitting when the number of features is much less than the number of samples, or the model is too simple to capture the complex structure(non-linearity) of the data. In this case, we should consider

Feature engineering to add more new features.
Handling non-linearity between the features and the target.

With regards to handling the non-linearity, there are 3 methods.

Engineer a richer set of features by including expert knowledge which can be directly used by a simple linear model
Choose a model that can natively deal with non-linearity, such as decision tree models.
Use a “kernel” to have a locally-based decision function instead of a global linear decision function.

Overfitting

Linear models can also overfit when the number of features is much more than the number of samples, or when there are many uninformative features. In this case, we should consider Feature engineering to select informative features Regularization.

Regularization for Regression

Ridge Regression is regularized linear regression with parameter alpha. The larger alpha, the stronger regularization. To search the best hyperparameter alpha:

from sklearn.linear_model import Ridge
from sklearn.model_selection import GridSearchCV
param_grid = {"alpha": {0.001, 0.1, 1, 10, 1000}}
model = GridSearchCV(Ridege(), param_grid)
model.fit(X_train, y_train)

Alternatively, sklearn provides a more fast implementation - RidgeCV to search the hyperparameter alpha.

from sklearn.linear_model import RidgeCV
model = RidgeCV(alphas=[0.001, 0.1, 1, 10.1000])
model.fit(X_train, y_train)

Regularization for Classification

Logistic Regression in sklearn is regularized by default with parameter C equals 1. High C value means weaker regularization.

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV
param_grid = {"C": {0.01, 0.1, 1, 10}}
model = GridSearchCV(LogisticRegression(), param_grid)
model.fit(X_train, y_train)

Reference & Resources

Linear Models, Scikit-learn MOCC
Modeling Non-Liner features-target relationship, Scikit-learn MOCC

💚 Back to Home