Machine Learning:Linear Models
Published On 2022/06/09 Thursday, Singapore
Linear Models provide simple and fast baselines for more complicated models. When the number of features is large, more complex models may be hard to beat linear models.
Regression
The target variable is a linear combination of the features. The relationship between feature and target is linear. For regression problems, widely used linear models are Linear Regression and SVM.
Linear Regression
from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X_train,y_train)
model.score(X_test,y_test)
More detailed notes on Linear regression can be found here.
Linear Kernel SVM
from sklearn.svm import SVR
model = SVR(kernel="linear")
model.fit(X_train,y_train)
Classification
Linear models suit data that is almost linearly separable. The decision boundaries are straight lines.
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
model.fit(X_train,y_train)
More detailed notes on Logistic regression can be found here.
Underfitting
Linear models can be underfitting when the number of features is much less than the number of samples, or the model is too simple to capture the complex structure(non-linearity) of the data. In this case, we should consider
- Feature engineering to add more new features.
- Handling non-linearity between the features and the target.
With regards to handling the non-linearity, there are 3 methods.
- Engineer a richer set of features by including expert knowledge which can be directly used by a simple linear model
- Choose a model that can natively deal with non-linearity, such as decision tree models.
- Use a “kernel” to have a locally-based decision function instead of a global linear decision function.
Overfitting
Linear models can also overfit when the number of features is much more than the number of samples, or when there are many uninformative features. In this case, we should consider Feature engineering to select informative features Regularization.
Regularization for Regression
Ridge Regression is regularized linear regression with parameter alpha. The larger alpha, the stronger regularization. To search the best hyperparameter alpha:
from sklearn.linear_model import Ridge
from sklearn.model_selection import GridSearchCV
param_grid = {"alpha": {0.001, 0.1, 1, 10, 1000}}
model = GridSearchCV(Ridege(), param_grid)
model.fit(X_train, y_train)
Alternatively, sklearn provides a more fast implementation - RidgeCV
to search the hyperparameter alpha.
from sklearn.linear_model import RidgeCV
model = RidgeCV(alphas=[0.001, 0.1, 1, 10.1000])
model.fit(X_train, y_train)
Regularization for Classification
Logistic Regression in sklearn is regularized by default with parameter C equals 1. High C value means weaker regularization.
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV
param_grid = {"C": {0.01, 0.1, 1, 10}}
model = GridSearchCV(LogisticRegression(), param_grid)
model.fit(X_train, y_train)
Reference & Resources
- Linear Models, Scikit-learn MOCC
- Modeling Non-Liner features-target relationship, Scikit-learn MOCC