Machine Learning:Ensemble Models - Bagging
Published On 2024/06/16 Thursday, Singapore
Bagging is a general strategy that can work with any base models - linear models and decision trees.
Bagging consists of 2 steps bootstrapping and aggregation.
- Fit a decision tree model for each bootstrapped sample.
- Aggregated decision tree models by avaeraging the prediction from the
Even if the individual decision tree overfit, the ensembled model is not overfitted becuase the aggregation step.
%%time
from sklearn.ensemble import BaggingRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import cross_validate
base_estimator = DecisionTreeRegressor(random_state=0)
bagging_regressor = BaggingRegressor(
base_estimator=base_estimator, n_estimators=20, random_state=0)
cv_results = cross_validate(bagging_regressor, data, target, n_jobs=2)
Random Forest
Random Forests are bagged randomized decision trees.
- At each split: a random subset.
Regression
from sklearn.ensemble import BaggingClassifier
from sklearn.ensemble import RandomForestClassifier
Classification
from sklearn.ensemble import BaggingClassifier
from sklearn.ensemble import RandomForestClassifier
Reference & Resources
- Ensemble Models: Bagging, Scikit-learn MOCC