R package to tune parameters for machine learning(Support Vector Machine, Random Forest, and Xgboost), using bayesian optimization with gaussian process
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
This is an R package to tune hyperparameters for machine learning algorithms
using Bayesian Optimization based on Gaussian Processes. Algorithms currently supported are: Support Vector Machines, Random Forest, and XGboost.
It’s very easy to write Bayesian Optimaization function, but you also able to customise your model very easily. You have only to specify the data and the column name of the label to classify.
On XgBosst functions, your data frame is automatically transformed into xgb.DMatrix
class.
Any class (character, integer, factor) of label column is OK. The class of the label column is automatically transformed.
install.packages("MlBayesOpt")
You can also install MlBayesOpt from github with:
# install.packages("githubinstall")
githubinstall::githubinstall("MlBayesOpt")
# install.packages("devtools")
devtools::install_github("ymattu/MlBayesOpt")
fashion_train
and fashion_test
are data reproduced from Fashion-MNIST. Each data has 1,000 rows and 784 feature column, and 1 label column named y
.
fashion
is a data made by the function dplyr::bind_rows(fashion_train, fashion_test)
.
iris_train
and iris_test
are included in this pacakge. iris_train
is odd-numbered rows of iris
data, and iris_test
is even-numbered rows of iris
data.
iris
data, using SVM.devtools::load_all()
library(MlBayesOpt)
set.seed(71)
res0 <- svm_cv_opt(data = iris,
label = Species,
n_folds = 3,
init_points = 10,
n_iter = 1)
iris
data, using Xgboost.res0 <- xgb_cv_opt(data = iris,
label = Species,
objectfun = "multi:softmax",
evalmetric = "mlogloss",
n_folds = 3,
classes = 3,
init_points = 10,
n_iter = 1)
See the vignette