cvFolds.Rd
Create folds based on training data fed in.
cvFolds(
data = train,
foldNumber = 5,
stratifyOnVar = FALSE,
whatVarToStratifyOn = "var"
)
The training data set
the number of folds for cross-validation. Default is 5.
Logical. Should the folds be stratified based on the response? If so, set TRUE.
Character. What is the name of the variable to stratify on?
An rsample::vfold_cv() object.
library(easytidymodels)
library(dplyr)
utils::data(penguins, package = "modeldata")
resp <- "sex"
split <- trainTestSplit(penguins, stratifyOnResponse = TRUE, responseVar = resp)
formula <- stats::as.formula(paste(resp, ".", sep="~"))
rec <- recipes::recipe(formula, data = split$train) %>% recipes::prep()
train_df <- recipes::bake(rec, split$train)
folds <- cvFolds(train_df)
folds
#> # 5-fold cross-validation
#> # A tibble: 5 x 2
#> splits id
#> <list> <chr>
#> 1 <split [219/55]> Fold1
#> 2 <split [219/55]> Fold2
#> 3 <split [219/55]> Fold3
#> 4 <split [219/55]> Fold4
#> 5 <split [220/54]> Fold5