Create folds based on training data fed in.

cvFolds(
  data = train,
  foldNumber = 5,
  stratifyOnVar = FALSE,
  whatVarToStratifyOn = "var"
)

Arguments

data

The training data set

foldNumber

the number of folds for cross-validation. Default is 5.

stratifyOnVar

Logical. Should the folds be stratified based on the response? If so, set TRUE.

whatVarToStratifyOn

Character. What is the name of the variable to stratify on?

Value

An rsample::vfold_cv() object.

Examples

library(easytidymodels)
library(dplyr)
utils::data(penguins, package = "modeldata")
resp <- "sex"
split <- trainTestSplit(penguins, stratifyOnResponse = TRUE, responseVar = resp)
formula <- stats::as.formula(paste(resp, ".", sep="~"))
rec <- recipes::recipe(formula, data = split$train) %>% recipes::prep()
train_df <- recipes::bake(rec, split$train)
folds <- cvFolds(train_df)
folds
#> #  5-fold cross-validation 
#> # A tibble: 5 x 2
#>   splits           id   
#>   <list>           <chr>
#> 1 <split [219/55]> Fold1
#> 2 <split [219/55]> Fold2
#> 3 <split [219/55]> Fold3
#> 4 <split [219/55]> Fold4
#> 5 <split [220/54]> Fold5