Skip to contents
stopifnot("mlbench not installed" = requireNamespace("mlbench", quietly = TRUE))
stopifnot("rsample not installed" = requireNamespace("rsample", quietly = TRUE))
library(bnns)
library(mlbench)
library(rsample)
set.seed(123)

Introduction

This article demonstrates the use of the bnns package on three datasets from the mlbench package:

  • Regression: BostonHousing dataset
  • Binary Classification: PimaIndiansDiabetes dataset
  • Multi-class Classification: Glass dataset

For each dataset, we: 1. Prepare the data for training and testing. 2. Build a Bayesian Neural Network using the bnns package. 3. Evaluate the model’s predictive performance.


Regression: BostonHousing Dataset

Dataset Description

The BostonHousing dataset contains information on housing prices in Boston, with features like crime rate, average number of rooms, and more.

data(BostonHousing)
BH_data <- BostonHousing
# Splitting data into training and testing sets
BH_split <- initial_split(BH_data, prop = 0.8)
BH_train <- training(BH_split)
BH_test <- testing(BH_split)

Model Training

model_reg <- bnns(
  medv ~ -1 + .,
  data = BH_train, L = 2, out_act_fn = 1,
  iter = 1e3, warmup = 2e2, chains = 2, cores = 2
)
#> Trying to compile a simple C file
#> Running /opt/R/4.4.2/lib/R/bin/R CMD SHLIB foo.c
#> using C compiler: ‘gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0’
#> gcc -I"/opt/R/4.4.2/lib/R/include" -DNDEBUG   -I"/home/runner/work/_temp/Library/Rcpp/include/"  -I"/home/runner/work/_temp/Library/RcppEigen/include/"  -I"/home/runner/work/_temp/Library/RcppEigen/include/unsupported"  -I"/home/runner/work/_temp/Library/BH/include" -I"/home/runner/work/_temp/Library/StanHeaders/include/src/"  -I"/home/runner/work/_temp/Library/StanHeaders/include/"  -I"/home/runner/work/_temp/Library/RcppParallel/include/"  -I"/home/runner/work/_temp/Library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -D_HAS_AUTO_PTR_ETC=0  -include '/home/runner/work/_temp/Library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include    -fpic  -g -O2  -c foo.c -o foo.o
#> In file included from /home/runner/work/_temp/Library/RcppEigen/include/Eigen/Core:19,
#>                  from /home/runner/work/_temp/Library/RcppEigen/include/Eigen/Dense:1,
#>                  from /home/runner/work/_temp/Library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22,
#>                  from <command-line>:
#> /home/runner/work/_temp/Library/RcppEigen/include/Eigen/src/Core/util/Macros.h:679:10: fatal error: cmath: No such file or directory
#>   679 | #include <cmath>
#>       |          ^~~~~~~
#> compilation terminated.
#> make: *** [/opt/R/4.4.2/lib/R/etc/Makeconf:195: foo.o] Error 1
#> 
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 0.000619 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 6.19 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:   1 / 1000 [  0%]  (Warmup)
#> 
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
#> Chain 2: 
#> Chain 2: Gradient evaluation took 0.0005 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 5 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2: 
#> Chain 2: 
#> Chain 2: Iteration:   1 / 1000 [  0%]  (Warmup)
#> Chain 2: Iteration: 100 / 1000 [ 10%]  (Warmup)
#> Chain 1: Iteration: 100 / 1000 [ 10%]  (Warmup)
#> Chain 2: Iteration: 200 / 1000 [ 20%]  (Warmup)
#> Chain 2: Iteration: 201 / 1000 [ 20%]  (Sampling)
#> Chain 1: Iteration: 200 / 1000 [ 20%]  (Warmup)
#> Chain 1: Iteration: 201 / 1000 [ 20%]  (Sampling)
#> Chain 2: Iteration: 300 / 1000 [ 30%]  (Sampling)
#> Chain 2: Iteration: 400 / 1000 [ 40%]  (Sampling)
#> Chain 1: Iteration: 300 / 1000 [ 30%]  (Sampling)
#> Chain 2: Iteration: 500 / 1000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 400 / 1000 [ 40%]  (Sampling)
#> Chain 2: Iteration: 600 / 1000 [ 60%]  (Sampling)
#> Chain 2: Iteration: 700 / 1000 [ 70%]  (Sampling)
#> Chain 1: Iteration: 500 / 1000 [ 50%]  (Sampling)
#> Chain 2: Iteration: 800 / 1000 [ 80%]  (Sampling)
#> Chain 1: Iteration: 600 / 1000 [ 60%]  (Sampling)
#> Chain 2: Iteration: 900 / 1000 [ 90%]  (Sampling)
#> Chain 2: Iteration: 1000 / 1000 [100%]  (Sampling)
#> Chain 2: 
#> Chain 2:  Elapsed Time: 17.015 seconds (Warm-up)
#> Chain 2:                52.596 seconds (Sampling)
#> Chain 2:                69.611 seconds (Total)
#> Chain 2: 
#> Chain 1: Iteration: 700 / 1000 [ 70%]  (Sampling)
#> Chain 1: Iteration: 800 / 1000 [ 80%]  (Sampling)
#> Chain 1: Iteration: 900 / 1000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 1000 / 1000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 19.97 seconds (Warm-up)
#> Chain 1:                86.915 seconds (Sampling)
#> Chain 1:                106.885 seconds (Total)
#> Chain 1:

Model Evaluation

BH_pred <- predict(model_reg, newdata = BH_test)
measure_cont(BH_test$medv, BH_pred)
#> $rmse
#> [1] 8.297719
#> 
#> $mae
#> [1] 5.822318

Binary Classification: PimaIndiansDiabetes Dataset

Dataset Description

The PimaIndiansDiabetes dataset contains features related to health status for predicting the presence of diabetes.

data(PimaIndiansDiabetes)
PID_data <- PimaIndiansDiabetes |>
  transform(diabetes = ifelse(diabetes == "pos", 1, 0))
# Splitting data into training and testing sets
PID_split <- initial_split(PID_data, prop = 0.8, strata = "diabetes")
PID_train <- training(PID_split)
PID_test <- testing(PID_split)

Model Training

model_bin <- bnns(
  diabetes ~ -1 + .,
  data = PID_train, L = 2,
  out_act_fn = 2, iter = 1e3, warmup = 2e2, chains = 2, cores = 2
)
#> Trying to compile a simple C file
#> Running /opt/R/4.4.2/lib/R/bin/R CMD SHLIB foo.c
#> using C compiler: ‘gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0’
#> gcc -I"/opt/R/4.4.2/lib/R/include" -DNDEBUG   -I"/home/runner/work/_temp/Library/Rcpp/include/"  -I"/home/runner/work/_temp/Library/RcppEigen/include/"  -I"/home/runner/work/_temp/Library/RcppEigen/include/unsupported"  -I"/home/runner/work/_temp/Library/BH/include" -I"/home/runner/work/_temp/Library/StanHeaders/include/src/"  -I"/home/runner/work/_temp/Library/StanHeaders/include/"  -I"/home/runner/work/_temp/Library/RcppParallel/include/"  -I"/home/runner/work/_temp/Library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -D_HAS_AUTO_PTR_ETC=0  -include '/home/runner/work/_temp/Library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include    -fpic  -g -O2  -c foo.c -o foo.o
#> In file included from /home/runner/work/_temp/Library/RcppEigen/include/Eigen/Core:19,
#>                  from /home/runner/work/_temp/Library/RcppEigen/include/Eigen/Dense:1,
#>                  from /home/runner/work/_temp/Library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22,
#>                  from <command-line>:
#> /home/runner/work/_temp/Library/RcppEigen/include/Eigen/src/Core/util/Macros.h:679:10: fatal error: cmath: No such file or directory
#>   679 | #include <cmath>
#>       |          ^~~~~~~
#> compilation terminated.
#> make: *** [/opt/R/4.4.2/lib/R/etc/Makeconf:195: foo.o] Error 1
#> 
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 0.00078 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 7.8 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:   1 / 1000 [  0%]  (Warmup)
#> 
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
#> Chain 2: 
#> Chain 2: Gradient evaluation took 0.000561 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 5.61 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2: 
#> Chain 2: 
#> Chain 2: Iteration:   1 / 1000 [  0%]  (Warmup)
#> Chain 1: Iteration: 100 / 1000 [ 10%]  (Warmup)
#> Chain 2: Iteration: 100 / 1000 [ 10%]  (Warmup)
#> Chain 1: Iteration: 200 / 1000 [ 20%]  (Warmup)
#> Chain 1: Iteration: 201 / 1000 [ 20%]  (Sampling)
#> Chain 2: Iteration: 200 / 1000 [ 20%]  (Warmup)
#> Chain 2: Iteration: 201 / 1000 [ 20%]  (Sampling)
#> Chain 2: Iteration: 300 / 1000 [ 30%]  (Sampling)
#> Chain 1: Iteration: 300 / 1000 [ 30%]  (Sampling)
#> Chain 2: Iteration: 400 / 1000 [ 40%]  (Sampling)
#> Chain 1: Iteration: 400 / 1000 [ 40%]  (Sampling)
#> Chain 2: Iteration: 500 / 1000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 500 / 1000 [ 50%]  (Sampling)
#> Chain 2: Iteration: 600 / 1000 [ 60%]  (Sampling)
#> Chain 2: Iteration: 700 / 1000 [ 70%]  (Sampling)
#> Chain 1: Iteration: 600 / 1000 [ 60%]  (Sampling)
#> Chain 2: Iteration: 800 / 1000 [ 80%]  (Sampling)
#> Chain 1: Iteration: 700 / 1000 [ 70%]  (Sampling)
#> Chain 2: Iteration: 900 / 1000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 800 / 1000 [ 80%]  (Sampling)
#> Chain 2: Iteration: 1000 / 1000 [100%]  (Sampling)
#> Chain 2: 
#> Chain 2:  Elapsed Time: 27.03 seconds (Warm-up)
#> Chain 2:                70.365 seconds (Sampling)
#> Chain 2:                97.395 seconds (Total)
#> Chain 2: 
#> Chain 1: Iteration: 900 / 1000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 1000 / 1000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 24.864 seconds (Warm-up)
#> Chain 1:                91.483 seconds (Sampling)
#> Chain 1:                116.347 seconds (Total)
#> Chain 1:

Model Evaluation

PID_pred <- predict(model_bin, newdata = PID_test)
PID_measure <- measure_bin(PID_test$diabetes, PID_pred)
#> Setting levels: control = 0, case = 1
#> Setting direction: controls < cases
PID_measure
#> $conf_mat
#>    pred_label
#> obs  0  1
#>   0 82 18
#>   1 29 25
#> 
#> $accuracy
#> [1] 0.6948052
#> 
#> $ROC
#> 
#> Call:
#> roc.default(response = obs, predictor = pred)
#> 
#> Data: pred in 100 controls (obs 0) < 54 cases (obs 1).
#> Area under the curve: 0.7296
#> 
#> $AUC
#> [1] 0.7296296
plot(PID_measure$ROC)


Multi-class Classification: Glass Dataset

Dataset Description

The Glass dataset contains features to classify glass types.

data(Glass)
Glass_data <- Glass

# Splitting data into training and testing sets
Glass_split <- initial_split(Glass_data, prop = 0.8, strata = "Type")
Glass_train <- training(Glass_split)
Glass_test <- testing(Glass_split)

Model Training

model_multi <- bnns(
  Type ~ -1 + .,
  data = Glass_train, L = 2,
  out_act_fn = 3, iter = 1e3, warmup = 2e2, chains = 2, cores = 2
)
#> Trying to compile a simple C file
#> Running /opt/R/4.4.2/lib/R/bin/R CMD SHLIB foo.c
#> using C compiler: ‘gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0’
#> gcc -I"/opt/R/4.4.2/lib/R/include" -DNDEBUG   -I"/home/runner/work/_temp/Library/Rcpp/include/"  -I"/home/runner/work/_temp/Library/RcppEigen/include/"  -I"/home/runner/work/_temp/Library/RcppEigen/include/unsupported"  -I"/home/runner/work/_temp/Library/BH/include" -I"/home/runner/work/_temp/Library/StanHeaders/include/src/"  -I"/home/runner/work/_temp/Library/StanHeaders/include/"  -I"/home/runner/work/_temp/Library/RcppParallel/include/"  -I"/home/runner/work/_temp/Library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -D_HAS_AUTO_PTR_ETC=0  -include '/home/runner/work/_temp/Library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include    -fpic  -g -O2  -c foo.c -o foo.o
#> In file included from /home/runner/work/_temp/Library/RcppEigen/include/Eigen/Core:19,
#>                  from /home/runner/work/_temp/Library/RcppEigen/include/Eigen/Dense:1,
#>                  from /home/runner/work/_temp/Library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22,
#>                  from <command-line>:
#> /home/runner/work/_temp/Library/RcppEigen/include/Eigen/src/Core/util/Macros.h:679:10: fatal error: cmath: No such file or directory
#>   679 | #include <cmath>
#>       |          ^~~~~~~
#> compilation terminated.
#> make: *** [/opt/R/4.4.2/lib/R/etc/Makeconf:195: foo.o] Error 1
#> 
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 0.000518 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 5.18 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:   1 / 1000 [  0%]  (Warmup)
#> 
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
#> Chain 2: 
#> Chain 2: Gradient evaluation took 0.000569 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 5.69 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2: 
#> Chain 2: 
#> Chain 2: Iteration:   1 / 1000 [  0%]  (Warmup)
#> Chain 1: Iteration: 100 / 1000 [ 10%]  (Warmup)
#> Chain 2: Iteration: 100 / 1000 [ 10%]  (Warmup)
#> Chain 1: Iteration: 200 / 1000 [ 20%]  (Warmup)
#> Chain 1: Iteration: 201 / 1000 [ 20%]  (Sampling)
#> Chain 2: Iteration: 200 / 1000 [ 20%]  (Warmup)
#> Chain 2: Iteration: 201 / 1000 [ 20%]  (Sampling)
#> Chain 1: Iteration: 300 / 1000 [ 30%]  (Sampling)
#> Chain 2: Iteration: 300 / 1000 [ 30%]  (Sampling)
#> Chain 1: Iteration: 400 / 1000 [ 40%]  (Sampling)
#> Chain 1: Iteration: 500 / 1000 [ 50%]  (Sampling)
#> Chain 2: Iteration: 400 / 1000 [ 40%]  (Sampling)
#> Chain 1: Iteration: 600 / 1000 [ 60%]  (Sampling)
#> Chain 1: Iteration: 700 / 1000 [ 70%]  (Sampling)
#> Chain 2: Iteration: 500 / 1000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 800 / 1000 [ 80%]  (Sampling)
#> Chain 2: Iteration: 600 / 1000 [ 60%]  (Sampling)
#> Chain 1: Iteration: 900 / 1000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 1000 / 1000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 18.44 seconds (Warm-up)
#> Chain 1:                57.758 seconds (Sampling)
#> Chain 1:                76.198 seconds (Total)
#> Chain 1: 
#> Chain 2: Iteration: 700 / 1000 [ 70%]  (Sampling)
#> Chain 2: Iteration: 800 / 1000 [ 80%]  (Sampling)
#> Chain 2: Iteration: 900 / 1000 [ 90%]  (Sampling)
#> Chain 2: Iteration: 1000 / 1000 [100%]  (Sampling)
#> Chain 2: 
#> Chain 2:  Elapsed Time: 20.425 seconds (Warm-up)
#> Chain 2:                92.069 seconds (Sampling)
#> Chain 2:                112.494 seconds (Total)
#> Chain 2:

Model Evaluation

Glass_pred <- predict(model_multi, newdata = Glass_test)
measure_cat(Glass_test$Type, Glass_pred)
#> $log_loss
#> [1] 1.353789
#> 
#> $ROC
#> 
#> Call:
#> multiclass.roc.default(response = obs, predictor = `colnames<-`(data.frame(pred),     levels(obs)))
#> 
#> Data: multivariate predictor `colnames<-`(data.frame(pred), levels(obs)) with 6 levels of obs: 1, 2, 3, 5, 6, 7.
#> Multi-class area under the curve: 0.7095
#> 
#> $AUC
#> [1] 0.7094742

Summary

The performance of the bnns package demonstrates its flexibility across various machine learning tasks. It provides posterior distributions of predictions, which can be used for uncertainty quantification and probabilistic decision-making.