ActuarialExam PADecision Trees and Ensemble Methods
Exam PA topic · 20–30% of exam

Decision Trees and Ensemble Methods

Fitting and tuning decision trees, random forests, and gradient boosting models in R.

Per-objective worked-example outlines

For each learning objective on Decision Trees and Ensemble Methods, here is the approach an exam item would test — the setup, the ordering of your reasoning, and the formula or identity you need to bring to the page. Approaches, not full solutions, by design. Verify against the current soa.org syllabus before your sitting.

Fit regression and classification trees using rpart in R

Setup

A dataset is given and you must fit a decision tree using rpart() and then prune it.

Approach

Use rpart(y ~ ., data = df, method = "anova" or "class") with control parameters cp (complexity) and minsplit. Examine the cp table with printcp() and select the cp at the minimum xerror (or one-SE rule). Prune with prune(tree, cp = chosen_cp). Visualize with rpart.plot() and interpret each split path.

Key identity

rpart(formula, method, control); printcp() then prune(tree, cp).

Tune random forests and gradient boosted trees for performance

Setup

You must fit a random forest and a gradient boosting model and tune key hyperparameters.

Approach

Random forest with randomForest(): tune mtry (number of variables tried at each split) and ntree. Use OOB error to select. GBM with gbm(): tune n.trees, interaction.depth, shrinkage. Smaller shrinkage with more trees usually generalizes better. Use a validation set or CV to select all hyperparameters; report final OOB or CV error.

Key identity

RF: tune mtry, ntree. GBM: tune n.trees, depth, shrinkage. Validate.

Compare tree-based methods to GLMs for an actuarial problem

Setup

You have fitted both a GLM and a tree-ensemble model and must choose one for production.

Approach

Compare on holdout metrics (RMSE, AUC, log loss) using the same validation set. Consider interpretability — GLMs are easier to explain to underwriters and regulators. Consider data size and feature interactions — trees handle non-linearities and high-order interactions automatically. State the tradeoff explicitly in the recommendation.

Key identity

Choose based on (predictive performance, interpretability, regulatory acceptance, complexity).

Common exam traps on Decision Trees and Ensemble Methods

Recurring patterns where candidates lose points on Decision Trees and Ensemble Methods-style items. Each entry pairs the trap with the fix.

Trap

Using the default cp value in rpart without checking the cp table.

Fix

Always inspect printcp() and choose the cp that minimizes cross-validated error.

Trap

Tuning RF and GBM on training error.

Fix

Use OOB error for RF and a validation set or CV for GBM.

Trap

Recommending a GBM in a regulatory context without discussing interpretability.

Fix

For regulated lines, discuss interpretability and possibly recommend a GLM with comparable performance.

Trap

Ignoring feature importance from RF/GBM in the report.

Fix

Always include variable importance plots and discuss the top features in business terms.

Where to find Decision Trees and Ensemble Methods in popular manuals

Pointers to where each major vendor covers this topic, so you can grab the right chapter without combing the full manual. We do not reproduce vendor content — just the location. Chapter and lesson numbers shift between editions; use these as a guide, not as a citation.

ACTEX

Trees and ensembles chapter in the PA manual with rpart, randomForest, gbm examples

Coaching Actuaries

Learn modules on Decision Trees and Ensembles in R; Adapt category "Trees / PA"

The Infinite Actuary

PA trees video block including rpart, RF, and GBM

7-day Decision Trees and Ensemble Methods micro plan

A focused 7-day sub-schedule for Decision Trees and Ensemble Methods specifically, at roughly 1.5–2.5 hours per day. Drop it inside your full Exam PA plan as a single coverage module.

Day 1

Practice rpart on a sample dataset; explore the cp table and prune to the one-SE point.

Day 2

Practice randomForest tuning — grid search over mtry and ntree, examine OOB error.

Day 3

Practice gbm tuning — grid search over n.trees, depth, and shrinkage on a validation set.

Day 4

Compare GLM and tree ensemble on the same dataset; write a one-page recommendation.

Day 5

Variable importance and partial dependence plots — practice generating and interpreting them.

Day 6

End-to-end tree section of a mock project against a 60-minute timer.

Day 7

Re-do flagged areas and refine your model-comparison report template.

How exclam.ai helps you master Decision Trees and Ensemble Methods

Flashcards from your manual

Upload your ACTEX Exam PA digital edition, scanned ASM pages, TIA handouts, or your own notes. exclam.ai extracts the Decision Trees and Ensemble Methods sections and generates flashcards automatically, tuned to the exam traps above.

Worked-example drilling

Each per-objective approach above maps to a quiz template. exclam.ai re-surfaces missed items until you can recall both the setup and the key identity from cold.

FSRS spaced repetition

Because Decision Trees and Ensemble Methods is 20–30% of your exam, losing it during review costs you. FSRS brings it back at the optimal moment.

Decision Trees and Ensemble Methods in the Exam PA context

SOA Exam PA has 4 topic areas. Decision Trees and Ensemble Methods is weighted at approximately 20–30% of the exam, here is where it sits relative to the other topics.

Topic areaWeight
Problem Framing and Data Preparation15–25%
Generalized Linear Models30–40%
→ Decision Trees and Ensemble Methods20–30%
Model Validation and Business Communication15–25%

Start practicing Decision Trees and Ensemble Methods today

Upload your ACTEX Exam PA digital edition, scanned ASM pages, TIA handouts, or your own notes. exclam.ai generates a fully guided study plan with adaptive flashcards and quizzes for this topic.

See pricing