Per-objective worked-example outlines

For each learning objective on Decision Trees, here is the approach an exam item would test — the setup, the ordering of your reasoning, and the formula or identity you need to bring to the page. Approaches, not full solutions, by design. Verify against the current soa.org syllabus before your sitting.

Construct and interpret classification and regression trees

Setup

A small dataset is given and you must build a tree by recursive binary splitting using Gini, entropy (classification), or RSS (regression).

Approach

At each node, search every predictor and every candidate split point to find the split minimizing the impurity criterion (Gini or entropy for classification, RSS for regression). Recurse on each child until a stopping rule fires (min size, max depth, min improvement). Interpret each leaf as the prediction for observations falling into that path.

Key identity

Splits chosen to minimize Σ p_k (1 - p_k) (Gini), -Σ p_k log p_k (entropy), or Σ (y - ȳ)^2 (RSS).

Apply cost-complexity pruning to simplify decision trees

Setup

A fully grown tree overfits the training data and you must prune using cost-complexity to find the optimal subtree.

Approach

Define cost complexity Cα(T) = misclassification (or RSS) + α |T| where |T| is the number of terminal nodes. Iteratively collapse the weakest link (smallest increase in cost per node removed) to generate a sequence of subtrees. Choose α via cross-validation. The resulting tree balances fit and complexity.

Key identity

Cα(T) = error(T) + α |T|; cross-validate α.

Explain and apply bagging, random forests, and boosted trees

Setup

A single tree is compared against bagging, random forest, and gradient boosting on a regression or classification task.

Approach

Bagging: fit B trees on bootstrap samples and average predictions. Random forest: same as bagging, but at each split consider a random subset of m predictors (de-correlates trees, lower variance). Boosting: fit trees sequentially on residuals (or weighted observations), with each tree fixing the previous mistakes. Random forests are robust; boosting often more accurate but needs tuning.

Key identity

Bagging: parallel bootstrap. Random forest: bagging + random feature subsets. Boosting: sequential weighted fits.

Common exam traps on Decision Trees

Recurring patterns where candidates lose points on Decision Trees-style items. Each entry pairs the trap with the fix.

Trap

Confusing Gini index with entropy in classification tree splits.

Fix

Both measure impurity; they differ slightly numerically but the chosen split is usually the same.

Trap

Forgetting that random forests use random feature subsets, not full predictors.

Fix

At each split in a random forest, only m of p predictors are considered (m ≈ √p for classification, p/3 for regression).

Trap

Increasing the number of boosting iterations without checking the validation loss.

Fix

Boosting can overfit if not stopped; tune the number of trees via CV or early stopping.

Trap

Using deep individual trees in a random forest and expecting low variance.

Fix

Random forest gets variance reduction from averaging many decorrelated trees, not from depth control alone.

Where to find Decision Trees in popular manuals

Pointers to where each major vendor covers this topic, so you can grab the right chapter without combing the full manual. We do not reproduce vendor content — just the location. Chapter and lesson numbers shift between editions; use these as a guide, not as a citation.

ACTEX

Decision trees and ensembles chapter (CART, random forest, GBM)

Coaching Actuaries

Learn modules on Decision Trees; Adapt category "Decision Trees / Ensembles"

The Infinite Actuary

Tree and ensemble video block

7-day Decision Trees micro plan

A focused 7-day sub-schedule for Decision Trees specifically, at roughly 1.5–2.5 hours per day. Drop it inside your full Exam SRM plan as a single coverage module.

Day 1

Read the tree-building chapter; manually grow a small regression tree on a 10-row dataset.

Day 2

Practice manual splits using Gini, entropy, and RSS on small classification examples.

Day 3

Pruning — work 6 problems applying cost-complexity to a fully grown tree.

Day 4

Bagging and random forest — 8 conceptual and applied problems.

Day 5

Gradient boosting — 6 problems including tuning iterations and shrinkage.

Day 6

Mixed 18-problem timed drill spanning trees and ensembles.

Day 7

Re-do flagged problems and create a one-page chart comparing tree methods.

How exclam.ai helps you master Decision Trees

Flashcards from your manual

Upload your ACTEX Exam SRM digital edition, scanned ASM pages, TIA handouts, or your own notes. exclam.ai extracts the Decision Trees sections and generates flashcards automatically, tuned to the exam traps above.

Worked-example drilling

Each per-objective approach above maps to a quiz template. exclam.ai re-surfaces missed items until you can recall both the setup and the key identity from cold.

FSRS spaced repetition

Because Decision Trees is 20–25% of your exam, losing it during review costs you. FSRS brings it back at the optimal moment.

Decision Trees in the Exam SRM context

SOA Exam SRM has 5 topic areas. Decision Trees is weighted at approximately 20–25% of the exam, here is where it sits relative to the other topics.

Topic area	Weight
Basics of Statistical Learning	7–13%
Linear Models	40–50%
→ Decision Trees	20–25%
Principal Components and Cluster Analysis	5–10%
Time Series Models	5–10%

Decision Trees

Per-objective worked-example outlines

Construct and interpret classification and regression trees

Apply cost-complexity pruning to simplify decision trees

Explain and apply bagging, random forests, and boosted trees

Common exam traps on Decision Trees

Where to find Decision Trees in popular manuals

7-day Decision Trees micro plan

How exclam.ai helps you master Decision Trees

Flashcards from your manual

Worked-example drilling

FSRS spaced repetition

Decision Trees in the Exam SRM context

Other Exam SRM topics

Basics of Statistical Learning

Linear Models

Principal Components and Cluster Analysis

Time Series Models

Start practicing Decision Trees today