ActuarialExam PAProblem Framing and Data Preparation
Exam PA topic · 15–25% of exam

Problem Framing and Data Preparation

Translating a business problem into a predictive modeling question, exploratory data analysis, and feature engineering.

Per-objective worked-example outlines

For each learning objective on Problem Framing and Data Preparation, here is the approach an exam item would test — the setup, the ordering of your reasoning, and the formula or identity you need to bring to the page. Approaches, not full solutions, by design. Verify against the current soa.org syllabus before your sitting.

Identify the business problem and frame it as a supervised learning task

Setup

A business stakeholder describes a goal in plain English (e.g., reduce policy lapse) and you must translate it into a modeling problem.

Approach

Identify the response (lapse/no lapse, claim amount, etc.) and decide whether it is regression or classification. Define the unit of analysis (policy, claim, customer) and the prediction horizon. State assumptions about data availability at scoring time — do not use post-event variables as predictors.

Key identity

Translate "what business problem" → "what response, unit, and prediction horizon".

Perform exploratory data analysis to identify patterns and data quality issues

Setup

A raw dataset is loaded into R and you must perform an EDA before any modeling.

Approach

Inspect each variable: type, distribution, missingness, outliers. Plot the response against key candidate predictors. Check for class imbalance in classification problems. Identify potential leakage variables (those known only after the prediction time). Note categorical variables with rare levels that may need consolidation.

Key identity

EDA = data audit: types, distributions, missingness, leakage, response relationships.

Engineer features including transformations, binning, and interaction terms

Setup

EDA reveals a skewed variable, a categorical with many rare levels, and a non-linear relationship between two predictors and the response.

Approach

Apply log or power transforms to skewed continuous variables. Bin or collapse categorical variables with few observations per level. Add interaction terms when the relationship between a predictor and the response varies across groups. Document each transformation with a business rationale so it can be defended in the report.

Key identity

Common transforms: log for skew, bins for sparse categories, interactions for differential effects.

Common exam traps on Problem Framing and Data Preparation

Recurring patterns where candidates lose points on Problem Framing and Data Preparation-style items. Each entry pairs the trap with the fix.

Trap

Including post-event variables (data leakage) as predictors.

Fix

Check the timing: every predictor must be known at the prediction time, not after.

Trap

Ignoring class imbalance in classification problems.

Fix

Discuss the imbalance in the EDA section; consider weighted models or threshold tuning later.

Trap

Dropping every row with a missing value.

Fix

Listwise deletion biases the data; consider imputation or treating missingness as informative.

Trap

Engineering features without a business rationale.

Fix

Document why each transformation is plausible; arbitrary features will be flagged in the report.

Where to find Problem Framing and Data Preparation in popular manuals

Pointers to where each major vendor covers this topic, so you can grab the right chapter without combing the full manual. We do not reproduce vendor content — just the location. Chapter and lesson numbers shift between editions; use these as a guide, not as a citation.

ACTEX

Problem framing and EDA chapter in the PA manual

Coaching Actuaries

Learn modules on Problem Framing and Data Preparation; Adapt category "Data Prep"

The Infinite Actuary

PA framing and EDA video block

7-day Problem Framing and Data Preparation micro plan

A focused 7-day sub-schedule for Problem Framing and Data Preparation specifically, at roughly 1.5–2.5 hours per day. Drop it inside your full Exam PA plan as a single coverage module.

Day 1

Read the problem framing chapter; review a past SOA sample project to see how problems are framed.

Day 2

EDA practice in R — load a sample dataset and walk through summary(), str(), and ggplot2 distributions.

Day 3

Practice writing the framing and EDA sections for one full mock project.

Day 4

Feature engineering — practice transformations (log, bin, interaction) on a dataset, and document each in writing.

Day 5

Drill data leakage detection on 5-6 example datasets with planted leakage variables.

Day 6

End-to-end EDA writeup against a 60-minute timer; treat it like the exam.

Day 7

Re-do flagged sections and refine your EDA report template.

How exclam.ai helps you master Problem Framing and Data Preparation

Flashcards from your manual

Upload your ACTEX Exam PA digital edition, scanned ASM pages, TIA handouts, or your own notes. exclam.ai extracts the Problem Framing and Data Preparation sections and generates flashcards automatically, tuned to the exam traps above.

Worked-example drilling

Each per-objective approach above maps to a quiz template. exclam.ai re-surfaces missed items until you can recall both the setup and the key identity from cold.

FSRS spaced repetition

Because Problem Framing and Data Preparation is 15–25% of your exam, losing it during review costs you. FSRS brings it back at the optimal moment.

Problem Framing and Data Preparation in the Exam PA context

SOA Exam PA has 4 topic areas. Problem Framing and Data Preparation is weighted at approximately 15–25% of the exam, here is where it sits relative to the other topics.

Topic areaWeight
→ Problem Framing and Data Preparation15–25%
Generalized Linear Models30–40%
Decision Trees and Ensemble Methods20–30%
Model Validation and Business Communication15–25%

Start practicing Problem Framing and Data Preparation today

Upload your ACTEX Exam PA digital edition, scanned ASM pages, TIA handouts, or your own notes. exclam.ai generates a fully guided study plan with adaptive flashcards and quizzes for this topic.

See pricing