Genetics and Risk Prediction

We published a paper that predicts the risk of preeclampsia by incorporating the genetic risk of both the pregnant individual and her partner. We have been working on developing risk prediction models for preeclampsia for about five years (across three papers), and this study represents a step into a new domain, “genetic information,” making it one of the first of its kind globally.

In pursuing preventive medicine, an individual’s genetic predisposition is an essential factor that cannot be ignored. We hope that this approach will be applied across a wide range of fields in the future.

Read the paper

The Starting Point: Why Do We Predict?

This paper marks a milestone in my research, yet one question has consistently challenged me:

“Why do we need to predict disease risk in the first place?”

When working with data, there is an almost instinctive drive to predict. My research began with that instinct. However, as a medical student at the time, I did not fully understand why prediction models for preeclampsia were considered so important in obstetrics.

Still, the fact that numerous prediction models had already been reported in the literature suggested that they must hold some meaningful value.

The Mainstream in Epidemiology: Causal Inference

The mainstream of epidemiological research is grounded in causal inference.

Studies centered on prediction models tend to be more aligned with data science than with clinical research, and in clinical papers they are often treated as secondary to causal inference.

At its core, causal inference asks:

“What would happen if this exposure or intervention did not occur (or did occur)?”

This question is directly relevant to clinical decision-making and health policy.

For example:

A medication lowers blood pressure by 5 mmHg
Even light exercise reduces cardiovascular risk by 3%

These findings are intuitively actionable in clinical practice.

Risk Prediction vs. Causal Inference

What about prediction models? Does knowing an individual’s risk change clinical practice?

In many cases, the clinical application of prediction models remains challenging. A common misunderstanding is:

Interpreting prediction models as if they estimate intervention effects

Example

Consider a lung cancer risk prediction model that includes:

Baseline characteristics
Smoking status
Serum cytokine levels

Suppose a smoker is estimated to have a 10-year risk of 5%. If we change the smoking status in the model to “non-smoker,” the predicted risk decreases to 4%.

However, this 1% difference does not mean that quitting smoking reduces risk by 1%.

This is because:

Changes in cytokine levels following smoking cessation are not captured
The model does not reflect the real intervention process

With some exceptions (such as g-methods), prediction-based approaches do not satisfy the principles of causal inference.

If the goal is to estimate intervention effects, a causal inference framework should be used from the outset.
Regression coefficients or feature contributions from prediction models should not be interpreted as causal effects.

When Does Risk Prediction Matter?

In my view, the value of prediction models can be summarized in three situations:

When presenting risk can lead to behavioral change in individuals
When clinical interventions vary according to risk level
When risk estimates influence health policy decisions

In other words: