medmodels.treatment_effect_estimation.continuous_estimators
average_treatment_effect
def average_treatment_effect(treated_set: pd.DataFrame,
control_set: pd.DataFrame,
outcome_variable: str) -> float
Calculates the Average Treatment Effect (ATE) as the difference between the outcome means of the treated and control sets. A positive ATE indicates that the treatment increased the outcome, while a negative ATE suggests a decrease.
The ATE is computed as follows when the numbers of observations in treated and control sets are N and M, respectively:
where and represent outcome values for individual treated and control observations. In the case of matched sets with equal sizes (N = M), the formula simplifies to:
Arguments:
treated_set
pd.DataFrame - DataFrame of the treated group.control_set
pd.DataFrame - DataFrame of the control group.outcome_variable
str - Name of the outcome variable.
Returns:
-
float
- The average treatment effect.This function provides a simple yet powerful method for estimating the impact of a treatment by comparing average outcomes between treated and control groups.
cohen_d
def cohen_d(treated_set: pd.DataFrame,
control_set: pd.DataFrame,
outcome_variable: str,
add_correction: bool = False) -> float
Calculates Cohen's D, the standardized mean difference between two sets, measuring the effect size of the difference between two outcome means. It's applicable for any two sets but is recommended for sets of the same size. Cohen's D indicates how many standard deviations the two groups differ by, with 1 standard deviation equal to 1 z-score.
A rule of thumb for interpreting Cohen's D:
- Small effect = 0.2
- Medium effect = 0.5
- Large effect = 0.8
Arguments:
treated_set
pd.DataFrame - DataFrame containing the treated group data.control_set
pd.DataFrame - DataFrame containing the control group data.outcome_variable
str - The name of the outcome variable to analyze.add_correction
bool, optional - Whether to apply a correction factor for small sample sizes. Defaults to False.
Returns:
-
float
- The Cohen's D coefficient, representing the effect size.This metric provides a dimensionless measure of effect size, facilitating the comparison across different studies and contexts.
medmodels.treatment_effect_estimation.treatment_effect
This module provides a class for analyzing treatment effects in medical records.
The TreatmentEffect class facilitates the analysis of treatment effects over time or across different patient groups. It allows users to identify patients who underwent treatment and experienced outcomes, and find a control group with similar criteria but without undergoing the treatment. The class supports customizable criteria filtering, time constraints between treatment and outcome, and optional matching of control groups to treatment groups using a specified matching class.
TreatmentEffect Objects
class TreatmentEffect()
This class facilitates the analysis of treatment effects over time and across different patient groups.
__init__
def __init__(treatment: Group, outcome: Group) -> None
Initializes a Treatment Effect analysis setup with the group of the Medrecord that contains the treatment node IDs and the group of the Medrecord that contains the outcome node IDs.
Arguments:
treatment
Group - The group of treatments to analyze.outcome
Group - The group of outcomes to analyze.
builder
@classmethod
def builder(cls) -> TreatmentEffectBuilder
Creates a TreatmentEffectBuilder instance for the TreatmentEffect class.
estimate
@property
def estimate() -> Estimate
Creates an Estimate object for the TreatmentEffect instance.
Returns:
Estimate
- An Estimate object for the current TreatmentEffect instance.
report
@property
def report() -> Report
Creates a Report object for the TreatmentEffect instance.
Returns:
Report
- A Report object for the current TreatmentEffect instance.
medmodels.treatment_effect_estimation.tests.test_treatment_effect
Tests for the TreatmentEffect class in the treatment_effect module.
create_patients
def create_patients(patient_list: List[NodeIndex]) -> pd.DataFrame
Create a patients dataframe.
Returns:
pd.DataFrame
- A patients dataframe.
create_diagnoses
def create_diagnoses() -> pd.DataFrame
Create a diagnoses dataframe.
Returns:
pd.DataFrame
- A diagnoses dataframe.
create_prescriptions
def create_prescriptions() -> pd.DataFrame
Create a prescriptions dataframe.
Returns:
pd.DataFrame
- A prescriptions dataframe.
create_edges
def create_edges(patient_list: List[NodeIndex]) -> pd.DataFrame
Create an edges dataframe.
Returns:
pd.DataFrame
- An edges dataframe.
create_medrecord
def create_medrecord(
patient_list: List[NodeIndex] = [
"P1",
"P2",
"P3",
"P4",
"P5",
"P6",
"P7",
"P8",
"P9",
]
) -> MedRecord
Create a MedRecord object.
Returns:
MedRecord
- A MedRecord object.
TestTreatmentEffect Objects
class TestTreatmentEffect(unittest.TestCase)
Class to test the TreatmentEffect class in the treatment_effect module.
test_metrics
def test_metrics()
Test the metrics of the TreatmentEffect class.
test_full_report
def test_full_report()
Test the reporting of the TreatmentEffect class.
medmodels.treatment_effect_estimation.builder
TreatmentEffectBuilder Objects
class TreatmentEffectBuilder()
with_treatment
def with_treatment(treatment: Group) -> TreatmentEffectBuilder
Sets the treatment group for the treatment effect estimation.
Arguments:
treatment
Group - The treatment group.
Returns:
TreatmentEffectBuilder
- The current instance of the TreatmentEffectBuilder.
with_outcome
def with_outcome(outcome: Group) -> TreatmentEffectBuilder
Sets the outcome group for the treatment effect estimation.
Arguments:
outcome
Group - The group to be used as the outcome.
Returns:
TreatmentEffectBuilder
- The current instance of the TreatmentEffectBuilder with updated outcome group.
with_patients_group
def with_patients_group(group: Group) -> TreatmentEffectBuilder
Sets the group of patients to be used in the treatment effect estimation.
Arguments:
group
Group - The group of patients.
Returns:
TreatmentEffectBuilder
- The current instance of the TreatmentEffectBuilder with updated patients group.
with_time_attribute
def with_time_attribute(
attribute: MedRecordAttribute) -> TreatmentEffectBuilder
Sets the time attribute to be used in the treatment effect estimation.
Arguments:
attribute
MedRecordAttribute - The time attribute.
Returns:
TreatmentEffectBuilder
- The current instance of the TreatmentEffectBuilder with updated time attribute.
with_washout_period
def with_washout_period(
days: Optional[Dict[str, int]] = None,
reference: Optional[Literal["first", "last"]] = None
) -> TreatmentEffectBuilder
Sets the washout period for the treatment effect estimation. The washout period is the period of time before the treatment that is not considered in the estimation.
Arguments:
days
Optional[Dict[str, int]], optional - The duration of the washout period in days. If None, the duration is left as it was. Defaults to None.reference
Optional[Literal['first', 'last']], optional - The reference point for the washout period. Must be either 'first' or 'last'. Defaults to None.
Returns:
TreatmentEffectBuilder
- The current instance of the TreatmentEffectBuilder with updated time attribute.
with_grace_period
def with_grace_period(
days: Optional[int] = None,
reference: Optional[Literal["first", "last"]] = None
) -> TreatmentEffectBuilder
Sets the grace period for the treatment effect estimation. The grace period is the period of time after the treatment that is not considered in the estimation.
Arguments:
days
Optional[int], optional - The duration of the grace period in days. If None, the duration is left as it was. Defaults to 0.reference
Optional[Literal['first', 'last']], optional - The reference point for the grace period. Must be either 'first' or 'last'. Defaults to None.
Returns:
TreatmentEffectBuilder
- The current instance of the TreatmentEffectBuilder with updated time attribute.
with_follow_up_period
def with_follow_up_period(
days: Optional[int] = None,
reference: Optional[Literal["first", "last"]] = None
) -> TreatmentEffectBuilder
Sets the follow-up period for the treatment effect estimation.
Arguments:
days
Optional[int], optional - The duration of the follow-up period in days. If None, the duration is left as it was. Defaults to 365.reference
Optional[Literal['first', 'last']], optional - The reference point for the follow-up period. Must be either 'first' or 'last'. Defaults to None.
Returns:
TreatmentEffectBuilder
- The current instance of the TreatmentEffectBuilder with updated time attribute.
with_outcome_before_treatment_exclusion
def with_outcome_before_treatment_exclusion(
days: int) -> TreatmentEffectBuilder
Define whether we allow the outcome to exist before the treatment or not. The outcome_before_treatment_days parameter is used to set the number of days before the treatment that the outcome should not exist. If not set, the outcome is allowed to exist before the treatment.
Arguments:
days
int - The number of days before the treatment that the outcome should not exist.
Returns:
TreatmentEffectBuilder
- The current instance of the TreatmentEffectBuilder with updated time attribute.
filter_controls
def filter_controls(operation: NodeOperation) -> TreatmentEffectBuilder
Filter the control group based on the provided operation.
Arguments:
operation
NodeOperation - The operation to be applied to the control group.
Returns:
TreatmentEffectBuilder
- The current instance of the TreatmentEffectBuilder with updated time attribute.
with_propensity_matching
def with_propensity_matching(
essential_covariates: MedRecordAttributeInputList = ["gender", "age"],
one_hot_covariates: MedRecordAttributeInputList = ["gender"],
model: Model = "logit",
distance_metric: Metric = "mahalanobis",
number_of_neighbors: int = 1,
hyperparam: Optional[Dict[str, Any]] = None) -> TreatmentEffectBuilder
Adjust the treatment effect estimate using propensity score matching.
Arguments:
essential_covariates (MedRecordAttributeInputList, optional): Covariates that are essential for matching. Defaults to ["gender", "age"]. one_hot_covariates (MedRecordAttributeInputList, optional): Covariates that are one-hot encoded for matching. Defaults to ["gender"].
model
Model, optional - Model to choose for the matching. Defaults to "logit".distance_metric
Metric, optional - Metric to use for the distance calculation. Defaults to "mahalanobis".number_of_neighbors
int, optional - Number of neighbors to consider for the matching. Defaults to 1.hyperparam
Optional[Dict[str, Any]], optional - Hyperparameters for the matching model. Defaults to None.
Returns:
TreatmentEffectBuilder
- The current instance of the TreatmentEffectBuilder with updated matching configurations.
with_nearest_neighbors_matching
def with_nearest_neighbors_matching(
essential_covariates: MedRecordAttributeInputList = ["gender", "age"],
one_hot_covariates: MedRecordAttributeInputList = ["gender"],
distance_metric: Metric = "mahalanobis",
number_of_neighbors: int = 1) -> TreatmentEffectBuilder
Adjust the treatment effect estimate using nearest neighbors matching.
Arguments:
essential_covariates (MedRecordAttributeInputList, optional): Covariates that are essential for matching. Defaults to ["gender", "age"]. one_hot_covariates (MedRecordAttributeInputList, optional): Covariates that are one-hot encoded for matching. Defaults to ["gender"].
distance_metric
Metric, optional - Metric to use for the distance calculation. Defaults to "mahalanobis".number_of_neighbors
int, optional - Number of neighbors to consider for the matching. Defaults to 1.hyperparam
Optional[Dict[str, Any]], optional - Hyperparameters for the matching model. Defaults to None.
Returns:
TreatmentEffectBuilder
- The current instance of the TreatmentEffectBuilder with updated matching configurations.
build
def build() -> tee.TreatmentEffect
Builds the treatment effect with all the provided configurations.
Returns:
tee.TreatmentEffect
- treatment effect object
medmodels.treatment_effect_estimation.estimate
Estimate Objects
class Estimate()
subjects_contigency_table
def subjects_contigency_table(
medrecord: MedRecord) -> Dict[str, Set[NodeIndex]]
Overview of which subjects are in the treatment and control groups and whether they have the outcome or not.
Arguments:
medrecord
MedRecord - The MedRecord object containing the data.
Returns:
Dict[str, Set[NodeIndex]]: Dictionary with description of the subject group and Lists of subject ids belonging to each group.
Raises:
ValueError
- Raises Error if the required groups are not present in the MedRecord (patients, treatments, outcomes).
subject_counts
def subject_counts(medrecord: MedRecord) -> Dict[str, int]
Returns the subject counts for the treatment and control groups in a Dictionary.
Arguments:
medrecord
MedRecord - The MedRecord object containing the data.
Returns:
Dict[str, int]: Dictionary with description of the subject group and their respective counts.
Raises:
ValueError
- Raises Error if the required groups are not present in the MedRecord (patients, treatments, outcomes).ValueError
- If there are no subjects in the treatment false, control true or control false groups in the contingency table. This would result in division by zero errors.
relative_risk
def relative_risk(medrecord: MedRecord) -> float
Calculates the relative risk (RR) of an event occurring in the treatment group compared to the control group. RR is a key measure in epidemiological studies for estimating the likelihood of an event in one group relative to another.
The interpretation of RR is as follows:
- RR = 1 indicates no difference in risk between the two groups.
- RR > 1 indicates a higher risk in the treatment group.
- RR < 1 indicates a lower risk in the treatment group.
Arguments:
medrecord
MedRecord - The MedRecord object containing the data.
Returns:
float
- The calculated relative risk between the treatment and control groups.
Raises:
ValueError
- Raises Error if the required groups are not present in the MedRecord (patients, treatments, outcomes).ValueError
- If there are no subjects in the treatment false, control true or control false groups in the contingency table. This would result in division by zero errors.
odds_ratio
def odds_ratio(medrecord: MedRecord) -> float
Calculates the odds ratio (OR) to quantify the association between exposure to a treatment and the occurrence of an outcome. OR compares the odds of an event occurring in the treatment group to the odds in the control group, providing insight into the strength of the association between the treatment and the outcome.
Interpretation of the odds ratio:
- OR = 1 indicates no difference in odds between the two groups.
- OR > 1 suggests the event is more likely in the treatment group.
- OR < 1 suggests the event is less likely in the treatment group.
Arguments:
medrecord
MedRecord - The MedRecord object containing the data.
Returns:
float
- The calculated odds ratio between the treatment and control groups.
Raises:
ValueError
- Raises Error if the required groups are not present in the MedRecord (patients, treatments, outcomes).ValueError
- If there are no subjects in the treatment false, control true or control false groups in the contingency table. This would result in division by zero errors.
confounding_bias
def confounding_bias(medrecord: MedRecord) -> float
Calculates the confounding bias (CB) to assess the impact of potential confounders on the observed association between treatment and outcome. A confounder is a variable that influences both the dependent (outcome) and independent (treatment) variables, potentially biasing the study results.
Interpretation of CB:
- CB = 1 indicates no confounding bias.
- CB != 1 suggests the presence of confounding bias, indicating potential confounders.
The method relies on the relative risk (RR) as an intermediary measure and adjusts the observed association for potential confounding effects. This adjustment helps in identifying whether the observed association might be influenced by factors other than the treatment.
Arguments:
medrecord
MedRecord - The MedRecord object containing the data.
Returns:
float
- The calculated confounding bias.
Raises:
ValueError
- Raises Error if the required groups are not present in the MedRecord (patients, treatments, outcomes).ValueError
- If there are no subjects in the treatment false, control true or control false groups in the contingency table. This would result in division by zero errors.
absolute_risk
def absolute_risk(medrecord: MedRecord) -> float
Calculates the absolute risk (AR) of an event occurring in the treatment group compared to the control group. AR is a measure of the incidence of an event in each group.
Arguments:
medrecord
MedRecord - The MedRecord object containing the data.
Returns:
float
- The calculated absolute risk difference between the treatment and control groups.
Raises:
ValueError
- Raises Error if the required groups are not present in the MedRecord (patients, treatments, outcomes).ValueError
- If there are no subjects in the treatment false, control true or control false groups in the contingency table. This would result in division by zero errors.
number_needed_to_treat
def number_needed_to_treat(medrecord: MedRecord) -> float
Calculates the number needed to treat (NNT) to prevent one additional bad outcome. NNT is derived from the absolute risk reduction.
Arguments:
medrecord
MedRecord - The MedRecord object containing the data.
Returns:
float
- The calculated number needed to treat between the treatment and control groups.
Raises:
ValueError
- Raises Error if the required groups are not present in the MedRecord (patients, treatments, outcomes).ValueError
- If there are no subjects in the treatment false, control true or control false groups in the contingency table. This would result in division by zero errors.ValueError
- If the absolute risk is zero, cannot calculate NNT.
hazard_ratio
def hazard_ratio(medrecord: MedRecord) -> float
Calculates the hazard ratio (HR) for the treatment group compared to the control group. HR is used to compare the hazard rates of two groups in survival analysis.
Arguments:
medrecord
MedRecord - The MedRecord object containing the data.
Returns:
float
- The calculated hazard ratio between the treatment and control groups.
Raises:
ValueError
- Raises Error if the required groups are not present in the MedRecord (patients, treatments, outcomes).ValueError
- If there are no subjects in the treatment false, control true or control false groups in the contingency table. This would result in division by zero errors.ValueError
- If the control hazard rate is zero, cannot calculate HR.
medmodels.treatment_effect_estimation.report
Report Objects
class Report()
full_report
def full_report(medrecord: MedRecord) -> Dict[str, Any]
Generates a full report of the treatment effect estimation.
Returns:
Dict[str, float]: A dictionary containing the results of all estimation
methods
- relative risk, odds ratio, confounding bias, absolute risk, number needed to treat, and hazard ratio.