Review Article
Creative Commons, CC-BY
Methylation Risk Scores Incorporating Epigenetic Biomarkers: A Systematic Review of Diagnostic Accuracy and Clinical Utility Across Multiple Diseases
*Corresponding author: Helen Taylor MSc, Managing Director Evexia Group Limited, Integrative Metabolic Health, UK.
Received: October 14, 2025; Published: November 07, 2025
DOI: 10.34297/AJBSR.2025.29.003759
Abstract
Methylation Risk Scores (MRS) that incorporate DNA methylation-derived Epigenetic Biomarkers (EBPs) offer a promising route to improved diagnosis, risk prediction and personalised disease management. This systematic review synthesises evidence from 12 clinical studies (January 2020-May 2024) identified via PubMed, ScienceDirect, bioRxiv and medRxiv using PRISMA-compliant screening, with studies selected on the basis of primary human data assessing MRS/EBP performance across a range of non-cancer conditions. Data extraction captured study design, sample size, analytic approach and key performance metrics (hazard ratios, odds ratios, C-Index, AUC, ICC, sensitivity, specificity, Pearson r²). A narrative synthesis was performed owing to heterogeneity in cohorts, outcomes and analytic pipelines.
Across studies, MRS demonstrated consistent and moderate-to-strong predictive power: hazard ratios ranged from 1.4-2.3 and odds ratios from 1.7-2.2; C-Index values were 0.70-0.80 and AUC values reached as high as 0.89 (type 2 diabetes complications), with typical sensitivity 78-88% and specificity 82-89%. Reliability metrics were favourable (ICC 0.77-0.85; r² 0.65-0.74). Disease contexts included metabolic syndrome, cardiovascular events, type 2 diabetes, kidney disease, frailty, cognitive outcomes and osteoarthritis; several studies reported overlapping CpG loci and biologically plausible gene associations (e.g. CPT1A, ABCG1), supporting mechanistic relevance [35,12,86].
Methodologically, studies employed EWAS, penalised regression (notably elastic net), machine-learning classifiers and Multi-Omics Integration (OMICmAge), reflecting rapid methodological innovation [16,66]. Principal barriers to clinical translation include lack of standardised DNAm measurement and analysis protocols, limited cohort diversity, high initial costs and regulatory uncertainty [85,87]. Emerging opportunities lie in AI/ ML enhancement, multi-omics and single-cell epigenomics, and targeted epigenetic therapeutics. In summary, current evidence indicates that MRS incorporating EBPs can meaningfully improve disease prediction and prognostication across multiple conditions, but widespread clinical adoption will require standardisation, broader validation in diverse populations and resolution of economic and regulatory challenges.
Scientific-Theoretical Agenda
This section sets out the theoretical foundations, methodological principles and an applied research agenda for Methylation Risk Scores (MRS) that incorporate DNA methylation-derived Epigenetic Biomarkers (EBPs). It synthesises mechanistic knowledge of epigenetics with current practice in MRS development, identifies methodological and translational bottlenecks, and proposes priority directions for research that would enable robust clinical deployment.
Epigenetic Foundations and Mechanistic Rationale
Epigenetics denotes heritable changes in gene expression that occur without alteration of the DNA sequence [47,7]. DNA Methylation (DNAm) the covalent addition of a methyl group to cytosine residues within CpG dinucleotides has been the most intensively characterised mechanism because of its relative stability in peripheral tissues and its functional impact on transcriptional regulation Schmits, et al., (2019) [36]. Methylation of promoterproximal CpG islands commonly reduces transcription factor binding and represses gene expression, while demethylation may permit transcriptional activation Shuai, et al., (2020) [91]. CpG sites are non-uniformly distributed across the genome and cluster in islands frequently associated with gene promoters; methylation status at these loci therefore has outsized regulatory importance [23,97,40].
From a pathophysiological perspective, DNAm integrates genetic predisposition and environmental exposures (diet, smoking, psychosocial stress, pollutants), providing a molecular record of cumulative and recent exposures that can influence disease onset and progression [67,64]. This dual sensitivity to genotype and environment is the central theoretical justification for MRS: where Polygenic Risk Scores (PRS) represent inherited genetic liability, MRS can capture dynamic, exposure-responsive elements of risk and thus potentially improve prediction for complex diseases whose aetiology is strongly modulated by environment and lifestyle Wattacherill, et al., (2023) [65,43].
Constructing Methylation Risk Scores: Data, Pipelines and Algorithms
Data Generation and Preprocessing
MRS construction typically begins with genome-wide DNAm data generated on microarray platforms (e.g. Illumina 450K or EPIC arrays) or by sequencing approaches [13,77]. Raw data require rigorous preprocessing probe-level quality control, normalisation, batch-effect correction and, where necessary, deconvolution to account for cellular composition to reduce technical artefact and improve comparability across cohorts [63,25]. Standardised pipelines for these steps remain incomplete across the field, producing one of the principal barriers to reproducibility [85].
Feature Selection and Model Training
Feature selection for MRS draws on several complementary strategies. Epigenome-Wide Association Studies (EWAS) identify CpG sites statistically associated with outcomes of interest, while candidate-gene and functional genomic approaches prioritise sites with demonstrated regulatory impact [83,62]. Machine-learning methods (elastic net, random forest, penalised Cox models) are then used for variable selection and model building; elastic net has been particularly prominent because it manages multicollinearity and high-dimensional predictor spaces common to methylation data [30,14]. Cross-validation and independent cohort validation remain essential to avoid overfitting and to estimate generalisable performance [25,90].
Performance Metrics and Evaluation
Robust evaluation of MRS uses multiple complementary metrics: discrimination (AUC, C-Index), calibration, effect size (hazard ratios, odds ratios), and reliability (intra-class correlation, Pearson r²). Sensitivity and specificity characterise classification performance where thresholds are applied [1-10,11-29]. Reporting across these metrics, alongside transparent methods for feature selection and weighting, is crucial for cross-study comparison and clinical interpretation.
Biological Plausibility and Cross-Disease Signal
A compelling theoretical prerequisite for MRS is biological plausibility: the CpG sites that contribute substantially to a score should map to genes and pathways mechanistically connected to the phenotype. Across the reviewed literature, recurring loci (for example, CPT1A and ABCG1) and loci associated with lipid metabolism, inflammation and cellular ageing support mechanistic interpretation for metabolic and cardiovascular endpoints [30- 35,12]. The identification of overlapping CpGs across different disease endpoints suggests that certain epigenetic signatures reflect shared pathophysiological axes (inflammation, metabolic dysfunction, biological ageing) and may underpin pleiotropic predictive utility across conditions such as metabolic syndrome, cardiovascular disease and frailty Marioni, et al., (2015); Li, et al., (2022).
Epigenetic Clocks, Ebps and Conceptual Nomenclature
Epigenetic clocks (Horvath’s clock, Hannum, PhenoAge and more recent integrative measures such as OMICmAge) provide quantitative indices of biological age derived from DNAm patterns and have proven predictive of mortality and age-related morbidity [36-53,8,17]. These clocks form a subset of EBPs and illustrate how methylation patterns can summarise latent biological processes. However, the field suffers from inconsistent terminology (EBP, EpiSigns, EpiScores), hindering comparability and translation. A theoretical agenda must therefore prioritise consensus definitions and taxonomies to ensure clarity in how different classes of methylation-derived markers are described and validated [54-92].
Methodological Limitations and Sources of Bias
Cohort Heterogeneity and Population Generalisability
A recurrent methodological concern is cohort composition. Many studies to date are drawn from relatively homogeneous populations which reduces power to detect population-specific effects and limits external validity [92]. Meta-analytic integration is hampered by heterogeneity of cohorts, phenotype definitions and assay platforms.
Technical Variability
Differences in array platforms, laboratory procedures, and preprocessing choices introduce batch effects and technical noise that can masquerade as biological signal [21]. Lack of standardised pipelines for preprocessing and normalisation therefore threatens reproducibility [85].
Cellular Heterogeneity and Sample Type
Peripheral blood is commonly used for convenience but comprises mixed cell types whose proportions vary with age, health status and acute exposures. Without adequate deconvolution, observed DNAm differences may reflect shifts in cell composition rather than locus-specific regulation [63].
Confounding, Causality and Temporality
DNAm is both a potential mediator and a consequence of disease and exposure. Distinguishing causal methylation changes from epiphenomena requires longitudinal designs, repeated measures and methods such as Mendelian randomisation leveraging Methylation Quantitative Trait Loci (mQTLs). Crosssectional associations alone cannot establish causality [38].
Integration With Other Data Modalities and Advanced Technologies
Multi-Omics and Single-Cell Epigenomics
Combining DNAm with transcriptomics, proteomics and metabolomics (multi-omics) promises richer, mechanistically informed risk models and is exemplified by integrative efforts such as OMICmAge [18]. Single-cell bisulfite sequencing and other single-cell epigenomic methods resolve cellular heterogeneity and can pinpoint cell-type-specific methylation changes relevant to disease [37,26]. The theoretical value is clear: models built with multi-layered molecular data should better capture disease biology and may improve both sensitivity and specificity.
Artificial Intelligence, Explainability and Model Robustness
AI and machine learning including deep learning can extract complex, non-linear patterns from high-dimensional methylation data [66,45]. Yet clinical translation demands interpretable models that clinicians can interrogate. The agenda must therefore stress development of explainable AI approaches, techniques for model calibration, external validation and assessment of clinical net benefit (decision-curve analysis) rather than sole reliance on statistical discrimination metrics.
Clinical Translation: Regulatory, Economic and Workflow Considerations
The utility of MRS will be determined by clinical validity and clinical utility. Beyond accuracy, MRS must demonstrate that their use changes clinical decisions in ways that improve outcomes and are cost-effective. Regulatory approval will require standardised assays, reproducible pipelines, prospective validation and clinical trials demonstrating impact on care pathways [87,84].
Health-system integration implies interoperability with electronic health records, decision-support tools that synthesise MRS with other risk factors, and training for clinicians to interpret and communicate epigenetic risk [87,55]. Economic assessments should consider both the upfront costs of assay implementation and potential downstream savings from earlier diagnosis, reduced invasive testing and better-targeted therapies (Impact Statement).
Ethical, Social and Equity Imperatives
To progress from promising research instruments to clinically useful tools, the following research priorities are proposed:
Standardisation Initiatives
Develop community consensus on preprocessing pipelines, normalisation strategies, reporting standards and nomenclature for EBPs and MRS [85,38].
Large, Diverse, Longitudinal Cohorts
Invest in multi-centre longitudinal studies with repeated DNAm measures and rich phenotyping to resolve temporality, measure change over time and test prognostic utility across populations [20,92].
Independent External Validation
Require independent cohort validation as a standard for any proposed MRS before claims of clinical readiness [25].
Causal Inference and Functional Follow-Up
Use mQTL-based Mendelian randomisation, perturbation experiments and single-cell assays to distinguish causal from associative methylation signals and to identify actionable targets for intervention [38].
Clinical Impact Studies
Design pragmatic clinical trials and implementation studies that measure whether MRS-guided interventions improve patient outcomes and are cost-effective compared with standard care [87].
Multi-Omics and Integrative Modelling
Prioritise studies that combine DNAm with transcriptomic, proteomic and metabolomic data to improve mechanistic understanding and predictive performance [18].
Explainable AI And Interoperability
Develop interpretable ML frameworks and EHR-integrated decision support that clinicians can use confidently [66].
Equity-Centred Design
Explicitly recruit under-represented populations, report sociodemographic variables in publications and assess performance stratified by demographic subgroups to prevent widening disparities [92].
Concluding Synthesis of The Theoretical Agenda
The theoretical case for MRS is strong: DNAm bridges genetic predisposition and environmental exposure, yielding biomarkers that are both mechanistically plausible and practically informative for disease prediction. Methodological advances (elastic net and other penalised models, ML techniques, single-cell technologies) and multi-omics integration have markedly increased predictive capacity. However, for MRS to fulfil their translational promise the field must confront standardisation deficits, expand cohort diversity, demonstrate causal relationships where possible, and validate clinical utility through prospective implementation studies. Addressing these priorities alongside careful attention to ethical and equity considerations will be central to embedding MRS and EBPs into responsible, effective clinical practice.
Discussion / Review
This expanded Discussion synthesises and critically appraises the evidence from the 12 studies included in the systematic review and situates that evidence within methodological, clinical, regulatory and ethical contexts. Its aim is to state clearly what the present evidence supports, to identify where further work is essential, and to outline realistic translational pathways for Methylation Risk Scores (MRS) that incorporate DNA methylationderived Epigenetic Biomarkers (EBPs).
Recap of Principal Empirical Findings
Across the twelve studies reviewed, MRS incorporating EBPs delivered consistent and generally moderate-to-strong predictive performance across a heterogeneous set of non-cancer conditions. Effect sizes (hazard ratios 1.4-2.3; odds ratios 1.7-2.2) and discrimination metrics (C-Index ~0.70-0.80; AUCs up to 0.86-0.89 in selected studies) indicate that MRS capture signal that is relevant for disease prediction and prognostication in conditions including type 2 diabetes, cardiovascular disease, kidney disease, frailty, cognitive decline and osteoarthritis [35,12,86,48]. Reliability measures (ICC 0.77-0.85; r² 0.65-0.74) support the reproducibility of methylationderived metrics within cohorts when standardised pipelines are used [20]. The recurring identification of biologically meaningful loci (for example, CPT1A, ABCG1, PHOSPHO1, SREBF1) strengthens confidence that MRS are not purely statistical artefacts but reflect pathophysiological processes [35,12]. Nonetheless, heterogeneity of study designs, platforms, cohorts and preprocessing approaches produces a complex evidence base in which the promise is clear but the route to reliable, generalisable clinical tools remains contingent upon addressing multiple methodological, infrastructural and social challenges.
Disease-Specific Considerations and Interpretation
Cardiovascular Disease: The cardiovascular applications reported relatively high discrimination (AUCs ≈ 0.85 in some studies) and high sensitivity/specificity in short-term event prediction [12,18]. Mechanistically, DNAm loci associated with lipid handling and inflammation (e.g. ABCG1, SREBF1) are plausibly linked to atherothrombotic risk. Clinically, the most realistic nearterm use is improved risk stratification to guide the intensity of preventive therapies (statin initiation, antihypertensives, behavioural interventions). However, clinical utility depends on demonstrating that reclassification by MRS changes management and improves hard endpoints beyond what existing calculators (e.g. QRISK, SCORE) accomplish.
Metabolic Disease and Type 2 Diabetes: Type 2 diabetes produced some of the largest AUC values and hazard ratios (e.g. AUC up to 0.89; HRs ≈2.1-2.3) in reviewed [86,20]. CpGs mapping to genes governing lipid metabolism and hepatic function (CPT1A) provide mechanistic plausibility. MRS could be used to identify high-risk individuals for intensive lifestyle or pharmacologic prevention; yet, as with cardiovascular disease, the imperative is prospective testing of whether MRS-guided prevention reduces incidence or complications versus standard risk stratification. Kidney Disease and Diabetic Nephropathy: DNAm panels were predictive of kidney function decline and progression of diabetic kidney disease in several studies [50], with HRs approaching 1.9. Given the high clinical and economic burden of end-stage kidney disease, MRS that reliably predict progression might support earlier nephrology referral, intensified glycaemic/blood-pressure control, or therapeutic prioritisation. The challenge will be integrating MRS predictions with well-established biomarkers (eGFR, albuminuria) and demonstrating incremental prognostic value and actionable thresholds.
Frailty and Ageing Phenotypes: Epigenetic clocks and frailty risk scores demonstrated useful discrimination for age-related decline [48,16]. Because epigenetic clocks measure accumulated biological ageing, they may be particularly suited to populationlevel stratification for preventive geriatric interventions. However, translating clock deviations into discrete clinical actions requires development of evidence-based interventions proven to alter clock measures and clinical endpoints.
Cognitive Outcomes and Neurodegeneration: DNAm associations with cognitive ability and brain health are emerging [86]. While predictive performance is presently moderate, epigenetic measures that reflect lifelong exposures or biological ageing could complement imaging and fluid biomarkers in risk stratification for neurodegenerative disease. Key gaps include tissue specificity (blood vs brain), the causal relevance of peripheral methylation marks, and the long lead times to clinical outcomes requiring very large longitudinal cohorts.
Osteoarthritis and Musculoskeletal Disease: Studies predicting knee osteoarthritis progression reported AUCs in the mid-0.70s to low-0.80s [27]. Here, MRS might identify patients at high risk of structural progression who could be prioritised for disease-modifying interventions as they become available. Again, demonstration of clinical impact such as reducing pain, improving mobility, or deferring joint replacement will be critical.
Psychiatric Outcomes: Associations between childhood trauma-linked methylation signatures and later psychiatric disorders show HRs around 1.8 [89]. Such predictive markers raise both clinical possibilities (early psychosocial support) and ethical challenges (stigmatisation, sensitive information); any clinical application must be accompanied by robust consent processes and supportive clinical pathways.
Methodological Critique Deeper Analysis
Assay Platforms and Probe Chemistry
Different studies used Illumina 450K arrays, EPIC arrays or sequencing [13,77]. Each platform has different coverage, probe chemistries and susceptibility to cross-hybridisation. Platform choice influences which CpGs are available for analyses and complicates pooling. Theoretical and practical solutions require cross-platform harmonisation strategies, probe-level mapping tables and, where feasible, replication using sequencing-based approaches that avoid array probe limitations.
Preprocessing, Normalisation and Batch Correction
The lack of consensus on preprocessing (quantile normalisation, functional normalisation, Noob, SWAN etc.) produces divergent methylation beta distributions and affects downstream feature selection [63]. Batch effects introduced at DNA extraction, bisulfite conversion or array processing can generate spurious associations if cases and controls are processed in different batches. Rigorous study design (randomisation of samples across plates), comprehensive batch covariate modelling and use of established normalisation pipelines are necessary but not uniformly applied. The field would benefit from a community-adopted “best practice” pipeline and pre-registration of preprocessing choices.
Cellular Composition and Deconvolution
Peripheral blood is a composite tissue; cell proportions vary with infection, inflammation and ageing. Deconvolution algorithms (Houseman, reference-based/non-reference) mitigate this issue but depend on accurate reference methylomes and assume linearity. Single-cell methylation technologies remove ambiguity but are currently resource intensive. Analytic strategies should consistently report whether and how cell-type effects were controlled.
Feature Selection, Penalisation and Model Stability
Elastic net has been widely used due to its capacity to manage multicollinearity and perform variable selection [14]. However, elastic net solutions can vary with tuning parameters and the composition of training data; bootstrap stability analyses, reporting of selected CpG lists with weights, and publication of model coefficients are needed to enable replication and metaanalytic synthesis. Methods to assess and report model calibration (e.g. calibration plots, Brier scores) are underused but crucial for clinical reliability.
Overfitting, Optimism and Reproducibility
With high predictor:sample ratios, even penalised models can overfit. Independent external validation in truly distinct cohorts is the gold standard for establishing generalisability but was not universal. Where external validation occurred, it increased confidence; where it was absent or limited, conclusions should be more tentative [25]. Transparent code and data sharing, within ethical constraints, are essential to mitigate selective reporting and publication bias.
Statistical Reporting Standards
Beyond AUC or HRs, useful statistics include net reclassification improvement (NRI), Integrated Discrimination Improvement (IDI), decision-curve analysis and clinical utility metrics. Few studies consistently reported these. For clinical adoption, the field should standardise reporting to include such measures to evaluate whether MRS change clinical decisions and yield net patient benefit [6].
Ethical, Social and Equity Imperatives
DNAm can be both cause and consequence. Distinguishing causality requires triangulation: repeated measures to establish temporal precedence; mQTL analyses for instrumental variable inference; and functional follow-up (CRISPR-based epigenome editing, in vitro perturbation) to demonstrate phenotypic effects. [38] Hüls & Czamara, et al., (2020) emphasised this point; yet, most included studies remain associative. Greater investment in causal inference and functional assays is needed before treating particular methylation changes as therapeutic targets.
Translational Readiness and Clinical Utility
From Prediction to Practice
Predictive performance alone does not guarantee clinical utility. The essential next steps are pragmatic trials that test MRS-guided care pathways. For example, a randomised implementation trial might evaluate whether MRS-based stratification for pre-diabetes prevention reduces incident diabetes compared with guidelinebased care. Such trials should capture patient-centred outcomes, cost-effectiveness, acceptability and unintended harms.
Interoperability, EHR Integration and Decision Support
MRS must be packaged for clinical workflows: standardised laboratory reporting, integration into electronic health records and decision-support systems that translate a score into clear management options. Clinician training materials and patientfacing explanations will be necessary to avoid misinterpretation.
Thresholds, Reclassification and Actionability
Clinical thresholds for MRS need calibration and consensus. Reclassification tables and decision-curve analysis should inform whether moving someone from low to high risk by MRS should alter management. Without such thresholds tied to evidence-based interventions, scores will be of limited use.
Economic, Regulatory and Commercial Considerations
Cost-Benefit Calculus
Upfront investment in laboratory platforms, bioinformatics infrastructure and workforce training is substantial; however, modelling studies should evaluate long-term savings from earlier intervention, reduced hospital admissions and targeted therapeutics. Health economic evaluations require realistic assumptions about assay costs, disease prevalence and intervention efficacy; these analyses remain sparse.
Regulatory Pathways
Regulatory approval will require analytical and clinical validation. Regulators expect reproducible assays, demonstration of clinical validity and data on clinical utility. Currently, guidance for epigenetic diagnostics is evolving. Early engagement with regulators and alignment on validation standards will streamline translation [84,87].
Commercialisation And Stewardship
Commercial offerings (e.g. Cardio Diagnostics, Dionysus Health) illustrate private-sector interest. Commercial players can accelerate scale-up but must adhere to transparency, independent validation and equitable access principles. Public-private partnerships could facilitate large-scale cohort creation and broader validation if governed to protect public interest.
Ethical, Social and Equity Implications
Epigenetic profiles encode environmental exposures and social determinants; this raises the risk that MRS could act as proxies for disadvantage. Ethical deployment requires:
a) Representative recruitment to avoid models that perform poorly in under-represented groups [92].
b) Transparent communication about the probabilistic and modifiable nature of epigenetic risk, avoiding deterministic framing.
c) Protection against misuse by insurers or employers; policy frameworks must restrict discriminatory use of epigenetic data.
d) Informed consent that addresses the potential sensitivity of detected exposures (e.g. childhood trauma) and downstream implications.
Patient and public engagement should be prioritised to shape acceptable uses of MRS and to co-design communication materials that contextualise risk.
Practical Roadmap and Recommendations
To move from robust science to responsible clinical application, the field should pursue a phased roadmap:
Immediate (0-2 Years)
Consolidate best-practice pipelines for preprocessing and reporting; establish community standards for cell deconvolution and benchmark datasets; require external validation for publication claims.
Short Term (2-5 Years)
Create multi-centre consortia for diverse longitudinal cohorts with repeated DNAm measures; publish calibration and reclassification analyses; begin small pragmatic trials of MRS-guided interventions in high-yield use cases (e.g. diabetes prevention).
Medium Term (5-10 Years)
Scale assays to clinical laboratory standards; integrate MRS into EHR decision support in pilot centres; perform health economic analyses and engage regulators for pathway definition.
Long Term (>10 Years)
Routine clinical use where proven effective and equitable; development of targeted epigenetic therapeutics informed by causal CpG identification; international governance frameworks for data use and anti-discrimination protections.
Concrete methodological steps include routine publication of model coefficients and CpG lists, cross-platform replication efforts, mandatory subgroup performance reporting and pre-registration of analytical plans.
Limitations of the Present Review and Inference Boundaries
The review intentionally excluded cancer, narrowing its scope but also excluding a widely studied domain of epigenetics that could inform methods. The limited number of studies per disease and heterogeneity in methods constrain the ability to perform quantitative synthesis; a narrative synthesis provides interpretive clarity but lacks pooled effect estimates. The reliance on published performance metrics invites publication bias. Finally, because several reviewed studies used overlapping cohorts or similar population sources, independence of evidence must be interpreted cautiously.
Concluding Assessment
The assembled evidence indicates that MRS incorporating EBPs are a maturing class of biomarkers with genuine potential to improve disease prediction and stratification across a range of clinically important non-cancer conditions. Their strengths lie in dynamic sensitivity to exposures, mechanistic plausibility of key CpG associations and demonstrable incremental predictive value in several contexts. Yet the path to routine clinical use is nontrivial: methodological standardisation, broader population validation, causal and functional validation of key loci, demonstration of clinical utility in prospective trials, and careful ethical governance are all prerequisites.
If these conditions are addressed, MRS could shift preventive medicine and personalised care toward earlier, more precisely targeted interventions. Achieving that future will require coordinated research consortia, method standardisation, health economic evidence and regulatory engagement all pursued with an explicit equity mandate. In sum, MRS are a powerful and promising tool in the biomarker repertoire; realising their clinical promise demands rigorous, deliberate and ethically informed translation.
Conclusion
This systematic review demonstrates that Methylation Risk Scores (MRS) incorporating DNA methylation-derived Epigenetic Biomarkers (EBPs) are a promising addition to the biomarker landscape for non-cancer conditions. Across twelve studies (January 2020-May 2024), MRS consistently showed moderate-tostrong predictive performance (HRs 1.4-2.3; ORs 1.7-2.2; C-Index/ AUC typically 0.70-0.89) and acceptable reliability (ICC 0.77- 0.85; r² 0.65-0.74), with recurring biologically plausible loci (e.g. CPT1A, ABCG1) underpinning mechanistic credibility [35,12,17]. These findings indicate that MRS can capture exposure-responsive molecular information that complements genetic and clinical risk indicators, particularly for complex, environment-modulated diseases such as type 2 diabetes, cardiovascular disease and frailty.
However, translation into routine clinical practice is not yet justified without further work. Key barriers include heterogeneous assay platforms and preprocessing pipelines, limited cohort diversity and external validation, challenges in causal inference, and substantial upfront costs and regulatory uncertainty [85,38,84]. Ethical and equity concerns demand that models be developed and tested in representative populations to avoid exacerbating health disparities [92-99].
To realise clinical utility, the field must prioritise standardisation of laboratory and analytic workflows, large multiethnic longitudinal cohorts with repeated measures, independent external validation, pragmatic trials demonstrating impact on decision-making and outcomes, and transparent governance for data use. Advances in AI/ML, multi-omics integration and singlecell epigenomics offer substantial opportunity to refine predictive performance and mechanistic insight, but these must be pursued alongside explainability and clinical interpretability [66,20].
In summary, MRS with EBPs constitute a powerful, mechanistically grounded tool for enhanced risk prediction. With rigorous standardisation, validation, and ethically informed implementation, they have the potential to augment personalised prevention and management but careful, evidence-based adoption is essential to ensure benefit, equity and safety.
References
- Adler Milstein J, Aggarwal N, Ahmed M, Castner J, Evans BJ, et al. (2022) Meeting the moment: Addressing barriers and facilitating clinical adoption of artificial intelligence in medical diagnosis. NAM Perspectives 10: 31478.
- Ahmad A, Imran M, Ahsan H (2023) Biomarkers as biomedical bioindicators: Approaches and techniques for the detection, analysis, and validation of novel biomarkers of diseases. Pharmaceutics 15(6): 1630.
- Alowais SA, Alghamdi SS, Alsuhebany N, Alqahtani T, Alshaya AI, et al. (2023) Revolutionizing healthcare: The role of artificial intelligence in clinical practice. BMC Med Educ 23(1): 1-689.
- Andersen MS, Leikfoss IS, Brorson IS, Cappelletti C, Bettencourt C, et al. (2023) Epigenome-wide association study of peripheral immune cell populations in Parkinson’s disease. NPJ Parkinsons Dis 9(1): 149.
- Babu M, Snyder M (2023) Multi-omics profiling for health. Mol Cell Proteomics 22(6): 100561.
- Bansal A, Heagerty PJ (2019) A comparison of landmark methods and time-dependent ROC methods to evaluate the time-varying performance of prognostic markers for survival outcomes. Diagn Progn Res 3(1): 14.
- Bansal A, Simmons RA (2018) Epigenetics and developmental origins of diabetes: Correlation or causation? Am J Physiol Endocrinol Metabo 315(1): E15-E28.
- Bernabeu E, McCartney DL, Gadd DA, Hillary RF, Lu AT, et al. (2023) Refining epigenetic prediction of chronological and biological age. Genome Med 15(1): 12.
- Bienkowska A, Raddatz G, Söhle J, Kristof B, Völzke H, et al. (2024) Development of an epigenetic clock to predict visual age progression of human skin. Front Ageing 4: 1258183.
- Bobak CA, Barr PJ, O Malley AJ (2018) Estimation of an inter-rater intra-class correlation coefficient that overcomes common assumption violations in the assessment of health measurement scales. BMC Med Res Methodol 18(1): 93.
- Boecker M, Lai AG (2021) Could personalised risk prediction for type 2 diabetes using polygenic risk scores direct prevention, enhance diagnostics, or improve treatment? Wellcome Open Res 5: 1-14.
- Cappozzo A, McCrory C, Robinson O, Freni Sterrantino A, Sacerdote C, et al. (2022) A blood DNA methylation biomarker for predicting short-term risk of cardiovascular events. Clin Epigenetics 14(1): 121.
- Carmona JJ, Accomando WP, Binder AM, Hutchinson JN, Pantano L, et al. (2017) Empirical comparison of reduced representation bisulfite sequencing and Infinium BeadChip reproducibility and coverage of DNA methylation in humans. NPJ Genom Med 2(1): 13.
- Chamlal H, Benzmane A, Ouaderhman T (2024) Elastic net-based high dimensional data selection for regression. Expert Systems with Applications 244: 122958.
- Chavarría Bolaños D, Rodríguez Wong L, Noguera González D, Esparza Villalpando V, Montero Aguilar M, et al. (2017) Sensitivity, specificity, predictive values, and accuracy of three diagnostic tests to predict inferior alveolar nerve blockade failure in symptomatic irreversible pulpitis. Pain Res Manag: 1-8.
- Chen C, Wang J, Pan D, Wang X, Xu Y, et al. (2023) Applications of multi-omics analysis in human diseases. MedComm (2020) 4(4): e315.
- Chen J, Gatev E, Everson T, Conneely KN, Koen N, et al. (2023) Pruning and thresholding approach for methylation risk scores in multi-ancestry populations. Epigenetics 18(1): 2187172.
- Chen Q, Dwaraka VB, Carreras Gallo N, Mendez K, Chen Y, et al. (2023) OMICmAge: An integrative multi-omics approach to quantify biological age with electronic medical records. bioRxiv 10: 562114.
- Chen Y, Wang B, Zhao Y, Shao X, Wang M, et al. (2024) Metabolomic machine learning predictor for diagnosis and prognosis of gastric cancer. Nat Commun 15(1): 1657.
- Cheng Y, Gadd DA, Gieger C, Monterrubi Gómez K, Zhang Y, et al. (2023) Development and validation of DNA methylation scores in two European cohorts augment 10-year risk prediction of type 2 diabetes. Nat Ageing 3(4): 450-458.
- Cherukuri PF, Soe MM, Condon DE, Bartaria S, Meis K, et al. (2022) Establishing analytical validity of Bead Chip array genotype data by comparison to whole-genome sequence and standard benchmark datasets. BMC Med Genomics 15(1): 56.
- Çorbacıoğlu ŞK, Aksel G (2023) Receiver operating characteristic curve analysis in diagnostic accuracy studies: A guide to interpreting the area under the curve value. Turk J Emerg Med 23(4): 195-198.
- Daniel JM, Spring CM, Crawford HC, Reynolds AB, Baig A (2002) The p120ctn-binding partner Kaiso is a bi-modal DNA-binding protein that recognises both a sequence-specific consensus and methylated CpG dinucleotides. Nucleic Acids Res 30(13): 2911-2919.
- Davis KD, Aghaeepour N, Ahn AH, Angst MS, Borsook D, et al. (2020) Discovery and validation of biomarkers to aid the development of safe and effective pain therapeutics: Challenges and opportunities. Nat Rev Neurology 16(7): 381-400.
- Doherty T, Dempster E, Hannon E, Mill J, Poulton R, et al. (2023) A comparison of feature selection methodologies and learning algorithms in the development of a DNA methylation-based telomere length estimator. BMC Bioinformatics 24(1): 178.
- Domínguez Fernández C, Egiguren Ortiz J, Razquin J, Gómez Galán M, De Las Heras García L, et al. (2023) Review of technological challenges in personalised medicine and early diagnosis of neurodegenerative disorders. Int J Mol Sci 24(4): 3321.
- Dunn CM, Sturdy C, Velasco C, Schlupp L, Prinz E, et al. (2023) Peripheral blood DNA methylation-based machine learning models for prediction of knee osteoarthritis progression: Biologic specimens and data from the Osteoarthritis Initiative and Johnston County Osteoarthritis Project. Arthritis & Rheumatol 75(1): 28-40.
- Fallet M, Blanc M, Di Criscio M, Antczak P, Engwall M, et al. (2023) Present and future challenges for the investigation of transgenerational epigenetic inheritance. Environ Int 172: 107776.
- Feng J, Phillips RV, Malenica I, Bishara A, Hubbard AE, et al. (2022) Clinical artificial intelligence quality improvement: Towards continual monitoring and updating of AI algorithms in healthcare. NPJ Digit Med 5(1): 66.
- Forrest LN, Ivezaj V, Grilo CM (2023) Machine learning v. traditional regression models predicting treatment outcomes for binge-eating disorder from a randomized controlled trial. Psychol Med 53(7): 2777-2788.
- Gao S, Chen S, Han D, Wang Z, Li M, et al. (2020) Chromatin binding of FOXA1 is promoted by LSD1-mediated demethylation in prostate cancer. Nat Genet 52(10): 1011-1017.
- García Giménez JL, Seco Cervera M, Tollefsbol TO, Romá Mateo C, Peiró Chova L, et al. (2017) Epigenetic biomarkers: Current strategies and future challenges for their use in the clinical laboratory. Crit Rev Clin Lab Sci 54(7-8): 529-550.
- Haddaway NR, Page MJ, Pritchard CC, McGuinness LA (2022) PRISMA2020: An R package and shiny app for producing PRISMA 2020-compliant flow diagrams, with interactivity for optimised digital transparency and open synthesis. Campbell Sys Rev 18(2): e1230.
- Heikkinen A, Bollepalli S, Ollikainen M (2022) The potential of DNA methylation as a biomarker for obesity and smoking. J Int Med 292(3): 390-408.
- Hidalgo BA, Minniefield B, Patki A, Tanner R, Bagheri M, et al. (2021) A 6-CpG validated methylation risk score model for metabolic syndrome: The HyperGEN and GOLDN studies. PLoS One 16(11): e0259836.
- Hill C, Sandholm N, Kilner J, Rossing P, Lajer M, et al. (2022) Significant differential methylation of telomere-related genes in diabetic kidney disease and its potential role in regulating gene expression and Wnt signalling: TH-PO222. J Am Soc Nephrol 33(11): 110.
- Hu Y, Shen F, Yang X, Han T, Long Z, et al. (2023) Single-cell sequencing technology applied to epigenetics for the study of tumor heterogeneity. Clin Epigenetics 15(1): 1-161.
- Hüls A, Czamara D (2020) Methodological challenges in constructing DNA methylation risk scores. Epigenetics 15(1-2): 1-11.
- Jansen R, Han LKM, Verhoeven JE, Aberg KA, van den Oord ECGJ, et al. (2021) An integrative study of five biological clocks in somatic and mental health. eLife 10: e59479.
- Jara Espejo M, Line SR (2020) DNA G-quadruplex stability, position and chromatin accessibility are associated with CpG island methylation. FEBS J 287(3): 483-495.
- Kabacik S, Lowe D, Fransen L, Leonard M, Ang S, et al. (2022) The relationship between epigenetic age and the hallmarks of ageing in human cells. Nat Ageing 2(6): 484-493.
- Kayser M, Branicki W, Parson W, Phillips C (2023) Recent advances in forensic DNA phenotyping of appearance, ancestry and age. Forensic Science International: Genetics 65: 102870.
- Kilanowski A, Chen J, Everson T, Thiering E, Wilson R, et al. (2022) Methylation risk scores for childhood aeroallergen sensitization: Results from the LISA birth cohort. Allergy 77(9): 2803-2817.
- Kiselev IS, Kulakova OG, Boyko AN, Favorova OO (2021) DNA methylation as an epigenetic mechanism in the development of multiple sclerosis. Acta Naturae 13(2): 45-57.
- Krishnan G, Singh S, Pathania M, Gosavi S, Abhishek S, et al. (2023) Artificial intelligence in clinical medicine: Catalyzing a sustainable global healthcare paradigm. Front Artif Intell 6: 1227091.
- Krolevets M, Cate Vt, Prochaska JH, Schulz A, Rapp S, et al. (2023) DNA methylation and cardiovascular disease in humans: A systematic review and database of known CpG methylation sites. Clin Epigenetics 15(1): 56.
- Kumari A, Bhawal S, Kapila S, Yadav H, Kapila R (2022) Health-promoting role of dietary bioactive compounds through epigenetic modulations: A novel prophylactic and therapeutic approach. Crit Rev Food Sci Nutr 62(3): 619-639.
- Li A, Mueller A, English B, Arena A, Vera D, et al. (2022) Novel feature selection methods for construction of accurate epigenetic clocks. PLoS Computat Biol 18(8): e1009938.
- Li C, Dubbelaar ML, Zhang X, Zheng JC (2023) Editorial: Understanding the heterogeneity and spatial brain environment of neurodegenerative diseases through conventional and future methods. Front Cell Neurosci 17: 1211273.
- Li KY, Tam CHT, Liu H, Day S, Lim CKP, et al. (2023) DNA methylation markers for kidney function and progression of diabetic kidney disease. Nat Commun 14(1): 2543.
- Li X, Delerue T, Schöttker B, Holleczek B, Grill E, et al. (2022) Derivation and validation of an epigenetic frailty risk score in population-based cohorts of older adults. Nat Commun 13(1): 5269.
- Liu Z, Zhu Y (2021) Epigenetic clock: A promising mirror of ageing. Lancet Healthy Longev 2(6): e304-e305.
- Lu AT, Quach A, Wilson JG, Reiner AP, Aviv A, et al. (2019) DNA methylation GrimAge strongly predicts lifespan and healthspan. Ageing (Albany NY) 11(2): 303-327.
- Maiuolo J, Gliozzi M, Musolino V, Carresi C, Scarano F, et al. (2021) From metabolic syndrome to neurological diseases: Role of autophagy. Front Cell Developmental Biol 9: 651021.
- Mansell G, Gorrie Stone T, Bao Y, Kumari M, Schalkwyk LS, et al. (2019) Guidance for DNA methylation studies: Statistical insights from the Illumina EPIC array. BMC Genomics 20(1): 366.
- Marques L, Costa B, Pereira M, Silva A, Santos J, et al. (2024) Advancing precision medicine: A review of innovative in silico approaches for drug development, clinical pharmacology and personalized healthcare. Pharmaceutics 16(3): 332.
- McBride CM, Koehly LM (2017) Imagining roles for epigenetics in health promotion research. J Behavioral Med 40(2): 229-238.
- McCartney DL, Hillary RF, Conole EL S, Banos DT, Gadd DA, et al. (2022) Blood-based epigenome-wide analyses of cognitive abilities. Genome Biol 23(1): 26.
- Merz C, Sykora J, Krendyukov A, Wiestler B, Gieffers C, et al. (2019) P14.52 differential methylation of a single CpG site in the CD95 ligand promoter affects gene activity and correlates with invasiveness of glioma cells. Neuro Oncol 21(Suppl 3): iii79.
- Mitteldorf J (2024) Biological clocks: Why we need them, why we cannot trust them, how they might be improved. Biochemistry (Moscow) 89(2): 356-366.
- Mohsen F, Al Saadi B, Abdi N, Khan S, Shah Z (2023) Artificial intelligence-based methods for precision cardiovascular medicine. J Pers Med13(8): 1268.
- Mora A (2020) Gene set analysis methods for the functional interpretation of non-mRNA data Genomic range and ncRNA data. Briefings in Bioinformatics 21(5): 1495-1508.
- Nazer N, Sepehri MH, Mohammadzade H, Mehrmohamadi M (2024) A novel approach toward optimal workflow selection for DNA methylation biomarker discovery. BMC Bioinformatics 25(1): 37.
- Omidiran O, Patel A, Usman S, Mhatre I, Abdelhalim H, et al. (2024) GWAS advancements to investigate disease associations and biological mechanisms. Clin Transl Discov 4(3): e296.
- Patel AP, Wang M, Ruan Y, Koyama S, Clarke SL, et al. (2023) A multi-ancestry polygenic risk score improves risk prediction for coronary artery disease. Nat Med 29(7): 1793-1803.
- Peng J, Jury EC, Dönnes P, Ciurtin C (2021) Machine learning techniques for personalised medicine approaches in immune-mediated chronic inflammatory diseases: Applications and challenges. Front Pharmacol 12: 720694.
- Penner Goeke S, Binder EB (2019) Epigenetics and depression. Dialogues Clin Neurosci 21(4): 397-405.
- Prasanth MI, Sivamaruthi BS, Cheong CSY, Verma K, Tencomnao T, et al. (2024) Role of epigenetic modulation in neurodegenerative diseases: Implications of phytochemical interventions. Antioxidants 13(5): 606.
- Prosz A, Pipek O, Börcsök J, Palla G, Szallasi Z, et al. (2024) Biologically informed deep learning for explainable epigenetic clocks. Sci Rep 14(1): 1306.
- Livia Puljak, Irma Ramic, Coral Arriola Naharro, Jana Brezova, Yi Chen Lin, et al. (2020) Cochrane risk of bias tool was used inadequately in the majority of non-Cochrane systematic reviews. J Clin Epidemiol 123: 114-119.
- Ayub Qassim, Emmanuelle Souzeau, Georgie Hollitt, Mark M Hassall, Owen M Siggs, et al. (2021) Risk stratification and clinical utility of polygenic risk scores in ophthalmology. Transl Vis Sci Technol 10(6): 14.
- Meghna Rajaprakash, Lorraine T Dean, Meredith Palmore, Sara B Johnson, Joan Kaufman, et al. (2023) DNA methylation signatures as biomarkers of socioeconomic position. Environ Epigenet 9(1): dvac027.
- Yousef Rasmi, Ameneh Shokati, Amber Hassan, Shiva Gholizadeh Ghaleh Aziz, Sepideh Bastani, et al. (2023) The role of DNA methylation in progression of neurological disorders and neurodegenerative diseases as well as the prospect of using DNA methylation inhibitors as therapeutic agents for such disorders. IBRO Neurosci Rep 14: 28-37.
- Ieva Rauluseviciute, Finn Drabløs, Morten Beck Rye (2019) DNA methylation data by sequencing: Experimental approaches and recommendations for tools and pipelines for data analysis. Clin Epigenet 11(1): 193.
- Melissa L Rethlefsen, Shona Kirtley, Siw Waffenschmidt, Ana Patricia Ayala, David Moher, et al. (2021) PRISMA-S: An extension to the PRISMA statement for reporting literature searches in systematic reviews. Syst Rev 10(1): 39.
- Jarod Rutledge, Hamilton Oh, Tony Wyss Coray (2022) Measuring biological age using omics data. Nat Rev Genetics 23(12): 715-727.
- Karishma Sahoo, Vino Sundararajan (2024) Methods in DNA methylation array dataset analysis: A review. Comput Struct Biotechnol J 23: 2304-2325.
- Yasmeen Salameh, Yosra Bejaoui, Nady El Hajj (2020) DNA methylation biomarkers in ageing and age-related diseases. Front Genetics 11: 171.
- Iqbal H Sarker (2021) Machine learning: Algorithms, real-world applications and research directions. SN Comput Sci 2(3): 160.
- Heena Satam, Kandarp Joshi, Upasana Mangrolia, Sanober Waghoo, Gulnaz Zaidi, et al. (2023) Next-generation sequencing technology: Current trends and advancements. Biol (Basel) 12(7): 997.
- Robert J Schmitz, Zachary A Lewis, Mary G Goll (2019) DNA methylation: Shared and divergent features across eukaryotes. Trend Genetics 35(11): 818-827.
- Silja Schrader, Alexander Perfilyev, Emma Ahlqvist, Leif Groop, Allan Vaag, et al. (2022) Novel subgroups of type 2 diabetes display different epigenetic patterns that associate with future diabetic complications. Diabetes Care 45(7): 1621-1630.
- Po Chien Shen, Ying Fu Wang, Hao Chih Chang, Wen Yen Huang, Cheng Hsiang Lo, et al. (2022) Developing a novel DNA methylation risk score for survival and identification of prognostic gene mutations in endometrial cancer: A study based on TCGA data. Japanese J Clin Oncol 52(9): 992-1000.
- Simayijiang H, Yan J (2023) Recent developments in forensic DNA typing. J Forensic Sci Med 9(4): 353-359.
- Simpson DJ, Chandra T (2021) Epigenetic age prediction. Ageing Cell 20(9): e13452.
- Hannah M Smith, Joanna E Moodie, Karla Monterrubio Gómez, Danni A Gadd, Robert F Hillary, et al. (2024) Epigenetic scores of blood-based proteins as biomarkers of general cognitive function and brain health. Clin Epigenet 16(1): 46.
- Mike Thompson, Brian L Hill, Nadav Rakocz, Jeffrey N Chiang, Daniel Geschwind, et al. (2022) Methylation risk scores are associated with a collection of phenotypes within electronic health record systems. NPJ Genomic Med 7(1): 50.
- Ugo CH, Cardona CJ, Chowanadisai W, Lucas EA, Montgomery MR, et al. (2024) Isolation of gDNA from faecal samples of healthy subjects for assessing the influence of dietary pulse consumption on tumor suppressor gene expression and methylation status. Current Devel Nutr 8: 102576.
- Charlie L J D van den Oord, William E Copeland, Min Zhao, Lin Ying Xie, Karolina A Aberg, et al. (2022) DNA methylation signatures of childhood trauma predict psychiatric disorders and other adverse outcomes 17 years after exposure. Mol Psychiatry 27(8): 3367-3373.
- Miri Varshavsky, Gil Harari, Benjamin Glaser, Yuval Dor, Ruth Shemer, et al. (2023) Accurate age prediction from blood using a small set of DNA methylation sites and a cohort-based machine learning algorithm. Cell Rep Methods 3(9): 100567.
- Kang Wang, Huicong Liu, Qinchao Hu, Lingna Wang, Jiaqing Liu, et al. (2022) Epigenetic regulation of ageing: Implications for interventions of ageing and diseases. Signal Transduct Targeted Ther 7(1): 374.
- Sarah Holmes Watkins, Christian Testa, Jarvis T Chen, Immaculata De Vivo, Andrew J Simpkin, et al. (2023) Epigenetic clocks and research implications of the lack of data on whom they have been developed: A review of reported and missing sociodemographic characteristics. Environ Epigenet 9(1): dvad005.
- Julia J Wattacheril, Srilakshmi Raj, David A Knowles, John M Greally (2023) Using epigenomics to understand cellular responses to environmental influences in diseases. PLoS Genetics 19(1): e1010567.
- Ruidong Xiang, Martin Kelemen, Yu Xu, Laura W Harris, Helen Parkinson, et al. (2024) Recent advances in polygenic scores: Translation, equitability, methods and FAIR tools. Genome Med 16(1): 33.
- Yabe H, Saito K, Kamimoto T, Kimura M, Ogawa K, et al. (2010) Regulation of cancer-specific microRNA by aberrant DNA methylation in laryngeal cancer. Cancer Res 70(8): 4089.
- Seema Yelne, Minakshi Chaudhary, Karishma Dod, Akhtaribano Sayyad, Ranjana Sharma, et al. (2023) Harnessing the power of AI: A comprehensive review of its impact and challenges in nursing science and healthcare. Cureus 15(11): e49252.
- Yin Y, Morgunova E, Jolma A, Kaasinen E, Sahu B, et al. (2017) Impact of cytosine methylation on DNA binding specificities of human transcription factors. Science 356(6337): 502.
- Zheng Z, Li J, Liu T, Fan Y, Zhai Q, et al. (2024) DNA methylation clocks for estimating biological age in Chinese cohorts. Protein Cell 15(8): 575-593.
- Zuber S, Bechtiger L, Bodelet JS, Golin M, Heumann J, et al. (2023) An integrative approach for the analysis of risk and health across the life course: Challenges, innovations, and opportunities for life course research. Discov Soc Sci Health 3(1): 14.

We use cookies to ensure you get the best experience on our website.