Dataset Handle: 20.500.14123/1735

Supplementary Material for the Paper "Estimation of minimal data sets sizes for machine learning predictions in digital mental health interventions"

Archiving without access
No downloads available

Chronological data

Date of availability in catalog2025-01-22
Available from / since 2025-01-22

Language of the resource

English

Related external resources

Supplement to DOI: 10.1038/s41746-024-01360-w
Zantvoort, K., Nacke, B., Görlich, D., Hornstein, S., Jacobi, C., Funk, B. (2024). Estimation of minimal data sets sizes for machine learning predictions in digital mental health interventions. npj Digital Medicine, 7(1), Article 361.

Related PubData resources

Abstract

To provide insights on minimal necessary data set sizes, the researchers explore domain-specific learning curves for digital intervention dropout predictions based on 3654 users from a single study. Prediction performance is analyzed based on dataset size (N = 100–3654), feature groups (F = 2–129), and algorithm choice (from Naive Bayes to Neural Networks). The results substantiate the concern that small datasets (N ≤ 300) overestimate predictive power. For uninformative feature groups, in-sample prediction performance was negatively correlated with dataset size. Sophisticated models overfitted in small datasets but maximized holdout test results in larger datasets. While N = 500 mitigated overfitting, performance did not converge until N = 750–1500. Consequently, the researchers propose minimum dataset sizes of N = 500–1000.

Resource type

Dataset

Kinds of Data

Statistical Evaluations / Tables
Context Materials / Supporting information
Survey Instruments / Measuring Instruments
Programs and Applications

Methods

Summary
Description
Aggregation

Thematic classification

Data Science

Keywords

Maschinelles Lernen; Data Science; Prognose; Algorithmus; Gesundheitsdaten; Digitale Gesundheit; Mentale Gesundheit; Psychische Störung; Intervention; Therapeutik; Machine Learning; Data Science; Prediction; Algorithm; Health Data; Digital Health; Mental Health; Psychiatric Disorder; Intervention; Therapeutics

Notes

The supplementary material is available for download. Please visit the article linked below to gain access. You will find the file in the chapter "Supplementary information".