Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: https://doi.org/10.48548/pubdata-1422
RessourcentypZeitschriftenartikel
TitelDataset size versus homogeneity: A machine learning study on pooling intervention data in e-mental health dropout predictions
DOI10.48548/pubdata-1422
Handle20.500.14123/1491
Autor*inZantvoort, Kirsten  0000-0001-9876-054X
Hentati Isacsson, Nils  0000-0002-5749-5310
Funk, Burkhardt  0000-0001-5855-2666
Kaldo, Viktor  0000-0002-6443-5279
AbstractObjective This study proposes a way of increasing dataset sizes for machine learning tasks in Internet-based Cognitive Behavioral Therapy through pooling interventions. To this end, it (1) examines similarities in user behavior and symptom data among online interventions for patients with depression, social anxiety, and panic disorder and (2) explores whether these similarities suffice to allow for pooling the data together, resulting in more training data when prediction intervention dropout. Methods A total of 6418 routine care patients from the Internet Psychiatry in Stockholm are analyzed using (1) clustering and (2) dropout prediction models. For the latter, prediction models trained on each individual intervention's data are compared to those trained on all three interventions pooled into one dataset. To investigate if results vary with dataset size, the prediction is repeated using small and medium dataset sizes. Results The clustering analysis identified three distinct groups that are almost equally spread across interventions and are instead characterized by different activity levels. In eight out of nine settings investigated, pooling the data improves prediction results compared to models trained on a single intervention dataset. It is further confirmed that models trained on small datasets are more likely to overestimate prediction results. Conclusion The study reveals similar patterns of patients with depression, social anxiety, and panic disorder regarding online activity and intervention dropout. As such, this work offers pooling different interventions’ data as a possible approach to counter the problem of small dataset sizes in psychological research.
SpracheEnglisch
SchlagwörterMental Health; Digital Health; Machine Learning
Jahr der Veröffentlichung in PubData2024
Art der VeröffentlichungZweitveröffentlichung
PublikationsversionVeröffentlichte Version
Datum der Erstveröffentlichung2024-05-15
EntstehungskontextForschung
AnmerkungenThis publication was funded by the German Research Foundation (DFG).
Veröffentlicht durchMedien- und Informationszentrum, Leuphana Universität Lüneburg
Zugehörige Ressourcen Beziehungen dieser Publikation
  Informationen zur Erstveröffentlichung
ElementWert
RessourcentypZeitschrift
Titel des RessourcentypsDigital Health
IdentifierDOI: 10.1177/20552076241248920
Publikationsjahr2024
Band10
Verlag / AnbieterSAGE
Dateien zu dieser Ressource:
Datei Beschreibung GrößeFormat 

Zantvoort_dataset_size_versus_homogeneity.pdf
MD5: 0033e682c954f7d6089f51e2f3424a52
Lizenz: 
open-access


817.65 kB

Adobe PDF
Öffnen/Anzeigen

Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt, soweit nicht anderweitig angezeigt.

Ansichten
Zitationsformate
Datensatz Exporte
Zugriffsstatistik

Seitenaufruf(e): 2

Download(s): 1