Accurate and efficient data-driven psychiatric assessment using machine learning

Abstract

Background

Accurate assessment of mental disorders and learning disabilities is essential for timely intervention. Machine learning and feature selection techniques have potential for improving the accuracy and efficiency of mental health assessments. However, limited research has explored the use of large transdiagnostic datasets, as well as the application of these techniques in developing quick, briefer, question-based assessments. This study applies machine learning and feature selection techniques to a large transdiagnostic dataset featuring a high number of assessment items, and to create a tool for construction of streamlined, efficient, and effective assessments from existing data.

Methods

Using the Healthy Brain Network dataset (n = 4,136 at the time this study was conducted) containing over 1000 questionnaire items, a two-stage feature selection approach, with Elastic Net models, was used to identify optimal, parsimonious item subsets for assessing various disorders and symptoms, as well as custom test-based outcome measures for learning disabilities. The study then compared model performance to existing assessments through rigorous cross-validation.

Results

Machine learning models using parsimonious item subsets significantly outperformed traditional assessments (p = 0.004). Models for specific learning disorders achieved AUC values up to 0.855. Importantly, restricting analysis to non-proprietary assessment items did not significantly reduce performance.

Discussion

This study demonstrates the feasibility of using existing datasets to create efficient, effective assessment tools for mental disorders and learning disabilities. Our open-source, modular software architecture facilitates adaptation to diverse datasets, though external validation remains necessary before clinical implementation. The ability to achieve strong performance using only non-proprietary items supports the development of accessible assessment tools.