BlaBla: Linguistic Feature Extraction for Clinical Analysis in Multiple Languages


Abstract in English

We introduce BlaBla, an open-source Python library for extracting linguistic features with proven clinical relevance to neurological and psychiatric diseases across many languages. BlaBla is a unifying framework for accelerating and simplifying clinical linguistic research. The library is built on state-of-the-art NLP frameworks and supports multithreaded/GPU-enabled feature extraction via both native Python calls and a command line interface. We describe BlaBlas architecture and clinical validation of its features across 12 diseases. We further demonstrate the application of BlaBla to a task visualizing and classifying language disorders in three languages on real clinical data from the AphasiaBank dataset. We make the codebase freely available to researchers with the hope of providing a consistent, well-validated foundation for the next generation of clinical linguistic research.

Download