=Paper=
{{Paper
|id=Vol-3651/DARLI-AP_paper16
|storemode=property
|title=TLIMB - A Transfer Learning Framework for IMage Analysis of the Brain
|pdfUrl=https://ceur-ws.org/Vol-3651/DARLI-AP-16.pdf
|volume=Vol-3651
|authors=Marc-Andre Schulz,Jan Philipp Albrecht,Alpay Yilmaz,Alexander Koch,Dagmar Kainmüller,Ulf Leser,Kerstin Ritter
|dblpUrl=https://dblp.org/rec/conf/edbt/SchulzAY0KLR24
}}
==TLIMB - A Transfer Learning Framework for IMage Analysis of the Brain==
TLIMB - A Transfer Learning Framework for IMage Analysis of
the Brain
Marc-André Schulz1,2,† , Jan Philipp Albrecht3,4,† , Alpay Yilmaz3 , Alexander Koch1 ,
Dagmar Kainmüller3,4 , Ulf Leser3,† and Kerstin Ritter1,2,∗,†
1
Department of Psychiatry and Neurosciences, Charité – Universitätsmedizin Berlin, Berlin, Germany
2
Bernstein Center for Computational Neuroscience, Berlin, Germany
3
Department of Computer Science, Humboldt-Universität zu Berlin, Berlin, Germany
4
Max-Delbrueck-Center for Molecular Medicine, Berlin, Germany
Abstract
Biomedical image analysis plays a pivotal role in advancing our understanding of the human body’s functioning across different scales,
usually based on deep learning-based methods. However, deep learning methods are notoriously data hungry, which poses a problem in
fields where data is difficult to obtain such as in neuroscience. Transfer learning (TL) has become a popular and successful approach to
cope with this issue, but is difficult to apply in practise due the many parameters it requires to set properly. Here, we present TLIMB, a
novel python-based framework for easy development of optimized and scalable TL-based image analysis pipelines in the neurosciences.
TLIMB allows for an intuitive configuration of source / target data sets, specific TL-approach and deep learning-architecture, and
hyperparameter optimization method for a given data analysis pipeline and compiles these into a nextflow workflow for seamless
execution over different infrastructures, ranging from multicore servers to large compute clusters. Our evaluation using a pipeline for
analysing 10.000 MRI images of the human brain from the UK Biobank shows that TLIMB is easy to use, incurs negligible overhead and
can scale across different cluster sizes.
Keywords
framework, transfer learning, biomedical image analysis, nextflow
Introduction and error-prone, as it requires source code manipulation and
extensive experimentation to find optimal configurations.
Biomedical imaging, especially in neuroscience, is crucial These experimentations can be computationally extremely
for understanding the complexities of the central nervous time-consuming unless adequate parallel and/or distributed
system [15]. It allows for non-invasive examination of brain infrastructures are available, which, however, makes pro-
structure and function, enabling clinical applications like gramming the analysis pipeline even more involved.
diagnosing and monitoring neurological and psychiatric In this work, we present TLIMB, a Transfer-Learning
diseases [2]. Deep learning, with techniques such as convo- Framework for Image Analysis of the Brain. TLIMB is
lutional neural networks (CNNs) and transformer-based ar- programmed in the widely-used general-purpose language
chitectures, show great promise in this domain [8]. Their ef- Python and based on PyTorch Lightning1 . With TLIMB,
fectiveness in tasks such as lesion segmentation and disease users specify their TL-pipeline in the form of simple and
classification has been demonstrated [18, 20, 8]. However, intuitive configuration files, which are then compiled into a
the success of these advanced architectures often hinges concrete image analysis workflow in Nextflow [6], a popular
on the availability of large and homogeneous datasets, a and powerful workflow engine than can execute an analysis
challenge in biomedical settings due to their scarcity. over a wide range of infrastructures, ranging from single
Transfer learning (TL) has recently become popular for servers to large compute clusters. With TLIMB, researchers
overcoming the constraints of small and heterogeneous thus are able to easily assess the effectiveness of different
datasets. In a nutshell, it allows leveraging a model trained TL setups across diverse datasets and environments.
on a given source dataset for improving model performance We specifically designed TLIMB as a framework and not
on a different target dataset [21]. However, applying TL in as a proper domain specific language (i.e., a programming
neuroimaging practise has proven difficult, as it requires language tailored to a particular problem; DSL) because
the careful selection of a multitude of different yet close of the advantages of this approach in terms of flexibility,
interacting parameters, including the base image analy- ease of creation, extensibility, and seamless integration with
sis architecture (e.g. ResNet, different flavors of CNNs or existing tools [13, 1]. A Python-based framework, in par-
transformers), the concrete TL-method (e.g. fine-tuning, ticular, provides a familiar environment for data scientists,
multitask-learning), the concrete objective function, and the capitalizing on the language’s popularity and compatibility
source dataset to be used. Determining these parameters with established machine and deep learning frameworks
manually in a framework like PyTorch is time-consuming (like PyTorch).
Through a series of experiments following the "brain-
Published in the Proceedings of the Workshops of the EDBT/ICDT 2024
Joint Conference (March 25-28, 2024), Paestum, Italy
age" paradigm [4], a widely-used method for assessing brain
∗
Corresponding author. health through neuroimaging data, we validated the plat-
†
These authors contributed equally. form’s capability to create a diverse landscape of TL-based
$ marc-andre.schulz@charite.de (M. Schulz); pipelines and to execute them seamlessly over any infras-
jan-philipp.albrecht@mdc-berlin.de (J. P. Albrecht); tructure supported by Nextflow.
alpay.yilmaz@student.hu-berlin.de (A. Yilmaz);
alexander.koch@charite.de (A. Koch);
Dagmar.Kainmueller@mdc-berlin.de (D. Kainmüller);
leser@informatik.hu-berlin (U. Leser); kerstin.ritter@charite.de
(K. Ritter)
© 2024 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 1
International (CC BY 4.0). https://pypi.org/project/pytorch-lightning/
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
1
Marc-André Schulz et al. CEUR Workshop Proceedings 1–6
Figure 1: Simplified diagram of TLIMB’s structure, showcasing all necessary components.
Related Work definitions. An overview of TLIMB’s architecture is shown
in Figure 1.
There has been a number of efforts to develop DSLs as well The Scenario component is responsible for the training
as frameworks for machine learning-based (image) data logic: it orchestrates training, validation, and testing by
analysis [9]. OptiMl, a DSL tailored for machine learning sourcing data from the DataModule, processing it through
tasks, seeks to provide an implicitly parallel, expressive, and the specified deep learning Architecture, calculating
high-performance alternative to MATLAB and C++ [17]. losses using the ObjectiveFunction, and executing the
However, it does not address TL and is agnostic to the data optimization step. This abstraction level facilitates not only
types analysed and thus requires some effort for using it in the conventional sequential pretrain-finetune TL workflows,
image analysis. Extending it with TL abilities would be non- but also enables the implementation of workflows that re-
trivial due to its design as a DSL. P-Hydra employs Transfer quire simultaneous processing of both pretraining and fine-
Learning and Multitask learning for image analysis in cancer tuning data, such as semi-supervised learning algorithms
detection, aiming to validate its algorithmic effectiveness [19]. Scenarios are designed to be data-operation agnostic,
and establish a baseline for other methods [11]. In contrast i.e., independent of the specific deep learning architecture
to TLIMB, the method is implemented in a single pipeline and objective function, thereby enhancing the modularity
and not designed as configurable framework. Furthermore, of the design.
our approach supports multiple heads per model, enabling The Architecture component relates to the adaptable
a broader spectrum of TL-methods. Ilastik, designed as an configuration of network layers and nodes, providing users
interactive tool for machine-learning-based (bio)image anal- with the versatility to select from predefined architectures
ysis, addresses challenges associated with manual image or to incorporate their own custom designs by referencing
analysis by providing pre-defined workflows for segmen- them in the configuration file. To facilitate efficient TL,
tation, object classification, counting, and tracking, with a architectures are decomposed into two main elements: the
user-friendly interface emphasizing accessibility for non- encoder, which is often repurposed from the source task,
programmers [3]. In contrast to ilastik, our framework con- and the head, which is specific to and replaceable for the
centrates on training neural networks for TL-based analysis. target task. This modular structure supports a variety of TL
Finally, SimpleITK, is a software package designed for im- strategies, ensuring adaptability to methodologies such as
age analysis that provides a simplified interface for flexible the core train-fine-tune paradigm and multitask learning.
and reproducible computational workflows [24], aligning Objective Functions embody the core logic of TL, com-
closely with the goals of our framework. While SimpleITK posed of a primary objective (such as cross-entropy for
streamlines image analysis through Jupyter Notebooks and classification) and an auxiliary objective (such as an elastic
introduces various abstractions, our framework adopts a penalty on weights during fine-tuning or reconstruction
different approach, allowing users to initiate analysis by losses in semi-supervised training). During the training
starting with our framework components and building upon phase, this class considers batches and the architecture from
them as needed. the scenario class to computes the loss and performance
metrics. In pursuit of greater modularity, Objective Func-
tions have been architectured to remain decoupled from
Methods other framework components. For instance, employing an
Objective Function designed for multitask learning does not
Architecture of TLIMB require predefined knowledge of the number of heads within
The core of the framework is constructed around three the configuration. This design choice facilitates transitions
primary components: the DataModule, Architecture, of the objective function, enhancing the user experience
and ObjectiveFunction. These are orchestrated within and adaptability within the TL workflow.
a Scenario to create a comprehensive TL pipeline. Users Our DataModule defines the handling of diverse data
can execute different configurations of these Scenarios, such types, ranging from 3D brain MRI data to 1D fMRI time
as for hyperparameter tuning or model comparisons, by series. It encompasses data-specific loading, preprocess-
automatically generating a Nextflow workflow from their ing, and data augmentation routines. It ensures that batch
2
Marc-André Schulz et al. CEUR Workshop Proceedings 1–6
preparation conforms to a defined structure and assembles Integrated models and implementation
DataLoaders. In contrast to the standard PyTorch Light-
Our framework, implemented in Python, provides a seam-
ning (see below), our method imposes constraints on Dat-
less integration of PyTorch and incorporates PyTorch Light-
aLoader instantiation, mandates a uniform Dataset structure,
ning components. This integration offers multiple benefits,
and centralizes all data-related augmentations and transfor-
such as support for distributed training, compatibility with
mations within the DataModule itself. Such a separation-
multi-GPU setups, and optimized performance for various
of-concerns supports simple substitutability of Datasets
machine learning tasks. The framework’s alignment with
and DataModules. This module also inherits several fea-
Python and PyTorch’s popularity in the research commu-
tures from the PyTorch Lightning DataModule, such as
nity simplifies the learning curve, making it a user-friendly
the on_after_batch_transfer and on_before_batch_transfer
and accessible option for TL projects. However, it also offers
hooks. These hooks grant users the capability to refine batch
signficant additional functionalities compared to PyTorch
post-retrieval but prior to their delivery to the Scenario, en-
Lightning. For instance, our TL Command-Line Interface
abling, for instance, the offloading of resource-intensive data
(CLI) distinguishes itself from the PyTorch Lightning CLI by
augmentation strategies to a GPU. This design promotes
facilitating the passage of parameters from the DataModule
user-driven adaptability in our framework, ensuring the
to the Scenario during initialization. This enables users to
flexibility to customize components while preserving the
customize various aspects, such as output size and input
integrity of essential operations.
size.
Datasets, pivotal elements within the DataModule, are
The framework is used mainly via configuration files.
tasked with providing the necessary data and associated
Users specify key components such as a particular ’Data-
labels. The DataModule delineates the procedures for pro-
Module’ for input data specifications, a ’Dataset’ for data
cessing a certain category of data, whereas the Dataset is
file and label paths, ’Architecture’ for neural network struc-
explicit about the specific input files to utilize, their loca-
ture, ’Objective’ for the transfer learning strategy, and ’Sce-
tions within the file system, and the particular labels to
nario’ for training details. The framework supports class
retrieve (for instance, selecting the participant’s sex for a
path parsing, allowing users to define parameters via class
pretraining task, and later using the same dataset to return
references, which can be particularly useful for complex
the participant’s age, thus facilitating straightforward label
configurations. To facilitate hyperparameter tuning, mul-
specification). A DataModule can include multiple Datasets,
tiple variants of these parameters can be provided. The
accommodating various TL strategies that incorporate data
framework’s workflow generator leverages this information
from diverse sources. Each Dataset implements a custom
to create Nextflow workflows, which orchestrate the execu-
getitem method to ensure the standardized conveyance of
tion of tasks across the computational infrastructure. This
images and labels to the DataModule. This getitem method
streamlined approach enables systematic exploration and
invariably produces a tuple, which includes an image paired
efficient optimization of model parameters.
with a list of labels, thereby adapting to the diverse labeling
TLIMB comes with a number of readily available models
demands posed by different Objective Functions. Varied
and configurations for its different components. Regarding
learning paradigms such as Multitask, Pre-train Fine-tune,
architectures, it currently offers 3d-ResNets of different
and Unsupervised Domain Adaptation require unique label
depths [22], the 3d Simple-Fully-Convolutional-Network
arrangements.
[14], as well as vision and swin transformers , three highly
The Configuration component serves as an important
popular imaging architectures. ResNet utilizes shortcut con-
tool for managing configuration within our framework, of-
nections to enhance training performance, while SFCN is a
fering users the ability to customize every aspect of their
lightweight 3D convolutional neural network specifically
workflow. Unlike PyTorch Lightning, which primarily fo-
tailored for 3D neuroimaging data. Transformers are fully
cuses on non-structural hyperparameters, our Configu-
connected deep encoder-decoder stacks with self-attention.
ration empowers users to tailor Scenarios, Architectures,
Several customization options, such as filter and kernel sizes
Objective-functions, Datasets, DataModules, trainers, and
and addition of dropout layers are available for each. The
optimizer parameters, ensuring high configurability and
framework also integrates pre-processing, neuroimaging
modularity. This user-centric approach minimizes coding
domain specific data augmentation, and data transformation
efforts, allowing users to predominantly interact with the
techniques.
Configuration instead. The framework seamlessly inte-
Regarding TL algorithms, our framework encompasses
grates with PyTorch Lightning components, enabling the
five methods: Pre-train-fine-tune, multitask learning, self-
utilization of features like early stopping and automatic
supervised semi-supervised learning, elastic penalty, and
optimizers, effortlessly configurable through the provided
unsupervised domain adaptation. Pre-train-fine-tune in-
configuration file. To ensure reproducibility, each work-
volves using a pre-trained network for a target task, while
flow is associated with a defined configuration, facilitating
elastic penalty introduces an L² penalty to preserve learned
run reproduction. The configuration provides essential de-
features during fine-tuning. Multitask learning optimizes
tails such as splits, which define the distribution of images
models by sharing representations between related tasks.
across training, validation, and testing sets. It also includes
Self-supervised semi-supervised learning leverages both
adjustable seeds to guarantee consistent runs, except when
labeled and unlabeled data. Unsupervised domain adapta-
randomness is introduced by the user. Users are not con-
tion allows training deep models using labeled data from
fined to predefined components; instead, our framework
a source domain and unlabeled data from a target domain.
provides interfaces for Objective-functions, Architectures,
On overview of these techniques can be found in [10].
DataModules, Datasets, and Scenarios, making it easy to
TLIMB’s objective functions mirror PyTorch Lightning
implement specialized versions of these components, such
training/validation/testing steps. Hyperparameter opti-
as a new Objective-function.
mization is facilitated through grid search and random
search. TLIMB comes with three directly usable DataMod-
3
Marc-André Schulz et al. CEUR Workshop Proceedings 1–6
Table 1
Reduction in Lines of Source Code for simple pretrain-finetune scenario when moving from native pytorchlightning to our
framework.
Description Manual Implementation Framework Implementation
Total Source Lines 286 34
Data Module Definition 81 -
Dataset Definition 34 34
Lightning Module 51 -
Rest (Losses, Architecture, Import) 125 -
Table 2
Comparison of Execution Times (on AMD 3970X 32-Core / Nvidia GeForce 3090). We report average values of three runs, together
with their standard deviation (in brackets). Reduction in execution time in our framework is mostly due to parallelisation of
training and testing steps.
Execution Workflow Manual Implementation (s) Framework Implementation (s)
CPU only 109.66 (+- 0.47) 79.33 (+- 1.88)
Single GPU 72.66 (+- 0.47) 46.00 (+- 0.00)
ules, namely BaseDataModule, CropCenterDataModule, to investigate the framework’s usability and computational
and BioImageDataModule. Additionally, several pre-defined overhead rather than achieving state-of-the-art accuracy.
Datasets from the Human Connectome Project are read- Each variant was implemented in two ways: manually using
ily available, but researchers can effortlessly incorporate PyTorch with PyTorch Lightning, and through our TLIMB
any image analysis dataset of their choice by utilizing the framework compiled into a Nextflow workflow. These im-
provided interface. plementations were then run in three different scenarios:
manually without the framework, with the framework se-
Nextflow as workflow manager quentially, and with the framework in parallel. The primary
metrics for evaluation were the execution times and the
Nextflow is a mature and popular scientific workflow en- lines of code required for each scenario. Execution times are
gine [7]. Workflows in Nextflow are written in a proper documented in Table 1, illustrating the comparison between
workflow language based on Groovy and are executed by a running the processes with and without the framework,
workflow engine which controls data dependencies, max- both sequentially and in parallel.
imises parallelism in task executions, and supports repro- Additionally, we conducted a minimal set of experiments
ducibility by a sophisticated logging mechanism. Workflows illustrating how TLIMB may be used in practice. On the
can either be executed locally (non distributed) by the sys- same data set and using the ResNet-18 architecture, we
tem itself, or passed on to popular resource managers, such compared fine-tuning effectiveness for different numbers of
as Slurm or Kubernetes [25], for scheduling on arbitrarily frozen layers in the pre-trained model. Freezing lower lay-
large clusters. In TLIMB, we utilize Nextflow to assemble TL ers of a pre-trained model reduces the number of trainable
workflows from user-provided configurations into a work- parameters and thus reduces the risk of overfitting during
flow script. This script can then be executed in parallel and the fine-tuning process. Metrics for pre-training and fine-
distributed across all supported infrastructures, significantly tuning performance are shown in Table 3 and 4 respectively.
accelerating the processing speed. TLIMB achieved expected levels of accuracy, in line with
other studies [5, 16].
Execution Time: The execution times indicated mini-
Experiments mal to no computational overhead when using the TLIMB
For the evaluation of the TLIMB framework, we used T1- framework. The parallel execution with nextflow signifi-
weighted brain images from the UK Biobank [12]. To stream- cantly reduced the time compared to the sequential runs,
line the evaluation process, we processed images by apply- showcasing the framework’s scalability (see Table 1).
ing linear registration and extracting the central axial slices. Lines of Code: A notable reduction in lines of code
This reduced the dimensionality of the data, allowing us was observed when using TLIMB, emphasizing the ease of
to expedite the training process. We created three subsets use and time savings in coding. The framework abstracted
of randomly sampled images: 10,000 for pre-training, 500 many of the repetitive tasks, such as setting up data loaders,
for fine-tuning, and 1,000 for the test set. Models were model configurations, and hyperparameter tuning, which
pre-trained on age regression, and fine-tuned on sex classi- contributed to a more streamlined development process.
fication. Prediction Performance: Although no exact replication
To assess usability improvements, we specified a simpli- of literature results was attempted at the time of writing,
fied search space comprising two different neural network our preliminary results are compatible with literature ex-
architectures (ResNet-18 and a Vision Transformer), three pectations.
different learning rates (10− 4, 10− 3, 10− 2), and an optional
elastic penalty loss [23] as an advanced fine-tuning tech-
nique. Both models were pre-trained for 10 epochs and
fine-tuned for one epoch. Such limited training time would
be insufficient for real world applications, but our aim here is
4
Marc-André Schulz et al. CEUR Workshop Proceedings 1–6
Table 3
Accuracy of a ResNet18 predicting brain age from 2D images of human brain from the UK Biobank. Models were trained from
scratch. Reported results are the average and standard deviation over three training runs.
Learn Rate MSE MAE
0.01 31.3 (+- 2.63) 4.45 (+- 0.2)
0.001 26.4 (+- 0.17) 4.1 (+- 0.03)
0.0001 32.99 (+- 1.14) 4.6 (+- 0.08)
0.00001 59.99 (+- 1.63) 6.32 (+- 0.09)
Table 4
Accuracy of a ResNet18 predicting sex from 2D images of the human brain from the UK Biobank. Models were pre-trained on
age prediction (see Table 3). Learning rate was fixed at 0.001. We show 4 variations in which increasing numbers of layers are
kept frozen during fine-tuning. "None" refers to no frozen parameters.
Freeze up to Layer CE Accuracy Trainable Parameters
None 1.29 (+- 0.3) 0.76 (+- 0.04) 11.2 M
layer3.1.conv1 0.71 (+- 0.1) 0.78 (+- 0.01) 9.6 M
layer4.0.conv1 0.67 (+- 0.3 ) 0.79 (+- 0.01) 8.4 M
layer4.1.conv1 0.56 (+- 0.0) 0.72 (+- 0.0) 4.7 M
Conclusion and outlook [2] Rohit Bakshi et al. “MRI in multiple sclerosis: current
status and future prospects”. In: The Lancet Neurology
In this study, we introduce our innovative solution – a 7.7 (2008), pp. 615–625.
tailored implementation and evaluation platform for TL
[3] Stuart Berg et al. “Ilastik: interactive machine learn-
techniques in biomedical imaging applications. Guided by
ing for (bio) image analysis”. In: Nature methods 16.12
specific requirements, we opted for a comprehensive frame-
(2019), pp. 1226–1232.
work over a DSL. Our framework comprises two key com-
ponents: firstly, a Python framework built upon PyTorch [4] James H Cole et al. “Predicting brain age with deep
Lightning, facilitating diverse user-defined TL tasks. Sec- learning from raw imaging data results in a reliable
ondly, a workflow generator and executor ensuring scalabil- and heritable biomarker”. In: NeuroImage 163 (2017),
ity. We provide in-depth descriptions of both components, pp. 115–124.
highlighting their functionalities and capabilities. To as- [5] James H. Cole. “Multimodality neuroimaging brain-
certain the effectiveness and utility of our framework, we age in UK biobank: relationship to biomedical,
applied it to the "brain-age" paradigm. In this context, the lifestyle, and cognitive factors”. In: Neurobiology of
assessment of brain-age deviations from chronological age Aging 92 (Aug. 2020), pp. 34–42. issn: 0197-4580. doi:
serves as a metric for evaluating brain health. Our frame- 10.1016/j.neurobiolaging.2020.03.014. url: https://
work demonstrates minimal or no computational overhead, www.ncbi.nlm.nih.gov/pmc/articles/PMC7280786/
while significantly reducing the number of lines of code (visited on 03/04/2024).
required. In the pursuit of refining our framework, we pro-
[6] P. Di Tommaso et al. “Nextflow enables reproducible
pose several avenues for future development. Firstly, we
computational workflows”. In: Nat Biotechnol 35.4
recommend the establishment of a standardized template
(2017), pp. 316–319.
to streamline the evaluation of TL methods. This template
would simplify result and methodology comparisons among [7] Paolo Di Tommaso et al. “Nextflow enables re-
researchers, fostering a more cohesive and efficient research producible computational workflows”. In: Nature
environment. Moreover, to enhance the efficiency of model biotechnology 35.4 (2017), pp. 316–319.
tuning, we advocate for the implementation of additional [8] Fabian Eitel et al. “Promises and pitfalls of deep
hyperparameter optimization methods within our frame- neural networks in neuroimaging-based psychiatric
work. Specifically, techniques like Bayesian Optimization research”. In: Experimental Neurology 339 (2021),
can be incorporated to further optimize model performance. p. 113608.
Furthermore, to minimize manual intervention and improve
[9] Joan Giner-Miguelez, Abel Gómez, and Jordi Cabot.
user experience, we suggest enhancing the workflow man-
“A domain-specific language for describing machine
ager. This enhancement includes the addition of automatic
learning datasets”. In: Journal of Computer Languages
ranking capabilities, which will facilitate a more efficient
76 (2023), p. 101209.
comparison and selection of the best-performing models,
guided by predefined evaluation metrics. [10] Padmavathi Kora et al. “Transfer learning techniques
for medical image analysis: A review”. In: Biocyber-
netics and Biomedical Engineering 42.1 (2022), pp. 79–
References 107.
[1] Alexander Alexandrov et al. “Implicit parallelism [11] Jiyoung Lee. “P-Hydra: Bridging Transfer Learning
through deep language embedding”. In: SIGMOD. And Multitask Learning”. In: Master Thesis, Univer-
2015, pp. 47–61. sity of Friburg (2020).
5
Marc-André Schulz et al. CEUR Workshop Proceedings 1–6
[12] Thomas J Littlejohns et al. “The UK Biobank imaging
enhancement of 100,000 participants: rationale, data
collection, management and future directions”. In:
Nature communications 11.1 (2020), p. 2624.
[13] Marjan Mernik, Jan Heering, and Anthony M. Sloane.
“When and How to Develop Domain-Specific Lan-
guages”. In: ACM Comput. Surv. 37.4 (Dec. 2005),
pp. 316–344.
[14] Han Peng et al. “Accurate brain age prediction with
lightweight deep neural networks”. In: Medical image
analysis 68 (2021), p. 101871.
[15] Rangaraj M Rangayyan. Biomedical image analysis.
CRC press, 2004.
[16] Marc-Andre Schulz et al. “Performance reserves in
brain-imaging-based phenotype prediction”. In: Cell
Reports 43.1 (2024).
[17] Arvind Sujeeth et al. “OptiML: an implicitly parallel
domain-specific language for machine learning”. In:
ICML. 2011, pp. 609–616.
[18] Sergi Valverde et al. “Improving automated multiple
sclerosis lesion segmentation with a cascaded 3D con-
volutional neural network approach”. In: NeuroImage
155 (2017).
[19] Jesper E Van Engelen and Holger H Hoos. “A survey
on semi-supervised learning”. In: Machine learning
109.2 (2020), pp. 373–440.
[20] Sandra Vieira, Walter HL Pinaya, and Andrea
Mechelli. “Using deep learning to investigate the
neuroimaging correlates of psychiatric and neurolog-
ical disorders: Methods and applications”. In: Neu-
roscience & Biobehavioral Reviews 74 (2017), pp. 58–
75.
[21] Karl Weiss, Taghi M Khoshgoftaar, and DingDing
Wang. “A survey of transfer learning”. In: Journal of
Big data 3.1 (2016), pp. 1–40.
[22] Wanni Xu, You-Lei Fu, and Dongmei Zhu. “ResNet
and Its Application to Medical Image Processing: Re-
search Progress and Challenges”. In: Computer Meth-
ods and Programs in Biomedicine (2023), p. 107660.
[23] LI Xuhong, Yves Grandvalet, and Franck Davoine.
“Explicit inductive bias for transfer learning with con-
volutional networks”. In: International Conference on
Machine Learning. PMLR. 2018, pp. 2825–2834.
[24] Ziv Yaniv et al. “SimpleITK image-analysis note-
books: a collaborative environment for education and
reproducible research”. In: Journal of digital imaging
31.3 (2018), pp. 290–303.
[25] Naweiluo Zhou, Huan Zhou, and Dennis Hoppe.
“Containerization for High Performance Computing
Systems: Survey and Prospects”. In: IEEE Transac-
tions on Software Engineering 49.4 (2022), pp. 2722–
2740.
Acknowledgements and Funding
This work was funded by FONDA (DFG; SFB 1404; Project
ID: 414984028).
6