=Paper=
{{Paper
|id=Vol-3651/DARLI-AP_paper16
|storemode=property
|title=TLIMB - A Transfer Learning Framework for IMage Analysis of the Brain
|pdfUrl=https://ceur-ws.org/Vol-3651/DARLI-AP-16.pdf
|volume=Vol-3651
|authors=Marc-Andre Schulz,Jan Philipp Albrecht,Alpay Yilmaz,Alexander Koch,Dagmar Kainmüller,Ulf Leser,Kerstin Ritter
|dblpUrl=https://dblp.org/rec/conf/edbt/SchulzAY0KLR24
}}
==TLIMB - A Transfer Learning Framework for IMage Analysis of the Brain==
<pdf width="1500px">https://ceur-ws.org/Vol-3651/DARLI-AP-16.pdf</pdf>
<pre>
                         TLIMB - A Transfer Learning Framework for IMage Analysis of
                         the Brain
                         Marc-André Schulz1,2,† , Jan Philipp Albrecht3,4,† , Alpay Yilmaz3 , Alexander Koch1 ,
                         Dagmar Kainmüller3,4 , Ulf Leser3,† and Kerstin Ritter1,2,∗,†
                         1
                           Department of Psychiatry and Neurosciences, Charité – Universitätsmedizin Berlin, Berlin, Germany
                         2
                           Bernstein Center for Computational Neuroscience, Berlin, Germany
                         3
                           Department of Computer Science, Humboldt-Universität zu Berlin, Berlin, Germany
                         4
                           Max-Delbrueck-Center for Molecular Medicine, Berlin, Germany


                                          Abstract
                                          Biomedical image analysis plays a pivotal role in advancing our understanding of the human body’s functioning across different scales,
                                          usually based on deep learning-based methods. However, deep learning methods are notoriously data hungry, which poses a problem in
                                          fields where data is difficult to obtain such as in neuroscience. Transfer learning (TL) has become a popular and successful approach to
                                          cope with this issue, but is difficult to apply in practise due the many parameters it requires to set properly. Here, we present TLIMB, a
                                          novel python-based framework for easy development of optimized and scalable TL-based image analysis pipelines in the neurosciences.
                                          TLIMB allows for an intuitive configuration of source / target data sets, specific TL-approach and deep learning-architecture, and
                                          hyperparameter optimization method for a given data analysis pipeline and compiles these into a nextflow workflow for seamless
                                          execution over different infrastructures, ranging from multicore servers to large compute clusters. Our evaluation using a pipeline for
                                          analysing 10.000 MRI images of the human brain from the UK Biobank shows that TLIMB is easy to use, incurs negligible overhead and
                                          can scale across different cluster sizes.

                                          Keywords
                                          framework, transfer learning, biomedical image analysis, nextflow


                         Introduction                                                                                                      and error-prone, as it requires source code manipulation and
                                                                                                                                           extensive experimentation to find optimal configurations.
                         Biomedical imaging, especially in neuroscience, is crucial                                                        These experimentations can be computationally extremely
                         for understanding the complexities of the central nervous                                                         time-consuming unless adequate parallel and/or distributed
                         system [15]. It allows for non-invasive examination of brain                                                      infrastructures are available, which, however, makes pro-
                         structure and function, enabling clinical applications like                                                       gramming the analysis pipeline even more involved.
                         diagnosing and monitoring neurological and psychiatric                                                               In this work, we present TLIMB, a Transfer-Learning
                         diseases [2]. Deep learning, with techniques such as convo-                                                       Framework for Image Analysis of the Brain. TLIMB is
                         lutional neural networks (CNNs) and transformer-based ar-                                                         programmed in the widely-used general-purpose language
                         chitectures, show great promise in this domain [8]. Their ef-                                                     Python and based on PyTorch Lightning1 . With TLIMB,
                         fectiveness in tasks such as lesion segmentation and disease                                                      users specify their TL-pipeline in the form of simple and
                         classification has been demonstrated [18, 20, 8]. However,                                                        intuitive configuration files, which are then compiled into a
                         the success of these advanced architectures often hinges                                                          concrete image analysis workflow in Nextflow [6], a popular
                         on the availability of large and homogeneous datasets, a                                                          and powerful workflow engine than can execute an analysis
                         challenge in biomedical settings due to their scarcity.                                                           over a wide range of infrastructures, ranging from single
                            Transfer learning (TL) has recently become popular for                                                         servers to large compute clusters. With TLIMB, researchers
                         overcoming the constraints of small and heterogeneous                                                             thus are able to easily assess the effectiveness of different
                         datasets. In a nutshell, it allows leveraging a model trained                                                     TL setups across diverse datasets and environments.
                         on a given source dataset for improving model performance                                                            We specifically designed TLIMB as a framework and not
                         on a different target dataset [21]. However, applying TL in                                                       as a proper domain specific language (i.e., a programming
                         neuroimaging practise has proven difficult, as it requires                                                        language tailored to a particular problem; DSL) because
                         the careful selection of a multitude of different yet close                                                       of the advantages of this approach in terms of flexibility,
                         interacting parameters, including the base image analy-                                                           ease of creation, extensibility, and seamless integration with
                         sis architecture (e.g. ResNet, different flavors of CNNs or                                                       existing tools [13, 1]. A Python-based framework, in par-
                         transformers), the concrete TL-method (e.g. fine-tuning,                                                          ticular, provides a familiar environment for data scientists,
                         multitask-learning), the concrete objective function, and the                                                     capitalizing on the language’s popularity and compatibility
                         source dataset to be used. Determining these parameters                                                           with established machine and deep learning frameworks
                         manually in a framework like PyTorch is time-consuming                                                            (like PyTorch).
                                                                                                                                              Through a series of experiments following the "brain-
                         Published in the Proceedings of the Workshops of the EDBT/ICDT 2024
                         Joint Conference (March 25-28, 2024), Paestum, Italy
                                                                                                                                           age" paradigm [4], a widely-used method for assessing brain
                         ∗
                           Corresponding author.                                                                                           health through neuroimaging data, we validated the plat-
                         †
                           These authors contributed equally.                                                                              form’s capability to create a diverse landscape of TL-based
                         $ marc-andre.schulz@charite.de (M. Schulz);                                                                       pipelines and to execute them seamlessly over any infras-
                         jan-philipp.albrecht@mdc-berlin.de (J. P. Albrecht);                                                              tructure supported by Nextflow.
                         alpay.yilmaz@student.hu-berlin.de (A. Yilmaz);
                         alexander.koch@charite.de (A. Koch);
                         Dagmar.Kainmueller@mdc-berlin.de (D. Kainmüller);
                         leser@informatik.hu-berlin (U. Leser); kerstin.ritter@charite.de
                         (K. Ritter)
                                  © 2024 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0       1
                                  International (CC BY 4.0).                                                                                   https://pypi.org/project/pytorch-lightning/


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
                                                                                                                                       1
Marc-André Schulz et al. CEUR Workshop Proceedings                                                                              1–6


Figure 1: Simplified diagram of TLIMB’s structure, showcasing all necessary components.


Related Work                                                          definitions. An overview of TLIMB’s architecture is shown
                                                                      in Figure 1.
There has been a number of efforts to develop DSLs as well               The Scenario component is responsible for the training
as frameworks for machine learning-based (image) data                 logic: it orchestrates training, validation, and testing by
analysis [9]. OptiMl, a DSL tailored for machine learning             sourcing data from the DataModule, processing it through
tasks, seeks to provide an implicitly parallel, expressive, and       the specified deep learning Architecture, calculating
high-performance alternative to MATLAB and C++ [17].                  losses using the ObjectiveFunction, and executing the
However, it does not address TL and is agnostic to the data           optimization step. This abstraction level facilitates not only
types analysed and thus requires some effort for using it in          the conventional sequential pretrain-finetune TL workflows,
image analysis. Extending it with TL abilities would be non-          but also enables the implementation of workflows that re-
trivial due to its design as a DSL. P-Hydra employs Transfer          quire simultaneous processing of both pretraining and fine-
Learning and Multitask learning for image analysis in cancer          tuning data, such as semi-supervised learning algorithms
detection, aiming to validate its algorithmic effectiveness           [19]. Scenarios are designed to be data-operation agnostic,
and establish a baseline for other methods [11]. In contrast          i.e., independent of the specific deep learning architecture
to TLIMB, the method is implemented in a single pipeline              and objective function, thereby enhancing the modularity
and not designed as configurable framework. Furthermore,              of the design.
our approach supports multiple heads per model, enabling                 The Architecture component relates to the adaptable
a broader spectrum of TL-methods. Ilastik, designed as an             configuration of network layers and nodes, providing users
interactive tool for machine-learning-based (bio)image anal-          with the versatility to select from predefined architectures
ysis, addresses challenges associated with manual image               or to incorporate their own custom designs by referencing
analysis by providing pre-defined workflows for segmen-               them in the configuration file. To facilitate efficient TL,
tation, object classification, counting, and tracking, with a         architectures are decomposed into two main elements: the
user-friendly interface emphasizing accessibility for non-            encoder, which is often repurposed from the source task,
programmers [3]. In contrast to ilastik, our framework con-           and the head, which is specific to and replaceable for the
centrates on training neural networks for TL-based analysis.          target task. This modular structure supports a variety of TL
Finally, SimpleITK, is a software package designed for im-            strategies, ensuring adaptability to methodologies such as
age analysis that provides a simplified interface for flexible        the core train-fine-tune paradigm and multitask learning.
and reproducible computational workflows [24], aligning                  Objective Functions embody the core logic of TL, com-
closely with the goals of our framework. While SimpleITK              posed of a primary objective (such as cross-entropy for
streamlines image analysis through Jupyter Notebooks and              classification) and an auxiliary objective (such as an elastic
introduces various abstractions, our framework adopts a               penalty on weights during fine-tuning or reconstruction
different approach, allowing users to initiate analysis by            losses in semi-supervised training). During the training
starting with our framework components and building upon              phase, this class considers batches and the architecture from
them as needed.                                                       the scenario class to computes the loss and performance
                                                                      metrics. In pursuit of greater modularity, Objective Func-
                                                                      tions have been architectured to remain decoupled from
Methods                                                               other framework components. For instance, employing an
                                                                      Objective Function designed for multitask learning does not
Architecture of TLIMB                                                 require predefined knowledge of the number of heads within
The core of the framework is constructed around three                 the configuration. This design choice facilitates transitions
primary components: the DataModule, Architecture,                     of the objective function, enhancing the user experience
and ObjectiveFunction. These are orchestrated within                  and adaptability within the TL workflow.
a Scenario to create a comprehensive TL pipeline. Users                  Our DataModule defines the handling of diverse data
can execute different configurations of these Scenarios, such         types, ranging from 3D brain MRI data to 1D fMRI time
as for hyperparameter tuning or model comparisons, by                 series. It encompasses data-specific loading, preprocess-
automatically generating a Nextflow workflow from their               ing, and data augmentation routines. It ensures that batch


                                                                  2
Marc-André Schulz et al. CEUR Workshop Proceedings                                                                               1–6


preparation conforms to a defined structure and assembles             Integrated models and implementation
DataLoaders. In contrast to the standard PyTorch Light-
                                                                      Our framework, implemented in Python, provides a seam-
ning (see below), our method imposes constraints on Dat-
                                                                      less integration of PyTorch and incorporates PyTorch Light-
aLoader instantiation, mandates a uniform Dataset structure,
                                                                      ning components. This integration offers multiple benefits,
and centralizes all data-related augmentations and transfor-
                                                                      such as support for distributed training, compatibility with
mations within the DataModule itself. Such a separation-
                                                                      multi-GPU setups, and optimized performance for various
of-concerns supports simple substitutability of Datasets
                                                                      machine learning tasks. The framework’s alignment with
and DataModules. This module also inherits several fea-
                                                                      Python and PyTorch’s popularity in the research commu-
tures from the PyTorch Lightning DataModule, such as
                                                                      nity simplifies the learning curve, making it a user-friendly
the on_after_batch_transfer and on_before_batch_transfer
                                                                      and accessible option for TL projects. However, it also offers
hooks. These hooks grant users the capability to refine batch
                                                                      signficant additional functionalities compared to PyTorch
post-retrieval but prior to their delivery to the Scenario, en-
                                                                      Lightning. For instance, our TL Command-Line Interface
abling, for instance, the offloading of resource-intensive data
                                                                      (CLI) distinguishes itself from the PyTorch Lightning CLI by
augmentation strategies to a GPU. This design promotes
                                                                      facilitating the passage of parameters from the DataModule
user-driven adaptability in our framework, ensuring the
                                                                      to the Scenario during initialization. This enables users to
flexibility to customize components while preserving the
                                                                      customize various aspects, such as output size and input
integrity of essential operations.
                                                                      size.
   Datasets, pivotal elements within the DataModule, are
                                                                         The framework is used mainly via configuration files.
tasked with providing the necessary data and associated
                                                                      Users specify key components such as a particular ’Data-
labels. The DataModule delineates the procedures for pro-
                                                                      Module’ for input data specifications, a ’Dataset’ for data
cessing a certain category of data, whereas the Dataset is
                                                                      file and label paths, ’Architecture’ for neural network struc-
explicit about the specific input files to utilize, their loca-
                                                                      ture, ’Objective’ for the transfer learning strategy, and ’Sce-
tions within the file system, and the particular labels to
                                                                      nario’ for training details. The framework supports class
retrieve (for instance, selecting the participant’s sex for a
                                                                      path parsing, allowing users to define parameters via class
pretraining task, and later using the same dataset to return
                                                                      references, which can be particularly useful for complex
the participant’s age, thus facilitating straightforward label
                                                                      configurations. To facilitate hyperparameter tuning, mul-
specification). A DataModule can include multiple Datasets,
                                                                      tiple variants of these parameters can be provided. The
accommodating various TL strategies that incorporate data
                                                                      framework’s workflow generator leverages this information
from diverse sources. Each Dataset implements a custom
                                                                      to create Nextflow workflows, which orchestrate the execu-
getitem method to ensure the standardized conveyance of
                                                                      tion of tasks across the computational infrastructure. This
images and labels to the DataModule. This getitem method
                                                                      streamlined approach enables systematic exploration and
invariably produces a tuple, which includes an image paired
                                                                      efficient optimization of model parameters.
with a list of labels, thereby adapting to the diverse labeling
                                                                         TLIMB comes with a number of readily available models
demands posed by different Objective Functions. Varied
                                                                      and configurations for its different components. Regarding
learning paradigms such as Multitask, Pre-train Fine-tune,
                                                                      architectures, it currently offers 3d-ResNets of different
and Unsupervised Domain Adaptation require unique label
                                                                      depths [22], the 3d Simple-Fully-Convolutional-Network
arrangements.
                                                                      [14], as well as vision and swin transformers , three highly
   The Configuration component serves as an important
                                                                      popular imaging architectures. ResNet utilizes shortcut con-
tool for managing configuration within our framework, of-
                                                                      nections to enhance training performance, while SFCN is a
fering users the ability to customize every aspect of their
                                                                      lightweight 3D convolutional neural network specifically
workflow. Unlike PyTorch Lightning, which primarily fo-
                                                                      tailored for 3D neuroimaging data. Transformers are fully
cuses on non-structural hyperparameters, our Configu-
                                                                      connected deep encoder-decoder stacks with self-attention.
ration empowers users to tailor Scenarios, Architectures,
                                                                      Several customization options, such as filter and kernel sizes
Objective-functions, Datasets, DataModules, trainers, and
                                                                      and addition of dropout layers are available for each. The
optimizer parameters, ensuring high configurability and
                                                                      framework also integrates pre-processing, neuroimaging
modularity. This user-centric approach minimizes coding
                                                                      domain specific data augmentation, and data transformation
efforts, allowing users to predominantly interact with the
                                                                      techniques.
Configuration instead. The framework seamlessly inte-
                                                                         Regarding TL algorithms, our framework encompasses
grates with PyTorch Lightning components, enabling the
                                                                      five methods: Pre-train-fine-tune, multitask learning, self-
utilization of features like early stopping and automatic
                                                                      supervised semi-supervised learning, elastic penalty, and
optimizers, effortlessly configurable through the provided
                                                                      unsupervised domain adaptation. Pre-train-fine-tune in-
configuration file. To ensure reproducibility, each work-
                                                                      volves using a pre-trained network for a target task, while
flow is associated with a defined configuration, facilitating
                                                                      elastic penalty introduces an L² penalty to preserve learned
run reproduction. The configuration provides essential de-
                                                                      features during fine-tuning. Multitask learning optimizes
tails such as splits, which define the distribution of images
                                                                      models by sharing representations between related tasks.
across training, validation, and testing sets. It also includes
                                                                      Self-supervised semi-supervised learning leverages both
adjustable seeds to guarantee consistent runs, except when
                                                                      labeled and unlabeled data. Unsupervised domain adapta-
randomness is introduced by the user. Users are not con-
                                                                      tion allows training deep models using labeled data from
fined to predefined components; instead, our framework
                                                                      a source domain and unlabeled data from a target domain.
provides interfaces for Objective-functions, Architectures,
                                                                      On overview of these techniques can be found in [10].
DataModules, Datasets, and Scenarios, making it easy to
                                                                         TLIMB’s objective functions mirror PyTorch Lightning
implement specialized versions of these components, such
                                                                      training/validation/testing steps. Hyperparameter opti-
as a new Objective-function.
                                                                      mization is facilitated through grid search and random
                                                                      search. TLIMB comes with three directly usable DataMod-


                                                                  3
Marc-André Schulz et al. CEUR Workshop Proceedings                                                                                   1–6


     Table 1
     Reduction in Lines of Source Code for simple pretrain-finetune scenario when moving from native pytorchlightning to our
     framework.
                 Description                            Manual Implementation            Framework Implementation
                 Total Source Lines                                  286                                34
                 Data Module Definition                               81                                 -
                 Dataset Definition                                   34                                34
                 Lightning Module                                     51                                 -
                 Rest (Losses, Architecture, Import)                 125                                 -


     Table 2
     Comparison of Execution Times (on AMD 3970X 32-Core / Nvidia GeForce 3090). We report average values of three runs, together
     with their standard deviation (in brackets). Reduction in execution time in our framework is mostly due to parallelisation of
     training and testing steps.

                   Execution Workflow         Manual Implementation (s)             Framework Implementation (s)
                   CPU only                            109.66 (+- 0.47)                       79.33 (+- 1.88)
                   Single GPU                          72.66 (+- 0.47)                        46.00 (+- 0.00)


ules, namely BaseDataModule, CropCenterDataModule,                         to investigate the framework’s usability and computational
and BioImageDataModule. Additionally, several pre-defined                  overhead rather than achieving state-of-the-art accuracy.
Datasets from the Human Connectome Project are read-                       Each variant was implemented in two ways: manually using
ily available, but researchers can effortlessly incorporate                PyTorch with PyTorch Lightning, and through our TLIMB
any image analysis dataset of their choice by utilizing the                framework compiled into a Nextflow workflow. These im-
provided interface.                                                        plementations were then run in three different scenarios:
                                                                           manually without the framework, with the framework se-
Nextflow as workflow manager                                               quentially, and with the framework in parallel. The primary
                                                                           metrics for evaluation were the execution times and the
Nextflow is a mature and popular scientific workflow en-                   lines of code required for each scenario. Execution times are
gine [7]. Workflows in Nextflow are written in a proper                    documented in Table 1, illustrating the comparison between
workflow language based on Groovy and are executed by a                    running the processes with and without the framework,
workflow engine which controls data dependencies, max-                     both sequentially and in parallel.
imises parallelism in task executions, and supports repro-                    Additionally, we conducted a minimal set of experiments
ducibility by a sophisticated logging mechanism. Workflows                 illustrating how TLIMB may be used in practice. On the
can either be executed locally (non distributed) by the sys-               same data set and using the ResNet-18 architecture, we
tem itself, or passed on to popular resource managers, such                compared fine-tuning effectiveness for different numbers of
as Slurm or Kubernetes [25], for scheduling on arbitrarily                 frozen layers in the pre-trained model. Freezing lower lay-
large clusters. In TLIMB, we utilize Nextflow to assemble TL               ers of a pre-trained model reduces the number of trainable
workflows from user-provided configurations into a work-                   parameters and thus reduces the risk of overfitting during
flow script. This script can then be executed in parallel and              the fine-tuning process. Metrics for pre-training and fine-
distributed across all supported infrastructures, significantly            tuning performance are shown in Table 3 and 4 respectively.
accelerating the processing speed.                                         TLIMB achieved expected levels of accuracy, in line with
                                                                           other studies [5, 16].
                                                                              Execution Time: The execution times indicated mini-
Experiments                                                                mal to no computational overhead when using the TLIMB
For the evaluation of the TLIMB framework, we used T1-                     framework. The parallel execution with nextflow signifi-
weighted brain images from the UK Biobank [12]. To stream-                 cantly reduced the time compared to the sequential runs,
line the evaluation process, we processed images by apply-                 showcasing the framework’s scalability (see Table 1).
ing linear registration and extracting the central axial slices.              Lines of Code: A notable reduction in lines of code
This reduced the dimensionality of the data, allowing us                   was observed when using TLIMB, emphasizing the ease of
to expedite the training process. We created three subsets                 use and time savings in coding. The framework abstracted
of randomly sampled images: 10,000 for pre-training, 500                   many of the repetitive tasks, such as setting up data loaders,
for fine-tuning, and 1,000 for the test set. Models were                   model configurations, and hyperparameter tuning, which
pre-trained on age regression, and fine-tuned on sex classi-               contributed to a more streamlined development process.
fication.                                                                     Prediction Performance: Although no exact replication
   To assess usability improvements, we specified a simpli-                of literature results was attempted at the time of writing,
fied search space comprising two different neural network                  our preliminary results are compatible with literature ex-
architectures (ResNet-18 and a Vision Transformer), three                  pectations.
different learning rates (10− 4, 10− 3, 10− 2), and an optional
elastic penalty loss [23] as an advanced fine-tuning tech-
nique. Both models were pre-trained for 10 epochs and
fine-tuned for one epoch. Such limited training time would
be insufficient for real world applications, but our aim here is


                                                                    4
Marc-André Schulz et al. CEUR Workshop Proceedings                                                                                      1–6


       Table 3
       Accuracy of a ResNet18 predicting brain age from 2D images of human brain from the UK Biobank. Models were trained from
       scratch. Reported results are the average and standard deviation over three training runs.

                                              Learn Rate              MSE               MAE
                                              0.01              31.3 (+- 2.63)      4.45 (+- 0.2)
                                              0.001            26.4 (+- 0.17)      4.1 (+- 0.03)
                                              0.0001           32.99 (+- 1.14)      4.6 (+- 0.08)
                                              0.00001          59.99 (+- 1.63)     6.32 (+- 0.09)


       Table 4
       Accuracy of a ResNet18 predicting sex from 2D images of the human brain from the UK Biobank. Models were pre-trained on
       age prediction (see Table 3). Learning rate was fixed at 0.001. We show 4 variations in which increasing numbers of layers are
       kept frozen during fine-tuning. "None" refers to no frozen parameters.

                             Freeze up to Layer           CE               Accuracy      Trainable Parameters
                             None                    1.29 (+- 0.3)     0.76 (+- 0.04)               11.2 M
                             layer3.1.conv1          0.71 (+- 0.1)     0.78 (+- 0.01)               9.6 M
                             layer4.0.conv1          0.67 (+- 0.3 )    0.79 (+- 0.01)               8.4 M
                             layer4.1.conv1          0.56 (+- 0.0)      0.72 (+- 0.0)               4.7 M


Conclusion and outlook                                                       [2]    Rohit Bakshi et al. “MRI in multiple sclerosis: current
                                                                                    status and future prospects”. In: The Lancet Neurology
In this study, we introduce our innovative solution – a                             7.7 (2008), pp. 615–625.
tailored implementation and evaluation platform for TL
                                                                             [3]    Stuart Berg et al. “Ilastik: interactive machine learn-
techniques in biomedical imaging applications. Guided by
                                                                                    ing for (bio) image analysis”. In: Nature methods 16.12
specific requirements, we opted for a comprehensive frame-
                                                                                    (2019), pp. 1226–1232.
work over a DSL. Our framework comprises two key com-
ponents: firstly, a Python framework built upon PyTorch                      [4]    James H Cole et al. “Predicting brain age with deep
Lightning, facilitating diverse user-defined TL tasks. Sec-                         learning from raw imaging data results in a reliable
ondly, a workflow generator and executor ensuring scalabil-                         and heritable biomarker”. In: NeuroImage 163 (2017),
ity. We provide in-depth descriptions of both components,                           pp. 115–124.
highlighting their functionalities and capabilities. To as-                  [5]    James H. Cole. “Multimodality neuroimaging brain-
certain the effectiveness and utility of our framework, we                          age in UK biobank: relationship to biomedical,
applied it to the "brain-age" paradigm. In this context, the                        lifestyle, and cognitive factors”. In: Neurobiology of
assessment of brain-age deviations from chronological age                           Aging 92 (Aug. 2020), pp. 34–42. issn: 0197-4580. doi:
serves as a metric for evaluating brain health. Our frame-                          10.1016/j.neurobiolaging.2020.03.014. url: https://
work demonstrates minimal or no computational overhead,                             www.ncbi.nlm.nih.gov/pmc/articles/PMC7280786/
while significantly reducing the number of lines of code                            (visited on 03/04/2024).
required. In the pursuit of refining our framework, we pro-
                                                                             [6]    P. Di Tommaso et al. “Nextflow enables reproducible
pose several avenues for future development. Firstly, we
                                                                                    computational workflows”. In: Nat Biotechnol 35.4
recommend the establishment of a standardized template
                                                                                    (2017), pp. 316–319.
to streamline the evaluation of TL methods. This template
would simplify result and methodology comparisons among                      [7]    Paolo Di Tommaso et al. “Nextflow enables re-
researchers, fostering a more cohesive and efficient research                       producible computational workflows”. In: Nature
environment. Moreover, to enhance the efficiency of model                           biotechnology 35.4 (2017), pp. 316–319.
tuning, we advocate for the implementation of additional                     [8]    Fabian Eitel et al. “Promises and pitfalls of deep
hyperparameter optimization methods within our frame-                               neural networks in neuroimaging-based psychiatric
work. Specifically, techniques like Bayesian Optimization                           research”. In: Experimental Neurology 339 (2021),
can be incorporated to further optimize model performance.                          p. 113608.
Furthermore, to minimize manual intervention and improve
                                                                             [9]   Joan Giner-Miguelez, Abel Gómez, and Jordi Cabot.
user experience, we suggest enhancing the workflow man-
                                                                                   “A domain-specific language for describing machine
ager. This enhancement includes the addition of automatic
                                                                                   learning datasets”. In: Journal of Computer Languages
ranking capabilities, which will facilitate a more efficient
                                                                                   76 (2023), p. 101209.
comparison and selection of the best-performing models,
guided by predefined evaluation metrics.                                    [10]    Padmavathi Kora et al. “Transfer learning techniques
                                                                                    for medical image analysis: A review”. In: Biocyber-
                                                                                    netics and Biomedical Engineering 42.1 (2022), pp. 79–
References                                                                          107.

 [1]     Alexander Alexandrov et al. “Implicit parallelism                  [11]    Jiyoung Lee. “P-Hydra: Bridging Transfer Learning
         through deep language embedding”. In: SIGMOD.                              And Multitask Learning”. In: Master Thesis, Univer-
         2015, pp. 47–61.                                                           sity of Friburg (2020).


                                                                       5
Marc-André Schulz et al. CEUR Workshop Proceedings                    1–6


[12]   Thomas J Littlejohns et al. “The UK Biobank imaging
       enhancement of 100,000 participants: rationale, data
       collection, management and future directions”. In:
       Nature communications 11.1 (2020), p. 2624.
[13]   Marjan Mernik, Jan Heering, and Anthony M. Sloane.
       “When and How to Develop Domain-Specific Lan-
       guages”. In: ACM Comput. Surv. 37.4 (Dec. 2005),
       pp. 316–344.
[14]   Han Peng et al. “Accurate brain age prediction with
       lightweight deep neural networks”. In: Medical image
       analysis 68 (2021), p. 101871.
[15]   Rangaraj M Rangayyan. Biomedical image analysis.
       CRC press, 2004.
[16]   Marc-Andre Schulz et al. “Performance reserves in
       brain-imaging-based phenotype prediction”. In: Cell
       Reports 43.1 (2024).
[17]   Arvind Sujeeth et al. “OptiML: an implicitly parallel
       domain-specific language for machine learning”. In:
       ICML. 2011, pp. 609–616.
[18]   Sergi Valverde et al. “Improving automated multiple
       sclerosis lesion segmentation with a cascaded 3D con-
       volutional neural network approach”. In: NeuroImage
       155 (2017).
[19]   Jesper E Van Engelen and Holger H Hoos. “A survey
       on semi-supervised learning”. In: Machine learning
       109.2 (2020), pp. 373–440.
[20]   Sandra Vieira, Walter HL Pinaya, and Andrea
       Mechelli. “Using deep learning to investigate the
       neuroimaging correlates of psychiatric and neurolog-
       ical disorders: Methods and applications”. In: Neu-
       roscience & Biobehavioral Reviews 74 (2017), pp. 58–
       75.
[21]   Karl Weiss, Taghi M Khoshgoftaar, and DingDing
       Wang. “A survey of transfer learning”. In: Journal of
       Big data 3.1 (2016), pp. 1–40.
[22]   Wanni Xu, You-Lei Fu, and Dongmei Zhu. “ResNet
       and Its Application to Medical Image Processing: Re-
       search Progress and Challenges”. In: Computer Meth-
       ods and Programs in Biomedicine (2023), p. 107660.
[23]   LI Xuhong, Yves Grandvalet, and Franck Davoine.
       “Explicit inductive bias for transfer learning with con-
       volutional networks”. In: International Conference on
       Machine Learning. PMLR. 2018, pp. 2825–2834.
[24]   Ziv Yaniv et al. “SimpleITK image-analysis note-
       books: a collaborative environment for education and
       reproducible research”. In: Journal of digital imaging
       31.3 (2018), pp. 290–303.
[25]   Naweiluo Zhou, Huan Zhou, and Dennis Hoppe.
       “Containerization for High Performance Computing
       Systems: Survey and Prospects”. In: IEEE Transac-
       tions on Software Engineering 49.4 (2022), pp. 2722–
       2740.


Acknowledgements and Funding
This work was funded by FONDA (DFG; SFB 1404; Project
ID: 414984028).


                                                                  6

</pre>