<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Automatic Medical Concept Detection on Images: Dividing the Task into Smaller Ones Notebook for the ImageCLEFmedical Caption 2024. Contributions of the UACH-VisionLab Team</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Axel</forename><surname>Moncloa-Muro</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Facultad de Ingeniería</orgName>
								<orgName type="institution">Universidad Autónoma de Chihuahua</orgName>
								<address>
									<addrLine>Circuito Universitario Campus II</addrLine>
									<postCode>31125</postCode>
									<settlement>Chihuahua</settlement>
									<country key="MX">Mexico</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Graciela</forename><surname>Ramirez-Alonso</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Facultad de Ingeniería</orgName>
								<orgName type="institution">Universidad Autónoma de Chihuahua</orgName>
								<address>
									<addrLine>Circuito Universitario Campus II</addrLine>
									<postCode>31125</postCode>
									<settlement>Chihuahua</settlement>
									<country key="MX">Mexico</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Fernando</forename><surname>Martinez-Reyes</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Facultad de Ingeniería</orgName>
								<orgName type="institution">Universidad Autónoma de Chihuahua</orgName>
								<address>
									<addrLine>Circuito Universitario Campus II</addrLine>
									<postCode>31125</postCode>
									<settlement>Chihuahua</settlement>
									<country key="MX">Mexico</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Automatic Medical Concept Detection on Images: Dividing the Task into Smaller Ones Notebook for the ImageCLEFmedical Caption 2024. Contributions of the UACH-VisionLab Team</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">4C138C13D5F27F5E791A9F784BCECFD0</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:57+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Multi-label classification</term>
					<term>imbalanced data</term>
					<term>EfficientNet</term>
					<term>ImageCLEFmedical</term>
					<term>ensemble</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper describes the approach proposed by the UACH-VisionLab team for the ImageCLEFmedical Concept Detection subtask 2024. The objective of this subtask is to assign medical concepts to images automatically. In particular, 1,945 distinct Clinical Concepts of Unique Identifiers (CUIs) must be associated with medical images representing a multi-label classification (MLC) problem. In this context, the ImageCLEFmedical Concept Detection subtask provides a multi-label dataset in which a medical image may contain multiple descriptive labels. The class imbalance problem in MLC poses a challenge where the samples and their corresponding labels are not uniformly distributed over the dataset. To address this challenge, our approach employs an ensemble of five EfficientNet B0 (ENB0) neural architectures. An initial neural network, ENB0, classifies each image into all possible labels. Based on the classification results, we create subgroups of multi-label datasets considering specific CUIs, such as ultrasonography, bone structure of the cranium, angiogram, and lower extremity. A separate ENB0 architecture is trained for each of these subgroups. Finally, the outputs of these five neural architectures are combined to generate the final prediction results. Our proposal ranks 5th place in the ImageCLEFmedical Concept Detection subtask, achieving an F1-score of 0.59. The code to implement our proposal can be found in https://github.com/axelm11/CLEF-ImageCLEF-2024.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>ImageCLEF is an ongoing evaluation event launched in 2003 as part of the Cross Language Evaluation Forum (CLEF) <ref type="bibr" target="#b0">[1]</ref>. In 2024, the ImageCLEFmedical Lab presents the 8th edition of the automatic image captioning task, which consists of two subtasks: concept detection and caption prediction <ref type="bibr" target="#b1">[2]</ref>. The objective of the concept detection subtask is to identify the Unified Medical Language System (UMLS) concepts of each image. These concepts are unique identifiers assigned to different medical-related terms. The training, validation, and test datasets for this subtask comprise 70,108, 9,972, and 17,237 images, respectively. This subtask is considered a multi-label classification problem, where 1,945 different concepts must be detected and a single medical image can be associated with multiple labels. The dataset is highly imbalanced, with four of the most prevalent concepts having a frequency of occurrence in the training set of 24,227, 19,363, 11,296, and 9,870, in contrast to 306 classes that have ten or fewer images. For these reasons, this dataset is particularly challenging and complex, providing an ideal setting for the development of new deep learning (DL) approaches where robust solutions must be capable of identifying the different concepts for each medical image.</p><p>In this work, we present our approach, which we submit as part of the UACH-Vision Lab group for the ImageCLEFmedical Concept Detection subtask. This proposal consists of an ensemble of five deep learning models based on the EfficientNet B0 (ENB0) architecture <ref type="bibr" target="#b2">[3]</ref>. An initial ENB0 associates each image with 1,945 possible medical concepts. Given the high imbalance of the dataset, an additional four ENB0 models were trained to identify specific concepts and improve the performance of our proposal.</p><p>The rest of this paper is organized as follows: Section 2 presents a general description of the ImageCLEFmedical dataset, Section 3 introduces our approach, and Sections 4 and 5 provide results and conclusions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Dataset</head><p>The multimodal data utilized in the ImageCLEFmedical Lab is derived from the Radiology Object in Context version 2 (ROCOv2) dataset <ref type="bibr" target="#b3">[4]</ref>. This dataset consists of radiological images accompanied by their respective medical concepts and captions. It is comprised of three distinct subsets: the training set, the validation set, and the test set. The training and validation datasets are accompanied by commaseparated value (CSV) files, which contain the medical image identifiers and the corresponding Concept Unique Identifiers (CUIs). The objective of the concept detection task is to automatically assign the corresponding CUIs to the different images of the dataset. Figure <ref type="figure" target="#fig_0">1</ref> shows a visual representation of the medical concepts associated with the different CUIs. In this case, the size of each word is related to its frequency. Among the most frequently occurring concepts are X-Ray Computed Tomography, Plain x-ray, Ultrasonography, Magnetic Resonance Imaging, and Chest, to mention some.</p><p>The task of assigning the 1,945 possible medical concepts to each image in the ImageCLEFmedical dataset is highly challenging, given the high level of complexity involved. For instance, images obtained from the same image modality may describe different conditions affecting different parts of the body. This is exemplified in Figure <ref type="figure" target="#fig_1">2</ref>, where images corresponding to the same modality, X-Ray Computed Tomography, show different parts of the body emphasizing different medical concepts.</p><p>Another case is presented in Figure <ref type="figure" target="#fig_2">3</ref>, where different image modalities present the same medical CUI. In this case, an angiogram, plain x-ray, and magnetic resonance imaging are associated with the CUI heart. Therefore, it is possible that one CUI can be present in different image modalities.</p><p>Figure <ref type="figure" target="#fig_3">4</ref> shows an additional challenging scenario where images that appear to be highly similar may, in fact, have different CUIs.    </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Methods</head><p>Our proposal is based on the baseline model provided by the ImageCLEFmedical 2024 organizers, an EfficientNet B0 (ENB0) neural architecture. Our team evaluates different neural architecture models, such as ResNet <ref type="bibr" target="#b4">[5]</ref>, DenseNet <ref type="bibr" target="#b5">[6]</ref>, the Vision Transformer (ViT) <ref type="bibr" target="#b6">[7]</ref>, and Convolutional vision Transformer (CvT) <ref type="bibr" target="#b7">[8]</ref>. However, the one proposed by the organizers yielded the best F1-scores with the validation set. The results of the ENB0 model indicate that certain CUIs exhibit highly accurate F1 performance while others exhibit zero performance. This discrepancy is primarily attributed to the multi-label class imbalance issue inherent in real-world application datasets <ref type="bibr" target="#b8">[9,</ref><ref type="bibr" target="#b9">10,</ref><ref type="bibr" target="#b10">11,</ref><ref type="bibr" target="#b11">12,</ref><ref type="bibr" target="#b12">13]</ref>. Table <ref type="table" target="#tab_0">1</ref> presents the top eight best F1-score performances. Based on these results, we select specific CUIs to create four multi-label subgroups to train and validate separate ENB0 models. The number of support samples and visual similarities in the images were considered when selecting these CUIs. For example, the categories bone structure of cranium, lower extremity and angiogram exhibit a comparable number of samples. In contrast, ultrasonography is a particularly interesting image modality, given the homogeneity of the images within this subgroup.</p><p>Figure <ref type="figure" target="#fig_4">5</ref> shows a block diagram of the proposed approach. First, an initial ENB0 model is trained to classify all the images of the training dataset on all the possible CUIs of the challenge. The output of this model is a vector of dimensionality 1,945. Then, four subgroups are defined based on the classification results of the ultrasonography, bone structure of cranium, lower extremity and angiogram CUIs. If an image is classified within any of the four aforementioned concepts, it is considered to be part of a specific subgroup. Once the subgroups have been defined, they are trained with a separate ENB0 model to identify the possible medical concepts they contain. During training, we consider it appropriate to eliminate those CUIs with a very high or low-frequency appearance to avoid severe class imbalance issues. For example, the concept plain x-ray is a very common concept. Therefore, it is eliminated from all the subgroups. For low-frequency concepts, we consider those CUIs with a support set of at least 50 samples and a maximum of 20 concepts to predict for each model.</p><p>Then, the proposed methodology is as follows. If the initial ENB0 identifies that the input medical image contains a CUI associated with the concepts of ultrasonography, bone structure of cranium, lower extremity or angiogram, then the ENB0 model trained with the specific subgroup will also analyze this input image and will produce an output prediction. All possible predictions identified by the second ENB0 will be included in the initial prediction. In other words, four ENB0 neural architectures are employed to enhance the outcome of the initial model. To ensure a precise final prediction, it is essential to exercise caution in determining the location of the CUI, as the output dimensionality of these models differs. Figure <ref type="figure" target="#fig_5">6</ref> illustrates this procedure. In this example, the angiogram concept is identified, and the prediction of the model trained with this specific subgroup is utilized to generate the final prediction result. In this case, the second ENB0 model detects four new concepts included in the final prediction.</p><p>Once we define the four subgroups, we proceed to analyze the relationship between the different CUIs they contain. Figure <ref type="figure" target="#fig_6">7</ref> shows the chord diagram of the angiogram concept. This figure illustrates the relationship between the CUIs within this subgroup. The nodes represent the different concepts, and the width of the edges is proportional to the relationship between the two nodes. Table <ref type="table" target="#tab_1">2</ref> provides a more detailed overview of the different concepts within this subgroup and the support set of each of them. The most frequent concepts are the anterior descending branch of left coronary artery, stent device, right coronary artery structure and stenosis. As can be observed in Figure <ref type="figure" target="#fig_6">7</ref>, the anterior descending branch of   left coronary artery has a strong relationship with stenosis, pulmonary artery structure, and structure of circumflex branch of left coronary artery. Furthermore, it is noteworthy that the right coronary artery structure is a frequent medical concept in this subgroup that exhibits a constant relationship with the majority of other concepts, with the exception of pseudoaneurysm.</p><p>Figure <ref type="figure" target="#fig_7">8</ref> shows the chord diagram of the medical concept bone structure of cranium. Table <ref type="table" target="#tab_2">3</ref> shows the specific canonical names of this subgroup and their support set. As can be observed, mandible is the more common medical concept. It has a strong relationship with permanent premolar tooth, and maxilla  but also, the concepts tooth structure, tooth root structure and structure of wisdom tooth are related to it. On the contrary, X-Ray Computed Tomography is only slightly related to maxilla and the CUI C1266909 (this CUI does not present a canonical name associated with it). Figure <ref type="figure" target="#fig_8">9</ref> and Table <ref type="table" target="#tab_3">4</ref> show the chord diagram and CUIs, canonical names, and support set of the lower extremity subgroup. Femur is the most frequent concept with a strong relationship with cerebral cortex, axis vertebra, and head of femur. We would like to point out that we are not sure if the cerebral cortex should be the correct canonical name of C0007776. Furthermore, it can be observed that the medical concepts of bone plates and screw are closely related.</p><p>Ultrasonography is our last subgroup. Figure <ref type="figure" target="#fig_9">10</ref> shows its relationship chord diagram, and Table <ref type="table" target="#tab_4">5</ref> presents the canonical names and support set of this subgroup. Left ventricular structure and right  ventricular structure are the more common concepts and present a high relationship between them. Right atrial structure is another common concept, and it can be observed that it is associated with the concepts left ventricular structure, right ventricular structure and left atrial structure.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>This working note paper presents the approach and results of the UACH-VisionLab team on the ImageCLEFmedical 2024 Concept Detection subtask. An analysis of the results yielded by the baseline code provided by the organizers reveals a significant imbalance issue in the context of multi-label classification. Therefore, we consider it appropriate to define subgroups with the aim of reducing this class imbalance problem. The medical concepts of ultrasonography, bone structure of the cranium, lower extremity and angiogram are identified as appropriate for use in the construction of these subgroups. Each subgroup is trained separately, and their results are merged with those produced by an initial ENB0 neural model. Upon examination of the validation results obtained in the various iterations experiments, we observe an increase in the recall metric. This indicates that our approach has reduced the number of false negative detections, which is the behavior we are looking for in class imbalance datasets. However, it has also resulted in an increase in the number of false positives, decreasing the precision metric. The only subgroup that does not produce an improvement in the metric results is the bone structure of cranium. Further investigation is required in order to gain an understanding of this behavior.</p><p>A chord diagram of the formed subgroups provides a more comprehensive understanding of the diverse concepts within them and their interconnections. Unfortunately, due to time constraints, we were unable to incorporate this crucial knowledge into the training of the models. However, we consider it to be of paramount importance, and we intend to incorporate this information into future approaches.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Word cloud of the medical concepts present on the ImageCLEFmedical dataset.</figDesc><graphic coords="2,94.57,469.26,406.16,217.05" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Images obtained with the same imaging modality yet showing different anatomical regions of the body emphasizing different medical concepts. CC BY-NC [Nghiem et al. (2014)], CC BY-NC [Unterstell et al. (2013)], CC BY [Muacevic et al. (2021)].</figDesc><graphic coords="3,83.28,65.61,428.71,101.83" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: The CUI associated with the heart concept is present in different image modalities. CC BY [Lacalzada-Almeida et al. (2018)], CC BY-NC [Biharas Monfared et al. (2015)], CC BY [Bourfiss et al. (2017)].</figDesc><graphic coords="3,83.28,225.31,428.71,140.49" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Similar images with different CUIs. CC BY [Yuasa et al. (2015)].</figDesc><graphic coords="3,83.28,423.66,428.73,129.17" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: Block diagram of our proposal. An initial ENB0 detects all possible labels. If one of these labels corresponds to the concepts ultrasonography, bone structure of cranium, lower extremity or angiogram, the initial prediction will be improved with the output of the corresponding ENB0 model. CC BY-NC [Yoon et al. (2018)], CC BY [Alwi et al. (2008)], CC BY-NC [Bagewadi et al. (2015)], CC BY [Awad et al. (2021)].</figDesc><graphic coords="4,94.57,416.26,406.14,273.94" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 6 :</head><label>6</label><figDesc>Figure 6:Example prediction of our proposal. The initial ENB0 model generates an output vector with all possible predictions. In this example, the angiogram concept is detected, then the output of a second ENB0 model is incorporated into the initial prediction. Special care must be taken with regard to the dimensions of the output vector of each model.</figDesc><graphic coords="5,105.84,356.50,383.59,352.46" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Figure 7 :</head><label>7</label><figDesc>Figure 7: CUIs and canonical names relationship in the Angiogram subgroup.</figDesc><graphic coords="6,94.57,65.61,406.14,293.70" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>Figure 8 :</head><label>8</label><figDesc>Figure 8: CUIs and canonical names relationship in the Bone Structure of Cranium subgroup.</figDesc><graphic coords="7,105.84,65.61,383.59,321.89" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_8"><head>Figure 9 :</head><label>9</label><figDesc>Figure 9: CUIs and canonical names relationship in the Lower Extremity subgroup.</figDesc><graphic coords="8,105.84,65.60,383.59,317.82" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_9"><head>Figure 10 :</head><label>10</label><figDesc>Figure 10: CUIs and canonical names relationship in the Ultrasonography subgroup.</figDesc><graphic coords="9,94.57,65.60,406.14,312.04" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="1,0.00,190.95,595.28,459.99" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Canonical name, CUI, F1-score, and support set of the top eight best classification results obtained with an ENB0 neural architecture.</figDesc><table><row><cell>Canonical name</cell><cell>CUI</cell><cell cols="2">F1-score Support</cell></row><row><cell>Ultrasonography</cell><cell>C0041618</cell><cell>0.9943</cell><cell>1,606</cell></row><row><cell cols="2">X-Ray Computed Tomography C0040405</cell><cell>0.9737</cell><cell>3,625</cell></row><row><cell>Plain x-ray</cell><cell>C1306645</cell><cell>0.9551</cell><cell>2,741</cell></row><row><cell cols="2">Magnetic Resonance Imaging C0024485</cell><cell>0.9535</cell><cell>1,437</cell></row><row><cell>Bone structure of cranium</cell><cell>C0037303</cell><cell>0.9296</cell><cell>393</cell></row><row><cell>Lower Extremity</cell><cell>C0023216</cell><cell>0.8411</cell><cell>463</cell></row><row><cell>Angiogram</cell><cell>C0002978</cell><cell>0.8366</cell><cell>421</cell></row><row><cell>Upper Extremity</cell><cell>C1140618</cell><cell>0.8060</cell><cell>178</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>CUIs, canonical names, and support set of the Angiogram subgroup.</figDesc><table><row><cell>CUI</cell><cell>Canonical Name</cell><cell>Support</cell></row><row><cell cols="2">C0226032 Anterior descending branch of left coronary artery</cell><cell>448</cell></row><row><cell cols="2">C0038257 Stent, device</cell><cell>355</cell></row><row><cell cols="2">C1261316 Right coronary artery structure</cell><cell>302</cell></row><row><cell cols="2">C1261287 Stenosis</cell><cell>300</cell></row><row><cell cols="2">C0034052 Pulmonary artery structure</cell><cell>258</cell></row><row><cell cols="2">C0085590 Catheter device</cell><cell>231</cell></row><row><cell cols="2">C1947917 Occluded</cell><cell>229</cell></row><row><cell cols="2">C0001168 Complete obstruction</cell><cell>200</cell></row><row><cell cols="2">C0002940 Aneurysm</cell><cell>194</cell></row><row><cell cols="2">C1510412 Pseudoaneurysm</cell><cell>185</cell></row><row><cell cols="2">C0226037 Structure of circumflex branch of left coronary artery</cell><cell>156</cell></row><row><cell cols="2">C0018787 Heart</cell><cell>145</cell></row><row><cell cols="2">C0042591 Vessel Positions</cell><cell>134</cell></row><row><cell cols="2">C1261082 Left coronary artery structure</cell><cell>129</cell></row><row><cell cols="2">C0016169 Pathologic fistula</cell><cell>126</cell></row><row><cell cols="2">C0205097 Caudal</cell><cell>111</cell></row><row><cell cols="2">C1275670 Collateral branch of vessel</cell><cell>104</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>CUIs, canonical names, and support set of the Bone Structure of Cranium subgroup.</figDesc><table><row><cell>CUI</cell><cell>Canonical Name</cell><cell>Support</cell></row><row><cell cols="2">C0024687 Mandible</cell><cell>472</cell></row><row><cell cols="2">C0040426 Tooth structure</cell><cell>273</cell></row><row><cell cols="2">C0024947 Maxilla</cell><cell>265</cell></row><row><cell cols="2">C1266909 -</cell><cell>174</cell></row><row><cell cols="2">C0040452 Tooth root structure</cell><cell>172</cell></row><row><cell cols="2">C0021102 Implants</cell><cell>171</cell></row><row><cell cols="2">C1704302 Permanent premolar tooth</cell><cell>140</cell></row><row><cell cols="2">C0026369 Structure of wisdom tooth</cell><cell>81</cell></row><row><cell cols="2">C1947917 Occluded</cell><cell>67</cell></row><row><cell cols="2">C0447274 Entire maxillary right lateral incisor tooth</cell><cell>61</cell></row><row><cell cols="2">C0040405 X-Ray Computed Tomography</cell><cell>61</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4</head><label>4</label><figDesc>CUIs, canonical names, and support set of the Lower Extremity subgroup.</figDesc><table><row><cell>CUI</cell><cell>Canonical Name</cell><cell>Support</cell></row><row><cell cols="2">C0015811 Femur</cell><cell>318</cell></row><row><cell cols="2">C0301559 Screw</cell><cell>119</cell></row><row><cell cols="2">C0030797 Pelvis</cell><cell>116</cell></row><row><cell cols="2">C0206207 Joint Capsule</cell><cell>103</cell></row><row><cell cols="2">C1266909 -</cell><cell>102</cell></row><row><cell cols="2">C0015813 Head of femur</cell><cell>93</cell></row><row><cell cols="2">C4281598 Structure of right knee region</cell><cell>91</cell></row><row><cell cols="2">C0524470 Right hip region structure</cell><cell>83</cell></row><row><cell cols="2">C0007776 Cerebral cortex</cell><cell>78</cell></row><row><cell cols="2">C1261192 Ankle region</cell><cell>77</cell></row><row><cell cols="2">C0005971 Bone plates</cell><cell>75</cell></row><row><cell cols="2">C0524471 Structure of left hip</cell><cell>74</cell></row><row><cell cols="2">C0004457 Axis vertebra</cell><cell>72</cell></row><row><cell cols="2">C0021102 Implants</cell><cell>69</cell></row><row><cell cols="2">C4281599 Structure of left knee region</cell><cell>64</cell></row><row><cell cols="2">C0025584 Metatarsal bone structure</cell><cell>50</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 5</head><label>5</label><figDesc>CUIs, canonical names, and support set of the Ultrasonography subgroup.</figDesc><table><row><cell>CUI</cell><cell>Canonical Name</cell><cell>Support</cell></row><row><cell cols="2">C0225897 Left ventricular structure</cell><cell>671</cell></row><row><cell cols="2">C0225883 Right ventricular structure</cell><cell>538</cell></row><row><cell cols="2">C0225860 Left atrial structure</cell><cell>380</cell></row><row><cell cols="2">C0205207 Cystic</cell><cell>340</cell></row><row><cell cols="2">C0018827 Heart Ventricle</cell><cell>332</cell></row><row><cell cols="2">C0225844 Right atrial structure</cell><cell>319</cell></row><row><cell cols="2">C0003483 Aorta</cell><cell>294</cell></row><row><cell cols="2">C0018792 Heart Atrium</cell><cell>278</cell></row><row><cell cols="2">C0031039 Pericardial effusion</cell><cell>253</cell></row><row><cell cols="2">C0026264 Mitral Valve</cell><cell>247</cell></row><row><cell cols="2">C0444611 Fluid behavior</cell><cell>241</cell></row><row><cell cols="2">C0023884 Liver</cell><cell>237</cell></row><row><cell cols="2">C0087086 Thrombus</cell><cell>235</cell></row><row><cell cols="2">C1269894 Entire left atrium</cell><cell>233</cell></row><row><cell cols="2">C0018787 Heart</cell><cell>214</cell></row><row><cell cols="2">C0003501 Aortic valve structure</cell><cell>207</cell></row><row><cell cols="2">C0016976 Gallbladder</cell><cell>206</cell></row><row><cell cols="2">C0027551 Needle device</cell><cell>193</cell></row><row><cell cols="2">C0042149 Uterus</cell><cell>190</cell></row><row><cell cols="2">C0028259 Nodule</cell><cell>190</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 6</head><label>6</label><figDesc>Test results of the Concept Detection subtask on the ImageCLEFmedical Concept Lab 2024. Two runs were submitted by our team. The first run use a drop path rate of 0.2 while the second a drop path rate of 0.3, with a weight decay factor of 1e-5.</figDesc><table><row><cell>Team</cell><cell cols="2">F1-score Secondary F1-score</cell></row><row><cell>1st run -UACH-VisionLab</cell><cell>0.59876</cell><cell>0.93631</cell></row><row><cell>2nd run -UACH-VisionLab</cell><cell>0.52921</cell><cell>0.84224</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_6"><head>Table 7</head><label>7</label><figDesc>Comparison of precision, recall, and F1-score across the different approaches with the validation dataset. The Base model corresponds to employing only one ENB0 model, Base+LE incorporates the training of the lower extremity (LE) subgroup, Base+LE+Angio includes the training of the angiogram subgroup, Base+LE+Angio+Ultrasono combines the LE and angiogram subgroups with the ultrasonography, and Base+LE+Angio+Ultrasono+Cranium integrates the bone structure of cranium subgroup. A green highlighting is related to a metric improvement, whereas a yellow highlight indicates a metric decrease.</figDesc><table><row><cell></cell><cell>Base</cell><cell></cell><cell></cell><cell>Base+LE</cell><cell></cell><cell cols="3">Base+LE+Angio</cell><cell cols="6">Base+LE+Angio+Ultrasono Base+LE+Angio+Ultrasono+Cranium</cell><cell></cell></row><row><cell cols="14">precision recall f1-score precision recall f1-score precision recall f1-score precision recall f1-score precision recall</cell><cell>f1-score</cell><cell>CUI</cell><cell>Canonical Name</cell></row><row><cell>0.4583</cell><cell>0.0873</cell><cell>0.1467</cell><cell>0.4583</cell><cell>0.0873</cell><cell>0.1467</cell><cell>0.4583</cell><cell>0.0873</cell><cell>0.1467</cell><cell>0.2268</cell><cell>0.1746</cell><cell>0.1973</cell><cell>0.2268</cell><cell>0.1746</cell><cell>0.1973</cell><cell cols="2">C0023884 Liver</cell></row><row><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.1591</cell><cell>0.1148</cell><cell>0.1333</cell><cell>0.1167</cell><cell>0.1148</cell><cell>0.1157</cell><cell cols="2">C0018792 Heart Atrium</cell></row><row><cell>1.0000</cell><cell>0.0130</cell><cell>0.0256</cell><cell>1.0000</cell><cell>0.0130</cell><cell>0.0256</cell><cell>1.0000</cell><cell>0.0130</cell><cell>0.0256</cell><cell>0.2857</cell><cell>0.1299</cell><cell>0.1786</cell><cell>0.2857</cell><cell>0.1299</cell><cell>0.1786</cell><cell cols="2">C0225844 Right atrial structure</cell></row><row><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0909</cell><cell>0.0435</cell><cell>0.0588</cell><cell>0.0909</cell><cell>0.0435</cell><cell>0.0588</cell><cell>0.0909</cell><cell>0.0435</cell><cell>0.0588</cell><cell>0.0909</cell><cell>0.0435</cell><cell>0.0588</cell><cell cols="2">C0524471 Structure of left hip</cell></row><row><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.1264</cell><cell>0.1392</cell><cell>0.1325</cell><cell>0.1264</cell><cell>0.1392</cell><cell>0.1325</cell><cell>0.1264</cell><cell>0.1392</cell><cell>0.1325</cell><cell>0.1264</cell><cell>0.1392</cell><cell>0.1325</cell><cell cols="2">C0015811 Femur</cell></row><row><cell>0.2174</cell><cell>0.0500</cell><cell>0.0813</cell><cell>0.2174</cell><cell>0.0500</cell><cell>0.0813</cell><cell>0.2174</cell><cell>0.0500</cell><cell>0.0813</cell><cell>0.1739</cell><cell>0.1200</cell><cell>0.1420</cell><cell>0.1739</cell><cell>0.1200</cell><cell>0.1420</cell><cell cols="2">C0003483 Aorta</cell></row><row><cell>0.4286</cell><cell>0.0361</cell><cell>0.0667</cell><cell>0.4286</cell><cell>0.0361</cell><cell>0.0667</cell><cell>0.2400</cell><cell>0.1446</cell><cell>0.1805</cell><cell>0.2400</cell><cell>0.1446</cell><cell>0.1805</cell><cell>0.2400</cell><cell>0.1446</cell><cell>0.1805</cell><cell cols="2">C0038257 Stent, device</cell></row><row><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.3571</cell><cell>0.1087</cell><cell>0.1667</cell><cell>0.3571</cell><cell>0.1087</cell><cell>0.1667</cell><cell>0.3571</cell><cell>0.1087</cell><cell>0.1667</cell><cell cols="2">C0205097 Caudal</cell></row><row><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0455</cell><cell>0.0179</cell><cell>0.0256</cell><cell>0.0455</cell><cell>0.0179</cell><cell>0.0256</cell><cell>0.0455</cell><cell>0.0179</cell><cell>0.0256</cell><cell>0.0455</cell><cell>0.0179</cell><cell>0.0256</cell><cell cols="2">C0206207 Joint Capsule</cell></row><row><cell>0.5000</cell><cell>0.0154</cell><cell>0.0299</cell><cell>0.1667</cell><cell>0.0462</cell><cell>0.0723</cell><cell>0.1667</cell><cell>0.0462</cell><cell>0.0723</cell><cell>0.1667</cell><cell>0.0462</cell><cell>0.0723</cell><cell>0.1667</cell><cell>0.0462</cell><cell>0.0723</cell><cell cols="2">C0301559 Screw</cell></row><row><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.2500</cell><cell>0.1351</cell><cell>0.1754</cell><cell>0.2500</cell><cell>0.1351</cell><cell>0.1754</cell><cell cols="2">C0026264 Mitral Valve</cell></row><row><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.6000</cell><cell>0.1111</cell><cell>0.1875</cell><cell>0.6000</cell><cell>0.1111</cell><cell>0.1875</cell><cell>0.6000</cell><cell>0.1111</cell><cell>0.1875</cell><cell cols="2">C0226037 Structure of circumflex</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell>branch of left coronary artery</cell></row><row><cell>1.0000</cell><cell>0.0500</cell><cell>0.0952</cell><cell>1.0000</cell><cell>0.0500</cell><cell>0.0952</cell><cell>0.3333</cell><cell>0.1000</cell><cell>0.1538</cell><cell>0.3333</cell><cell>0.1000</cell><cell>0.1538</cell><cell>0.3333</cell><cell>0.1000</cell><cell>0.1538</cell><cell cols="2">C1275670 Collateral branch of vessel</cell></row><row><cell>0.5769</cell><cell>0.1282</cell><cell>0.2098</cell><cell>0.5769</cell><cell>0.1282</cell><cell>0.2098</cell><cell>0.5769</cell><cell>0.1282</cell><cell>0.2098</cell><cell>0.3095</cell><cell>0.3333</cell><cell>0.3210</cell><cell>0.3047</cell><cell>0.3333</cell><cell>0.3184</cell><cell cols="2">C0225883 Right ventricular structure</cell></row><row><cell>0.2500</cell><cell>0.0156</cell><cell>0.0294</cell><cell>0.2500</cell><cell>0.0156</cell><cell>0.0294</cell><cell>0.2500</cell><cell>0.0156</cell><cell>0.0294</cell><cell>0.1648</cell><cell>0.2344</cell><cell>0.1935</cell><cell>0.1042</cell><cell>0.2344</cell><cell>0.1442</cell><cell cols="2">C0042149 Uterus</cell></row><row><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.1609</cell><cell>0.1359</cell><cell>0.1474</cell><cell>0.1609</cell><cell>0.1359</cell><cell>0.1474</cell><cell cols="2">C0018827 Heart Ventricle</cell></row><row><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.1364</cell><cell>0.1034</cell><cell>0.1176</cell><cell>0.1364</cell><cell>0.1034</cell><cell>0.1176</cell><cell>0.1364</cell><cell>0.1034</cell><cell>0.1176</cell><cell cols="2">C1510412 Pseudoaneurysm</cell></row><row><cell>0.2308</cell><cell>0.1429</cell><cell>0.1765</cell><cell>0.2000</cell><cell>0.1667</cell><cell>0.1818</cell><cell>0.2000</cell><cell>0.1667</cell><cell>0.1818</cell><cell>0.2000</cell><cell>0.1667</cell><cell>0.1818</cell><cell>0.2000</cell><cell>0.1667</cell><cell>0.1818</cell><cell cols="2">C0015813 Head of femur</cell></row><row><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0667</cell><cell>0.0288</cell><cell>0.0403</cell><cell>0.0694</cell><cell>0.0481</cell><cell>0.0568</cell><cell cols="2">C0087086 Thrombus</cell></row><row><cell>0.7143</cell><cell>0.2353</cell><cell>0.3540</cell><cell>0.7143</cell><cell>0.2353</cell><cell>0.3540</cell><cell>0.7143</cell><cell>0.2353</cell><cell>0.3540</cell><cell>0.4359</cell><cell>0.4000</cell><cell>0.4172</cell><cell>0.1545</cell><cell>0.4235</cell><cell>0.2264</cell><cell cols="2">C0031039 Pericardial effusion</cell></row><row><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.0000</cell><cell>0.1000</cell><cell>0.0208</cell><cell>0.0345</cell><cell>0.1000</cell><cell>0.0208</cell><cell>0.0345</cell><cell>0.1000</cell><cell>0.0208</cell><cell>0.0345</cell><cell cols="2">C0042591 Vessel Positions</cell></row></table></figure>
		</body>
		<back>
			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Results</head><p>All the neural models were trained on an NVIDIA GeForce RTX 3080 Ti 12GB GPU using the PyTorch framework and the Adam Optimizer, with an initial learning rate of 1e-3 using a batch size of 64.</p><p>Table <ref type="table">6</ref> shows the results of our team, UACH-VisionLab, with the test partition dataset. These results were provided by the ImageCLEFmedical Lab 2024 organizers. The F1-score is a measure of the harmonic mean of precision and recall. A secondary F1-score was calculated using a subset of concepts that was manually curated. Two runs were submitted by our team. The first run use a drop path rate of 0.2 while the second a drop path rate of 0.3, with a weight decay factor of 1e-5.</p><p>The results presented in Table <ref type="table">6</ref> demonstrate that the first run achieves a superior performance. The increase in the drop path rate and the use of the L2 regularization method affect the performance of the model, reducing its generalization ability with test data.</p><p>In order to gain a deeper understanding of the manner in which the incorporation of the four ENB0 models enhances the performance of our approach, Table <ref type="table">7</ref> presents the results of the precision, recall, and F1-score metrics on randomly selected CUIs. The first three columns show the results obtained when only one ENB0 model is employed, defined as the "Base" model. Subsequently, the approach was further enhanced by incorporating the training of the lower extremity (LE) subgroup defining the "Base+LE" approach. The "Base+LE+Angio" approach was created by additionally including the angiogram subgroup. The "Base+LE+Angio+Ultrasono" approach was constructed by combining the LE and angiogram subgroups with ultrasonography. Finally, the "Base+LE+Angio+Ultrasono+Cranium" approach integrates the bone structure of cranium subgroup.</p><p>A green highlight in Table <ref type="table">7</ref> indicates a metric improvement, whereas a yellow highlight indicates a metric decrease. It is important to note that the improvements in the F1-score are mainly related to an increase in the recall score. The recall metric measures how often a true positive image is identified, whereas the precision metric considers how many positive predictions are true positive samples. Consequently, if the model detects only one true positive sample with a specific CUI, the precision metric will be high. In contrast, the recall metric will exhibit a low performance (as observed, for example, in the third row of Table <ref type="table">7</ref> where many false negative samples are detected). Consequently, with fewer false negative detections but more false positives, the precision metric will decrease (highlighted in yellow), while the recall metric will increase, resulting in an improved F1-score metric (highlighted in green).</p><p>The improvements in the F1-score metric resulting from the incorporation of the lower extremity subgroup (Base+LE apporach) are structure of left hip, femur, joint capsule, screw, and head of femur. All of these medical concepts are considered in the training of this subgroup.</p><p>The improvement in the concepts detection resulting from the incorporation of the angiogram subgroup (Base+LE+Angio approach) includes the stent device, caudal, structure of circumflex branch of left coronary artery, collateral branch of vessel, pseudoaneurysm and vessel positions. It should be noted that all the aforementioned improvements, which had been reported in the previous approach (Base+LE), are maintained in this one, but only those that are new are highlighted in these three columns. This same reporting strategy is used in the remaining approaches.</p><p>The training and incorporation of the ultrasonography subgroup results in the Base+LE+Angio+Ultrasono approach. The concepts that demonstrate an improvement in the F1-score metric are liver, heart atrium, right atrial structure, aorta, mitral valve, right ventricular structure, uterus, heart ventricle, thrombus, and pericardial effusion. The medical concept heart atrium was also slightly modified with the training and incorporation of the bone structure of cranium subgroup. However, this is the only concept that was modified. No additional improvements could be identified with the Base+LE+Angio+Ultrasono+Cranium approach.</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Overview of ImageCLEF 2024: Multimedia retrieval in medical applications</title>
		<author>
			<persName><forename type="first">B</forename><surname>Ionescu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Müller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Drăgulinescu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Rückert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ben Abacha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>García Seco De Herrera</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bloch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Brüngel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Idrissi-Yaghir</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Schäfer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">S</forename><surname>Schmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">M G</forename><surname>Pakull</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Damm</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Bracke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">M</forename><surname>Friedrich</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Andrei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Prokopchuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Karpenka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Radzhabov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Kovalev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Macaire</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Schwab</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Lecouteux</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Esperança-Rodier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Yim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Fu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Yetisgen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Xia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Hicks</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Riegler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Thambawita</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Storås</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Halvorsen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Heinrich</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kiesel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Potthast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Stein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Experimental IR Meets Multilinguality, Multimodality, and Interaction, Proceedings of the 15th International Conference of the CLEF Association (CLEF 2024</title>
		<title level="s">Springer Lecture Notes in Computer Science LNCS</title>
		<meeting><address><addrLine>Grenoble, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Overview of ImageCLEFmedical 2024 -Caption Prediction and Concept Detection</title>
		<author>
			<persName><forename type="first">J</forename><surname>Rückert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ben Abacha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">G</forename><surname>Seco De Herrera</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bloch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Brüngel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Idrissi-Yaghir</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Schäfer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Bracke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Damm</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">M G</forename><surname>Pakull</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">S</forename><surname>Schmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Müller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">M</forename><surname>Friedrich</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CLEF2024 Working Notes, CEUR Workshop Proceedings</title>
				<meeting><address><addrLine>Grenoble, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks</title>
		<author>
			<persName><forename type="first">M</forename><surname>Tan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><forename type="middle">V</forename><surname>Le</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 36th International Conference on Machine Learning</title>
				<meeting>the 36th International Conference on Machine Learning<address><addrLine>PMLR</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="6105" to="6114" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Rückert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bloch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Brüngel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Idrissi-Yaghir</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Schäfer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">S</forename><surname>Schmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Koitka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Pelka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">B</forename><surname>Abacha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">G S</forename><surname>De Herrera</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Müller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">A</forename><surname>Horn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Nensa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">M</forename><surname>Friedrich</surname></persName>
		</author>
		<idno type="DOI">10.1038/s41597-024-03496-6</idno>
		<ptr target="https://arxiv.org/abs/2405.10004v1.doi:10.1038/s41597-024-03496-6" />
		<title level="m">ROCOv2: Radiology Objects in COntext version 2, an updated multimodal image dataset</title>
				<imprint>
			<publisher>Scientific Data</publisher>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Deep Residual Learning for Image Recognition</title>
		<author>
			<persName><forename type="first">K</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sun</surname></persName>
		</author>
		<idno type="DOI">10.1109/CVPR.2016.90</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</title>
				<imprint>
			<date type="published" when="2016">2016. 2016</date>
			<biblScope unit="page" from="770" to="778" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Densely Connected Convolutional Networks</title>
		<author>
			<persName><forename type="first">G</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Van Der Maaten</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">Q</forename><surname>Weinberger</surname></persName>
		</author>
		<idno type="DOI">10.1109/CVPR.2017.243</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</title>
				<imprint>
			<date type="published" when="2017">2017. 2017</date>
			<biblScope unit="page" from="2261" to="2269" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale</title>
		<author>
			<persName><forename type="first">A</forename><surname>Dosovitskiy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Beyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kolesnikov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Weissenborn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Unterthiner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dehghani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Minderer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Heigold</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gelly</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Uszkoreit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Houlsby</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">9th International Conference on Learning Representations, ICLR 2021, Virtual Event</title>
				<meeting><address><addrLine>Austria</addrLine></address></meeting>
		<imprint>
			<publisher>OpenReview</publisher>
			<date type="published" when="2021">May 3-7, 2021. 2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">CvT: Introducing Convolutions to Vision Transformers</title>
		<author>
			<persName><forename type="first">H</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Xiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Codella</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Dai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zhang</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICCV48922.2021.00009</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)</title>
				<meeting>the IEEE/CVF International Conference on Computer Vision (ICCV)</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="22" to="31" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Multi-Label Image Classification Model Based on Multiscale Fusion and Adaptive Label Correlation</title>
		<author>
			<persName><forename type="first">J</forename><surname>Ye</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Xiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Jiang</surname></persName>
		</author>
		<idno type="DOI">10.1007/s12204-023-2688-6</idno>
	</analytic>
	<monogr>
		<title level="j">Journal of Shanghai Jiaotong University</title>
		<imprint>
			<biblScope unit="page" from="1" to="10" />
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note>Science</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Deep learning for understanding multilabel imbalanced Chest X-ray datasets</title>
		<author>
			<persName><forename type="first">H</forename><surname>Liz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Huertas-Tato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sánchez-Montañés</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Del</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Ser</surname></persName>
		</author>
		<author>
			<persName><surname>Camacho</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.future.2023.03.005</idno>
	</analytic>
	<monogr>
		<title level="j">Future Generation Computer Systems</title>
		<imprint>
			<biblScope unit="volume">144</biblScope>
			<biblScope unit="page" from="291" to="306" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Enhancement of DNN-based multilabel classification by grouping labels based on data imbalance and label correlation</title>
		<author>
			<persName><forename type="first">L</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Li</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.patcog.2022.108964</idno>
	</analytic>
	<monogr>
		<title level="j">Pattern Recognition</title>
		<imprint>
			<biblScope unit="volume">132</biblScope>
			<biblScope unit="page">108964</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">A partition-based problem transformation algorithm for classifying imbalanced multi-label data</title>
		<author>
			<persName><forename type="first">J</forename><surname>Duan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Yu</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.engappai.2023.107506</idno>
	</analytic>
	<monogr>
		<title level="j">Engineering Applications of Artificial Intelligence</title>
		<imprint>
			<biblScope unit="volume">128</biblScope>
			<biblScope unit="page">107506</biblScope>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Label correlation guided borderline oversampling for imbalanced multi-label data learning</title>
		<author>
			<persName><forename type="first">K</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Mao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Liang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">R</forename><surname>Zaiane</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.knosys.2023.110938</idno>
	</analytic>
	<monogr>
		<title level="j">Knowledge-Based Systems</title>
		<imprint>
			<biblScope unit="volume">279</biblScope>
			<biblScope unit="page">110938</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
