<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">CN-Unet:A Robust Network Based On Deep Convolution For Medical Image Segmentation</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Wei</forename><surname>Liu</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Hubei University of Technology</orgName>
								<address>
									<settlement>Wuhan</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Junwei</forename><surname>Li</surname></persName>
							<email>lijunwei7800@gmail.com</email>
							<affiliation key="aff0">
								<orgName type="institution">Hubei University of Technology</orgName>
								<address>
									<settlement>Wuhan</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Zhiwei</forename><surname>Ye</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Hubei University of Technology</orgName>
								<address>
									<settlement>Wuhan</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Orest</forename><surname>Kochan</surname></persName>
							<email>orest.v.kochan@lpnu.ua</email>
							<affiliation key="aff0">
								<orgName type="institution">Hubei University of Technology</orgName>
								<address>
									<settlement>Wuhan</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">Lviv Polytechnic National University</orgName>
								<address>
									<addrLine>12 S. Bandera Str</addrLine>
									<postCode>79013</postCode>
									<settlement>Lviv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">CN-Unet:A Robust Network Based On Deep Convolution For Medical Image Segmentation</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">371DA87F455A00643EC0A716451EF92F</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-06-19T15:03+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Medical Image Segmentation</term>
					<term>Deep learning</term>
					<term>Adjacent Information</term>
					<term>Data Augmentation</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>With the rapid development of deep learning, traditional medical image segmentation methods are gradually eliminated. The mainstream tasks of medical image segmentation include tumor segmentation, multi-organ segmentation, cardiac segmentation, and retinal segmentation. For the common multi-category segmentation problem in medical segmentation, due to the large differences in individual shapes and textures between categories, it is difficult to achieve segmentation. Taking medical segmentation as an example, we propose an efficient and powerful network architecture for medical segmentation, CN-Unet. CN-Unet is a U-shaped symmetric network based on deep convolution, and its basic unit comes from CN-Block in ConvNeXt. To cope with small object segmentation in medical images, we design a multiple data augmentation module, in which the slice fusion branch can subtly capture the adjacent information of medical slices. Experiments on two public datasets (Synapse and ACDC) show that the segmentation ability of CN-Unet outperforms other state-of-the-art methods.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>As technology evolves, various fields of our lives have benefited to varying degrees; deep learning has also begun to penetrate more professional areas, such as the military, medicine, education, transportation, and other regions <ref type="bibr" target="#b0">[1]</ref>. As an essential task in the medical field, medical image processing has always received significant attention from medicine and some interdisciplinary experts <ref type="bibr" target="#b1">[2]</ref>. Therefore, combining deep learning and medical image processing to solve various subtasks on medical images has become a hot topic in recent years <ref type="bibr" target="#b2">[3]</ref>. Due to the rapid development of medical imaging equipment, medical imaging equipment such as magnetic resonance imaging (MRI), computed tomography (CT), and X-ray imaging has gradually become essential in medical image analysis. As the primary method of medical image analysis, medical image segmentation can assist doctors in obtaining information about organs or lesions, which is of significance for a series of medical analysis tasks such as disease observation, treatment plan formulation, and anatomical structure modeling. Due to the characteristics of medical images, it has a series of problems, such as complex image format, difficulty acquiring data sets, and difficulty extracting features. Therefore, medical image segmentation remains a challenging task.</p><p>When AlexNet first appeared, it won the ImageNet LSVRC-2010 championship, and its accuracy far exceeded the second place. Since then, the craze of the convolutional neural network has risen to a new height. The convolutional neural network has also indicated that it is about to become the trend in the image field. In 2015, Ronneberger et al. <ref type="bibr" target="#b3">[4]</ref> proposed a U-shaped network structure (U-Net) for medical image segmentation, and its excellent segmentation performance and ingenious network structure attracted the attention of scholars. Since then, the U-shaped structure's application in medical images has been extensively recognized <ref type="bibr" target="#b4">[5]</ref>. As the most classic neural network in medical image segmentation, U-Net is characterized by a symmetrical U-shaped structure and skip connections that can combine features. The U-shaped system consists of an encoder and a decoder. The encoder can extract features from the input original image layer by layer from shallow to deep; the decoder can restore the extracted features to the original size layer by layer through upsampling operations. The skip connection can fuse the features of different levels; the purpose is to reduce the loss of information in the recovery process to achieve a better segmentation effect.</p><p>Currently, the medical image segmentation methods with deep learning backgrounds are divided into 2D and 3D segmentation <ref type="bibr" target="#b5">[6]</ref>. In the 2D segmentation method, we generally decompose the 3D medical image data into many 2D slices, then input each slice into the segmentation model; the model will output the segmentation effect. 2D segmentation has strong generalization ability, 2D segmentation model also has better transferability, and the training process has the advantages of fewer parameters and faster training speed. In contrast, 2D image segmentation may lose some contextual information in data processing and feature extraction. In the 3D segmentation method, the data is superimposed by multiple layers of slices. Compared with the 2D data, there is one more z-axis, so the (x, y, z) three directions are encoded in the convolution process. This may allow the model to obtain richer feature information, but 3D segmentation will consume more memory and obtain several times the number of parameters of 2D segmentation. Due to the matching problem between the amount of data and the number of model parameters, the 3D segmentation model needs more data to train. Otherwise, it may lead to overfitting problems <ref type="bibr" target="#b6">[7]</ref>.</p><p>In this paper, we propose a depthwise convolution-based 2D medical image segmentation architecture, CN-Unet. CN-Unet is a U-shaped symmetric architecture based on deep convolution based on CN Block as the basic unit. To take full advantage of the powerful feature extraction capability of CN Block, we divide the encoder and decoder of CN-Unet into four stages, and set the CN Block ratio of several stages to <ref type="bibr" target="#b2">(3,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b8">9,</ref><ref type="bibr" target="#b2">3)</ref>. We add skip connections between each symmetry stage to recover the contextual information of the feature maps. To improve the segmentation ability of CN-Unet for small and medium organs in Synapse, we propose a multiple data augmentation module in which the slice fusion branch is presented according to the characteristics of medical images. We use two different datasets, Synapse and ACDC, to evaluate the segmentation performance of CN-Unet. We achieve the best results on the Synapse dataset and break through 80% DSC on the semantic class of gallbladder. To verify the feasibility of our structure, we do in-depth comparative experiments on the Synapse dataset, and our base CN-Unet architecture outperforms ConvNeXt <ref type="bibr" target="#b7">[8]</ref> in results.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related works 2.1. Medical image segmentation method based on deep learning</head><p>With the advent of convolutional neural networks, deep learning-based segmentation models have been widely used in medical impact. In 2015, Ronneberger et al. pioneered the coder-decoder structure network, U-Net <ref type="bibr" target="#b3">[4]</ref>. U-Net is the first deep learning network applied to medical image segmentation. Its unique structure and excellent segmentation performance later became the benchmark for medical image segmentation networks. To make U-Net's skip connections work more fully, U-Net series networks have been proposed successively (U-Net+, U-Net++, U-Net+++). Since 3D methods have become popular in the image field, medical imaging scholars have also begun to apply 3D network models to various tasks in medical image analysis, such as 3D-U-Net and V-Net. With the advent of visual transformers (VITs), researchers began to try to combine Transformers to design medical segmentation models. TransUNet proposed by Chen et al. <ref type="bibr" target="#b14">[15]</ref>, is the first medical segmentation model that combines CNN <ref type="bibr" target="#b15">[16]</ref> and Transformer. The authors used it for multi-organ and heart segmentation, and its performance was better than the state-of-the-art network. In the past two years, Transformerbased medical image segmentation network structures have begun to emerge, such as Swin-Unet <ref type="bibr" target="#b8">[9]</ref>, MISSFormer <ref type="bibr" target="#b10">[11]</ref>, UNETR <ref type="bibr" target="#b11">[12]</ref>, nnFormer <ref type="bibr" target="#b9">[10]</ref>, etc. These advanced structures have all performed segmentation tasks on datasets such as multiple organs and have achieved positive results in DSC values. Until the emergence of ConvNeXt, Liu et al. <ref type="bibr" target="#b7">[8]</ref> designed a convolutional neural network according to the idea of Swin Transformer. They proved through many experiments that the convolutional neural network is not worse than the transformer-based networks, even better than the Transformer structure in some aspects.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Data Augmentation Methods in Medical Image Segmentation</head><p>Access to medical data has always been challenging due to the specialized nature required to label medical data. In this context, the augmentation of medical data is significant. Methods used for medical image enhancement typically have transformations such as rotation, random cropping, elastic deformation, and inversion, which generate a training image that resembles a specific training example. With the rapid development of deep learning, the effect of these commonly used data augmentation methods on model performance appears to be gradually missing. In 2019, Wang et al. <ref type="bibr" target="#b20">[21]</ref> proposed a theoretical formulation of test-time augmentation for deep learning and applied it to medical image tasks. However, many enhancement methods, such as DAGAN augmentation <ref type="bibr" target="#b18">[19]</ref>, have not been popularized in the downstream tasks of medical images. The method generates semantic maps through labels, and adds semantic maps and labels to the training set in pairs, which is an advanced and effective data augmentation method.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Method</head><p>CN-Unet is an encoder-decoder structure, and its overall architecture is shown in Fig 1 <ref type="figure">.</ref> It has the same U-shaped structure as the classic U-Net, mainly composed of encoder, bottleneck, decoder, and skip connections. The basic unit of CN-Unet is the CN block. Specifically, the encoder consists of one embedding layer, three CN modules, and two downsampling layers. Symmetrically, the decoder branch contains three upsampling layers and three CN modules. Furthermore, the bottleneck consists of a downsampling layer and two CN modules to provide a large receptive field to support the decoder. Inspired by U-Net, we design a symmetric encoder-decoder structure and add skip connections between the feature cones of each corresponding CN module. The fusion of multi-scale features helps recover fine-grained details in predictions to compensate for the loss of spatial information caused by downsampling. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.">Encoder</head><p>The input of CN-Unet is the 2D slice 𝑥 ∈ 𝑅 𝐻×𝑊 after slice fusion (𝑥 is obtained by scaling, cropping, and rotating the original image), and H and W represent the height and width of each input scan. Slices of size 512×512 are sent to the Embedding Layer. The Embedding Layer contains a downsampling layer and a normalization layer. Unlike the general average pooling and maximum pooling, our downsampling layer is composed of The convolution operation is completed with a kernel size of 4×4 and a stride of 4. Same as CN Block, Embedding Layer uses layer normalization. After passing through the Embedding Layer, the number of input channels increases to 96, and the normalization is completed. At this time, the data will pass through the first CN Block. On both sides of the entire symmetric structure, we set the stage scale of CN Block to <ref type="bibr" target="#b2">(3,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b8">9,</ref><ref type="bibr" target="#b2">3)</ref>, so the first CN Block layer contains three independent CN Block blocks. We call the matrix entering the CN Block 𝑥 𝑖𝑛 . Before 𝑥 𝑖𝑛 passes the inversion bottleneck, we will perform a 7×7 depthwise separable convolution operation on it, then pass through an LN layer. The process formula is as follows:</p><formula xml:id="formula_0">𝑦 1 = 𝐿𝑁(𝑆𝑒𝑝𝐶𝑜𝑛𝑣 7×7 (𝑥 𝑖𝑛 )),<label>(1)</label></formula><p>Then 𝑦 1 first performs the permute method to transform the dimension and enters the bottleneck layer. The bottleneck layer contains two fully connected layers and a GELU activation function. The order is FC, GELU, FC. After that, it will continue to use permute to transform the dimension, then go through a DropPath layer to delete the multi-branch structure randomly. Finally, adding 𝑥 𝑖𝑛 to the matrix obtained here is a complete CN Block operation, and its formula is as follows:</p><formula xml:id="formula_1">𝑥 𝑝𝑢𝑡 = 𝐷𝑟𝑜𝑝𝑃𝑎𝑡ℎ (𝐹𝐶 (𝐺𝐸𝐿𝑈(𝐹𝐶(𝑦 1 )))),<label>(2)</label></formula><p>The ratio range of DropPath is set between 0 and 0.4. Since the stage ratio of CN Block is (3, 3, 9, 3), we average it into 18 regions in the range of 0 to 0.4, of which the first DropPath of the CN Block is about 0.022, the DropPath of the second CN Block is about 0.044, and so on, the DropPath of the last CN Block of the encoder part is 0.4.</p><p>In the U-shaped structure, the encoder usually has a down-sampling operation to increase the receptive field and highlight the features of the image to better extract shallow features. The entire encoder structure of CN-Unet contains three downsampling layers located in the middle of every two CN Block layers. In order to obtain better fusion information and achieve the effect of feature extraction and feature dimensionality reduction, we use 2×2 convolution downsampling in the downsampling stage. We set the step size of convolution downsampling to 2, which can obtain enough receptive field and avoid over-sampling. Since we put four stages for both the encoder and the decoder, the feature size of each stage of the encoder will increase by four times, so each downsampling layer is double downsampling.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.5.">Decoder</head><p>The decoder of CN-Unet also contains four stages, maintaining symmetry with the encoder. The ratio of (3, 3, 9, 3) is also maintained in the layout of CN Blocks. After obtaining the shallow feature information, the feature map first passes through a bottleneck containing three CN Blocks, then passes through the first one. Upsampling layer. The traditional upsampling method generally uses a preset interpolation method, so the preset interpolation method cannot bring better learning ability to the network. In CN-Unet, our upsampling layer uses transposed convolution to extract deep features, then concatenates a layer for normalization to make the gradient more stable. Before connecting the next CN Block layer, we use skip connections to connect the feature maps from the upsampling layer with the feature maps from the CN Block at the same stage in the encoder. Assuming that the feature map before entering the first upsampling layer in the decoder is 𝑥 1 , the input of the next CN Block is 𝑥 2 , and the output of the corresponding CN Block in the encoder is 𝑦 1 , we can use the following formula to express the above Operation of sampling layers and skip connections:</p><formula xml:id="formula_2">𝑥 2 = 𝐶𝑜𝑛𝑐𝑎𝑡(𝐺𝐸𝐿𝑈(𝐷𝑒𝑐𝑜𝑛𝑣 4×4 (𝑥 1 )), 𝑦 1 ),<label>(3)</label></formula><p>In the decoder, this procedure is performed three times to restore the feature maps of each stage to the same size as the encoder. In each CN Block layer stage of the decoder, the jump connection between the upsampling layer and the encoder is passed before the input. After that, the number of channels of the feature map changed from C to 2C. To restore the number of channels to C, we add a convolutional layer inside each CN Block layer of the decoder; the convolution kernel size is 3×3, the stride is set to 1, and the padding is set to 1. In this way, we complete the design of the entire decoder. Finally, we send the output of each CN Block layer of the decoder to the Upper Head to complete the medical image segmentation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.6.">Multiple Data Augmentation Modules</head><p>In the process of solving downstream tasks in the image field, data augmentation is usually used to increase the number of samples and enrich the diversity of samples. At the same time, data augmentation can also improve the model's generalization ability. Due to the medical professionalism and privacy required for medical image labeling, it is tough to obtain medical data sets, which is also one of the main reasons why the problems in the field of medical image processing are difficult to solve <ref type="bibr" target="#b17">[18]</ref>. To avoid letting the lack of data affect the model's efficiency, we propose multiple data augmentation modules specifically for medical data.The MDA module contains the following three branches:</p><p>DAGAN data augmentation:Before inputting the images to CN-Unet, we randomly selected the labels of 1280 samples, then fed them into the DAGAN <ref type="bibr" target="#b18">[19]</ref> network for a semantic generation. Through the DAGAN network, we can obtain an additional 1280 generated pictures, then expand these generated pictures and the corresponding labels of the input into the data set. The specific steps are shown in Figure <ref type="figure" target="#fig_1">2</ref>.</p><p>Slice Fusion augmentation:When cutting medical CT images, we have some understanding of the characteristics of medical data. After completing the cutting of a single case, we quickly switch the picture to check whether the cutting effect is ideal roughly. Interestingly, this group of slices is like a video with a timeline, and we can see the internal changes in this case. This means that we can regard this case as a whole, then after cutting it into several slices, each of its adjacent slices has a specific relationship. If we feed the slices into the network structure one by one, we lose this association, the so-called contextual information. Our approach is to merge three adjacent single-channel grayscale slices into a single 3-channel RBG image so that we will effectively preserve the contextual information of the data. Of course, we can also directly convert a single-channel slice into a 3-channel image, but this simply copies the information of a single slice, which not only fails to capture the information between adjacent slices, but also causes data redundancy. the model make predictions for each augmented picture and return a set of those predictions, which is the step of de-augmentation. Finally, performing a merge operation completes the test-time data enhancement. The effect of TTA will be discussed later, with the results in Table <ref type="table">4</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.7.">Loss function</head><p>This paper uses weighted cross-entropy (WCE) loss to replace the commonly used cross-entropy loss. The WCE loss is a variant of the CE loss. All positive samples are multiplied by a coefficient for weighting. This loss function is widely used in class-imbalanced problems. The formula is as follows:</p><formula xml:id="formula_3">𝑙𝑜𝑠𝑠 𝑊𝐶𝐸 = − ∑ 𝑤 * 𝑦 𝑖 * log(𝑙𝑜𝑔𝑖𝑡𝑆 𝑖 ) + (1 − 𝑦 𝑖 ) * log(1 − 𝑙𝑜𝑔𝑖𝑡𝑆 𝑖 ) ,<label>(4)</label></formula><p>The Synapse dataset's training set and the background class have nine categories. The number of samples in each category is 𝑛 𝑖 , and 𝑖 ranges from 1 to 9. The median balance method can be used to calculate w.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Experiments</head><p>In this section, to verify the effectiveness of CN-Unet, we conduct experiments on two commonly used datasets: Synapse Multi-Organ Segmentation and Automatic Cardiac Diagnosis Challenge (ACDC). We chose to compare with current state-of-the-art ConvNets and transformer-based architectures, using the reported results to explore the research space in the field of medical image segmentation and demonstrate the superiority of CN-Unet.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Datasets</head><p>Synapse for multi-organ CT segmentation.：The dataset consists of abdominal clinical CT scan images of 30 patients, which contains 3779 axial abdominal images. After using the split in <ref type="bibr" target="#b14">[15]</ref>, 18 sample cases were constructed as the training set, and the remaining 12 sample cases were divided into the test set. Using the mean Dice Similarity Coefficient (DSC) and mean Hausdorff distance (HD) <ref type="bibr" target="#b27">[28]</ref> as evaluation metrics, we assessed CN-Unet performance.</p><p>ACDC for automated cardiac diagnosis ：The ACDC challenge obtained examinations of 100 patients using an MRI scanner. The MR image is a series of short-axis slices, the heart area is covered from the left atrium to the apex, and the thickness of the slices is 5 to 8 mm. Each data was manually labeled for left ventricle (LV), right ventricle (RV), and myocardium (MYO). The entire dataset was split into 70 training samples (1930 axial slices), ten validation samples, and 20 test samples. Like <ref type="bibr" target="#b8">[9]</ref>, we evaluate our method using the average DSC index.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Evaluation metrics</head><p>We use Dice score and 95% Hausdorff distance (HD95) <ref type="bibr" target="#b27">[28]</ref> to evaluate the accuracy of segmentation results. The Dice score is calculated based on the accuracy and sensitivity of the test samples and is a result of balancing the two criteria. Hausdorff distance is often used as a segmentation indicator, mainly used to measure the segmentation accuracy of the boundary. For a given semantic class, we denote the ground truth and predicted values of each pixel as Xi and Yi, respectively, and X' and Y' represent the surface point set of ground truth and predicted values. The evaluation formula of Dice score and HD is as follows:</p><formula xml:id="formula_4">𝐷𝑖𝑐𝑒 = 2 ∑ X i Y i I i=1 ∑ X i + ∑ Y i I i=1 I i=1 (5) HD = max{d XY , d YX } = max       y x y x X x Y y Y y X x , min max , , min max     ,<label>(6)</label></formula><p>95% HD is similar to Max HD. However, it is based on calculating the 95th percentile of the distance between boundary points in X and Y. The purpose of using this measure is to remove the effect of a minimal subset of outliers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Experimental details</head><p>CN-Unet is implemented on Python 3.6, PyTorch 1.8.1, and Ubuntu 20.04. For data preprocessing on Synapse, we adopt the method introduced in Swin-Unet. For all samples in the training set, data augmentation methods such as random cropping, scaling, and rotation are used to increase the diversity of samples. In the training phase, we use a batch size of 2, and the input image size is 256×256×3. We used the AdamW optimizer to train the model, set the initial learning rate to 0.0001, and set the weight decay to 0.05 to prevent overfitting during gradient descent; exponential decay of the first and second moment estimates The rates were set to 0.9 and 0.999, respectively. All training procedures are performed on an NVIDIA 3080ti GPU with 12GB memory. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">Results</head><p>To evaluate our CN-Unet, we compare current state-of-the-art medical segmentation methods with CN-Unet, including TransUNet, CoTR <ref type="bibr" target="#b23">[24]</ref>, Swin-Unet, UNETR, MISSFormer, nnFormer, and nnUNet <ref type="bibr" target="#b24">[25]</ref>. The segmentation results of all models are shown in Table <ref type="table" target="#tab_0">1</ref>; our CN-Unet achieves the best scores of 90.23 and 10.53 on the average Dice score and the average HD95 score, respectively. The Dice score of 90.23 shows that our CN-Unet segmentation's accuracy is the best among all models. In contrast, the HD95 score of 10.53 represents the superior performance of CN-Unet on organ edge segmentation. Excluding ConvNeXt, the average Dice score and average HD95 score of the second nnUNet are 86.79 and 13.69, respectively. In contrast, our CN-Unet is 3.44 and 3.16 higher than nnUNet, a remarkable achievement for Synapse Improve. Regarding semantic categories, our CN-Unet achieves the best Dice scores on the three categories of aorta, spleen, and stomach, outperforming the second-place Dice scores by 9.73, 2.77, and 4.87, respectively. The gallbladder breaks the 80% level on Synapse for the first time with a Dice score of 81.23, which shows that our CN-Unet has significantly improved the segmentation of small organs. From Figure <ref type="figure" target="#fig_3">4</ref>, we can see that CN-Unet outperforms other methods in segmenting small objects and is comparable to nnFormer in edge detection. The boxplots in Figure <ref type="figure">5</ref> show that CN-Unet has higher upper and lower quartiles than the rest of the methods, which validates the average superiority of CN-Unet over Synapse's categories. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.5.">Ablation experiment</head><p>As shown in Table <ref type="table" target="#tab_0">1</ref>, the U-shaped symmetric structure based on CN Block is 0.9 higher than the tiny version of ConvNeXt. In each semantic category, our base structure outperforms ConvNeXt in five categories. The line graph in Figure <ref type="figure" target="#fig_3">4</ref> shows that CN-Unet reaches the optimal value range faster than ConvNeXt. The experimental results demonstrate the feasibility of our CN-Unet infrastructure in general.</p><p>After determining the feasibility of CN-Net's infrastructure, we conducted in-depth research on the MDA module. The results of each branch on the ACDC dataset are shown in Table <ref type="table" target="#tab_1">2</ref>. Each branch has improved the performance of CN-Unet, and slice fusion has improved the average DSC by 1.13, which is the branch with the most contribution in MDA.Overall, the progress of the MDA module on CN-Unet is considerable, and each branch of MDA contributes to the whole module to varying degrees.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusion</head><p>In this paper, we propose a 2D medical segmentation network, CN-Unet. CN-Unet is a robust deep convolutional network, we design it as a U-shaped symmetrical structure, and the basic design is the same as Swin-Unet. CN Block is an advanced convolution block proposed in ConvNeXt. CN Block absorbs the advantages of Swin Transformer Block and ResNet Block simultaneously. It can not only accurately obtain encoded spatial information but also build standard hierarchical objects. To give full play to the feature learning ability of CN Block, we divide the encoder and decoder into four layers and use skip connections to connect each corresponding layer to recover contextual information. To improve the small object segmentation ability of CN-Unet, we propose the MDA module. Experiments on Synapse and ACDC datasets show that our CN-Unet achieves promising results on 2D medical segmentation, which shows that CN-Unet has good segmentation performance and robustness. Especially on Synapse, the HD95 and DSC scores obtained by CN-Unet exceed the current state-of-theart methods, and DSC, for the first time in the semantic class of gallbladder breaks through 80%, reflecting CN-Unet's performance in small object segmentation. We also conduct comparative experiments with the original ConvNeXt on the Synapse dataset, demonstrating the effectiveness of our encoder-decoder structure.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: The archietecture of CN-Unet</figDesc><graphic coords="3,86.16,465.72,438.00,206.64" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: The process of DAGAN data enhancement</figDesc><graphic coords="5,81.96,451.56,313.20,239.64" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Visual comparison with current state-of-the-art methods on the Synapse dataset.</figDesc><graphic coords="8,86.16,71.40,347.04,271.32" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: DSC line graphs per 1000iterations for CN-Unet and ConvNeXt.</figDesc><graphic coords="8,83.04,556.92,418.56,189.48" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 Compared</head><label>1</label><figDesc></figDesc><table><row><cell cols="4">with other advanced methods</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>Method</cell><cell>HD95 DSC</cell><cell>Aor</cell><cell>Gall</cell><cell>Lkid</cell><cell>RKid</cell><cell>Liv</cell><cell>Pan</cell><cell>Spl</cell><cell>Sto</cell></row><row><cell cols="8">TransUNet 31.69 77.48 87.23 63.13 81.87 77.02 94.08 55.86</cell><cell>85.08</cell><cell>75.62</cell></row><row><cell>CoTr</cell><cell cols="7">27.38 78.08 85.87 61.38 84.83 79.36 94.28 57.65</cell><cell>87.74</cell><cell>73.55</cell></row><row><cell cols="8">Swin-Unet 21.55 79.13 85.47 66.53 83.28 79.61 94.29 56.58</cell><cell>90.66</cell><cell>76.60</cell></row><row><cell>U-NETR</cell><cell cols="7">22.97 79.56 89.99 60.56 85.66 84.80 94.46 59.25</cell><cell>87.81</cell><cell>73.99</cell></row><row><cell cols="8">MISSFormer 18.20 81.96 86.99 68.65 85.21 82.00 94.41 65.67</cell><cell>91.92</cell><cell>80.81</cell></row><row><cell cols="8">nnFormer 15.80 86.56 92.13 70.54 86.50 86.21 96.88 83.32</cell><cell>90.10</cell><cell>86.83</cell></row><row><cell>nnUNet</cell><cell cols="7">13.69 86.79 93.20 71.50 84.39 88.36 97.31 82.89</cell><cell>91.22</cell><cell>85.47</cell></row><row><cell cols="8">ConvNeXt 12.43 87.14 89.86 68.73 94.65 93.32 96.48 73.03</cell><cell>93.62</cell><cell>87.43</cell></row><row><cell>CN-Unet</cell><cell cols="7">12.53 90.23 91.74 81.23 94.64 93.13 96.63 78.08</cell><cell>94.69</cell><cell>91.70</cell></row><row><cell>P-values</cell><cell></cell><cell></cell><cell></cell><cell cols="3">&lt;1e-2(HD95),&lt;1e-2(DSC)</cell><cell></cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc></figDesc><table><row><cell>Ablation experiments</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>Method</cell><cell>Average</cell><cell>RV</cell><cell>Myo</cell><cell>LV</cell></row><row><cell>CN-Unet*</cell><cell>87.37</cell><cell>87.26</cell><cell>83.35</cell><cell>91.52</cell></row><row><cell>CN-Unet*+SF</cell><cell>88.50</cell><cell>88.92</cell><cell>84.45</cell><cell>92.13</cell></row><row><cell>CN-Unet*+SF+DAGAN</cell><cell>89.35</cell><cell>89.75</cell><cell>84.62</cell><cell>93.68</cell></row><row><cell>CN-Unet*+MDA</cell><cell>90.43</cell><cell>90.89</cell><cell>86.74</cell><cell>93.66</cell></row><row><cell>P-value</cell><cell></cell><cell></cell><cell>&lt;1e-2(DSC)</cell><cell></cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Acknowledgements</head><p>This work was supported by the National Natural Science Foundation of China under Grant No.41901296, the Fujian Provincial Key Laboratory of Data Intensive Computing and Key Laboratory of Intelligent Computing and Information Processing under Grant No. BD201801, and the Wuhan Science and Technology Bureau Knowledge Innovation Dawning Plan Project: Detection and Optimization Method of GNSS Hybrid Attacks for Connected and Autonomous Vehicles und-er Grant No. 2022010801020270.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Deep neural networks for medical image segmentation</title>
		<author>
			<persName><forename type="first">P</forename><surname>Malhotra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Koundal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zaguia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Enbeyle</surname></persName>
		</author>
		<idno type="DOI">10.1155/2022/9580991</idno>
	</analytic>
	<monogr>
		<title level="j">Journal of Healthcare Engineering</title>
		<imprint>
			<biblScope unit="volume">2022</biblScope>
			<biblScope unit="page">9580991</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Deep learning techniques for medical image segmentation: achievements and challenges</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">H</forename><surname>Hesamian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Jia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kennedy</surname></persName>
		</author>
		<idno type="DOI">10.1007/s10278-019-00227-x</idno>
	</analytic>
	<monogr>
		<title level="j">Journal of digital imaging</title>
		<imprint>
			<biblScope unit="volume">32</biblScope>
			<biblScope unit="page" from="582" to="596" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Medical image segmentation with limited supervision: a review of deep network models</title>
		<author>
			<persName><forename type="first">J</forename><surname>Peng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<idno type="DOI">10.1109/ACCESS.2021.3062380</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page" from="36827" to="36851" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">U-net: convolutional networks for biomedical image segmentation</title>
		<author>
			<persName><forename type="first">O</forename><surname>Ronneberger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Fischer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Brox</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-319-24574-4_28</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International Conference on Medical image computing and computer-assisted intervention</title>
				<meeting>the International Conference on Medical image computing and computer-assisted intervention<address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="234" to="241" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Unet++: redesigning skip connections to exploit multiscale features in image segmentation</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">M R</forename><surname>Siddiquee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Tajbakhsh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Liang</surname></persName>
		</author>
		<idno type="DOI">10.1109/TMI.2019.2959609</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE transactions on medical imaging</title>
		<imprint>
			<biblScope unit="volume">39</biblScope>
			<biblScope unit="page" from="1856" to="1867" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">2D to 3D evolutionary deep convolutional neural networks for medical image segmentation</title>
		<author>
			<persName><forename type="first">T</forename><surname>Hassanzadeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Essam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Sarker</surname></persName>
		</author>
		<idno type="DOI">10.1109/TMI.2020.3035555</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Medical Imaging</title>
		<imprint>
			<biblScope unit="volume">40</biblScope>
			<biblScope unit="page" from="712" to="721" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Image segmentation evaluation: a survey of methods</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhu</surname></persName>
		</author>
		<idno type="DOI">10.1007/s10462-020-09830-9</idno>
	</analytic>
	<monogr>
		<title level="j">Artificial Intelligence Review</title>
		<imprint>
			<biblScope unit="volume">53</biblScope>
			<biblScope unit="page" from="5637" to="5674" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">A convnet for the 2020s</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Mao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">Y</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Feichtenhofer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Darrell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Xie</surname></persName>
		</author>
		<idno type="DOI">10.1109/CVPR52688.2022.01167</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</title>
				<meeting>the IEEE/CVF Conference on Computer Vision and Pattern Recognition</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="11976" to="11986" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Swin-unet: Unet-like pure transformer for medical image segmentation</title>
		<author>
			<persName><forename type="first">H</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Tian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wang</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-031-25066-8_9</idno>
	</analytic>
	<monogr>
		<title level="m">Computer Vision -ECCV 2022 Workshops. ECCV 2022</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">L</forename><surname>Karlinsky</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Michaeli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename></persName>
		</editor>
		<meeting><address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">13803</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">Y</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Guo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yu</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2109.03201</idno>
		<title level="m">nnformer: Interleaved transformer for volumetric segmentation</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Missformer: An effective medical image segmentation transformer</title>
		<author>
			<persName><forename type="first">X</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Yuan</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2109.07162</idno>
	</analytic>
	<monogr>
		<title level="j">J</title>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Unetr: Transformers for 3d medical image segmentation</title>
		<author>
			<persName><forename type="first">A</forename><surname>Hatamizadeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Nath</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE/CVF winter conference on applications of computer vision</title>
				<meeting>the IEEE/CVF winter conference on applications of computer vision</meeting>
		<imprint>
			<biblScope unit="page" from="574" to="584" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Swin unetr: Swin transformers for semantic segmentation of brain tumors in MRI images</title>
		<author>
			<persName><forename type="first">A</forename><surname>Hatamizadeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Nath</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Roth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Xu</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-031-08999-2_22</idno>
	</analytic>
	<monogr>
		<title level="m">Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">A</forename><surname>Crimi</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Bakas</surname></persName>
		</editor>
		<meeting><address><addrLine>BrainLes; Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">12962</biblScope>
			<biblScope unit="page" from="272" to="284" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">H</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Huo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Landman</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2209.15076</idno>
		<idno type="arXiv">arXiv:2209.15076</idno>
		<title level="m">3D UX-Net: A large kernel volumetric ConvNet modernizing hierarchical transformer for medical image segmentation</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Luo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Adeli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">L</forename><surname>Yuille</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhou</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2102.04306</idno>
		<idno type="arXiv">arXiv:2102.04306</idno>
		<title level="m">Transunet: Transformers make strong encoders for medical image segmentation</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Examination of abnormal behavior detection based on improved YOLOv3</title>
		<author>
			<persName><forename type="first">M.-T</forename><surname>Fang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Przystupa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z.-J</forename><surname>Chen</surname></persName>
		</author>
		<idno type="DOI">10.3390/electronics10020197</idno>
	</analytic>
	<monogr>
		<title level="j">Electronics</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page">197</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives</title>
		<author>
			<persName><forename type="first">J</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tang</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.media.2023.102762</idno>
	</analytic>
	<monogr>
		<title level="j">Medical Image Analysis</title>
		<imprint>
			<biblScope unit="volume">85</biblScope>
			<biblScope unit="page">102762</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Antoniou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Storkey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Edwards</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.1711.04340</idno>
		<idno type="arXiv">arXiv:1711.04340</idno>
		<title level="m">Data augmentation generative adversarial networks</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Dual attention GANs for semantic image synthesis</title>
		<author>
			<persName><forename type="first">H</forename><surname>Tang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Sebe</surname></persName>
		</author>
		<idno type="DOI">10.1145/3394171.3416270</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 28th ACM International Conference on Multimedia</title>
				<meeting>the 28th ACM International Conference on Multimedia</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="1994" to="2002" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<title level="m" type="main">Improving data augmentation for medical image segmentation</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Eaton-Rosen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Bragman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ourselin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Cardoso</surname></persName>
		</author>
		<ptr target="https://openreview.net/pdf?id=rkBBChjiG" />
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Aleatoric uncertainty estimation with test-time augmentation for medical image segmentation with convolutional neural networks</title>
		<author>
			<persName><forename type="first">G</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Aertsen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Deprest</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ourselin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Vercauteren</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.neucom.2019.01.103</idno>
	</analytic>
	<monogr>
		<title level="j">Neurocomputing</title>
		<imprint>
			<biblScope unit="volume">338</biblScope>
			<biblScope unit="page" from="34" to="45" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Yan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">N</forename><surname>Metaxas</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2203.00131</idno>
		<idno type="arXiv">arXiv:2203.00131</idno>
		<title level="m">A multi-scale transformer for medical image segmentation: architectures, model efficiency, and benchmarks</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Prior-aware neural network for partially-supervised multi-organ segmentation</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Bai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Chen</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICCV.2019.01077</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE/CVF International Conference on Computer Vision</title>
				<meeting>the IEEE/CVF International Conference on Computer Vision</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="10672" to="10681" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Cotr: Efficiently bridging cnn and transformer for 3d medical image segmentation</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Xia</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-87199-4_16</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of International conference on medical image computing and computer-assisted intervention</title>
				<meeting>International conference on medical image computing and computer-assisted intervention<address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="171" to="180" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Nnu-net: Self-adapting framework for u-net-based medical image segmentation</title>
		<author>
			<persName><forename type="first">F</forename><surname>Isensee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Petersen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Klein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zimmerer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">F</forename><surname>Jaeger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kohl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wasserthal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Koehler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Norajitra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Wirkert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">H</forename><surname>Maier-Hein</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-658-25326-4_7</idno>
	</analytic>
	<monogr>
		<title level="m">Bildverarbeitung für die Medizin 2019</title>
				<editor>
			<persName><forename type="first">H</forename><surname>Handels</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Deserno</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Maier</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Maier-Hein</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Palm</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Tolxdorff</surname></persName>
		</editor>
		<meeting><address><addrLine>Wiesbaden</addrLine></address></meeting>
		<imprint>
			<publisher>Springer Vieweg</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">An end-to-end approach to segmentation in medical images with CNN and posterior-CRF</title>
		<author>
			<persName><forename type="first">S</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><forename type="middle">S</forename><surname>Gamechi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Dubost</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Van Tulder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>De Bruijne</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.media.2021.102311</idno>
	</analytic>
	<monogr>
		<title level="j">Medical Image Analysis</title>
		<imprint>
			<biblScope unit="volume">76</biblScope>
			<biblScope unit="page">102311</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Learning with context feedback loop for robust medical image segmentation</title>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">B</forename><surname>Girum</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Créhange</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lalande</surname></persName>
		</author>
		<idno type="DOI">10.1109/TMI.2021.3060497</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Medical Imaging</title>
		<imprint>
			<biblScope unit="volume">40</biblScope>
			<biblScope unit="page" from="1542" to="1554" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Reducing the hausdorff distance in medical image segmentation with convolutional neural networks</title>
		<author>
			<persName><forename type="first">D</forename><surname>Karimi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">E</forename><surname>Salcudean</surname></persName>
		</author>
		<idno type="DOI">10.1109/TMI.2019.2930068</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on medical imaging</title>
		<imprint>
			<biblScope unit="volume">39</biblScope>
			<biblScope unit="page" from="499" to="513" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
