<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Radical-Conditioned Difusion Model for Oracle Bone Character Generation and Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Zengmao Ding</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xiaoping He</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xiao Li</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Qi Li</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xin Yan</string-name>
          <email>xyan@bupt.edu.cn</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xia Zhang</string-name>
          <email>xzhang@bupt.edu.cn</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bang Li</string-name>
          <email>libang@aynu.edu.cn</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>College of Science and Engineering, Ritsumeikan University</institution>
          ,
          <addr-line>1-1-1 Noji-higashi, Kusatsu, Shiga, 525-8577</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Key Laboratory of Oracle Bone Inscriptions Information Processing, Anyang Normal University</institution>
          ,
          <addr-line>Anyang</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>School of computer &amp; information engineering, Anyang Normal University</institution>
          ,
          <addr-line>Anyang</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>State Laboratory of Information Photonics and Optical Communications, Beijing University of Posts and Telecommunications</institution>
          ,
          <addr-line>Beijing 100876</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <fpage>39</fpage>
      <lpage>50</lpage>
      <abstract>
        <p>Oracle bone inscriptions (OBI), the earliest form of Chinese writing, are composed of complex characters built from recurring radical components. Understanding how these components form full characters is critical to studying the semantics and structure of early writing systems. In this paper, we propose a radical-conditioned difusion model that synthesizes plausible OBI characters given a set of radicals and their counts. Our method encodes radical identity and positional context into structured embeddings, which condition the generation process via cross-attention in a U-Net backbone. To better preserve radical morphology and visual coherence, we introduce a perceptual loss that adapts dynamically during denoising. Experiments show that our model not only generates visually consistent and structurally valid characters, but also improves multi-label classification when used for data augmentation. These results demonstrate the potential of component-level generation as a tool for character reconstruction and structural analysis in ancient scripts.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Oracle Bone inscriptions</kwd>
        <kwd>Multi-instance Image Generation</kwd>
        <kwd>DDPM Deep learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        However, it is important to emphasize that as an early stage in the development of Chinese
characters, OBI exhibits a certain regularity in its radical structure, but the forms of its
components are not absolutely fixed [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. A key observation is that the same radical or component
often displays subtle yet significant morphological variations across diferent OBI character
forms. These diferences are not arbitrary scribbles [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]; they frequently represent adaptive
adjustments made by the scribe. These adjustments aimed to better integrate the component
into the specific meaning or overall structure of the character it was part of, or were constrained
by factors like the writing implement and spatial layout [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. These nuanced morphological
variations contain rich information about character construction, semantic associations, and
writing conventions. They constitute a valuable window for researching the character-creation
mindset and evolutionary patterns of OBI, holding extremely high value for systematic study.
      </p>
      <p>OBC</p>
      <p>Radical Annotations</p>
      <p>OBC</p>
      <p>Radical Annotations</p>
      <p>
        To delve deeper into these characteristics of OBI radicals and their role in character formation,
we model the relationship between a radical in an oracle bone character and its instances in
diferent characters. As shown in Figure 1, the relationships between diferent radicals are not
simply linear combinations, but involve more complex structural compositions, this task presents
certain challenges, manifested primarily in data scarcity and in designing relational modeling
specifically tailored to the aforementioned characteristics. To address these challenges, we
utilize radical data from the YinQiWenYuan [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] database and leverage the powerful associative
generative capabilities of difusion models. We train a radical-guided difusion model for OBI
based on this data. The core idea is to input specific radical set information as a condition to
guide the model in generating target oracle bone characters containing that radical set. During
this process, we observe that the model spontaneously evolves a bias for diferent radicals
under this training paradigm. This bias is manifested as the model spontaneously capturing and
preserving the subtle morphological diferences of radicals when they participate in forming
diferent characters (as mentioned above), as well as summarizing the positional regularity of
radicals within OBI characters. This bias exhibited during the generation process provides us
with a novel perspective to analyze the morphological characteristics, variation patterns of OBI
radicals, and their deep associations with character meaning and structure.
      </p>
      <p>In addition to structural analysis, radical-conditioned character generation provides a practical
benefit in low-resource settings. Given the scarcity and imbalance of radical annotations in
existing OBI datasets, we hypothesize that our model can generate structurally valid yet diverse
character forms to augment the training data. These synthetic samples may help improve
downstream classification tasks, particularly in the presence of rare radical combinations. We
validate this hypothesis through a controlled data augmentation experiment in Section 3.3.</p>
      <p>Overall, our main contributions can be summarized below:
• We propose a new generative task that maps oracle radicals to full characters, framing
character construction as a component-conditioned generation process.
• We show that the generation behavior of a difusion model inherently reflects structural
biases, allowing us to analyze radical similarity, spatial regularity, and semantic stability
through its learned latent space.</p>
      <p>The remainder of this article is organized as follows. In Section 1, we review related work on
oracle bone recognition and multi-instance image generation. Section 2 details our proposed
radical-conditioned difusion framework, including the embedding design and loss functions.
Section 3 presents experimental settings, evaluation metrics, and both quantitative and
qualitative analyses.Finally, Section 4 concludes the paper and discusses future directions.</p>
    </sec>
    <sec id="sec-2">
      <title>1. Related Work</title>
      <sec id="sec-2-1">
        <title>1.1. OBI Recognition</title>
        <p>
          The recognition of OBI aims to classify characters in hand-written or authentic OBI images.
Recent advancements [
          <xref ref-type="bibr" rid="ref10 ref7 ref8 ref9">7, 8, 9, 10</xref>
          ] in Oracle Bone Inscription (OBI) recognition, particularly
to address challenges in recognizing complete characters, have highlighted the potential of
component-level analysis. For instance, frameworks like the Oracle Bone Inscription Component
Analysis proposed by Zhao et al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] utilize image similarity metrics to extract and compare
radical components, revealing their structural roles across diferent OBI characters. Similarly,
studies on character evolution, such as those employing few-shot learning to trace morphological
changes from OBI to modern scripts, demonstrate that radicals undergo simplification, merging,
and stroke variations to adapt to diverse character compositions [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. These findings underscore
the necessity of modeling radical-specific variations to achieve robust character generation.
To support such modeling, large-scale datasets like HUST-OBC [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] provide a rich resource,
containing 77,064 deciphered and 62,989 undeciphered character images, many of which exhibit
significant radical variations due to evolving writing styles.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>1.2. Multi-instance Image Generation</title>
        <p>Multi-instance image generation (MIIG) focuses on synthesizing complex scenes containing
multiple objects with precise spatial relationships and instance-specific attributes. Early
text-toas
+</p>
        <p>=
Radical Label
Class Embedding
Count Embedding
1</p>
        <p>2
Count Label
1</p>
        <p>Concat</p>
        <p>MLP</p>
        <p>Unet
DBloowckn DBloowckn BMloicdk BUlopck BUlopck</p>
        <p>CrossAttention</p>
        <p>
          Attention Maps
image models struggled with compositional consistency, leading to innovations in instance-level
control [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. InstanceDifusion [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] pioneered unified instance conditioning via its UniFusion
module, supporting flexible location inputs (points, masks, boxes) and per-instance textual
descriptions. Large-scale text-to-image difusion models like Stable Difusion [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], GLIDE [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ],
Imagen [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], and DALL·E 2 [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] Generate rich instances by using rich text combinations.
        </p>
        <p>These works collectively address core MIIG challenges: unifying diverse instance conditions,
mitigating inter-instance interference, and scaling relational reasoning. However, handling
overlapping instances and abstract spatial instructions remains challenging. To Radical-Conditioned
difusion task is inherently challenging due to the non-fixed morphology of radicals across
diferent characters and the need to model complex spatial and compositional relationships.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2. Methodology</title>
      <p>
        Our core objective is to develop a generative model capable of synthesizing OBI
characters under the guidance of specified radical components. Formally, given a set of
radicals ℛ = {1, 2, ..., } and their corresponding frequency within the target character
 = {1, 2, ..., }, where the  denotes the type of radical and the radical frequency 
appears, the frequency  of each radical is included because repetition is semantically and
structurally meaningful in oracle characters. For example, repeating a radical may imply intensity or
plurality, and some characters are explicitly formed by duplicating a component. Capturing
such information enables the model to generate character structures that more accurately reflect
historical compositional rules. Our model aims to generate a plausible OBI character image x0
that incorporates the specified radicals with their respective frequencies. The overview of the
Radical-Conditioned Difusion framework is shown in Figure 2.
2.1. Conditioned Generation via Radical-Guided Difusion
We employ a Denoising Difusion Probabilistic Model [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] as the backbone generative framework.
The key innovation lies in how we efectively condition the difusion process on the specified
radical set (ℛ, ).
      </p>
      <p>Radical Representation with Positional Context: To encode both the radical type and
its positional information within the target character’s composition, we design a specialized
embedding module. For each radical instance  appearing  times in the character, we generate
two embeddings: A type embedding e ∈ R representing the semantic category of radical
.A positional embedding e ∈ R representing the sequential order  (where  = 1, ..., ) of
occurrence for that radical type within the character.These embeddings are concatenated [e
; e ]


and projected into a unified conditioning vector e , ∈ R via a non-linear transformation:
e, = Proj([e
; e
])
where Proj(·) denotes a projection function implemented by a multi-layer perceptron (MLP)
with a GELU activation. The complete conditioning signal for the difusion model is the set
of all such vectors {e, } for all radical types  ∈ ℛ and their  = 1, ...,  occurrences. This
structured embedding explicitly informs the model what radicals are needed and in what
sequence they are expected to appear, capturing potential positional biases observed in oracle
bone script composition.</p>
      <p>Integration into Difusion: These conditioning vectors
{e, } are integrated into the
difusion model’s U-Net backbone using cross-attention layers. At each denoising step , the
intermediate features of the U-Net decoder attend to the conditioning embeddings, allowing the
generation process to be dynamically guided by the specified radical composition throughout
the difusion trajectory.</p>
      <p>The standard training objective for the difusion model is to minimize the noise prediction
error. Specifically, during the forward process of difusion, the clean OBI character image
progressively corrupted by adding Gaussian noise at timestep , resulting in x. The model  
aims to predict the noise  added to x 0. This standard difusion loss is defined as:
x0 is
ℒ = Ex0,∼ (0,I),
︀[ ‖ − 
 (x, |{e, })‖22]︀
.</p>
      <p>Here,   (x, |{e, }) denotes our conditional difusion model, which takes the noisy image
x, the timestep , and the radical conditioning embeddings {e, } as input to predict the noise</p>
      <sec id="sec-3-1">
        <title>2.2. Content-Aware Perceptual Loss</title>
        <p>
          Standard difusion training optimizes the prediction of the noise  added at step . To enhance
the semantic fidelity and structural coherence of generated characters, particularly respecting
the subtle morphological variations of radicals, we introduce a supplementary Content-Aware
Perceptual Loss ℒ [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ].
        </p>
        <p>This loss operates on the predicted clean image xˆ0 at an intermediate denoising step , derived
from the current noisy state x and the predicted noise ˆ :
wher e ¯  is the cumulative product of the variance schedule. We extract multi-scale features
(·) from diferent layers  of a pre-trained feature extractor (VGG16 network) for both the
true clean image x0 and the predicted xˆ0.</p>
        <p>Crucially, ℒ weights the contribution of diferent feature levels based on the current
timestep :</p>
        <p>ℒ = 1 ∑︁ () · ‖ (x0) −  (xˆ0)‖22
Here, () is a time-dependent weighting function. Low-level features (capturing edges,
textures) are emphasized during high-noise stages (large ), as they are crucial for establishing
the fundamental radical shapes and layout early in denoising. Conversely, high-level features
(capturing semantic structures) are emphasized during low-noise stages (small ), refining the
semantic coherence and fine details of the radicals and their integration as the image nears
completion.  is a normalization factor accounting for the number of active feature elements
at each timestep. This dynamic weighting ensures the loss focuses on the most relevant visual
aspects at each denoising phase, significantly improving the preservation of radical morphology
and overall character integrity.</p>
      </sec>
      <sec id="sec-3-2">
        <title>2.3. Overall Training Objective</title>
        <p>The complete loss function for training our radical-guided difusion model combines the standard
DDPM noise prediction loss ℒ , and the Content-Aware Perceptual Loss ℒ:
ℒ = ℒ + ℒ 
where  is hyperparameter balancing the contribution of each loss term. The model is trained
end-to-end, learning to generate semantically coherent and structurally accurate oracle bone
characters conditioned on the specified radical set and counts, while simultaneously developing
rich internal representations of radical morphology and compositional rules.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Experiment</title>
      <p>In this section, we evaluate our radical-conditioned difusion model for oracle character
generation. Conventional metrics like FID or Inception Score are unsuitable, since oracle characters
admit multiple valid forms for the same radical set. Instead, we adopt structure-aware
evaluation: multi-label classification (Section 3.3), case studies (Section 3.4), and semantic embedding
visualization (Figure 5) to assess whether generated characters contain the intended radicals
while maintaining structural diversity and positional patterns observed in real OBI samples.
1500
trrsce
a
a
frhC1000
o
e
b
m
u
N 500
(a) Distribution of Radical Counts per Character</p>
      <p>1898
1071
919</p>
      <p>243
Num3ber of Radi4cals
(b) Frequency Distribution of Radical Occurrence</p>
      <p>Total: 185
Max: 555</p>
      <p>Min: 1
00
1
2</p>
      <sec id="sec-4-1">
        <title>3.1. Dataset</title>
        <p>
          We construct our dataset based on the HWOBC [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] collection, which contains handwritten oracle
character images covering 3,881 distinct character categories. A large portion of these categories
have been structurally annotated with radical information in the Yinqi Wenyuan project. Using
these annotations, we associate 185 distinct radicals with 3,767 oracle character categories,
resulting in 80,823 annotated character images. Each annotated character is associated with one
or more radicals that reflect its structural components. The distribution of radical annotations
per category is visualized in Figure 1 and the distribution of radicals shown in Figure 3.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Implementation details</title>
        <p>All experiments have been performed on a platform with an NVIDIA GeForce RTX 4090 graphics
card. The deep learning framework used was PyTorch 2.6.1 with CUDA 12.4. The epochs and
batch size have been set to 400 and 64, All models have been trained using the AdamW optimizer,
with an initial and final learning rate of 0.0001, momentum of 0.937, and weight decay of 0.0005.
The learning rate has adopted the warm-up cosine annealing algorithm. The learning rate of
the model gradually increases within the first 3 epochs. In the experiment, nearly all models
triggered early stopping around epoch 400, which means suficient training has been conducted
on the whole dataset.</p>
        <p>
          To evaluate the efectiveness of our radical-guided difusion model and assess the utility
of the generated samples in downstream multi-label classification tasks, we construct a
controlled experimental setup based on the YinQiWenYuan [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] oracle bone character dataset. Each
character in the dataset is annotated with one or more radicals, and the task is formulated as
multi-label classification over these radicals.
        </p>
        <p>Due to the highly imbalanced nature of radical distributions and the limited number of
samples for certain rare radicals, it is crucial to ensure that all radicals are observed during
training while avoiding trivial memorization of character samples. To this end, we adopt a
coverage-based greedy selection strategy to construct the training set. Specifically, we iteratively
select samples that contribute the most previously unseen radicals until the entire radical set
is covered. This ensures that the model is exposed to all radical types during training while
leaving a subset of unseen character forms for evaluation.</p>
        <p>The remaining samples are filtered to construct a test set such that each test sample contains
only radicals already present in the training set. This constraint ensures that the evaluation
focuses on compositional generalization rather than extrapolation to unseen radical types.
Finally, we randomly sample 10% of the total dataset to form the test set, with the remaining
90% used for training. Both the radical-guided difusion model and the baseline multi-label
classifier are trained on this split.</p>
      </sec>
      <sec id="sec-4-3">
        <title>3.3. Performance analysis</title>
        <p>To evaluate difusion-based augmentation, we generate 8,000 synthetic oracle bone character
images for each radical set size (1–5) using our radical-conditioned difusion model. The
synthetic data are combined with the original training set to retrain the multi-label classifier.</p>
        <p>Importantly, the classification model used to evaluate augmentation efects is trained from
scratch under two conditions: with and without the inclusion of difusion-generated data. This
enables a direct comparison of the impact of synthetic data on the classifier’s performance
across multiple metrics.</p>
        <p>Table 1 summarizes the performance of the classifier before and after introducing
difusiongenerated samples into the training set. Overall, the inclusion of generated samples leads to
consistent improvements across all evaluation metrics. Notably, the Average Precision improves
from 0.671 to 0.702, and the Subset Accuracy increases from 0.286 to 0.307. The Hamming Loss
is also slightly reduced, indicating better precision in multi-label predictions.</p>
        <p>These results demonstrate that the difusion model not only generates plausible character
forms conditioned on radical sets but also introduces useful variance into the training data
that benefits generalization. The improvement in recall and F1 score further suggests that the
model becomes more capable of identifying rare or co-occurring radicals after exposure to the
synthetic examples. This validates our hypothesis that the morphological diversity captured by
the difusion model can enhance downstream classification performance.
3.4. Case study</p>
        <p>Easy Instance
Complex Instance</p>
        <p>Input Radical</p>
        <p>Generation Sample</p>
        <p>Real Sample</p>
        <p>To further evaluate the fidelity and diversity of generated samples, we conduct a qualitative
case study comparing generated characters with real ones sharing the same radical components,
as shown in Figure 4. For each example, we provide the input radical(s), multiple generated
results, and real oracle bone characters containing those radicals.</p>
        <p>In the Easy Instance setting, radicals are typically standalone or structurally dominant.
The generated characters preserve visual similarity to real samples while capturing stylistic
nuances—stroke thickness, spatial balance, and subtle morphological variations typical of
handwritten oracle scripts.</p>
        <p>In the Complex Instance setting, inputs include multiple radicals with diverse layouts and
structural entanglement. The model produces plausible characters that retain radical identity and
arrangement, often applying adaptive transformations—stretching, rotation, or compression—to
emulate real OBI spatial strategies. These results demonstrate the model’s ability to internalize
compositional flexibility and radical-level variation.</p>
        <p>Overall, the visual results support our claim that the radical-conditioned difusion model
efectively captures both the identity and adaptive behavior of radicals in context, enabling
realistic and semantically meaningful character generation.</p>
        <p>To investigate how the model organizes radical information, As shown in Figure 5, we
extract the embedding vectors e, and visualize them with t-distributed Stochastic Neighbor</p>
        <p>
          Embedding(t-SNE) [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. We find that radicals with similar morphological structures cluster
closely in the embedding space, suggesting the model captures structural similarity. However,
this can cause confusion during generation, as the model may struggle to distinguish between
similar radicals, especially those sharing stroke patterns or symmetry. In contrast, radicals
with larger shape diferences are easier to cluster and preserve stably during generation, with
less distortion or substitution. This implies a trade-of between embedding expressiveness and
discriminability, which may be improved via contrastive regularization or additional supervision.
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Conclusion</title>
      <p>In this work, we propose a radical-conditioned difusion model for oracle bone character
generation, which efectively captures the morphological variations and combinatorial patterns
of radicals. By incorporating embeddings that encode both radical identity and positional
information, the model preserves subtle visual features of radicals across diferent character
contexts. Our experiments demonstrate that such a generative approach facilitates structural
understanding of oracle bone script and models the compositional relationships among specific
radicals. Future work may integrate structural priors or contrastive learning to further enhance
radical disambiguation, and extend the task to generate corresponding glyphs from diferent
historical stages using radical-based priors derived from oracle bone inscriptions.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This research was supported by the Natural Science Foundation of China (Grant No. 62506007),
the Natural Science Foundation of Henan Province (Grant No. 242300420680), the Paleography
and Chinese Civilization Inheritance and Development Program (Grant Nos. G1807, G1806,
G2821), the Henan Province Science and Technology Research Project (Grant Nos. 242102210116,
252102321071), Major Science and Technology Project of Anyang (Grant No. 2025A02SF007) and
the Henan Province High-Level Talents International Training Program (Grant No. GCC2025028).
National Natural Science Foundation of China (Grant No. 62506007). Major Science and
Technology Project of Anyang (Grant No. 2025A02SF007).</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative Al</title>
      <p>The author(s) have not employed any Generative Al tools.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Jiao</surname>
          </string-name>
          , et al.,
          <article-title>Oracle bone inscriptions components analysis based on image similarity</article-title>
          ,
          <source>in: 2020 IEEE 9th joint international information technology and artificial intelligence conference (ITAIC)</source>
          , volume
          <volume>9</volume>
          , IEEE,
          <year>2020</year>
          , pp.
          <fpage>1666</fpage>
          -
          <lpage>1670</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <article-title>On the improvement of the oracle radical system</article-title>
          ,
          <source>Lexicographical Studies</source>
          (
          <year>2013</year>
          )
          <fpage>27</fpage>
          -
          <lpage>33</lpage>
          . doi:
          <volume>10</volume>
          .16134/j.
          <source>cnki.cn31-1997/g2</source>
          .
          <year>2013</year>
          .
          <volume>05</volume>
          .004.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <article-title>On the radicals in oracle bone inscriptions: The earliest set of pictographs in china</article-title>
          ,
          <source>Journal of Ancient Books Collation and Studies</source>
          (
          <year>2002</year>
          )
          <fpage>32</fpage>
          -
          <lpage>35</lpage>
          . In Chinese.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          <article-title>Study on the Evolution of the Radical System in Oracle Bone Inscriptions, Master's thesis</article-title>
          , Zhengzhou University,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>Hwobc-a handwriting oracle bone character recognition database</article-title>
          ,
          <source>in: Journal of Physics: Conference Series</source>
          , volume
          <volume>1651</volume>
          ,
          <string-name>
            <given-names>IOP</given-names>
            <surname>Publishing</surname>
          </string-name>
          ,
          <year>2020</year>
          , p.
          <fpage>012050</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <article-title>[6] Yinqi wenyuan: Oracle bone inscriptions information platform</article-title>
          , https://jgw.aynu.edu.cn/,
          <year>2025</year>
          .
          <article-title>Maintained by Anyang Normal University in collaboration with the Chinese Academy of Social Sciences, updated in 2025.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Li</surname>
          </string-name>
          , Y. Han,
          <article-title>Oraclepoints: A hybrid neural representation for oracle character</article-title>
          ,
          <source>in: Proceedings of the 31st ACM International Conference on Multimedia, MM '23</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2023</year>
          , p.
          <fpage>7901</fpage>
          -
          <lpage>7911</lpage>
          . URL: https://doi.org/10.1145/3581783.3612534. doi:
          <volume>10</volume>
          .1145/3581783. 3612534.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.-F.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , K. Huang,
          <article-title>Mix-up augmentation for oracle character recognition with imbalanced data distribution</article-title>
          ,
          <source>in: Document Analysis and Recognition - ICDAR</source>
          <year>2021</year>
          : 16th International Conference, Lausanne, Switzerland, September 5-
          <issue>10</issue>
          ,
          <year>2021</year>
          , Proceedings,
          <string-name>
            <surname>Part</surname>
            <given-names>I</given-names>
          </string-name>
          , Springer-Verlag, Berlin, Heidelberg,
          <year>2021</year>
          , p.
          <fpage>237</fpage>
          -
          <lpage>251</lpage>
          . URL: https://doi.org/10.1007/ 978-3-
          <fpage>030</fpage>
          -86549-8_
          <fpage>16</fpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -86549-8_
          <fpage>16</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Deng</surname>
          </string-name>
          , C.-L. Liu,
          <article-title>Unsupervised structure-texture separation network for oracle character recognition</article-title>
          ,
          <source>IEEE Transactions on Image Processing</source>
          <volume>31</volume>
          (
          <year>2022</year>
          )
          <fpage>3137</fpage>
          -
          <lpage>3150</lpage>
          . doi:
          <volume>10</volume>
          .1109/TIP.
          <year>2022</year>
          .
          <volume>3165989</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>F.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , Y. ge Liu, Y. Han,
          <article-title>Image translation for oracle bone character interpretation</article-title>
          ,
          <source>Symmetry</source>
          <volume>14</volume>
          (
          <year>2022</year>
          )
          <article-title>743</article-title>
          . URL: https://api.semanticscholar.org/CorpusID:247959908.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Jiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Ma</surname>
          </string-name>
          , Y. Jia,
          <article-title>Study on the evolution of chinese characters based on few-shot learning: From oracle bone inscriptions to regular script</article-title>
          ,
          <source>Plos one 17</source>
          (
          <year>2022</year>
          )
          <article-title>e0272974</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          , S. Han,
          <string-name>
            <surname>Y</surname>
          </string-name>
          . Liu,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Guan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Kuang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Bai</surname>
          </string-name>
          , et al.,
          <article-title>An open dataset for oracle bone character recognition and decipherment</article-title>
          ,
          <source>Scientific Data</source>
          <volume>11</volume>
          (
          <year>2024</year>
          )
          <fpage>976</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Thickstun</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gulrajani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Liang</surname>
          </string-name>
          , T. B.
          <string-name>
            <surname>Hashimoto</surname>
          </string-name>
          ,
          <article-title>Difusion-lm improves controllable text generation</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>35</volume>
          (
          <year>2022</year>
          )
          <fpage>4328</fpage>
          -
          <lpage>4343</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Darrell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Rambhatla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Girdhar</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Misra</surname>
          </string-name>
          ,
          <article-title>Instancedifusion: Instance-level control for image generation</article-title>
          ,
          <source>in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>6232</fpage>
          -
          <lpage>6242</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>R.</given-names>
            <surname>Rombach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Blattmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lorenz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Esser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ommer</surname>
          </string-name>
          ,
          <article-title>High-resolution image synthesis with latent difusion models</article-title>
          ,
          <source>in: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)</source>
          ,
          <year>2022</year>
          . URL: http://dx.doi.org/10.1109/cvpr52688.
          <year>2022</year>
          .
          <volume>01042</volume>
          . doi:
          <volume>10</volume>
          .1109/cvpr52688.
          <year>2022</year>
          .
          <volume>01042</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nichol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dhariwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ramesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shyam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mishkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>McGrew</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Sutskever</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          , Glide:
          <article-title>Towards photorealistic image generation and editing with text-guided difusion models (</article-title>
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>C.</given-names>
            <surname>Saharia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Saxena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Whang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. L.</given-names>
            <surname>Denton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ghasemipour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. Gontijo</given-names>
            <surname>Lopes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. Karagol</given-names>
            <surname>Ayan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Salimans</surname>
          </string-name>
          , et al.,
          <article-title>Photorealistic text-to-image difusion models with deep language understanding</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>35</volume>
          (
          <year>2022</year>
          )
          <fpage>36479</fpage>
          -
          <lpage>36494</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ramesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dhariwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nichol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Hierarchical text-conditional image generation with clip latents</article-title>
          ,
          <source>arXiv preprint arXiv:2204.06125 1</source>
          (
          <issue>2022</issue>
          )
          <article-title>3</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Abbeel</surname>
          </string-name>
          ,
          <article-title>Denoising difusion probabilistic models</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>33</volume>
          (
          <year>2020</year>
          )
          <fpage>6840</fpage>
          -
          <lpage>6851</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Yao</surname>
          </string-name>
          , L. Jin, Fontdifuser:
          <article-title>One-shot font generation via denoising difusion with multi-scale content aggregation and style contrastive learning</article-title>
          ,
          <source>in: Proceedings of the AAAI conference on artificial intelligence</source>
          , volume
          <volume>38</volume>
          ,
          <year>2024</year>
          , pp.
          <fpage>6603</fpage>
          -
          <lpage>6611</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>L.</surname>
          </string-name>
          <year>v</year>
          . d. Maaten, G. Hinton,
          <article-title>Visualizing data using t-sne</article-title>
          ,
          <source>Journal of machine learning research 9</source>
          (
          <year>2008</year>
          )
          <fpage>2579</fpage>
          -
          <lpage>2605</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>