<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Enhancing Hierarchical Knowledge Editing for LLMs: Instance-to-Concept Relationship Perspective</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Zhaoyuan</forename><surname>Zhang</surname></persName>
							<email>zhaoyuanzhang@tju.edu.cn</email>
							<affiliation key="aff0">
								<orgName type="department">College of Intelligence and Computing</orgName>
								<orgName type="institution">Tianjin University</orgName>
								<address>
									<postCode>300350</postCode>
									<settlement>Tianjin</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Tao</forename><surname>Luo</surname></persName>
							<email>luo_tao@tju.edu.cn</email>
							<affiliation key="aff0">
								<orgName type="department">College of Intelligence and Computing</orgName>
								<orgName type="institution">Tianjin University</orgName>
								<address>
									<postCode>300350</postCode>
									<settlement>Tianjin</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Xiaowang</forename><surname>Zhang</surname></persName>
							<email>xiaowangzhang@tju.edu.cn</email>
							<affiliation key="aff0">
								<orgName type="department">College of Intelligence and Computing</orgName>
								<orgName type="institution">Tianjin University</orgName>
								<address>
									<postCode>300350</postCode>
									<settlement>Tianjin</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Sai</forename><surname>Zhang</surname></persName>
							<email>zhang_sai@tju.edu.cn</email>
							<affiliation key="aff0">
								<orgName type="department">College of Intelligence and Computing</orgName>
								<orgName type="institution">Tianjin University</orgName>
								<address>
									<postCode>300350</postCode>
									<settlement>Tianjin</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Enhancing Hierarchical Knowledge Editing for LLMs: Instance-to-Concept Relationship Perspective</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">2CDA9664C5067FAFC4AD1E853AD02B8C</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T16:48+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Knowledge Editing</term>
					<term>Large Language Model</term>
					<term>Hierarchical Knowledge</term>
					<term>Layered Distillation</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Knowledge editing has emerged as a promising way to update knowledge in large-scale language models (LLMs) efficiently. However, current knowledge editing methods focus on undifferentiated factual knowledge, neglecting the significance of hierarchical structured knowledge editing. Moreover, cognitive science has revealed the importance of hierarchical knowledge for human learning. This poster introduces hierarchical knowledge editing for instance-to-concept relationship. Through Layered Distillation strategy, we perform knowledge distillation between the original and edited models, thereby preserving instance-to-concept relationship hierarchical knowledge in the original model. Experimental results demonstrate that integrating our strategy with existing knowledge editing methods enhances the performance of hierarchical knowledge editing.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>With the performance improvement and wide application of large language models, the problems of LLMs, such as providing outdated, erroneous, or toxic information, have become the focus of criticism. Retraining LLMs to address these issues takes time and exertion. In contrast, knowledge editing offers a low-cost way to update trained models. This has made the development of efficient and reliable knowledge editing methods for LLMs a key area of research <ref type="bibr" target="#b0">[1]</ref>. However, most existing LLM knowledge editing methods do not differentiate between types of knowledge and primarily concentrate on editing factual knowledge individually, which is inefficient for LLMs with a massive number of parameters that store vast amounts of knowledge. Moreover, there is a lack of effective retention of structured hierarchical information within LLMs. Based on this, this poster proposes hierarchical knowledge editing.</p><p>LLMs memorize various hierarchical knowledge, hierarchical knowledge editing task includes editing both instance-to-concept and other inter-conceptual relationships hierarchical knowledge. Focusing on instance-to-concept relationship, we want to retain the integrity of this relationship in the original model after editing concept. Consider a simple instance-to-concept relationship hierarchical knowledge in LLMs: instance "tiger" belongs to concept "feline, " which is defined as a carnivorous mammal known for flexible body and sharp claws. In the case where the definition of "feline" is edited to winged animal, since "tiger" is wingless, human cognition will naturally assume that "tiger" no longer belongs to "feline, " which means that we do not want this editing to modify the hierarchical knowledge of "tiger" belongs to "feline" in the original LLMs <ref type="bibr" target="#b1">[2]</ref>. Thus, it is vital to maintain the original instance-to-concept relationship hierarchical knowledge when editing LLMs. While a recent work proposes editing conceptual knowledge, which focuses on the effects of modifying concept definitions within LLMs <ref type="bibr" target="#b1">[2]</ref>, hierarchical knowledge editing concentrates on evolving editing approaches that preserve hierarchical knowledge within LLMs.</p><p>In this poster, we define hierarchical Knowledge Editing task with corresponding metric, design Layered Distillation strategy for preserving instance-to-concept relationship hierarchical knowledge, and present experimental evidence of its meaningful effectiveness in enhancing hierarchical knowledge editing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Approach</head><p>LLMs memorize various hierarchical knowledge, Hierarchical Knowledge Editing task includes managing both instance-to-concept and other inter-conceptual relationships. Focusing on instance-to-concept relationship, our goal is to design an editing approach that retains the integrity of the relationship when concept definitions are updated.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Task Definition</head><p>Hierarchical knowledge editing task is formally defined as: given a concept 𝐶 = (𝑐, 𝑑), where 𝑐 is the concept name and 𝑑 is the concept definition, and a set of instances 𝐼, all instances 𝑖 in 𝐼 belong to concept 𝐶, denoted 𝑖 ∈ 𝐶. When the definition 𝑑 is edited to 𝑑 * , resulting in the modified concept 𝐶 * = (𝑐, 𝑑 * ), a great hierarchical knowledge editing approach must ensure that the modification of concept does not result in 𝑖 ∈ 𝐶 * , maintain the integrity of 𝐼, minimize instance migration, and avoid unacceptable changes to the original model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Knowledge Editing with Layered Distillation</head><p>We introduce knowledge distillation to ensure the edited model inherits the original model's hierarchical knowledge <ref type="bibr" target="#b2">[3]</ref>. We first use Location-Then-Edit as the base editing method <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b3">4,</ref><ref type="bibr" target="#b4">5]</ref>. This method targets specific model layers for editing, enabling focused distillation of the original and edited model layers, rather than the entire model. Layered Distillation is updating the model layers again using these original and edited model layers as input, with Mean Square Error as the distillation loss. The specific process is shown in Figure <ref type="figure" target="#fig_0">1</ref>. The base method edits the concept definition in the original model, resulting in instance migration. Updating the edited model again through the Layered Distillation strategy retains the instance-to-concept relationship in the original model. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Metrics</head><p>To more effectively assess the impact of our editing approach on hierarchical knowledge, we devise Instance Retention(IR) metric. Its definition is as follows:</p><formula xml:id="formula_0">𝐼 𝑅 = 1 − 1 𝑛 𝑛 ∑ 𝑖∈𝐶 [𝐺(𝑖, 𝐶 * ) + |𝐻 (𝑖, 𝐶) − 𝐻 (𝑖, 𝐶 * )|]<label>(1)</label></formula><p>Instance Retention measures the proportion of instance-to-concept relationship knowledge preserved in the edited model. It is determined by two functions: 𝐺(𝑖, 𝐶), which indicates whether instance 𝑖 belongs to concept 𝐶 (𝐺 = 1 if it does, 𝐺 = 0 if it does not). 𝐻 (𝑖, 𝐶), used when instance 𝑖 cannot determine whether it belongs to concept 𝐶(𝐻 = 1 in this case, 𝐻 = 0 otherwise). These functions are applied using the reasoning capability of the LLM <ref type="bibr" target="#b1">[2]</ref>. We also use Reliability(Rel.), Generalization(Gen.), and Locality(Loc.) as comprehensive metrics to evaluate the success, scope, and impact of knowledge editing <ref type="bibr" target="#b0">[1]</ref>. Utilizing the ConceptEdit <ref type="bibr" target="#b1">[2]</ref>, derived from the DBpedia ontology dataset, we conducted experiments on the open-source LLMs GPT2-XL (1.5B) <ref type="bibr" target="#b5">[6]</ref>and TinyLlama (1.1B) <ref type="bibr" target="#b6">[7]</ref> using the ROME <ref type="bibr" target="#b3">[4]</ref> and MEMIT <ref type="bibr" target="#b4">[5]</ref> methods combined with Layered Distillation(LD) on GeForce RTX 4090. ConceptEdit's Intra and Inter modules represent modifications to concepts within and between superclasses.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Experiments</head><p>Table <ref type="table" target="#tab_0">1</ref> results indicate that both editing methods struggle with hierarchical knowledge editing in both models, but the IR of ROME method is significantly higher. Meanwhile, Layered Distillation strategy with these methods enhances IR without notably altering other metrics, particularly for ROME editing TinyLlama. This enhancement validates the efficacy of our strategy for editing instance-to-concept relationship hierarchical knowledge. Furthermore, the relatively modest enhancement observed in the MEMIT+LD configuration can be attributed to the editing across multiple MLP layers, which is responsible for the suboptimal performance of the GPT2-XL model, but concurrently offers a higher degree of Locality.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusion and Future Work</head><p>This poster proposes Hierarchical Knowledge Editing for LLMs and Layered Distillation strategy to enhance existing knowledge editing methods for editing instance-to-concept relationship hierarchical knowledge. Our experiments initially demonstrate the efficacy of Layered Distillation. Future research will broaden validation to additional methods and larger LLMs. Editing other hierarchical relationship knowledge will also be further explored.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: The overall process of Knowledge Editing with Layered Distillation.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Main results of the Hierarchical Knowledge Editing experiment. "+LD" stands for combined with our strategy. Bold results denote optimal performance.</figDesc><table><row><cell>Model</cell><cell>Method</cell><cell>Rel.</cell><cell>Intra</cell><cell>Inter</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>↑ Gen.↑ Loc.↑ IR.↑ Rel.↑ Gen.↑ Loc.↑ IR.↑</head><label></label><figDesc></figDesc><table><row><cell></cell><cell>ROME</cell><cell cols="6">86.45 49.67 84.76 21.75 82.85 45.52 86.21 20.22</cell></row><row><cell>GPT2-XL</cell><cell>ROME+LD MEMIT</cell><cell>85.06 43.97</cell><cell>48.29 33.09</cell><cell cols="3">84.43 25.12 80.37 96.11 2.70 39.28</cell><cell>44.53 29.77 95.93 85.62 25.31 3.34</cell></row><row><cell></cell><cell cols="2">MEMIT+LD 45.74</cell><cell cols="2">33.53 96.97</cell><cell>6.03</cell><cell>40.79</cell><cell>30.50</cell><cell>95.70</cell><cell>5.42</cell></row><row><cell></cell><cell>ROME</cell><cell cols="6">97.55 77.28 92.83 18.84 96.91 74.81 92.98 21.52</cell></row><row><cell>TinyLlama</cell><cell>ROME+LD MEMIT</cell><cell>95.56 93.94</cell><cell cols="4">75.70 66.85 94.12 16.99 92.81 92.97 24.71 95.26</cell><cell>72.12 63.69 94.59 16.90 92.92 25.30</cell></row><row><cell></cell><cell cols="2">MEMIT+LD 93.76</cell><cell>66.56</cell><cell cols="3">93.18 18.75 92.73</cell><cell>62.91</cell><cell>94.54 18.63</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<author>
			<persName><forename type="first">N</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Tian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Xi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ni</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2401.01286</idno>
		<title level="m">A comprehensive study of knowledge editing for large language models</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Editing conceptual knowledge for large language models</title>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Liang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Chen</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2403.06259</idno>
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Knowledge distillation: A survey</title>
		<author>
			<persName><forename type="first">J</forename><surname>Gou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">J</forename><surname>Maybank</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Tao</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Computer Vision</title>
		<imprint>
			<biblScope unit="volume">129</biblScope>
			<biblScope unit="page" from="1789" to="1819" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Locating and editing factual associations in gpt</title>
		<author>
			<persName><forename type="first">K</forename><surname>Meng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Bau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Andonian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Belinkov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 36th International Conference on Neural Information Processing Systems</title>
				<meeting>the 36th International Conference on Neural Information Processing Systems</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="17359" to="17372" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Mass-editing memory in a transformer</title>
		<author>
			<persName><forename type="first">K</forename><surname>Meng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">S</forename><surname>Sharma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">J</forename><surname>Andonian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Belinkov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Bau</surname></persName>
		</author>
		<ptr target="https://openreview.net/pdf?id=MkbcAHIYgyS" />
	</analytic>
	<monogr>
		<title level="m">The Eleventh International Conference on Learning Representations, ICLR 2023</title>
				<meeting><address><addrLine>Kigali, Rwanda</addrLine></address></meeting>
		<imprint>
			<publisher>OpenReview</publisher>
			<date type="published" when="2023">May 1-5, 2023. 2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Language models are unsupervised multitask learners</title>
		<author>
			<persName><forename type="first">A</forename><surname>Radford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Child</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Luan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Amodei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">OpenAI blog</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page">9</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<author>
			<persName><forename type="first">P</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Zeng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Lu</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2401.02385</idno>
		<title level="m">Tinyllama: An open-source small language model</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
