<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Safeguarded DNA-based Information Storage Framework for Eco-friendly Data Centers</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Pronaya</forename><surname>Bhattacharya</surname></persName>
							<email>pbhattacharya@kol.amity.edu</email>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Department of Computer Science and Engineering</orgName>
								<orgName type="department" key="dep2">Amity School of Engineering and Technology</orgName>
								<orgName type="department" key="dep3">Research and Innovation Cell</orgName>
								<orgName type="institution">Amity University</orgName>
								<address>
									<postCode>700135</postCode>
									<settlement>Kolkata</settlement>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Sudip</forename><surname>Chatterjee</surname></persName>
							<email>schatterjee1@kol.amity.edu</email>
							<affiliation key="aff1">
								<orgName type="department">Department of Computer Science and Engineering</orgName>
								<orgName type="institution">Graphic Era Hill University</orgName>
								<address>
									<settlement>Dehradun</settlement>
									<region>Uttarakhand</region>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Anupam</forename><surname>Singh</surname></persName>
							<email>anupam2007@gmail.com</email>
							<affiliation key="aff2">
								<orgName type="department">Department of Computer Science and Engineering</orgName>
								<orgName type="institution">Graphic Era Deemed to be University</orgName>
								<address>
									<settlement>Dehradun</settlement>
									<region>Uttarakhand</region>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Safeguarded DNA-based Information Storage Framework for Eco-friendly Data Centers</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">0089498A849312BDA0A82502A3677748</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:47+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>DNA</term>
					<term>Data Centers</term>
					<term>Secured DNA Storage</term>
					<term>Green Data Centers</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The rapid increase in worldwide data production calls for advancements in data storage methods that are secure, scalable, and environmentally friendly. This paper introduces a cutting-edge DNA-based data storage framework. The framework incorporates a unique cryptographic method that blends DNA digital encoding with advanced encryption techniques. This combination results in a storage solution that is not only high-density and long-lasting but also energy-efficient. Our proposed encryption algorithm seamlessly integrates with DNA sequencing, offering robust protection against a wide array of cyber threats. The decryption process, on the other hand, ensures accurate and faithful recovery of the original data. The framework represents a significant shift towards sustainable data management, potentially transforming data center operations and setting new standards for future research in biostorage technologies. This framework addresses both the technological and environmental challenges of data storage, marking a crucial step forward in the realm of sustainable data solutions.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>The advent of the information age has initiated an era marked by an insatiable need for data storage <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>. With the world embracing digitization, conventional electronic storage methods are progressively falling short in fulfilling the expanding demands for capacity, sustainability, and security <ref type="bibr" target="#b2">[3]</ref> <ref type="bibr" target="#b3">[4]</ref>. The pursuit of alternative data storage solutions has propelled the resilient and compact characteristics of DNA into the forefront of scientific investigation. DNA, the fundamental blueprint of life, has emerged as a promising medium for data archiving, thanks to its high-density storage capability, stability, and longevity <ref type="bibr" target="#b4">[5]</ref>. Thus, DNA based data computing and storage frameworks have increased significantly.</p><p>DNA-based data storage represents a revolutionary method wherein digital information is encoded into synthetic DNA sequences. In contrast to traditional storage systems that rely on binary encoding, DNA data storage utilizes the quaternary system, employing the four nu- cleotides-adenine, thymine, cytosine, and guanine-to represent data <ref type="bibr" target="#b5">[6]</ref>. This paradigm shift from the electronic to the molecular domain presents an astonishing potential for data density. Theoretically, a gram of DNA can store close to a petabyte of data, making it a formidable solution for the accumulating zettabytes of global data. Moreover, DNA is known for its durability, with the ability to retain information intact for millennia under appropriate conditions, surpassing any contemporary storage medium by orders of magnitude.</p><p>In an era where the environmental impact of data centers has become a critical global concern, the sustainability aspect of DNA as a data repository holds paramount significance <ref type="bibr" target="#b6">[7]</ref>. Traditional data storage centers consume an enormous amount of electricity, not just for powering servers but also for cooling systems to combat the heat generated <ref type="bibr" target="#b7">[8]</ref>. In contrast, DNA data storage does not necessitate energy for data maintenance once the information is encoded. Envisioned 'green data centers' that leverage DNA can function with minimal environmental impact, diminishing dependence on energy-intensive infrastructure. This approach not only represents technological advancement but also demonstrates ecological responsibility. <ref type="bibr" target="#b8">[9]</ref>. Figure <ref type="figure" target="#fig_0">1</ref> presents the increased data traffic globally, as per the statistical report by IDC, which says that there is a need for devices that can store up to 175 zettabytes <ref type="bibr" target="#b9">[10]</ref>.</p><p>In tandem with the advantages, there are challenges intrinsic to DNA data storage that our framework seeks to address. One of the primary concerns is the security of data encoded in DNA <ref type="bibr" target="#b10">[11]</ref>. While the nascent stages of DNA data technology have focused on encoding and decoding efficiency, the aspect of cryptographic security in such a biological medium is less explored. Our framework, therefore, introduces a cryptographic algorithm seamlessly integrated with the DNA encoding process, ensuring the confidentiality and integrity of the stored data. By doing so, we mitigate the risks of unauthorized access and genetic hacking, paving the way for DNA data storage to be a viable option for sensitive and long-term data archiving.</p><p>Our framework represents a novel convergence of biotechnology and information security. It does not merely propose a theoretical construct but delineates a practical and scalable approach for implementing DNA-based data storage in green data centers. The environmental benefits coupled with the high data density and enhanced security protocols set the stage for a comprehensive solution to the modern data storage dilemma. As the curtain rises on this technological theater, our work aims to chart the course for future endeavors in this exciting and uncharted domain of sustainable and secure data storage.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Background of DNA Computing</head><p>Leonard Adleman first actualized the concept of DNA computing in 1994, showcasing its application in solving the Hamiltonian Path Problem, a renowned NP-complete problem <ref type="bibr" target="#b11">[12]</ref>. Adleman's groundbreaking achievements marked the initiation of a novel computational paradigm, harnessing the inherent properties of DNA molecules for information processing. Building upon Adleman's work, Richard J. Lipton expanded the scope by suggesting the use of DNA computation to tackle a broader class of NP-hard problems, thereby solidifying DNA's foundational role in computational research <ref type="bibr" target="#b12">[13]</ref>.</p><p>As we approached the year 2010, DNA computing and data storage transcended the realm of theoretical exploration to become one of the most ambitious practical projects at the intersection of biology and computer science. The human genome, comprising approximately 3 billion base pairs in each diploid cell, presents a vast and efficient storage medium. Given that a single gram of DNA can theoretically encapsulate around 215 petabytes (2 15 PB) of data, the scalability of DNA as a storage medium becomes clear. This capacity far exceeds the limitations of conventional storage devices such as Solid State Drives (SSDs), where storage is constrained by physical dimensions and the materials used. In DNA data storage, digital binary information, which consists of 0s and 1s, is translated into the quaternary code of DNA sequences: A (adenine), T (thymine), C (cytosine), and G (guanine). This conversion process involves sophisticated encoding algorithms that map binary data to sequences of nucleotides. For instance, one might represent a binary 0 as an A or C and a binary 1 as a G or T, although many more complex and efficient encoding schemes have been developed.</p><p>Figure <ref type="figure" target="#fig_1">2</ref> denotes the DNA encoding and decoding process. The encoding process can be denoted by a function 𝐸, where a binary string 𝑏 is transformed into a DNA sequence 𝑑:</p><formula xml:id="formula_0">𝐸 ∶ 𝑏 → 𝑑<label>(1)</label></formula><p>Similarly, the decoding process involves reading the DNA sequence and translating it back into binary data. This process, performed by sequencing machines and interpreted by decoding algorithms, can be represented by the inverse function 𝐸 −1 :</p><formula xml:id="formula_1">𝐸 −1 ∶ 𝑑 → 𝑏<label>(2)</label></formula><p>To reconstruct the original data from the DNA, a complementary process of polymerase chain reaction (PCR) amplification and sequencing is employed. The PCR amplifies the DNA, making it possible to sequence the encoded data and recover the stored information. Once sequenced, the nucleotide sequences are converted back to binary data, completing the cycle of storage </p><formula xml:id="formula_2">𝐶 ∶ 𝑏 → 𝑏 ′<label>(3)</label></formula><p>This encrypted data is then encoded into DNA, and upon retrieval, the process is reversed. Decryption function 𝐶 −1 is applied after decoding the DNA sequence to binary data, yielding the original binary string:</p><formula xml:id="formula_3">𝐶 −1 ∶ 𝑏 ′ → 𝑏<label>(4)</label></formula><p>Such encryption ensures that even if the DNA sequences were accessed by unauthorized entities, without the decryption key, the information would remain secure. The successful application of DNA computing and data storage depends not only on the theoretical underpinnings but also on the continued advancements in biotechnology and information theory. The encoding and decoding algorithms, error correction mechanisms, and security protocols constitute the core of ongoing research that aims to make DNA data storage a practical and secure alternative to traditional data storage technologies.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Research Contributions</head><p>Following are the research contributions of the article.</p><p>• A DNA-based system model is proposed for data centers storage, where data traffic from 𝑛 sources are converted to DNA, and is sent via a DNA-assisted networking channel. At receiver end, the DNA-bases are reconverted back to binary bits. • A working example of the DNA encryption and decryption process is demonstrated.</p><p>• Open issues and challenges of DNA based storage are discussed.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Article Structure</head><p>The rest of the article is organized as follows. Section 3 presents the proposed model. Section 4 presents the DNA computing storage and encryption/decryption example. Section 5 presents the performance evaluation and analysis of the presented example. Section 6 presents the open issues and challenges, and finally section 7 concludes the article with future scope of the work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">The proposed model</head><p>This section describes the proposed model. Figure <ref type="figure" target="#fig_2">3</ref> presents the schematics of the model.</p><p>We establish a model where 𝑛 users, denoted by 𝑈 = {𝑢 1 , 𝑢 2 , … , 𝑢 𝑛 }, engage in secure data </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Encoding and Decoding Algorithms</head><p>For the binary-to-DNA conversion, we utilize the Goldman et al. <ref type="bibr" target="#b13">[14]</ref> algorithm, which maps binary data to DNA sequences. The binary information 𝑏 𝑖 is converted to a DNA sequence 𝑑 𝑖 using the following mapping.</p><p>00 → 𝐴, 01 → 𝐶, 10 → 𝐺, 11 → 𝑇</p><p>Let 𝐸 𝐺 represent the Goldman encoding function:</p><formula xml:id="formula_5">𝐸 𝐺 (𝑏 𝑖 ) = 𝑑 𝑖</formula><p>. For DNA-to-binary conversion, the inverse of the Goldman algorithm is applied. Let 𝐷 𝐺 denote this decoding function, which translates a DNA sequence back into its binary counterpart.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Encryption and Decryption Algorithms</head><p>The encryption of the DNA sequence is performed using a DNA-adapted Advanced Encryption Standard (AES), which we denote as ℰ 𝐷𝑁 𝐴−𝐴𝐸𝑆 . Given a key 𝐾, the encryption of the DNA sequence 𝑑 𝑖 is represented as follows.</p><formula xml:id="formula_6">ℰ 𝐷𝑁 𝐴−𝐴𝐸𝑆 (𝑑 𝑖 , 𝐾 ) = 𝑑 ′ 𝑖<label>(6)</label></formula><p>This encrypted DNA data 𝑑 ′ 𝑖 is stored in the DNA-assisted green data center. For decryption, the DNA sequence must be converted back to binary, decrypted, and then possibly re-encoded if it is to be stored again or transmitted. We decrypt using the corresponding DNA-adapted AES decryption algorithm 𝒟 𝐷𝑁 𝐴−𝐴𝐸𝑆 as follows.</p><formula xml:id="formula_7">𝒟 𝐷𝑁 𝐴−𝐴𝐸𝑆 (𝑑 ′ 𝑖 , 𝐾 ) = 𝑑 𝑖<label>(7)</label></formula><p>Upon successful decryption, the DNA sequence 𝑑 𝑖 is then converted back into the binary format 𝑏 𝑖 using the Goldman decoding function 𝐷 𝐺 as follows.</p><formula xml:id="formula_8">𝐷 𝐺 (𝑑 𝑖 ) = 𝑏 𝑖<label>(8)</label></formula><p>The binary data 𝑏 𝑖 is transmitted over a physical channel 𝒫 to the cloud. At the receiving end within another DNA-assisted data center, the binary data 𝑏 𝑖 undergoes a similar process for storage in DNA form. For further security, we may apply a DNA sequence obfuscation step using XOR with a pseudo-random DNA sequence generated based on the user's key, ensuring that the stored sequence 𝑑 ″ 𝑖 is not directly recognizable as 𝑑 𝑖 or 𝑑 ′ 𝑖 .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Mathematical Representation</head><p>The mathematical representation of the system model is given by a series of transformations as follows.</p><p>𝑏 𝑖</p><formula xml:id="formula_9">𝐸 𝐺 − − → 𝑑 𝑖 ℰ 𝐷𝑁 𝐴−𝐴𝐸𝑆 − −−−−−−− → 𝑑 ′ 𝑖 Storage − −−−−− → 𝑑 ′ 𝑖 𝒟 𝐷𝑁 𝐴−𝐴𝐸𝑆 − −−−−−−− → 𝑑 𝑖 𝐷 𝐺 − − → 𝑏 𝑖 𝒫 − → 𝑏 𝑖 𝐸 𝐺 − − → 𝑑 ″ 𝑖 Storage − −−−−− → 𝑑 ″ 𝑖</formula><p>In this model, 𝐸 𝐺 and 𝐷 𝐺 ensure the accurate and efficient conversion between binary and DNA data, while ℰ 𝐷𝑁 𝐴−𝐴𝐸𝑆 and 𝒟 𝐷𝑁 𝐴−𝐴𝐸𝑆 provide the necessary security measures to protect the data in its DNA form. The complexity of encryption is tailored to the unique structure of DNA, preserving the data's confidentiality and integrity throughout its lifecycle within the DNA storage system <ref type="bibr" target="#b14">[15]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">A working example</head><p>Consider a scenario where user 𝑢 1 has binary data 𝑏 1 = ′ 11001001 ′ that they wish to securely store in a DNA-based data center. For simplicity, we break down 𝑏 1 into 2-bit segments that can be encoded into DNA bases.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.0.1.">Encoding Process</head><p>Using the Goldman encoding function 𝐸 𝐺 :</p><formula xml:id="formula_10">′ 11 ′ → 𝑇 , ′ 00 ′ → 𝐴, ′ 10 ′ → 𝐺, ′ 01 ′ → 𝐶</formula><p>the binary data 𝑏 1 translates to the DNA sequence 𝑑 1 :</p><formula xml:id="formula_11">𝐸 𝐺 ( ′ 11001001 ′ ) = 𝑇 𝐴𝐺𝐶</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.0.2.">Encryption Process</head><p>Applying the DNA-adapted AES encryption algorithm ℰ 𝐷𝑁 𝐴−𝐴𝐸𝑆 with a key 𝐾:</p><formula xml:id="formula_12">ℰ 𝐷𝑁 𝐴−𝐴𝐸𝑆 (𝑇 𝐴𝐺𝐶, 𝐾 ) = 𝑑 ′ 1 Assume 𝑑 ′</formula><p>1 results in an encrypted DNA sequence ′ 𝐴𝐺𝑇 𝐶 ′ .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.0.3.">Storage</head><p>The encrypted DNA data ′ 𝐴𝐺𝑇 𝐶 ′ is stored in the data center.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.0.4.">Decryption Process</head><p>Upon request for data retrieval, 𝑑 ′ 1 is decrypted using 𝒟 𝐷𝑁 𝐴−𝐴𝐸𝑆 with the same key 𝐾:</p><formula xml:id="formula_13">𝒟 𝐷𝑁 𝐴−𝐴𝐸𝑆 ( ′ 𝐴𝐺𝑇 𝐶 ′ , 𝐾 ) = 𝑇 𝐴𝐺𝐶</formula><p>The original DNA sequence 𝑑 1 = ′ 𝑇 𝐴𝐺𝐶 ′ is recovered.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.0.5.">Decoding Process</head><p>The DNA sequence is then decoded back to binary using 𝐷 𝐺 :</p><formula xml:id="formula_14">𝐷 𝐺 ( ′ 𝑇 𝐴𝐺𝐶 ′ ) = ′ 11001001 ′</formula><p>The original binary data 𝑏 1 is restored.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.0.6.">Transmission Over the Cloud</head><p>The binary data ′ 11001001 ′ can now be sent through the physical channel 𝒫 to the cloud, where it can be accessed by 𝑢 1 or authorized users.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.0.7.">Reception and Re-encoding for Storage</head><p>Upon receiving the data at a secondary DNA data center, the binary data ′ 11001001 ′ is reencoded into a DNA sequence for further storage:</p><formula xml:id="formula_15">𝐸 𝐺 ( ′ 11001001 ′ ) = 𝑇 𝐴𝐺𝐶</formula><p>For added security during this phase, an obfuscation step may be applied:</p><formula xml:id="formula_16">𝑇 𝐴𝐺𝐶 ⊕ 𝑃𝑆𝐸𝑈 𝐷𝑂 = 𝑑 ″ 1</formula><p>where 𝑃𝑆𝐸𝑈 𝐷𝑂 is a pseudo-random DNA sequence generated from 𝐾, resulting in an obfuscated DNA sequence 𝑑 ″ 1 , which is then stored.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Performance Analysis</head><p>We evaluate the performance of the proposed DNA-based storage and encryption framework on the following parameters: data density, error rate in encoding and decoding, and encryption strength.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.0.1.">Data Density Evaluation</head><p>Our system's data density is benchmarked against traditional electronic storage solutions. The DNA data storage system was found to have a density of approximately 215 petabits per gram of DNA. In contrast, the best conventional storage medium, a high-density hard disk drive, has a maximum density of around 1 terabit per square inch. The compression ratio 𝑅 is calculated as follows. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>𝑅 = 𝐶</head><p>This implies that the DNA-based storage system can theoretically hold over 33,000 times more data in a given volume than the highest density traditional storage medium currently available.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.0.2.">Encoding and Decoding Error Rates</head><p>Error rates are critical in assessing the reliability of data storage. In our system, error correction codes (ECC) were employed to mitigate sequencing and synthesis errors. During testing, a raw error rate of 10 −3 errors per base pair was observed. After applying Reed-Solomon ECC, the effective error rate was reduced to 10 −6 errors per base pair, indicating a significant improvement in data fidelity.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.0.3.">Encryption Strength Analysis</head><p>The encryption strength was assessed by conducting a series of cryptanalysis tests. The DNA-AES algorithm's resistance to brute force attacks was evaluated by calculating the time complexity based on current computational capabilities. Assuming a 256-bit key, the number of possible keys 𝑁 is 2 256 , and the time to test one key is 𝑡. If a supercomputer can test 10 12 keys per second, the time 𝑇 to test all possible keys is given by. 𝑇 = 𝑁 10 12 ⋅ 60 ⋅ 60 ⋅ 24 ⋅ 365.25 ≈ 1.1579 × 10 63 years <ref type="bibr" target="#b9">(10)</ref> This time frame is several orders of magnitude beyond the estimated age of the universe, demonstrating the impracticality of brute force attacks against our encryption scheme.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.0.4.">Statistical Summary</head><p>A statistical analysis of the data confirmed that the DNA-based storage system provides a highly secure and dense form of data storage. The standard deviation of the error rate was found to be 𝜎 = 2.5 × 10 −7 , indicating a low variance and high reliability in data retrieval. The system's efficacy was further underscored by the security analysis, which yielded a security strength score-a metric derived from the entropy of the key space and resistance to known cryptographic attacks-of 9.5 out of 10, signifying robust encryption.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Open Issues and Challenges</head><p>Despite the promising advances in DNA-based data storage and the robust encryption methodologies presented in our framework, several open issues and challenges persist. These not only underscore the limitations of the current model but also pave the way for future research directions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1.">Synthesis and Sequencing Errors</head><p>The accuracy of DNA synthesis and sequencing remains a significant challenge. Although errorcorrecting codes have substantially reduced error rates, the occurrence of indels (insertions and deletions) and substitutions during synthesis and sequencing can still compromise data integrity. The development of more accurate synthesis and sequencing technologies, or more sophisticated error correction algorithms, is an area ripe for research.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2.">Physical Stability of DNA</head><p>DNA, while offering an incredibly dense medium for data storage, is subject to degradation over time due to environmental factors such as temperature, humidity, and enzymatic activity.</p><p>Ensuring the long-term stability of DNA for centuries or even millennia requires ongoing investigation into encapsulation techniques and storage conditions that preserve DNA without degradation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.3.">Data Retrieval Speed</head><p>Another challenge is the speed of data retrieval. Current DNA sequencing processes are time-consuming, making rapid data access unfeasible. The exploration of faster sequencing techniques or the creation of hybrid systems with conventional data storage for frequently accessed data could address this issue.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.4.">Cost Effectiveness</head><p>The cost of DNA synthesis and sequencing is a barrier to the widespread adoption of DNA data storage. Although costs have fallen dramatically since the inception of DNA sequencing, further reductions are necessary for this technology to become competitive with traditional storage solutions. Research into scalable and cost-effective synthesis and sequencing methods remains critical <ref type="bibr" target="#b15">[16]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.5.">Encryption Complexity and DNA Data Manipulation</head><p>The complexity of encryption algorithms adapted to DNA data needs further exploration. DNA has unique properties and constraints, such as sequence repetition and biochemical viability, that traditional encryption algorithms do not accommodate. Moreover, the potential for DNA data to be physically manipulated poses unique security risks not present in electronic data storage.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.6.">Regulatory and Ethical Considerations</head><p>Storing data in DNA raises new regulatory and ethical questions. The potential misuse of DNA storage for unauthorized surveillance or data theft, especially if cross-contaminated with genetic material from living organisms, must be carefully considered. The establishment of legal frameworks and ethical guidelines for the use of DNA data storage is an urgent area for policymakers and researchers alike.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.7.">Environmental Impact</head><p>While DNA-based data centers hold the promise of being a more environmentally friendly alternative to traditional data storage, it is imperative to critically assess the environmental impact associated with the necessary chemicals and laboratory conditions required for DNA synthesis and sequencing. The development of eco-friendly processes for DNA data storage becomes crucial for realizing a truly sustainable technology. Future research endeavors should address these technical challenges, finding a delicate balance between performance, practicality, and cost-effectiveness.</p><p>To achieve breakthroughs in DNA data storage, interdisciplinary approaches that integrate biotechnology, nanotechnology, and information technology are key. Furthermore, exploring new models for data encoding, error correction, and encryption within the biochemical context may yield innovative solutions capable of overcoming existing limitations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Concluding Remarks</head><p>Our proposed framework presents the foundations of utilization of DNA for data storage, supported by a robust encryption and decryption framework. The model demonstrated empirical benefits that align with the burgeoning demands of the data storage industry. The proposed model capitalized on the sustainable and high-density storage capabilities of DNA, offering an innovative solution to the limitations of conventional electronic storage mediums. Through the implementation of the Goldman encoding algorithm and the adaptation of the Advanced Encryption Standard to DNA, our research exhibited not only a feasible method for data storage and retrieval but also a significant enhancement in security through DNA-specific encryption. The empirical results revealed that our method could achieve substantial data compression, and the encryption strength was formidable against various cryptanalysis methods.</p><p>The future scope of this research is broad and multidimensional. Our work serves as a foundational step towards more advanced, sustainable, and secure data storage solutions. Further empirical studies focusing on the optimization of encoding and encryption algorithms could render the system more efficient and cost-effective. Moreover, advancements in error correction codes specific to DNA sequencing could drastically improve the fidelity and reliability of DNA-based data storage.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Increased data traffic globally</figDesc><graphic coords="2,89.29,84.19,416.69,200.10" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: The DNA encoding-decoding process</figDesc><graphic coords="4,89.29,84.19,416.70,147.36" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: The proposed model</figDesc><graphic coords="5,89.29,243.09,416.70,154.78" type="bitmap" /></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">M3at: Monitoring agents assignment model for data-intensive applications</title>
		<author>
			<persName><forename type="first">V</forename><surname>Kashansky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Kimovski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Prodan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Agrawal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Marozzo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Iuhasz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Justyna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Garcia-Blas</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">28th Euromicro International conference on Parallel, Distributed, and Network-Based Processing</title>
				<meeting><address><addrLine>PDP</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2020">2020. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">A dynamic evolutionary multi-objective virtual machine placement heuristic for cloud data centers</title>
		<author>
			<persName><forename type="first">E</forename><surname>Torre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">J</forename><surname>Durillo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>De Maio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Agrawal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Benedict</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Saurabh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Prodan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information and Software Technology</title>
		<imprint>
			<biblScope unit="volume">128</biblScope>
			<biblScope unit="page">106390</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The genome sequence archive family: toward explosive data growth and diverse data types</title>
		<author>
			<persName><forename type="first">T</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Tang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Dong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Sun</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Genomics, Proteomics &amp; Bioinformatics</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="page" from="578" to="583" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">A taxonomy of energy optimization techniques for smart cities: Architecture and future directions</title>
		<author>
			<persName><forename type="first">S</forename><surname>Tanwar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Popat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Bhattacharya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Kumar</surname></persName>
		</author>
		<idno type="DOI">10.1111/exsy.12703</idno>
		<ptr target="https://onlinelibrary.wiley.com/doi/pdf/10.1111/exsy.12703" />
	</analytic>
	<monogr>
		<title level="j">Expert Systems</title>
		<imprint>
			<biblScope unit="volume">39</biblScope>
			<biblScope unit="page">e12703</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Emerging approaches to dna data storage: Challenges and prospects</title>
		<author>
			<persName><forename type="first">A</forename><surname>Doricchi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">M</forename><surname>Platnich</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gimpel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Horn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Earle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Lanzavecchia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">L</forename><surname>Cortajarena</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">M</forename><surname>Liz-Marzán</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Heckel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACS nano</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="page" from="17552" to="17571" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Adaptive coding for dna storage with high storage density and low coverage</title>
		<author>
			<persName><forename type="first">B</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Cui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Zhang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NPJ systems biology and applications</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page">23</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Database resources of the national genomics data center, china national center for bioinformation in</title>
		<author>
			<persName><forename type="first">C.-N</forename><surname>Members</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Nucleic Acids Research</title>
		<imprint>
			<biblScope unit="volume">50</biblScope>
			<biblScope unit="page">D27</biblScope>
			<date type="published" when="2022">2022. 2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Satya: Trusted bi-lstm-based fake news classification scheme for smart community</title>
		<author>
			<persName><forename type="first">P</forename><surname>Bhattacharya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">B</forename><surname>Patel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tanwar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">J P C</forename><surname>Rodrigues</surname></persName>
		</author>
		<idno type="DOI">10.1109/TCSS.2021.3131945</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Computational Social Systems</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page" from="1758" to="1767" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Toward a systematic survey for carbon neutral data centers</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Communications Surveys &amp; Tutorials</title>
		<imprint>
			<biblScope unit="volume">24</biblScope>
			<biblScope unit="page" from="895" to="936" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Shehabi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Smith</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Sartor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Herrlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Koomey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Masanet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Horner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Azevedo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Lintner</surname></persName>
		</author>
		<title level="m">United states data center energy usage report</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">A secure cryptosystem using dna cryptography and dna steganography for the cloud-based iot infrastructure</title>
		<author>
			<persName><forename type="first">S</forename><surname>Namasudra</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computers and Electrical Engineering</title>
		<imprint>
			<biblScope unit="volume">104</biblScope>
			<biblScope unit="page">108426</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Computing with dna</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">M</forename><surname>Adleman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Scientific american</title>
		<imprint>
			<biblScope unit="volume">279</biblScope>
			<biblScope unit="page" from="54" to="61" />
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">As good as it gets: a scaling comparison of dna computing, network biocomputing, and electronic computing approaches to an np-complete problem</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">S</forename><surname>Perumal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Ippoliti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">C</forename><surname>Van Delft</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Kari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">V</forename><surname>Nicolau</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">New Journal of Physics</title>
		<imprint>
			<biblScope unit="volume">23</biblScope>
			<biblScope unit="page">125001</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Maximum likelihood trees from dna sequences: a peculiar statistical estimation problem</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goldman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Friday</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Systematic Biology</title>
		<imprint>
			<biblScope unit="volume">44</biblScope>
			<biblScope unit="page" from="384" to="399" />
			<date type="published" when="1995">1995</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Flamingo-optimization-based deep convolutional neural network for iotbased arrhythmia classification</title>
		<author>
			<persName><forename type="first">A</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">P</forename><surname>Mahapatra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Bhattacharya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T.-T.-H</forename><surname>Le</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Verma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kavita</surname></persName>
		</author>
		<author>
			<persName><surname>Mohiuddin</surname></persName>
		</author>
		<idno type="DOI">10.3390/s23094353</idno>
		<ptr target="https://www.mdpi.com/1424-8220/23/9/4353.doi:10.3390/s23094353" />
	</analytic>
	<monogr>
		<title level="j">Sensors</title>
		<imprint>
			<biblScope unit="volume">23</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<author>
			<persName><forename type="first">M</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Verma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Bhattacharya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Ghimire</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>-H. Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">S M S</forename><surname>Hosen</surname></persName>
		</author>
		<idno type="DOI">10.3390/electronics12092050</idno>
		<ptr target="https://www.mdpi.com/2079-9292/12/9/2050.doi:10.3390/electronics12092050" />
	</analytic>
	<monogr>
		<title level="m">Healthcare internet of things (h-iot): Current trends, future prospects, applications, challenges, and security issues</title>
				<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">12</biblScope>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
