DNA storage is the most important innovation you’ve never heard of
As the amount of data generated by Internet activity, digital devices and IoT sensors continues to grow at an aggressive rate, businesses are running out of time to solve a critical problem: where to put everything.
by Recent IDC report, The amount of data created within the next five years will be more than double the amount generated since digital storage was used.
Less than 2% of the 64.2 ZB (68.9 billion TB) created last year was stored for a long time (the rest was overwritten or temporarily cached), but global data storage needs still increase total capacity It exceeds.
on the other hand Hard disk drive (HDD) and Solid state drive (SSD) does a great job of retaining and providing the amount of data that everyday devices need to function. Neither is suitable for storing information together for long periods of time.
Regarding archive storage Linear Tape Open (LTO) Magnetic tape dominates the roost and has the lowest cost per capacity of any technology. The current generation tape, LTO-8, has a native capacity of 12 TB and can be purchased for just $ 75 (or $ 6.25 / TB).
However, while cost-effective, tape also has its weaknesses. Data can only be accessed serially, making it difficult to find a particular file. Enterprises also need to migrate to new tapes on a semi-regular basis to avoid this. Data loss..
To solve the looming data crisis, researchers are looking for new ultra-dense, ultra-durable storage technologies. Several different candidates have emerged, but one concept looks particularly promising. It is a deoxyribonucleic acid, well known as DNA.
What is DNA Storage? How does it work?
DNA, the basic substance of living organisms, is composed of four molecular building blocks: adenine (A), guanine (G), cytosine (C), and thymine (T). These compounds connect in pairs (AT and GC) to form the rung of the famous double helix ladder.
This structure can be used as a very dense and durable form of data storage by converting the binary 1s and 0s to a 4-letter genetic alphabet. 1 gram of DNA Can store 215PB (220,000 TB) of data.
“DNA data storage is the process of encoding and decoding binary data with synthesized DNA strands,” said a spokeswoman for the DNA Data Storage Alliance (DDSA), founded last year by Microsoft, Western Digital, Twist Bioscience, and Illumina. The person in charge explained.
“To store data in DNA, encode the original digital data, then write it (synthesize using a chemical / biological process) and store it. When the stored data is needed again, the DNA The molecules are sequenced, the individual A, C, G, or T are displayed in sequence and remapped from the DNA base to 1 and 0. “
DNA is superior to current archive storage technology in almost every category. A Recent treatises It is estimated that 9TB of encoded DNA can be tucked into a space of only 1mm ^ 3. In other words, the volume of one LTO cassette holds 2 million TB of data, which is about 167,000 times the capacity of LTO-8 tape.
In a real-world scenario, you can use DNA to store the entire YouTube (which is roughly thought to be the host). 400,000 TB new video In a small refrigerator, as opposed to a few acres of data centers (every year).
Unlike magnetic tape, which needs to be replaced every 10 or 2 years depending on usage, DNA lasts for thousands of years under the right conditions. This means that total cost of ownership (TCO) can be very low.
DNA is also very environmentally friendly because it is biodegradable, easily replicated, and consumes very little power beyond the energy required to produce the required climate.
But there are still many reasons why DNA hasn’t made tape storage obsolete. This technology is still in its infancy and will solve problems at almost every stage of the process, from encoding to compositing to sequencing.
According to Turguy Goker, director of Advance Development, LTO for storage company Quantum, it’s too early to “still bet on this horse.”
“Currently, DNA storage is swimming in volatile waters, and it will take years before we can safely move to commercial shores,” he explained.
High density and durable, but slow and expensive
Early signs may be promising, but there are still many hurdles to store before DNA begins to hit the global storage capacity problem. The main issues are related to cost and speed.
DNA requires a very special climate to prevent degradation. This is difficult to maintain and can be costly. Specifically, DNA should be kept at a very low temperature or exposed to carefully controlled airflow.
With current technology, the process of writing data to DNA is also very time consuming compared to existing technology. Until this is improved, DNA storage will not be available on a large scale.
“DNA writing is a chemical process, which is essentially much slower than the digital electronics we use today,” Goker explains. “Writing to DNA-based storage without overcoming this barrier is like using a straw to empty the pool.”
Reading the data stored in the DNA is also a challenge and increases the likelihood of errors during the sequencing process. For this reason, DDSA is a use case (to meet regulatory requirements) where early adopters of the technology use the technology in one write, one read (WORN), or one write, rarely read (WORSE). Expected to be used for storing specific data types, etc.).
Aside from technical issues, the lack of common standards must be addressed to ensure that DNA storage technologies are interoperable with each other and interoperable with legacy technologies.
However, as DNA storage attracts both attention and investment from governments, storage incumbents, and tech giants, work is underway to find solutions to these problems.
For example, the Director of National Intelligence of the United States has launched Molecular Information Storage (hazeLast year’s program set the goal of developing a DNA technology capable of writing 1TB and reading 10TB within 24 hours at a cost of less than $ 1,000.
Apart from this, twist bioscience Increased DNA synthesis yield 1,000 times By using a silicon platform that miniaturizes the required chemistry.
According to DDSA, data accuracy concerns are mitigated by scripts that can fix sequence problems. Organizations also believe that they have time to establish specifications that prevent fragmentation across the industry.
“Unlike synthesis for health care, which must be perfect, DNA storage can tolerate errors due to the correction algorithms commonly used in today’s storage. DNA storage pioneers mitigate this risk. We are already working on improving the encoding and error correction algorithms that accurately recover the data, “said a spokeswoman.
“As commercially viable DNA data storage methods and tools become better understood and more widely available, the Alliance will work with specific specifications and standards to facilitate the emergence of interoperable DNA. Consider creating encodings, physical interfaces, retention, file systems, etc.) A data storage-based solution that complements existing storage tiers. “
Is this the end of the tape?
The advent of DNA storage raises questions about the lasting usefulness of magnetic tape, but some believe it hasn’t been written on the wall yet.
For example, when asked if DNA felt threatened tape storage products, IBM gestured to improve tape density. This has also been tested and is true in the commercial context.
Andy Walls, IBM’s CTO and Chief Architect for Flash Storage, said:
“It’s also the most environmentally friendly storage technology available, it consumes no power and lasts for decades, and because it continues to densify tapes, it’s one of IBM’s cartridges today. (Small than VHS cassettes) can hold incredible 60TB of compressed data. These rely on tape for the largest hyperscalers that rely on tape for cheap and reliable archive storage. There are some qualities that make the solution. “
At the end of last year, IBM also Break the world record for area density A prototype tape made of strontium ferrite (SrFe) developed by Fujifilm. This pair achieved a record of 317 GB / in ^ 2. This is equivalent to 580 TB per cartridge, indicating that there is a way to advance the tape before it reaches maximum density.
Although the attributes of DNA storage are most comparable to tape, Quantum believes that DNA is more likely to be incorporated into an existing setup than to completely replace existing technology.
“The tape shows no signs of disappearing quickly, especially for long-term on-premises archiving,” Goker said. “This is the most economical form of storage per megabyte, it can store large amounts of data per cartridge, the running costs required are very low, and it is the safest because the data is stored offline. It is one of the storage media and can also serve as an active archive, an important and important feature for hyperscalers. ”
“If you work together, rather than seeing both storage options as competing, you need to make sure of their complementary nature. DNA coexists as a tiered system within a hyperscale data center. By doing so, it will complement the tape in the future. DNA is unlikely to replace magnetic tape in the coming years, but once read, it occupies the underlying layer as it is rarely used. It is ideal for big data archiving scenarios. “
However, tapes are unlikely to be used in the short term and are at the heart of enterprise storage systems, so tapes are stored, but decades-old technology is, in any case, a tsunami of data on the horizon. It doesn’t mean that you can endure it. Of R & D.
Tape capacity tends to nearly double with each LTO generation, significantly outpacing SSD and HDD capacity growth, but even this exponential expansion cannot exceed the amount of data generated.
The next frontier of data storage
If you believe in analysts, the data storage crisis will come to mind within the next five years. If storage technology isn’t in time, the results can be mixed.
For example, the inability to store a sufficient amount of data, whether caused by a cyberattack or a change in socioeconomic status, is well prepared for a company to recover from the turmoil. It means that it is not in order.Full value of analysis Enterprises need to handle incomplete datasets, leaving them undeveloped (and unknown).
From the consumer’s point of view, social media platforms Email Businesses and other businesses can start deleting old data and posts to create space for a river of constantly flowing fresh content.For example, Google Recently announced Start deleting data attached to Gmail, Drive, and Photo Services from accounts that haven’t been used for more than 2 years.
DNA storage isn’t the only thing you want.Microsoft researchers use lasers Etching data into quartz glass, Or save data Hologram format Inside the crystal.
However, DNA with its unique set of properties and characteristics is probably the most savior.
According to Luis Ceze, a DNA storage expert at the University of Washington, it will take eight to ten years for DNA to be adopted in large-scale commercial situations. Other experts we consulted agreed with this evaluation.
However, Ceze also said that research trends are “favorable” and that “the boutique market for smaller data needs is already viable today.” Therefore, there is hope that you can win the battle against time and avoid data disasters.