Using the CRISPR gene-editing tool, scientists from Harvard University have developed a technique that permanently records data into living cells. Incredibly, the information imprinted onto these microorganisms can be passed down to the next generation. Image: Shutterstock
CRISPR/Cas9 is turning into an incredibly versatile tool. The cheap and easy-to-use molecular editing system that burst onto the biotech scene only a few years ago is being used for a host of applications, including genetic engineering, RNA editing, disease modelling and fighting retroviruses like HIV. And now, as described in a new Science paper, it can also be used to turn lowly microorganisms into veritable hard drives.
Scientists have actually done this before, but in a completely artificial way from start to finish. In these prior experiments, information was encoded into a DNA sequence, the DNA synthesised and then that was it — all the information remained outside the realm of living organisms. In the new study, a Harvard research team led by geneticists Seth Shipman and Jeff Nivala went about DNA data storage in a completely different way.
"We write the information directly into the genome," Nivala told Gizmodo. "While the overall amount of DNA data we have currently stored within a genome is relatively small compared to the completely synthetic DNA data storage systems, we think genome-based information storage has many potential advantages." These advantages, he says, could include higher fidelity and the capability to directly interface with biology. For example, a bacterium could be taught to recognise, provide information and even kill other microorganisms in its midst, or provide a record of genetic expression.
"Depending on how you calculate it, we stored between about 30 to 100 bytes of information," said Nivala. "Which is quite high compared to the previous record set within a living cell, which was ~11 bits."
To do it, the researchers used the bacteria's built-in immune system — in the form of CRISPR — to write data directly onto the genome of the bacterial cells. This allowed the modified bacteria to pass on this customised information to the next generation, making this form of biological data storage extremely efficient and powerful.
Shipman and Nivala leveraged the power of bacteria's built-in immune system, AKA CRISPR, to make this possible. Whenever a virus attacks a bacterium, CRISPR diligently records the event in the DNA, which it can then reference in the event of a renewed viral attack. It does this by storing tiny sequences of the viral DNA itself, called spacers. In their experiment, the researchers wanted to see if these spacers could be added in a particular sequence, which would create a timeline of when these spacers were added.
The researchers figured that this temporal ordering of spacers could form the basis of a molecular recording device. During the experiment, loose segments of DNA were injected into a strain of E. coli bacteria equipped with CRISPR/cas9. But these bits of DNA weren't arbitrary — they contained specific strings of data that contained specific sequences of letters chosen by the scientists. These segments were introduced one at a time, and the bacteria systematically integrated them in a linearly coherent manner to reflect the order in which they were introduced.
The researchers only added a few spacers to demonstrate their theory. But given that other spacers are available, there's an absolutely staggering number of possible combinations.
"These experiments lay the foundations for a recording system that could be used to monitor molecular events that occur over long time periods," said Nivala. "For instance, it could eventually help us answer questions like what happens to the gene regulation inside a cell as it goes from a healthy to disease state. Or it could also be used to record information on the cell's outside environment, for example the presence of specific chemicals, toxins, or pathogens."
Moving forward, the team would like boost the system so that data can be stored more completely at the level of single cells, instead of having to use a population of cells to encode/decode the information.