
Catalog Technologies, Inc.a leader in DNA-based digital data storage and computation, achieved a historic breakthrough in DNA computing by demonstrating the ability to search DNA-stored data in a massively parallel and scalable way with use resources almost independent of data size.
This demo is the result of CATALOG’s ongoing collaboration with potential customers and partners to understand their use cases. With its revolutionary platform and using text as an example, CATALOG was able to show how chemistry could be exploited to compute on archives in parallel.
In September, using an innovative combinatorial writing scheme, CATALOG encoded approximately 17,000 words from Shakespeare’s Hamlet into DNA in minutes on Shannon, CATALOG’s flagship writer. On this DNA archive, CATALOG performed a parallel search computation and successfully retrieved all occurrences of a query word. The approach did not require any complex pre-processing or indexing. Instead, CATALOG’s approach took advantage of the massively parallel nature of chemistry to retrieve all occurrences of the query word in a number of steps that are almost independent of the size of the dataset. . Thus, the number of steps required would be approximately the same if the dataset contained 170,000 or 170 million words.
To show this, in November, CATALOG encoded around 200,000 words from eight Shakespearean tragedies into DNA. To search and retrieve all occurrences of a query word in all eight sets would require approximately the same number of chemical calculation steps, time, and resources as the original Hamlet search. CATALOG is on track to demonstrate this search scalability on datasets containing more than 100 million words by mid-2023. CATALOG’s innovative approach shows, for the first time, how to take advantage of the massive parallelism of DNA chemistry to search almost any amount of data stored in DNA without the expected commensurate increase in resources.
DNA-based versus traditional computers for research
Research is a fundamental part of computing. When searching the Internet, queries are often returned quickly due to the time-consuming and expensive process of indexing the data. However, over 90% of enterprise data is unstructured, making effective research costly and in some cases impossible. This is a critical hurdle in cases where a lack of timely research results can lead to missed information that can have costly long-term implications across many industries, including oil and gas, finance, and manufacturing. government.
Why DNA for Computing
In recent years, the computing industry has witnessed a proliferation of adaptive technologies, including accelerators like GPUs, quantum computers, and extreme parallel computers.
However, this performance and scalability comes at the expense of higher power consumption, larger memory and long-term storage demands, and higher management complexity. This has generated tremendous interest and momentum in chemistry-based DNA computing systems, which have a much smaller physical footprint, consume lower orders of magnitude of power, and are resistant to traditional electronic security vulnerabilities.
Not all data stored in DNA is created equal
While many researchers and academics are developing approaches to use DNA as a storage platform for archival purposes, CATALOG’s proprietary approach to encoding data in DNA is ideally positioned for large-scale computing. scale in order to obtain essential information about the data stored in DNA.
Many researchers and labs testing DNA-based storage focus on the dense storage of information inside the DNA molecule. CATALOG reverses this idea and stores information in specific collections of DNA molecules. Unlike other approaches, this gives CATALOG latitude in designing the optimal DNA sequence for computation and to make writing orders of magnitude more efficient.
In addition to proving the computational capability of DNA, with this achievement CATALOG also demonstrated how powerful computational capabilities can increase the efficiency and cost-effectiveness of reading DNA data – currently a challenge. important for the domain – by orders of magnitude.
“This historic and transformational achievement is based on years of working with partners and collaborators who have helped make DNA-based computing a reality,” said Hyunjun Park, Ph.D., Founder and CEO of CATALOG . “With the advantages of DNA-based data storage and computation demonstrated, we are now turning our attention to more sophisticated applications, from signal processing to machine learning on massive datasets. In parallel, we are working working closely with partners and collaborators to reduce the size and complexity of our platform and to identify specific workloads to target commercial offers.
CATALOG’s Vision for DNA Computing Technologies in the Enterprise
This landmark achievement is a key part of CATALOG’s plans to develop DNA-based storage and compute solutions.
CATALOG accelerates the vision of DNA computing by advancing DNA computational algorithms and applications with potentially widespread commercial use in areas such as artificial intelligence, machine learning, data analytics and computer science secure. In addition, CATALOG develops solutions for DNA-based information security, rack and desktop size DNA data storage and computing platform, DNA data storage as as a service and an API for storing and calculating DNA data.
Sign up for free at insideBIGDATA newsletter.
Join us on Twitter: https://twitter.com/InsideBigData1
Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/
Join us on Facebook: https://www.facebook.com/insideBIGDATANOW