Geneticist Petri Auvinen and his research team are using DNA samples to find out what has been happening in the Baltic ecosystem during the past 10,000 years.
Samples obtained from the seabed by drilling can be used to study past and present species and their habitats. This is useful in the study of biodiversity and climate change. Samples are obtained from the sediments –that is, layered soil that has been moved by water, wind or ice. If DNA can be isolated from the sediment samples, it can be used to learn about organisms located at certain depth of the sediment.
“Our aim is to collect sediments samples as far down as possible in the Baltic Sea bed in order to study the history of the Baltic Sea basin. We also take samples deep in marshland, providing us with information about the history of the soil,” says Auvinen.
Sediments are not created everywhere, but they can be found in the Baltic Sea and on marshland.
“These sediments have never been studied as extensively as we are doing now. At best, we may be able to reach back to samples dating from the Ice Age, which was when sediments began to accumulate in the sea.”
Petri Auvinen is a research director at the Institute of Biotechnology, University of Helsinki. His research focuses on genomics and metagenomics. While genomics looks at the entire genome of an organism, metagenomics can study and sequence a number of organisms, such as microbes, from a single sample at the same time. The study of micro-organisms has advanced in leaps and bounds. A sequence sample can be taken from any environment, soil or gut to determine the microbiota composition. The term used is microbiome, denoting the microbiota of a specific habitat and its genome, that is, the metagenome.
“Analysing the sediment, we are able to determine when it was created. We analyse the sediments to find what microbes, other organisms and plants lived in a specific period.”
Auvinen’s research group has been studying environmental samples from soil and composts for a long time. For example, by isolating DNA samples from composts, they have been able to identify thousands of species of bacteria.
Auvinen has spent a lot of time studying the genetic origin of microbes.
“We published our first microbiome study of the Baltic Sea in 2010. We were already then using next-generation sequencing methods. These methods can determine up to billions of DNA sequences at the same time from a single sample.”
As the Baltic Sea is a shallow basin mostly filled with brackish water, it suffers from eutrophication, toxic blue-green algae blooms, and oxygen deficit, all of which have an impact on the community. The research group carried out thorough sequencing to determine the structures of bacterial communities in the northern part of the Baltic.
Previously research focused on one molecule at a time, but now we are talking about a sequencing volume a million times greater. Next generation sequencing can be used to determine microbes in a sediment sample.
Microbes can provide a surprising amount of new information about climate change and biodiversity.
“It would not surprise me if we discovered from sediments that as the environment has changed, so have the microbes. It is worth keeping in mind that practically all matter used by organisms has been dissolved from sediments by microbes. This means that if microbiota undergoes major changes in the environment, it is possible that some other ecological services change as well.”
By this Auvinen refers to “services” provided by nature, such as pollination, conversion of nutrients suitable for humans, and clean water.
“If the environment changes, then forests, for example, may disappear or be damaged for long periods, meaning that these services would no longer be available. Humans can manage up to a point with technology, but at some stage life may become difficult or even impossible. On the other hand, you may look at the situation from the viewpoint that as ecological services change or are reduced, the environment can no longer support such a large number of people.”
Auvinen’s group has researchers from a range of fields. You need experts from different fields in order to arrive at an accurate analysis of bygone environments.
“Our plan is specifically to study reconstruction, that is, how DNA and RNA data can be combined with sample dating, enabling us to know the precise age of samples.”
Auvinen mentions stable isotopes that can be used for reconstructing the dating of environmental conditions. Furthermore, botanists can analyse the DNA of, say, pollen, and this can be combined with isotope dating, enabling us to see what the environment was like thousands of years ago. As sediments contain both old and new DNA, and it is not possible to tell them apart, dating is crucial.
“In order for us to know which way the environment is heading, we should know what it was like earlier. We are able to tell, going back 10,000 years, what has been happening in the environment. This can be used as reference material for what will happen in the future.”
Important areas of study include not only biodiversity loss but also excessive use of chemicals. They affect not only us but generations that come after us.
“Man-made chemicals not originating from nature will circulate in the environment. We have pharmaceuticals and detergents that may never disappear from nature. We do not know how these chemicals will affect the environment in the long run. The spread of microplastics in nature is one manifestation of excessive use of chemicals.”
Some of the study’s sediment and data samples are analysed, while some are placed in storage in order to study other aspects of them later. In addition to microbiome study, the Institute of Biotechnology conducts plenty of research on human sequence samples and other species. All of this requires a huge amount of computing power.
“Our institute alone produces 8 terabytes of sequence data on one device in a week. This is a lot more than it used to be ten years ago. When all start doing research this way, I see a big challenge ahead in data processing.”
For example, the DNA sequence that covers the entire genome is in separate parts in the cell, and these must be assembled in the proper order. Then there is annotation to do, seeking the genes and their function in the sequence.
“When assembling genomic data, you must have plenty of RAM, because all sequences must be analysed in the same memory space. We will also need more disk space for data storage. We have a great number of highly trained people that use most of their day to copy data from one place to another.”
The greatest bottleneck is the storage of sensitive data.
“The storage space we are using is too small.”
Another challenge is the software used for the calculations. Some of the data is analysed using the CSC – IT Center for Science’s ePouta hardware, but some must be done with the team’s own.
“The software can be so complex that it cannot be run on the CSC system. We use some virtual computers, but they have their limitations, too. We also have runs that may continue uninterrupted for months. CSC has thousands of users, and the CSC environment obviously has some maintenance outages. For example, when we were working on the genome of the Saimaa ringed seal, the first major assemblies took a thousand hours.”
“Currently we are able to work on large genomes a hundred times better than a few years ago. But there is more and more data all the time, and we must be able to store it efficiently and in a way that it is comparable to other data. We will continue to work with CSC in data storage, transfer and calculation.”
Ari Turunen
20.3.2023
Read article in PDF
More information:
Institute of Biotechnology (University of Helsinki)
https://www.helsinki.fi/en/hilife-helsinki-institute-life-science/units/institute-biotechnology
CSC – IT Center for Science
is a non-profit, state-owned company administered by the Ministry of Education and Culture. CSC maintains and develops the state-owned, centralised IT infrastructure.
https://research.csc.fi/cloud-computing
ELIXIR
builds infrastructure in support of the biological sector. It brings together the leading organisations of 21 European countries and the EMBL European Molecular Biology Laboratory to form a common infrastructure for biological information. CSC – IT Center for Science is the Finnish centre within this infrastructure.