The pandemic may be over, but the COVID-19 virus, or coronavirus, has certainly not disappeared, and could mutate into a dangerous form again. Research professor Markus Perola of the Finnish Institute for Health and Welfare (THL) and his team are using registry and genomic data to determine which factors contribute to the risk of some people developing severe coronavirus requiring hospitalisation. The research requires a large amount of computing and processing of sensitive data.
The emergence of the COVID-19 virus in late 2019 caused a pandemic that shocked the world. The disease was life-threatening for elderly people. By March 2021 – in a period of about 15 months – around 2.5 million people worldwide had died from the disease. The global crisis affected both the global economy and the health sector.
“Pandemics come about because of population growth and because we are living closer to farm animals. At the same time, biodiversity is declining and we have a narrower diet. The globe is essentially a petri dish in which pandemics are bred,” Perola says.
Because new forms of the COVID-19 virus can emerge, it’s crucial to understand how the virus works and how to fight it. The COVID-19 Host Genetics Initiative project, among others, brought together researchers from around the world to gather information on the characteristics of coronavirus infection.
The aim is to identify individuals with a high risk of developing the serious form of the disease. As a result of the project, more than 50 genomic regions were identified that may contain genes that predispose a person to COVID-19. Some of these also predispose a person to a particularly severe form of the disease.
“This information will be used in THL’s own study to find out why some people who contract coronavirus end up in hospital. One reason may be found in genes.”
The study led by Perola involves collecting data on more than 3,000 people who were hospitalised with the virus, or, in milder cases, sought coronavirus testing. The survey uses registry data. Sample collection is carried out in collaboration with biobanks. Blood samples are tested for other co-infectious diseases, the severity of the inflammation, and other values that indicate the biological balance of the body.
“There is always talk of different risk groups, but we forget that a large proportion of people at risk of the coronavirus either do not end up in intensive care or do not die of the disease. For example, the mortality rate for people over the age of 80 is around 10 per cent, but several times that number do not die. So what is the difference between these groups? And why do some very overweight people end up in intensive care with the coronavirus, but not others? Our aim is to find the groups at risk to benefit from vaccination the most.”
According to Perola, combining genetics and registry data will shed further light on these questions. Perola’s team has amassed some exceptionally interesting research results from analysing large amounts of data. Working with Tero Hiekkalinna and Joseph Terwilliger, Perola ran a simulation to test the use of data from a million human genomes. The data also included clinical phenotype data. The anonymised data was analysed by a supercomputer at the Finnish ELIXIR Node, CSC – IT Center for Science, including not only genomic data but also information on health, family relationships, age and gender. This test yielded valuable new insights into how big data could be used in public health in the future.
Why is understanding national health variations important for the country’s healthcare?
“If we don’t know what’s specific to Finns, no one else will research this. A good example is the diseases of the Finnish disease heritage, of which there are about forty rare diseases that are concentrated in Finland. There is intensive international collaboration in gene research to identify these genes and understand how they work. But it is researchers in Finland who are making it a clinical reality.”
Finland is a genetic isolate – a population with minimal genetic mixing due to geographical isolation or similar mechanisms – because it has historically developed in a somewhat isolated way from other European countries. The Finnish population has its own form of inheritance, which – from a research point of view – is easy to approach differently from others. Some of the biological characteristics found in Finland are not found anywhere else in the world. Several hundred disease variants that are not seen in other populations have been found here – in other words, they are uniquely Finnish disease variants.
According to Perola, the Finnish population is, in a way, the largest genetic isolate in the world.
“We have the statistical power to find more of these variants compared to other isolated populations, such as the population of Iceland. Rare gene variants offer new insights into disease biology that are not available from other populations. They can open up whole new ways of understanding diseases. Is there already a cure for a given disease, or do we need to develop one?”
According to Perola, Finland is the country that others look to when it comes to the use of registry data. This was the case, for instance, with the creation of the European Health Data Space (EHDS). Registry data has been collected in Finland for decades. For example, a cancer registry was set up as early as the 1950s.
“We have extensive data in registries, for example the Kanta Services, which stores people’s health data and prescriptions. Not many countries have a similar situation – for example, there are not many countries where all laboratory data is available as it is in Finland today. We have access to data from the entire population, regardless of the different data systems or governance structures.”
Perola gives the example of one of his previous studies, which involved using registry data to find out what distinguished people in Finland who took the first coronavirus vaccine from those who refused it.
“We wanted to determine the factors that describe the proportion of the Finnish population – almost 20 per cent – who did not take the first vaccine. We looked at family relationships and socioeconomic variables: whether the person was in paid employment, their area of residence and native language. The data allowed us to scientifically justify that the message about vaccines did not reach migrants in time, and that there were people who did not have the resources to find out about vaccination themselves.”
Another thing that was studied was respiratory syncytial virus (RSV) infection in children under the age of 1. Respiratory syncytial virus (RSV) is a ribonucleic acid (RNA) virus that causes millions of respiratory infections worldwide every year. It is a major cause of infections in young children.
“The registry data was used to follow families whose child had been hospitalised after contracting RSV. The study used data related to socioeconomic status, use of intoxicants by the child’s parents, and the child’s birth characteristics.”
According to Perola, this was highly valuable information that was obtained using artificial intelligence. The computer was fed registry data and taught to identify certain features in the dataset.
“This could not be done with anything other than CSC’s sensitive data services and supercomputing environment.”
Perola uses genetic and registry data in his research.
“Research infrastructure is important. Research needs organisations such as the CSC to enable analysis. It doesn’t matter whether the scientist is an astronomer or a geneticist – they both use the same infrastructure. It’s always difficult to get money for infrastructure when foundations don’t fund it but instead just assume that the state will pay. But the state says, ‘Get outside finding,’ so we’re in a catch-22 situation. Supporting research infrastructure is essential to ensuring research excellence in science in Finland.”
Ari Turunen
25.6.2024
Read article in PDF
More information:
CSC SD-connect
https://docs.csc.fi/data/sensitive-data/sd_connect/
Finnish Institute for Health and Welfare THL
CSC – IT Center for Science
is a non-profit, state-owned company administered by the Ministry of Education and Culture. CSC maintains and develops the state-owned, centra- lised IT infrastructure.
https://research.csc.fi/cloud-computing
ELIXIR
builds infrastructure in support of the biological sector. It brings together the leading organisations of 21 Euro- pean countries and the EMBL European Molecular Biology Laboratory to form a common infrastructure for biological information. CSC – IT Center for Science is the Finnish centre within this infrastructure.