8.5.2025Alkaloids derived from tree bark destroy cancer cells
27.3.2025Vinca alkaloids: Madagascar’s gift to cancer treatment
6.2.2025Genetic testing improves medication safety and effectiveness
26.12.2024The ComPatAI consortium uses large datasets to create an AI learning model for pathology
14.11.2024Microbiota affects the immune system
21.10.2024The skin’s wide range of microbiota improves the immune system
30.9.2024New drug targets from RNA-binding proteins
31.8.2024New machine learning method speeds up drug screening hundred-fold
22.7.2024Mapping the coffee genome to improve disease resistance
25.6.2024Why do some get the severe form of COVID-19?
30.5.2024An AI model that understands health data warns of future diseases
29.4.2024An infrastructure for genomic data
1.4.2024European research community preparing for next pandemic
8.3.2024Evolutionary dynamics of viruses and other microbes affect human health
2.3.2024A million European genomes
20.2.2024Efficient transfer and analysis of biological image data through web interfaces
23.1.2024Improving breast cancer treatment prognoses with liquid biopsy
15.12.2023The European Health Data Space: health data moves across borders for research purposes
16.11.2023New method for measuring gut microbiota
31.10.2023Purifying mining wastewater with plant-associated microbes
29.9.2023Artificial intelligence helps researchers find suitable drugs based on patient’s genetic data and cancer cell samples
1.9.2023Combining data from different sources for personalised treatment
15.8.2023Better treatments for leukaemia
10.6.2023MicroRNAs may reveal type 1 diabetes
16.5.2023Single-cell RNA sequencing enabling individual disease treatment
12.4.2023Tissue samples analysed with Sensitive Data (SD) services provide new information on celiac disease and other autoimmune diseases
20.3.2023DNA isolated from Baltic Sea sediment shedding light on climate change and biodiversity
27.2.2023Organoids grown from stem cells boost cancer research
19.12.2022Sensitive Data (SD) services for Research: with a few clicks a researcher can launch a personal secure computing environment
30.11.2022Microbiota in permafrost play an important role in climate change
20.10.2022Reusable, accurately described and high-quality data – tools created by the research community for agile data management
29.9.2022Gene sequencing used for study of structure and functioning of microbial communities in oceans
1.9.2022Antibiotic-resistant bacteria are a global problem
23.8.2022Personalised medicine against cancer and viruses
30.6.2022Studying the human microbiome is a key towards holistic understanding of our health
23.5.2022FINRISK: one of the world’s longest-running population survey time series
8.4.2022Combining biobank data with data from health registers enables research towards personalised treatment
3.3.2022Finnish research team sequences the genomes of thousands of individuals with diabetes to look for genetic risk factors
10.2.2022BIGPICTURE helps pathology go digital
30.12.2021Sensitive data infrastructure
23.11.2021In the future, an algorithm may diagnose glaucoma from fundus photos
26.10.2021Patient data creating better artificial intelligence models
15.9.2021Teaching an algorithm to identify cancer from sequence data
3.12.2020Efficient processing and sharing of data improving disease diagnosis and treatment
10.11.2020Bioinformatics to revolutionise healthcare: Efficient data processing speeds up diagnoses and enables personalised drug treatments
27.8.2020Tissue samples into digital images, interpreted by artificial intelligence
9.6.2020Digital pathology speeds up diagnosis
18.5.2020Searching markers for breast cancer by machine learning
8.4.2020Metabolomics measures and analyses metabolic changes caused by illness, diet or medication
1.3.2020Deep learning algorithms help in breast cancer screening
13.2.2020All breast cancer risk factors evaluated with AI
6.2.2020A dog can smell diseases
2.12.2019ELIXIR Compute Platform for life and health sciences
18.11.2019New bioinformatics methods and measurement technologies call for continuously updated courses and analysis software
30.10.2019No need to turn up personally: SisuID improves electronic authentication
30.9.2019Risk assessment of cardiovascular diseases for all citizens
20.8.2019Federated user ID management: a single identity giving access to numerous bioinformatics services
4.9.2019Targeted treatment for venous diseases with vascular system modelling
4.7.2019Research on rare genetic disorders can be utilised in understanding the mechanisms behind even more common diseases
3.6.2019VEIL.AI: patient data in a veil
20.5.2019Biocenter Oulu: technology services for biomedical research
23.4.2019Mouse models provide insights into the causal mechanisms of diseases
4.3.2019Euro-BioImaging: imaging infrastructure
26.2.2019Imaging helps to highlight significance of data
14.1.2019Data harmony and standards: data must be processed, described and stored by uniform means
10.12.2018Hundreds of genes could lie behind a single disease
5.11.2018Help from the Finnish genome for the prevention of cardiovascular diseases
8.10.2018Disease prediction models are becoming more accurate thanks to the computational methods
11.9.2018Genetic data under control and in the desired format
23.8.2018Massive data management project: Finns’ heredity is collected and safeguarded
14.6.2018Half of all drug ingredients affect only three protein families
12.6.2018Looking for a good drug
29.5.2018Quick DNA analysis of patient samples with artificial intelligence
7.5.2018Secrets of the intestines
4.4.2018Algorithm determines the appropriate drug
19.3.2018Bank of million patient samples
20.2.2018Mapping the genomes of all organisms enables the development of new vaccines and medicines
7.2.2018Ordered and secured
2.11.2017Striving for a national service to utilise genomic data in health care
11.8.2017Better harvests on the horizon? Data will also be harvested
19.6.2017Microbes and climate change
21.5.2017Storing the whole genome of the Finnish population? The data will benefit disease research
6.4.2017”Smart life insurances” offered: human biological data is only useful when interpreted correctly
15.1.2016New drug molecules through determining the structure of proteins
26.10.2015BBMRI.fi: an IT Infrastructure for shared biobanks
24.9.2015Fighting cancer with mathematics
10.8.2015Saimaa ringed seal aids the study of population genomes
1.8.2015Webmicroscope stores tissue samples in the cloud
15.7.2015Pups and Pooches Behind Genetic Discoveries in Human Diseases Canine Genetic Research Benefits from ELIXIR Databases
5.6.2015Life sciences in European cloud
Tree bark acts as an important chemical defence mechanism against pests. When a plant comes under threat from bacteria or an insect, alkaloids secreted by the plant may, for example, inhibit cell division or the activity of DNA in the insect, preventing reproduction. This is the operating mechanism of paclitaxel and camptothecin, two compounds isolated from the bark of different trees and developed into effective anticancer drugs. Data analyses and databases have now become available to help identify bioactive compounds in trees and other plants.
There are half a million plants in the world, of which an estimated 7% are used for medicinal purposes. Around 25% of prescription medicines in use today are plant-based. This refers to medicines consisting of natural compounds isolated from plants and synthetic derivatives developed from them. Preserving biodiversity is also of paramount importance for pharmaceuticals, as new plant species are constantly being discovered and the chemical composition of even known plants is largely unknown.
Paclitaxel and camptothecin are examples of anticancer drugs that were discovered when samples from potential medicinal plants were systematically screened. The US National Cancer Institute (NCI) screened more than 35,000 plant samples in a research programme that started in 1956 and continued until 1981. The aim of the programme was to identify plant compounds that could be used to prevent or treat cancers.
The ambitious programme also drew on ethnobotany and history. Programme director Jonathan Hartwell compiled an extensive collection of ancient Chinese, Egyptian, Greek and Roman texts on the medicinal uses of plants. To find the samples and obtain accurate botanical information, Hartwell turned to the U.S. Department of Agriculture (USDA). USDA botanists began collecting plants from around the world to be analysed in laboratories.
Research Triangle Institute’s chemists Monroe E. Wall and Mansukh C. Wani received samples of Camptotheca acuminata for study. Known as the Happy Tree in China, Camptotheca acuminata grows naturally on wet banks of the Yangtze River. In traditional Chinese medicine, its leaves and bark have been used to treat various inflammations and infections.
Wall and Wani discovered that the compounds in C. acuminata were highly active in the L1210 mouse leukaemia cell line, meaning that its effects were seen in cancer cells. The L1210 line is commonly used in cancer research and for testing new anticancer therapeutics. It was isolated from a mouse with lymphocytic leukaemia. Wall and Wani isolated an active compound from wood, which was named camptothecin. It was found to be highly effective against leukaemia cells.
Camptothecin binds to an important cellular enzyme, topoisomerase I, and to DNA complexes. This prevents cancer cells from replicating their DNA, resulting in cell death. Despite its effectiveness, camptothecin has serious side effects and poor solubility. A drug being soluble in water is important because it affects the absorption and distribution of the therapeutic agent in the body. Later, derivatives of camptothecin were developed that were water-soluble and better tolerated and retained their efficacy. These include topotecan and irinotecan. Topotecan (Hycamtin) is used for ovarian, lung and cervical cancer, while irinotecan (Camptosar) is used primarily for colon and rectal cancers.
Synthetic derivatives developed from a natural compound can be significantly more effective than the original compound. In the 1980s, the Japanese company Yakult Honsha developed irinotecan, a derivative of camptothecin. It was then discovered that its active form in the body is its metabolic product 7-ethyl-10-hydroxycamptothecin, which is about 100 to 1,000 times more active than irinotecan itself. The compound was given the name SN-38, which stands for the pharmaceutical company code “SmithKline Number 38”. It is not active as such, but acts as a prodrug. SN-38 is a potent anticancer agent that is produced in the body when irinotecan is converted to its active form. Conversion to SN-38 takes place in the liver and other tissues. It is therefore a modified version of naturally occurring camptothecin with added ethyl and hydroxyl groups. These changes resulted in a highly effective therapeutic agent. Some individuals carry the UGT1A1*28 mutation. A mutation in the UGT1A1 gene (such as UGT1A1*28) may reduce the activity of the enzyme and slow down the elimination of SN-38, thereby increasing its toxicity. This may increase the drug’s side effects. The Ensembl database can be used to study the UGT1A1 gene, its mutations and possible effects on SN-38 metabolism, for example.
Wall and Wani continued to study the plant samples after the discovery of camptothecin. They were asked to analyse samples of Pacific yew (Taxus brevifolia).
The Pacific yew is one of five genera in the Taxaceae family. It is a slow-growing tree native to North America, where it is found in the shade of giant conifers on the banks of streams, in deep ravines and in wet passes. Its wood is hard but of limited use. The tree has few natural pests because most parts of it are poisonous. In 1971, Wall, Wani and their colleagues published a study in which they presented a compound isolated from the bark of the yew tree. It prevents microtubules from breaking down, stopping cancer cells from dividing. The compound was named paclitaxel (Taxol).
Paclitaxel was an effective cancer drug, but there were environmental concerns. The extraction of the compound from the yew tree killed the rare tree. As the natural source (yew tree bark) was not sufficient for large-scale production of the drug, a semi-synthetic method was developed in the 1990s using 10-deacetylbaccatin from the needle of the yew tree as the raw material. The compound (10-DAB) is a precursor to paclitaxel, and by adding benzylamine to it, pure and ecologically sustainable paclitaxel can be produced. Paclitaxel is one of the most commonly used medicines for breast and ovarian cancers.
The ELIXIR Core Data Resources (CDRs) have been selected based on their quality, wide usage, and long-term significance. They are essential to many fields of research, including genomics, proteomics, and drug development. ELIXIR Core Data Resources provide researchers with open and reliable access to biological datasets, promoting new discoveries and accelerating, for example, the development of new drugs, the understanding of diseases, and the identification of biomarkers.
The data analysis services and machine learning models provided by the ELIXIR infrastructure can help identify new drug candidates from large datasets. These resources and databases allow natural compounds to be analysed more quickly and accurately, supporting their development into safe and effective pharmaceuticals.
ENA (European Nucleotide Archive) is a database maintained by the European Bioinformatics Institute (EMBL-EBI) that stores and shares sequencing data from various organisms, including microbes, plants, animals, and humans.
Since ENA contains genomic and sequencing data from all forms of life, it is a key resource for biodiversity researchers analysing species’ genetic diversity, population genetics, and evolution. It aids in the identification of new species (via DNA barcoding and metagenomics) and the study of relationships between species (through phylogenetic analyses).
The genetic databases included in ENA enable large-scale meta-analyses and comparisons of genetic information across different populations or species. This supports progress in a wide range of research areas such as evolutionary biology, disease research, and medicine. ENA is openly accessible to researchers worldwide.
ChEBI (Chemical Entities of Biological Interest) is a curated biochemical database that contains information about biologically relevant small-molecule compounds. It provides accurate chemical and biological data on compounds such as drugs, metabolites, and natural products.
ChEBI includes precise information on chemical structure, molecular formula, mass, and isomeric details, which helps researchers analyze the chemical properties of pharmaceutical compounds.
Search example: The database can be used to look up the biological effects of paclitaxel and its target molecules.
Ensembl is a genomics and bioinformatics database that provides analysed genomic data from a range of organisms, including humans, animals, plants, and microbes.
Search example: The main molecular target of paclitaxel is the tubulin protein. Ensembl provides genetic and protein structure data on tubulin and related genes, aiding research into drug resistance and the effects of mutations.
Ensembl includes information on genetic variations that may affect the efficacy and side effects of Taxol. For instance, the enzymes CYP3A4 and CYP2C8, which metabolize Taxol, can carry mutations that impact the drug’s effectiveness.
Ari Turunen
8.5.2025
Read article in PDF