Reset

Advanced Filters
Natural products & food informatics:
10:30am - 12:30pm USA / Canada - Eastern - August 23, 2021 | Room: Zoom Room 03
Jose Medina-Franco, Organizer, Universidad Nacional Autonoma de Mexico; Dr. Abraham Madariaga, Presider, Universidad Nacional Autonoma de Mexico
Division: [CINF] Division of Chemical Information
Session Type: Oral - Virtual
Division/Committee: [CINF] Division of Chemical Information

The symposium presents advances on informatics applications to advance natural products research including but not limited to drug discovery, and food chemistry. Examples of contributions include developments on compound databases, web servers, molecular modeling, screening, analysis of natural products and food chemicals. Contributions to de novo design of compounds inspired by natural or food chemicals are also welcome.

Monday
CAS: providing access to more than 110 years of natural products research
10:30am - 10:50am USA / Canada - Eastern - August 23, 2021 | Room: Zoom Room 03
Dr. Elaine N Cheeseman, Presenter, CAS
Division: [CINF] Division of Chemical Information
Session Type: Oral - Virtual
Rapid growth in research on the use of natural products and related compounds for consumer goods and medical therapies is driving the need for scientific and patent information to guide research and protect innovations. For over 110 years, CAS has been providing access to research on chemistry and surrounding areas, including natural products, from patents, non-patents, and other document types. CAS’ intellectual indexing includes concepts, exemplified compounds, Markush structures, and reactions to enable the retrieval of information relating to natural products research. The sources can be searched in a variety of ways depending on the type of information which is required. This information is accessible on both STNext and SciFinder-n. A new CAS visualization tool, ChemScape, will be shown using a natural products example.
Monday
Collecting and standardizing natural products: The COCONUT project
10:50am - 11:10am USA / Canada - Eastern - August 23, 2021 | Room: Zoom Room 03
Dr. Maria Sorokina, Presenter, Friedrich-Schiller University Jena; Christoph Steinbeck
Division: [CINF] Division of Chemical Information
Session Type: Oral - Virtual

Natural products (NP), biomolecules produced by living organisms, inspire the pharmaceutical industry and drug design due to their bioactive structural properties. To facilitate in silico NP studies, we assembled the COllection of Open Natural Products (COCONUT) from more than 55 open resources and it is currently the biggest repository for validated and predicted NP. A multitude of molecular features is precalculated for the NPs in COCONUT, facilitating compound search and selection on physicochemical and structural properties.
The COCONUT database is free and open to all users. Its web interface allows for diverse simple searches (e.g. by molecule name, InChI, InChI key, SMILES, molecular formula, predicted bioactivity), advanced search by molecular features, but also chemical searches based on structure similarity and substructure. The database offers simplified downloads of all data and search results. The database can also be queried via a REST API. COCONUT is also one of the first big chemical databases entirely relying on document-based NoSQL technology, which greatly facilitates future data model enhancements and the addition of new features.
COCONUT web is freely available at https://coconut.naturalproducts.net.

Monday
SistematX, a free web portal for the management of secondary metabolites
11:10am - 11:30am USA / Canada - Eastern - August 23, 2021 | Room: Zoom Room 03
Division: [CINF] Division of Chemical Information
Session Type: Oral - Virtual
Natural products (NP) and their secondary metabolites are promising starting points for the development of prototypes and new drugs, being a large part of the new treatments against countless diseases, directly or indirectly related to them. Currently, the computational approaches have a prominent role in NP-based drug discovery, these include the development and the use of NP databases, which have provided access to chemical, biological, pharmacological, toxicological, and structural data on NPs. Here, a Web Portal of secondary metabolites called SistematX is described (freely available to consult at http://sistematx.ufpb.br.), which was developed at the Federal University of Paraíba, Brazil, being introduced for the first time in 2018, considering the following aspects: (a) the ability to search by structure, SMILES (Simplified Molecular-Input Line-Entry System) code, compound name and species; (c) compound data results include important characteristics for natural products chemistry, including the generation and visualization of Hydrogen-1 and Carbon-13 nuclear magnetic resonance spectra, calculation of some physicochemical, drug-like, and lead-like properties, as well as biological activity profiles; (d) the user can find specific information for taxonomic rank (from family to species) of the plant from which the compound was isolated, the searched-for molecule, the bibliographic reference and Global Positioning System (GPS) coordinates; and (e) The ability to save chemical structures found by searching with the batch download option. SistematX also allows registered users to log in to the data management area and gain access to the administration pages.
Monday
NP navigator: A new look at the natural product chemical space
11:30am - 11:50am USA / Canada - Eastern - August 23, 2021 | Room: Zoom Room 03
Division: [CINF] Division of Chemical Information
Session Type: Oral - Virtual
Natural products (NPs), being evolutionary selected to bind to biological macromolecules, remained an important source of inspiration for medicinal chemists even after the advent of efficient drug discovery technologies. Thus, there is a strong demand for efficient and user-friendly computational tools that allow to analyze large libraries of NPs. Over the past 20 years, a lot of scientific reports exhaustively analyzed the chemical space of NPs in the medicinal chemistry context. Most of them, however, simply report static results of particular compound library analysis, without possibility to explore the chemical space of NPs via interactive interface.
In this context, we present NP Navigator – a free, intuitive online tool for visualization and navigation through the NPs chemical space. It is based on the hierarchical ensemble of more than 200 generative topographic maps(GTM), featuring NPs from the COlleCtion of Open NatUral producTs (COCONUT), bioactive compounds from ChEMBL and commercially available molecules from ZINC. Being a nonlinear probabilistic dimensionality reduction method, GTM is well suited to power NP Navigator. It has already proven to be a successful approach for the visualization and analysis of large chemical libraries. Hierarchical extension of GTM, combined with maximum common substructure (MCS) detection, allows to establish the link between the generalized visualization of the known chemical space of NPs/NP-like molecules and structural features of small compound clusters.
As a result, NP Navigator allows to efficiently analyze different aspects of NPs - chemotype distribution, physicochemical properties, reported and/or predicted biological activity and commercial availability of NPs. The latter concerns not only purchasable NPs but also their close analogs that can be considered as pseudo-NPs. Users are welcome not only to browse through hundreds of thousands of compounds from ZINC, ChEMBL and COCONUT but also to project several external molecules that play the role of “chemical trackers” allowing to trace particular chemotypes in the NPs chemical space and detect analogs of the compound of interest. Web-based implementation of NP Navigator is freely accessible by the link - https://infochm.chimie.unistra.fr/npnav/chematlas_userspace.

Monday
Navigating the known natural products chemical space
11:50am - 12:10pm USA / Canada - Eastern - August 23, 2021 | Room: Zoom Room 03
Division: [CINF] Division of Chemical Information
Session Type: Oral - Virtual
Natural products have traditionally played a key role in the discovery and development of flavour ingredients and biologically active molecules alike. The breakthroughs in analytical technologies, particularly NMR and a plethora of LCs, led to a dramatic increase in published structures of natural compounds. This progress was fueled by the interest in natural compounds as leads for novel active pharmaceutical ingredients and cross-fertilized by the major progress in organic synthesis.

Traditionally, the flavour and fragrance industry has intensively leveraged worldwide nature-derived materials generating highly effective ingredients. However, navigating the known natural products chemical space in a systematic way to identify high value compounds can be a daunting task for researchers. In order to fully leverage the biological relevance and biosynthetic origin of natural products, as well as their chemical characteristics and diversity, tools are needed to allow for a systematic representation and navigation of the chemical space of natural products.

Here we are describing approaches for the representation and interactive visualization of the natural products chemical space, allowing the rapid identification of areas of relative low interest, as well as high-value sub-spaces. The integration of structural data with data from physico-chemical parameters, to occurrence in nature and biological activities, enable an effective navigation and identification of spaces of interest and structurally closely related members of natural products.

Monday
Exploring microbial and plant natural products in the MAP4 chemical space
12:10pm - 12:30pm USA / Canada - Eastern - August 23, 2021 | Room: Zoom Room 03
Alice Capecchi, Presenter, Universität Bern; Prof. Jean-Louis Reymond, University of Bern
Division: [CINF] Division of Chemical Information
Session Type: Oral - Virtual
We recently reported the MinHashed Atom-Pair fingerprint up to a radius of four bonds (MAP4) as a new type of molecular fingerprint suitable for big data settings applicable across very different molecule families spanning from small molecule drugs to complex natural products (NPs), peptides and oligonucleotides. Here we used MAP4 to analyze NPs in the recently reported NPAtlas and COCONUT (Collection of Open Natural Products database) databases, two recently reported public collections of NPs and NP-like molecules. We show that MAP4 organizes NPs in different structural families visible in a TMAP layout, enabling a global understanding of the database. These and other interactive MAP4 TMAPs can be found at https://tm.gdb.tools/map4/. Furthermore, we show that a support vector machine (SVM) classifier trained with MAP4 data can be used to classify NPs according to their plant, bacterial of fungal origin. These tools provide new opportunities to better understand the structural diversity of NPs.

Natural products & food informatics:
02:00pm - 04:00pm USA / Canada - Eastern - August 23, 2021 | Room: Zoom Room 03
Jose Medina-Franco, Organizer, Universidad Nacional Autonoma de Mexico; Dr. Abraham Madariaga, Presider, Universidad Nacional Autonoma de Mexico
Division: [CINF] Division of Chemical Information
Session Type: Oral - Virtual
Division/Committee: [CINF] Division of Chemical Information

The symposium presents advances on informatics applications to advance natural products research including but not limited to drug discovery, and food chemistry. Examples of contributions include developments on compound databases, web servers, molecular modeling, screening, analysis of natural products and food chemicals. Contributions to de novo design of compounds inspired by natural or food chemicals are also welcome.

Monday
Most common functional groups occurring in natural products: A cheminformatics analysis
02:00pm - 02:20pm USA / Canada - Eastern - August 23, 2021 | Room: Zoom Room 03
Dr. Peter Ertl, Presenter, Novartis
Division: [CINF] Division of Chemical Information
Session Type: Oral - Virtual

The two most typical features that discriminate natural products from synthetic molecules are their characteristic scaffolds and unique functional groups (FGs). In this study we systematically investigate the distribution of FGs in natural products from a cheminformatics perspective by comparing FG frequencies in natural products with those found in average synthetic molecules. We thereby aim for the identification of FGs that are characteristic for molecules produced by living organisms. In our analysis we also include information about the natural origins of the structures investigated allowing us to link the occurrence of specific FGs to the individual producing species. Our findings have the potential for being applied in a medicinal chemistry context concerning the synthesis of natural product-like libraries and natural product-inspired fragment collections. The results may be used also to support compound derivatization strategies and the design of “non-natural” natural products.

Monday
Fragment library of natural products for drug discovery
02:20pm - 02:40pm USA / Canada - Eastern - August 23, 2021 | Room: Zoom Room 03
Ana Luisa Chávez Hernández, Presenter, UNAM; Jose Medina-Franco, Universidad Nacional Autonoma de Mexico; Dr. Norberto Sánchez-Cruz, Chemotargets SL
Division: [CINF] Division of Chemical Information
Session Type: Oral - Virtual
Natural Products (NP) possess unique functional groups. However, few amounts from them were obtained during extraction and purification procedures. To maximize the use of NP, we propose to generate fragment libraries from NP that can be used such as building blocks for the synthesis of pseudo-natural products. We recently generated fragment libraries from COlleCtion of Open NatUral producTs (COCONUT), and other reference-data sets such as food chemical compounds (FooDB); compounds that not showed activity although that have been thoroughly tested (DCM), and two data sets related to COVID-19 research, Chemical Abstract Service (CAS) and inhibitors of the main protease of SARS-CoV-2 (3CLP). Fragments were generated using the algorithm of Retrosynthetic Combinatorial Analysis Procedure (RECAP), and then we calculated structural features such as the fraction of chiral carbons; the fraction of sp3 carbons; atoms of carbon, oxygen, nitrogen. In general, molecular fragments retained their structural characteristics from original compounds. Also, fragments would be used as building blocks in the de novo design.

Monday
Isothiocyanates: A recent chemoinformatic study
02:40pm - 03:00pm USA / Canada - Eastern - August 23, 2021 | Room: Zoom Room 03
Araceli Guerrero Alonso, Presenter; Mayra Antunez-Mojica; Jose Medina-Franco, Universidad Nacional Autonoma de Mexico
Division: [CINF] Division of Chemical Information
Session Type: Oral - Virtual
Isothiocyanates (ITCs) are naturally organosulfur molecules derived from glucosinolates. Most of them are present in cruciferous vegetables and are responsible for the plant's sharp taste and act as a defense system. Recently ITCs are considerate an important source of nutritional and medical interest.

This study aimed to extract chemical structures containing the isothiocyanate moiety reported in four databases: COCONUT, FooDB, PDB, and DrugBank. As a result of this review, we obtained 154 isothiocyanate-like compounds, which were classified according to their chemical skeleton into seven categories: acyclic (69), cyclic (3), polycyclic (41), aromatic (14), polyaromatic (3), indolic (18) and glycosylated (6) (Figure 1). 14 of 24 were found in Brassica oleracea varieties, according to FooDB.

Molinspiration, SwissTargetPrediction, and DataWarrior were used to obtain the physicochemical properties. The results indicated that most ITCs were in a range of 100 to 450 Da of molecular weight, clogP around 1 to 6 and PSA from 40 to 120 Å.

Moreover, SwissTargetPrediction, PASSonline, ChEMBL, and Epigenetic Target Profiler were used to predict the activity type and molecular targets of 154 ITCs. In general, the predictions showed activity on macrophage migration inhibitory factor, transient receptor potential cation channel subfamily A member 1, membrane receptors as Serotonin 1b (5-HT1b), and epigenetic targets such as serine-protein kinase ATM and DNA lyase. Also, they can inhibit Cytochrome P450 2E1 and can be an apoptosis agonist.

This review has provided insights into the physicochemical properties of 154 isothiocyanates reported in four databases. In general, ITCs could be involved in anti-inflammatory and cancer processes with a potential pharmacological effect.
<b>Figure 1.</b> Classification of Isothiocyanates (ITCs).

Figure 1. Classification of Isothiocyanates (ITCs).


Monday
In silico characteristics of bioactive peptides from bovine milk proteins – biological and chemical approach
03:00pm - 03:20pm USA / Canada - Eastern - August 23, 2021 | Room: Zoom Room 03
Division: [CINF] Division of Chemical Information
Session Type: Oral - Virtual
The aim of the study was to characterize in silico the fragments of milk protein fractions in relation to their biological activity in the prevention of diet-related diseases.
Sequences of milk proteins, taken from the UniProt database were subjected to simulated proteolysis via joint action of pepsin, trypsin and chymotrypsin, followed by searching for bioactive compounds among resulting peptides using BIOPEP-UWM database.
Hydrophobicity, isoelectric point, charge at pH=7 and amphiphilicity were calculated on the basis of amino acid sequences. Parameters such as solubility in water, probability of intestinal absorption, halflife time and volume distribution were predicted via ADMETLab program.
Fragments with 19 types of bioactivity have been identified in the protein sequences of milk proteins. All tested proteins contained fragments corresponding to the amino acid sequences with the activity of ACE DPP-IV or DPP-III inhibitors, antioxidant, opioid and anticoagulant peptides. The greatest number of bioactive fragments was detected in the following amino acid sequences: lactoferrin, BSA and β-CN. Comparing casein and whey proteins, it can be stated that the latter turned out to be a better potential source of peptides in total and DPP-IV inhibitory peptides, while the former – ACE-inhibitory peptides. As a result of simulated hydrolysis of cow's milk proteins, 53 dipeptides and 7 tripeptides were obtained. ACE and DPP-IV inhibitors were the most numerous. Only 5 out of 60 peptides show antioxidant activity. Digestive enzymes more effectively released DPP-IV inhibitors than ACE inhibitors. All peptides described in this study have been classified by the ToxinPred program as non-toxic. All analyzed peptides should be highly water-soluble compounds. Dipeptides containing N-terminal isoleucine residues are characterized by a relatively high predicted probability of absorption in the gut. Dipeptides containing glycine, alanine, leucine, serine and threonine residues at the C-terminus are characterized by a relatively long predicted half-life.

Monday
Discovery and design of non-hemolytic AMPs using artificial intelligence
03:20pm - 03:40pm USA / Canada - Eastern - August 23, 2021 | Room: Zoom Room 03
Dr. Fabien Plisson, Presenter, LANGEBIO CINVESTAV-IPN
Division: [CINF] Division of Chemical Information
Session Type: Oral - Virtual
Antimicrobial peptides (AMPs) are polypeptide sequences of 12-50 residues characterized by their charged and hydrophobic cores that were long thought to kill bacteria by a general mechanism; disrupting their membranes leading to cell lysis and death. Their direct antibacterial activities and the lack of bacterial resistance have stimulated their therapeutic avenues against antibiotic-resistant infections. Major limitations preventing AMPs from translating into clinics are their low metabolic stability, poor oral bioavailability and high toxicity. Reducing hurdles to clinical trials without compromising the therapeutic promises of peptide candidates becomes an essential step in peptide-based drug design.

In this presentation, I will discuss the development of machine-learning models and outlier detection methods that ensure robust predictions for the discovery of AMPs and the design of novel peptides with reduced hemolytic activity. Our best models, gradient boosting classifiers, predicted the hemolytic nature from any peptide sequence with 95–97% accuracy. Nearly 70% of AMPs were predicted as hemolytic peptides. Applying multivariate outlier detection models, we found that 273 AMPs (9%) could not be predicted reliably. Our combined approach led to the discovery of 34 high-confidence non-hemolytic natural AMPs, the de novo design of 507 non-hemolytic peptides, and the guidelines for non-hemolytic peptide design. Finally, we will present HAPMOD, our new web server for hemolytic activity prediction with multivariate outlier detection.

Monday
Nuisance substructures and aggregators in a database of food compounds (FooDB) as source for putative false positives and promiscuity in their bioassays
03:40pm - 04:00pm USA / Canada - Eastern - August 23, 2021 | Room: Zoom Room 03
Division: [CINF] Division of Chemical Information
Session Type: Oral - Virtual
Understanding the mechanism of the biological action in health produced by food constituents is an area of intense research, which is performed through the testing of food compounds, mainly non-nutrient ones, in biochemical and biological assays. A positive result in these assays can be artifactual due to some properties of the compound: chemical reactivity, membrane disruption, redox cycling, etc, or through the formation of colloidal aggregates. Also, in many cases, these features result in promiscuous compounds that act non-specifically on all types of proteins. The results of assays with these compounds are potentially misleading. Within the field of drug discovery, a wide set of so-called “nuisance” filters have been generated after decades of extensive experience in high-throughput screening and medicinal chemistry, to identify substructures that frequently result in assay artifacts and/or promiscuity: e.g. the Pan-Assay INterference compoundS (PAINS), although there are others. In the sub-area of natural products, of large importance for food compounds, an analogous concept has been proposed through the so-called Invalid Metabolic PanaceaS (IMPs), a reduced set of natural products which display a huge set of activities and has absorbed much of the work in the field, suggesting the need to diverge the focus to other non-explored molecules. Finally, predictive tools to identify putative aggregators have also been developed.
In this presentation we will analyze the presence of nuisance substructures, IMPs and aggregators in a large database of food compounds (the FooDB), which should be useful to the researchers working in the field, in order to be aware of possible artifactual/promiscuity issues in their assays. The latest developments of our work in this area will be presented.