CoVex stories - Part one
(April 11, 2020)
Check the CoVex website:
This blog is continued with part 2 to document the CoVex progress and the hackaton developments.
Coronavirus Disease-2019 (COVID-19), caused by the SARS-CoV-2 virus, has become a pandemic. To help address this crisis, drug repurposing is an attractive approach that offers new therapeutic options through the identification of alternative usages for already approved drugs. Numerous research groups around the world are working on treatment options for COVID-19 and are joining efforts to identify drugs that can be repurposed. A promising strategy for identifying drugs is to use network and systems medicine approaches, which provide a comprehensive understanding of the infection mechanisms while not only focussing on the virus and its direct interaction partners, but also including the host protein interaction network.
What is CoVex?
CoVex is the first network and systems medicine online data analysis platform that integrates virus-human interaction data for SARS-CoV-2 and SARS-CoV (see next chapters for more detailed descriptions).
Some figures and an introduction video
As time matters, we decided to go online with preliminary results and a first prototype of our software, which we will continuously improve over the next few weeks. Since the main functionality is implemented, we hope that this beta version is already useful for other researchers. However, we would like to emphasize that this is a very first prototype release without any deeper scientific evaluation and with no preclinical or clinical validation. Such will be added in the course of the next few weeks and months. Note that we would, under normal circumstances, not have gone public with CoVex at this stage, but we would have verified and validated our methods and findings. Again, this will be done over the next weeks and months. We will also add more experimental, molecular data, as we receive it over the next few weeks. As soon as new data or new results will be available, we will disseminate it on this blog.
What is the rationale behind CoVex?
In the first release of CoVex, we included the experimental virus-host protein-protein interaction (PPI) data of Gordon and colleages, 2020 (Gordon et al. 2020). The goal of CoVex is not only to make this data more accessible and to establish the virus-host interactome of SARS-CoV-2 as basis for additional research, but also to crowd-source host protein drug repurposing efforts by offering different network analysis approaches to obtain such drug repurposing candidates. CoVex is available for biological, medical and computational researchers, as well as the general public. Users can already explore the latest data available and perform custom analysis at https://exbio.wzw.tum.de/covex/. CoVex is made publicly available "AS IS" without any warranty whatsoever.
How the CoVex project began and why we do this
We are a team of researchers from the Chair of Experimental Bioinformatics (ExBio) at the Technical University of Munich (TUM) and started working on predicting candidates for drug repurposing against COVID-19 (codename: ExBio vs. COVID-19) on March 26, 2020. We assembled 15 ExBio lab members and two TUM virologists in three sub-teams coordinated by the ExBio chair Jan Baumbach and our science manager Nina Wenke: (1) a network and drugs team: Sepideh Sadegh, Gihanna Galindez, Marisol Salgado Albarrán, Tim Rose, David B. Blumenthal, Nina Kerstin Wenke, Tim Kacprowski, Josch Pauling, (2) a web team: Julian Matschinske, Julian Späth, Reza Nasirigerdeh, Mhaned Oubounyt, Kevin Yuan, Markus List, and (3) a virology team: Andreas Pichlmair, Alexey Stukalov. We used a mix of Telegram, Slack and Skype channels to communicate and organize an internal lab Hackathon that recently emerged as CoVex, the Coronavirus Explorer.
With CoVex, we aim to make network and systems medicine approaches, and their direct application to relevant data, available and easily accessible for a broad audience. Importantly, with its network-based approach, the CoVex platform may significantly speed up the identification of potential drug targets for the treatment of COVID-19. For this purpose, CoVex provides a user-friendly interface and integrates virus-host protein interactions of SARS-CoV-2 and SARS-CoV-1, the human protein-protein interaction network, and a comprehensive drug-target protein network. CoVex hence offers a systems-level approach to identify candidates for drug repurposing against SARS-CoV-2.
SARS-CoV-1 and SARS-CoV-2 virus/host interactions
SARS-CoV-2 (Gordon et al., 2020): SARS-CoV-2 viral proteins and their interactions with human host proteins were obtained from the publicly available affinity purification-mass spectrometry (AP-MS) dataset from the Krogan Lab. This dataset contains the human interactors for 26 of the total of 29 SARS-CoV-2 proteins plus one mutated virus protein and revealed 332 SARS-CoV-2-human protein-protein interactions (Gordon et al. 2020).
SARS-CoV-1 (VirHostNet 2.0): VirHostNet included 24 interactions from 14 publications with experimental validation by at least one of the following assays: co-immunoprecipitation, two-hybrid, pull down, mass spectrometry (Guirimand, Delmotte, and Navratil 2015)
SARS-CoV-1 (Pfefferle et al., 2011): Pfefferle and colleagues identified 86 human proteins that interact with SARS-CoV1 viral proteins from yeast-two-hybrid screening. These human proteins comprise 44 high-confidence interactors and 42 low-confidence interactors. In addition, they compiled 28 SARS-CoV1-human interactions from previously published studies that are supported by experiments, such as yeast two-hybrid, co-immunoprecipitation, and glutathione S-transferase (GST) pull-down assays (Pfefferle et al. 2011). For all interactors, the gene symbols were mapped to Uniprot IDs for integration into our network.
Protein-protein interaction data
The human protein-protein interaction (PPI) network was obtained from the integrated interactions database (IID) (Kotlyar et al. 2016), which compiles interaction data from multiple databases (BioGRID, DIP, HPRD, I2D, InnateDB, IntAct, and MINT) (Oughtred et al. 2019; Xenarios et al. 2000; Liu and Hu 2010; Breuer et al. 2013; Kerrien et al. 2012; Licata et al. 2012; Brown and Jurisica 2005).
Drug-target edges for drugs were obtained from multiple sources, namely, ChEMBL, DrugBank, DrugCentral, Target Therapeutic Database, Guide To Pharmacology (approved drugs), PharmGKB, and BindingDB. For drug-target edges based on bioactivity data, only drugs that have strong affinity values (EC50, IC50, Kd, and Ki < 10 μM) were considered, since these parameters are indicators of the drug potency. All drugs with DrugBank IDs were included in the integrated network. (Gaulton et al. 2017; Wishart et al. 2018; Ursu et al. 2019; Thorn, Klein, and Altman 2013; Gilson et al. 2016; Wang et al. 2020; Armstrong et al. 2020).
Clinical trial data
Drugs undergoing clinical trials for the treatment of COVID-19 were collected and integrated from ClinicalTrials (clinicaltrials.gov), the EU Clinical Trials Register (https://www.clinicaltrialsregister.eu/) and the International Clinical Trials Registry Platform, WHO (https://www.who.int/ictrp/en/).
Network and systems medicine tools
Several network-based algorithms are implemented in CoVex, which can be used to prioritize drug targets for the treatment of COVID19. Users can perform several types of network analyses: TrustRank (Gyöngyi, Garcia-Molina, and Pedersen 2004), our newly developed multi-Steiner tree algorithm, KeyPathwayMiner (Alcaraz et al. 2016), and closeness centrality. These tools output a subnetwork or set of proteins that can provide insights in the underlying mechanisms during SARS-CoV-2 infection. This allows researchers to identify the associated biological pathways and drugs which target them and are hence candidates for treatment.
TrustRank is a node centrality measure that ranks nodes in a network based on how well they are connected to a (trusted) set of seed nodes (Gyöngyi, Garcia-Molina, and Pedersen 2004). It is a variant of Google’s PageRank algorithm, where “trust” is iteratively propagated from seed nodes to neighboring nodes using the network structure. The node centralities are initialized by assigning uniform probabilities to all seeds and zero probabilities to all non-seed nodes. In CoVex, the TrustRank algorithm can be run starting from a user-defined set of seed proteins to obtain a ranked list of proteins in the PPI network that could be prioritized as drug targets. TrustRank can also be run to rank the drugs in the integrated network. A nice introduction into the usability of TrustRank in biomedicine can be found e.g. in (Hyung et al. 2019).
Closeness centrality is a node centrality measure that ranks the nodes in a network based on the lengths of their shortest paths to all other nodes in the network. In CoVex, we use a modified version suggested by Kaczprowski and colleagues (Kaczprowski, Doncheva, and Albrecht 2013), where only the shortest paths to a set of selected seed nodes are considered. Like TrustRank, it can be used both for detecting drug targets and for identifying promising drugs.
The Steiner tree problem is a classical combinatorial optimization problem. It asks to find a subgraph of minimum size connecting a given set of seed nodes (in our case, proteins). CoVex uses a novel method on the PPI network which approximately computes multiple Steiner trees. The user can select the set of proteins of interest and specify the number of Steiner trees to be found. The algorithm returns that one or several' PPI sub-network(s) that connect the selected seed proteins as candidate mechanism(s) involved in COVID-19 progression. In this mechanistic subnetwork(s), we can then extract essential proteins and, thus, the most promising drug targets and repurposable drugs for COVID-19.
KeyPathwayMiner is a network enrichment tool that identifies condition-specific subnetworks (key pathways) (Alcaraz et al. 2016). In CoVex, KeyPathwayMiner can extract a maximally connected subnetwork of the human PPI network starting from a user-defined set of proteins of interest that, with a few exceptions, consists of proteins directly associated with the virus. The exceptions can be interpreted as potential key proteins participating in the deregulated subnetwork. The subnetwork contains proteins that are putatively affected by SARS-CoV-2 infection and thus form a candidate mechanism to interfere with using drugs.
Here, we provide a short overview of the most important analysis features available in CoVex.
Visualization of virus-human interaction network
One can select the interaction dataset to use (SARS-CoV-2 or SARS-CoV-1). The networks can be zoomed, and one may select the viral proteins and visualize the human interactor proteins. Additional information is shown in the right panel.
Network analysis and visualization
By selecting a set of proteins of interest (seed proteins) as input, one may extract and visualize sub-networks (candidate mechanisms driving disease progression). Choose from four analysis options: TrustRank, Closeness Centrality, KeyPathwayMiner, and Multi-Steiner (see “Data Integration and analysis tools” for a short description of the algorithms).
Identification of candidate drugs for repurposing
One may retrieve drugs targeting the mechanistic sub-networks or the set of proteins potentially affected by the virus. CoVex will output the drugs that target those proteins in sub-networks/mechanisms.
In the coming weeks, our team will continue to update CoVex by including the latest experimental data for SARS-CoV-2 and implementing new functionalities. We would be very pleased to receive feedback on how to improve our platform and meet the needs of the COVID-19 community. We are convinced that only if we combine the scientific efforts, we will be able to find solutions for COVID-19 treatment. Thanks in advance!
Alcaraz, Nicolas, Markus List, Martin Dissing-Hansen, Marc Rehmsmeier, Qihua Tan, Jan Mollenhauer, Henrik J. Ditzel, and Jan Baumbach. 2016. “Robust de Novo Pathway Enrichment with KeyPathwayMiner 5.” F1000Research 5 (June): 1531.
Armstrong, Jane F., Elena Faccenda, Simon D. Harding, Adam J. Pawson, Christopher Southan, Joanna L. Sharman, Brice Campo, et al. 2020. “The IUPHAR/BPS Guide to PHARMACOLOGY in 2020: Extending Immunopharmacology Content and Introducing the IUPHAR/MMV Guide to MALARIA PHARMACOLOGY.” Nucleic Acids Research 48 (D1): D1006–21.
Breuer, Karin, Amir K. Foroushani, Matthew R. Laird, Carol Chen, Anastasia Sribnaia, Raymond Lo, Geoffrey L. Winsor, Robert E. W. Hancock, Fiona S. L. Brinkman, and David J. Lynn. 2013. “InnateDB: Systems Biology of Innate Immunity and Beyond—recent Updates and Continuing Curation.” Nucleic Acids Research 41 (D1): D1228–33.
Brown, Kevin R., and Igor Jurisica. 2005. “Online Predicted Human Interaction Database.” Bioinformatics 21 (9): 2076–82.
Gaulton, Anna, Anne Hersey, Michał Nowotka, A. Patrícia Bento, Jon Chambers, David Mendez, Prudence Mutowo, et al. 2017. “The ChEMBL Database in 2017.” Nucleic Acids Research 45 (D1): D945–54.
Gilson, Michael K., Tiqing Liu, Michael Baitaluk, George Nicola, Linda Hwang, and Jenny Chong. 2016. “BindingDB in 2015: A Public Database for Medicinal Chemistry, Computational Chemistry and Systems Pharmacology.” Nucleic Acids Research 44 (D1): D1045–53.
Gordon, D. E., G. M. Jang, M. Bouhaddou, J. Xu, and K. Obernier. 2020. “A SARS-CoV-2-Human Protein-Protein Interaction Map Reveals Drug Targets and Potential Drug-Repurposing.” BioRxiv.
Guirimand, Thibaut, Stéphane Delmotte, and Vincent Navratil. 2015. “VirHostNet 2.0: Surfing on the Web of Virus/host Molecular Interactions Data.” Nucleic Acids Research 43 (Database issue): D583–87.
Gyöngyi, Zoltán, Hector Garcia-Molina, and Jan Pedersen. 2004. “- Combating Web Spam with TrustRank.” In Proceedings 2004 VLDB Conference, edited by Mario A. Nascimento, M. Tamer Özsu, Donald Kossmann, Renée J. Miller, José A. Blakeley, and Berni Schiefer, 576–87. St Louis: Morgan Kaufmann.
Hyung, Daejin, Ann-Marie Mallon, Dong Soo Kyung, Soo Young Cho, and Je Kyung Seong. 2019. “TarGo: Network Based Target Gene Selection System for Human Disease Related Mouse Models.” Laboratory Animal Research 35 (November): 23.
Kacprowski, Tim, Nadezhda T. Doncheva, and Mario Albrecht. 2013. “NetworkPrioritizer: A Versatile Tool for Network-Based Prioritization of Candidate Disease Genes or Other Molecules.” Bioinformatics 29 (11): 1471–73.
Kerrien, Samuel, Bruno Aranda, Lionel Breuza, Alan Bridge, Fiona Broackes-Carter, Carol Chen, Margaret Duesbury, et al. 2012. “The IntAct Molecular Interaction Database in 2012.” Nucleic Acids Research 40 (Database issue): D841–46.
Kotlyar, Max, Chiara Pastrello, Nicholas Sheahan, and Igor Jurisica. 2016. “Integrated Interactions Database: Tissue-Specific View of the Human and Model Organism Interactomes.” Nucleic Acids Research 44 (D1): D536–41.
Licata, Luana, Leonardo Briganti, Daniele Peluso, Livia Perfetto, Marta Iannuccelli, Eugenia Galeota, Francesca Sacco, et al. 2012. “MINT, the Molecular Interaction Database: 2012 Update.” Nucleic Acids Research 40 (Database issue): D857–61.
Liu, Baolin, and Bo Hu. 2010. “HPRD: A High Performance RDF Database.” International Journal of Parallel, Emergent and Distributed Systems 25 (2): 123–33.
Oughtred, Rose, Chris Stark, Bobby-Joe Breitkreutz, Jennifer Rust, Lorrie Boucher, Christie Chang, Nadine Kolas, et al. 2019. “The BioGRID Interaction Database: 2019 Update.” Nucleic Acids Research 47 (D1): D529–41.
Pfefferle, Susanne, Julia Schöpf, Manfred Kögl, Caroline C. Friedel, Marcel A. Müller, Javier Carbajo-Lozoya, Thorsten Stellberger, et al. 2011. “The SARS-Coronavirus-Host Interactome: Identification of Cyclophilins as Target for Pan-Coronavirus Inhibitors.” PLoS Pathogens 7 (10): e1002331.
Thorn, Caroline F., Teri E. Klein, and Russ B. Altman. 2013. “PharmGKB: The Pharmacogenomics Knowledge Base.” Methods in Molecular Biology 1015: 311–20.
Ursu, Oleg, Jayme Holmes, Cristian G. Bologa, Jeremy J. Yang, Stephen L. Mathias, Vasileios Stathias, Dac-Trung Nguyen, Stephan Schürer, and Tudor Oprea. 2019. “DrugCentral 2018: An Update.” Nucleic Acids Research 47 (D1): D963–70.
Wang, Yunxia, Song Zhang, Fengcheng Li, Ying Zhou, Ying Zhang, Zhengwen Wang, Runyuan Zhang, et al. 2020. “Therapeutic Target Database 2020: Enriched Resource for Facilitating Research and Early Development of Targeted Therapeutics.” Nucleic Acids Research 48 (D1): D1031–41.
Wishart, David S., Yannick D. Feunang, An C. Guo, Elvis J. Lo, Ana Marcu, Jason R. Grant, Tanvir Sajed, et al. 2018. “DrugBank 5.0: A Major Update to the DrugBank Database for 2018.” Nucleic Acids Research 46 (D1): D1074–82.
Xenarios, I., D. W. Rice, L. Salwinski, M. K. Baron, E. M. Marcotte, and D. Eisenberg. 2000. “DIP: The Database of Interacting Proteins.” Nucleic Acids Research 28 (1): 289–91.