Research Overview:

Cancer Drivers

Current cancer genomics databases have accumulated millions of somatic mutations that remain to be further explored. Due to the over-excess mutations unrelated to cancer, the great challenge is to identify somatic mutations that are cancer-driven. Under the notion that carcinogenesis is a form of somatic-cell evolution, we developed a software CanDriS (Cancer Driver Sites) based on two-component mixture model: while the ground component corresponds to passenger mutations, the rapidly-evolving component corresponds to driver mutations.


Neoantigens have been acknowledged as ideal targets for cancer immunotherapies, such as cancer vaccines and T-cell immunotherapies. Recent studies also indicated that neoantigens are closely related to the therapeutic effect of immune checkpoint blockade therapies. Because of the large workload of experimental verification, the practical solution is to take advantage of bioinformatics. We developed the neoantigen prediction method DeepHLApan, one-stop neoantigen predition tool TSNAD, and neoantigen database TSNAdb for better clinical usage of neoantigen.


Since the outbreak of Coronavirus Disease 2019 (COVID-19) in late December 2019, it has brought significant harm and challenges to over 200 countries and regions around the world. Considering the seriousness of the recent outbreaks of zoonotic coronaviruses, therapeutic agents and vaccines for pan-coronaviruses should be developed to cope with the hCoV outbreaks in the present and in the future. Here, we predict all the potential B/T cell epitopes for SARS-CoV, 2019-nCoV, and MERS-CoV, and develop the Coronavirus Immuno-Epitope databse COVIEdb, to provide potential targets for pan-coronaviruses vaccine development. RaTG13-CoV is included because of its high homology with 2019-nCoV (96% whole genome identity).





We develop an integrated software under the Linux operation system called TSNAD through calling a series of the-state-of-art software or tools. It is completely automated and user-friendly framework, which is mainly designed for users who have little programming experience. It consists of two toolkits, mutation detection and antigen predictiion. It aims to detect somatic mutation and predict potential tumor-specific mutated antigens. Each toolkit is a two-step process: (1) configure parameters, (2) run corresponding toolkit.



we present an updated version of TSNAD that implements new features and improvements including: (i) update all the embedded tools into the latest version, (ii) add the function of RNA-Seq data analysis including gene expression and gene fusion analyses, (iii) support both GRCh38 and GRCh37 version of reference genome when calling mutations, (iv) add the neoantigen prediction derived from INDELs and gene fusions, besides SNVs, (v) replace NetMHCpan with our developed tool DeepHLApan and provide a web service of TSNAD, (vi) provide the installation method of Docker which comprises all the needed tools and reference files. TSNAD v2.0 achieves high performance on the standard dataset that the Tumor Neoantigen Selection Alliance (TESLA) provided. TSNAD v2.0 is implemented in Perl and Python. With an installed Docker, TSNAD v2.0 could be used on any operation systems including Linux and Windows.



We developed a novel software called DeepHLApan that can predict peptide binding affinity accurately. We designed this software based on the recurrent neural networks (RNNs) and trained it with a large dataset containing 335,102 peptide-HLA pairs which allow us to predict peptide binding affinity with HLA alleles pan-specifically.




Tumor-Specific NeoAntigens have attracted more attentions for their importance to cancer diagnosis, prognosis and targeted therapy, as they are crucial tumor biomarkers in identifying tumor cells and are potential targets for cancer immunotherapy. We analysed whole genome/exome sequencing data of 9,155 patients in International Cancer Genome Consortium database (ICGC) and predicted tumor-specific neoantigens including excellular mutations of membrane proteins and neoantigens presented by class I Major Histocompatibility Complex (MHC) molecules. We mapped all the missense mutations to the extracellular regions of membrane proteins and got a dataset contains 88,354 extracellular mutations. We used software NetMHCpan (v2.8) to predict the affinity between Human Leukocyte Antigen (HLA) and collected peptides and obtained a large amount of records with respect to binding and specific binding information.



CandrisDB is a platform to comprehensively proflie the cancer-driving sites  at the pan-cancer and tumor-type level for the somatic mutations collected from The Cancer Genome Atlas (TCGA PanCanAtlas project) and International Cancer Genome Consortium (ICGC) (ICGC Release 25) by an in-house method CanDriS. CandrisDB also combined the lists of known driver genes and the predicted results of published bioinformatics algorithms (an in-house method CNCS calculator and other ten algorithms) to compile a list of candidate dirver genes. We also collected data from other public databases on functional and pharmacogenomics annotation for the cancer-driving sites, to provide guidance on clinical medication in the upcoming era of Precision Medicine.



We predict all the potential B/T cell epitopes for 2019-nCoV, RaTG13-CoV, SARS-CoV and MERS-CoV based on the proteins they express for providing potential targets of vaccines that could effective to different coronaviruses. RaTG13-CoV is added because of its high homology with 2019-nCoV (96% whole genome identity).

1. Wu J, Chen W, Zhou J, Zhao W, Sun Y, Zhu H, Yao P, Chen S, Jiang J*, Zhou Z*, COVIEdb: A database for potential immune epitopes of coronaviruses, Front Pharmacol, 2020, 11:572249.
Jingcheng Wu:

Copyright © 2021.Pharmacogenomics group All rights reserved.
(●'◡'●)ノ This website has running: 0 d 0 h 55 min 52 s