Preprint / Version 1

Protein Domain Annotations of the SARS-CoV-2 Proteomics as a Blue-Print for Mapping the Features for Drug and Vaccine Designs


  • Arli Aditya Parikesit Indonesia International Institute for Life Sciences


Protein domain, CDD, ATP, ADP, features


SARS-CoV-2 virus, as the causal agent for the COVID-19 pandemic, remains an enigma in the bioinformatics sense. Current efforts in drug and vaccine design in primarily targeting general devised protein domain while overlooking the specific features in the proteomics repertoire. However, the NCBI Conserved Domain Database (CDD) could annotate the specific features that are indispensable for a more advanced drug and vaccine design. In this regard, the annotation efforts were initiated with CDD database, and visualized with the 3D Protein Visualizer of Cn3D. The exsistence of the ATP and ADP binding protein with respected domains were found to be a very potential target for drug design. It is recommended that nucleoside inhibitor that could mimick the ATP molecule could serve as a potential drug lead agains SARS-CoV-2.


WHO, Situation Reports, (2020). (accessed March 6, 2020).

WHO, Coronavirus disease 2019, World Heal. Organ. 2019 (2020) 2633. doi:10.1001/jama.2020.2633.

WHO, Middle East respiratory syndrome coronavirus ( MERS-CoV ) - update, Glob. Alert Response. June (2013).

WHO, Coronavirus, Heal. Top. (2020). (accessed March 10, 2020).

WHO, WHO | Avian influenza A(H7N9) virus, WHO. (2017). (accessed December 30, 2018).

U.S.F. Tambunan, A.H. Alkaff, M.A.F. Nasution, Bioinformatics Approach to Screening and Developing Drug against Ebola, in: Adv. Ebola Control, 2018: pp. 75–88. doi:10.5772/intechopen.72278.

H.M. Ashour, W.F. Elkhatib, M.M. Rahman, H.A. Elshabrawy, Insights into the Recent 2019 Novel Coronavirus (SARS-CoV-2) in Light of Past Human Coronavirus Outbreaks., Pathog. (Basel, Switzerland). 9 (2020) 186. doi:10.3390/pathogens9030186.

U. Melcher, The “30K” superfamily of viral movement proteins., J. Gen. Virol. 81 (2000) 257–66. doi:10.1099/0022-1317-81-1-257.

J. ?erný, B. ?erná Bolfíková, P.M. de A Zanotto, L. Grubhoffer, D. R?žek, A deep phylogeny of viral and cellular right-hand polymerases., Infect. Genet. Evol. 36 (2015) 275–86. doi:10.1016/j.meegid.2015.09.026.

M. Cotten, S.J. Watson, P. Kellam, A.A. Al-Rabeeah, H.Q. Makhdoom, A. Assiri, J. Aal-Tawfiq, R.F. Alhakeem, H. Madani, F.A. AlRabiah, S. Al Hajjar, W.N. Al-Nassir, A. Albarrak, H. Flemban, H.H. Balkhy, S. Alsubaie, A.L. Palser, A. Gall, R. Bashford-Rogers, A. Rambaut, A.I. Zumla, Z.A. Memish, Transmission and evolution of the Middle East respiratory syndrome coronavirus in Saudi Arabia: A descriptive genomic study, Lancet. 382 (2013) 1993–2002. doi:10.1016/S0140-6736(13)61887-5.

G. Caetano-Anollés, D. Caetano-Anollés, An evolutionarily structured universe of protein architecture., Genome Res. 13 (2003) 1563–71. doi:10.1101/gr.1161903.

G. Caetano-Anollés, D. Caetano-Anollés, Universal sharing patterns in proteomes and evolution of protein fold architecture and life., J. Mol. Evol. 60 (2005) 484–98. doi:10.1007/s00239-004-0221-6.

M. Wang, G. Caetano-Anollés, Global phylogeny determined by the combination of protein domains in proteomes., Mol. Biol. Evol. 23 (2006) 2444–54. doi:10.1093/molbev/msl117.

C. Kemena, E. Bornberg-Bauer, A Roadmap to Domain Based Proteomics, in: Methods Mol. Biol., 2019: pp. 287–300. doi:10.1007/978-1-4939-8736-8_16.

A.A. Parikesit, D.H. Utomo, N. Karimah, Protein Domain Annotation of Plasmodium spp. Circumsporozoite Protein (CSP) Using Hidden Markov Model-based Tools, J. Biol. Indones. 14 (2018) 185–190. doi:10.14203/jbi.v14i2.3737.

A.P. Pandurangan, J. Stahlhacke, M.E. Oates, B. Smithers, J. Gough, The SUPERFAMILY 2.0 database: A significant proteome update and a new webserver, Nucleic Acids Res. 47 (2019) D490–D494. doi:10.1093/nar/gky1130.

I. Inza, B. Calvo, R. Armañanzas, E. Bengoetxea, P. Larrañaga, J.A. Lozano, Machine learning: an indispensable tool in bioinformatics., Methods Mol. Biol. 593 (2010) 25–48. doi:10.1007/978-1-60327-194-3_2.

A.A. Parikesit, R. Nurdiansyah, The Challenge of Protein Domain Annotation with Supervised Learning Approach: A Systematic Review, J. Mat. Dan Sains. 24 (2019) 1–9. (accessed October 1, 2019).

F. Rohart, B. Gautier, A. Singh, K.A. Lê Cao, mixOmics: An R package for ‘omics feature selection and multiple data integration, PLoS Comput. Biol. 13 (2017). doi:10.1371/journal.pcbi.1005752.

K.A. Lê Cao, S. Boitard, P. Besse, Sparse PLS discriminant analysis: Biologically relevant feature selection and graphical displays for multiclass problems, BMC Bioinformatics. 12 (2011) 253. doi:10.1186/1471-2105-12-253.

L. Yu, D.K. Tanwar, E.D.S. Penha, Y.I. Wolf, E. V. Koonin, M.K. Basu, Grammar of protein domain architectures, Proc. Natl. Acad. Sci. U. S. A. 116 (2019) 3636–3645. doi:10.1073/pnas.1814684116.

A.A. Parikesit, Introductory Chapter: The Contribution of Bioinformatics as Blueprint Lead for Drug Design, in: Ivana Glavic (Ed.), Mol. Insight Drug Des., InTech, London, 2018: p. 7. doi:10.5772/intechopen.79664.

J.F.W. Chan, K.H. Kok, Z. Zhu, H. Chu, K.K.W. To, S. Yuan, K.Y. Yuen, Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan, Emerg. Microbes Infect. 9 (2020) 221–236. doi:10.1080/22221751.2020.1719902.

F. Wu, S. Zhao, B. Yu, Y.M. Chen, W. Wang, Z.G. Song, Y. Hu, Z.W. Tao, J.H. Tian, Y.Y. Pei, M.L. Yuan, Y.L. Zhang, F.H. Dai, Y. Liu, Q.M. Wang, J.J. Zheng, L. Xu, E.C. Holmes, Y.Z. Zhang, A new coronavirus associated with human respiratory disease in China, Nature. 579 (2020) 265–269. doi:10.1038/s41586-020-2008-3.

R. Lu, X. Zhao, J. Li, P. Niu, B. Yang, H. Wu, W. Wang, H. Song, B. Huang, N. Zhu, Y. Bi, X. Ma, F. Zhan, L. Wang, T. Hu, H. Zhou, Z. Hu, W. Zhou, L. Zhao, J. Chen, Y. Meng, J. Wang, Y. Lin, J. Yuan, Z. Xie, J. Ma, W.J. Liu, D. Wang, W. Xu, E.C. Holmes, G.F. Gao, G. Wu, W. Chen, W. Shi, W. Tan, Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding., Lancet (London, England). (2020). doi:10.1016/S0140-6736(20)30251-8.

A. Marchler-Bauer, Y. Bo, L. Han, J. He, C.J. Lanczycki, S. Lu, F. Chitsaz, M.K. Derbyshire, R.C. Geer, N.R. Gonzales, M. Gwadz, D.I. Hurwitz, F. Lu, G.H. Marchler, J.S. Song, N. Thanki, Z. Wang, R.A. Yamashita, D. Zhang, C. Zheng, L.Y. Geer, S.H. Bryant, CDD/SPARCLE: Functional classification of proteins via subfamily domain architectures, Nucleic Acids Res. (2017). doi:10.1093/nar/gkw1129.

A. Marchler-Bauer, M.K. Derbyshire, N.R. Gonzales, S. Lu, F. Chitsaz, L.Y. Geer, R.C. Geer, J. He, M. Gwadz, D.I. Hurwitz, C.J. Lanczycki, F. Lu, G.H. Marchler, J.S. Song, N. Thanki, Z. Wang, R.A. Yamashita, D. Zhang, C. Zheng, S.H. Bryant, CDD: NCBI’s conserved domain database., Nucleic Acids Res. 43 (2015) D222-6. doi:10.1093/nar/gku1221.

A. Marchler-Bauer, S. Lu, J.B. Anderson, F. Chitsaz, M.K. Derbyshire, C. DeWeese-Scott, J.H. Fong, L.Y. Geer, R.C. Geer, N.R. Gonzales, M. Gwadz, D.I. Hurwitz, J.D. Jackson, Z. Ke, C.J. Lanczycki, F. Lu, G.H. Marchler, M. Mullokandov, M. V Omelchenko, C.L. Robertson, J.S. Song, N. Thanki, R.A. Yamashita, D. Zhang, N. Zhang, C. Zheng, S.H. Bryant, CDD: a Conserved Domain Database for the functional annotation of proteins., Nucleic Acids Res. 39 (2011) D225-9. doi:10.1093/nar/gkq1189.

A. Marchler-Bauer, S.H. Bryant, CD-Search: Protein domain annotations on the fly, Nucleic Acids Res. 32 (2004) W327-31. doi:10.1093/nar/gkh454.

A.A. Parikesit, L. Steiner, P.F. Stadler, S.J. Prohaska, Pitfalls of Ascertainment Biases in Genome Annotations—Computing Comparable Protein Domain Distributions in Eukarya, Malaysian J. Fundam. Appl. Sci. 10 (2014) 65–75. doi:10.11113/mjfas.v10n2.57.

A.A. Parikesit, P.F. Stadler, S.J. Prohaska, Evolution and quantitative comparison of genome-wide protein domain distributions., Genes (Basel). 2 (2011) 912–24. doi:10.3390/genes2040912.

Y. Wang, L.Y. Geer, C. Chappey, J.A. Kans, S.H. Bryant, Cn3D: Sequence and structure views for Entrez, Trends Biochem. Sci. 25 (2000) 300–302. doi:10.1016/S0968-0004(00)01561-9.

P.K. Stasinakis, D. Nicolaou, Modeling of DNA and protein organization levels with Cn3D software, Biochem. Mol. Biol. Educ. 45 (2017) 126–129. doi:10.1002/bmb.20998.

S.G. Porter, J. Day, R.E. McCarty, A. Shearn, R. Shingles, L. Fletcher, S. Murphy, R. Pearlman, Exploring DNA structure with Cn3D, CBE Life Sci. Educ. 6 (2007) 65–73. doi:10.1187/cbe.06-03-0155.

A.A. Parikesit, Protein Domain Annotations of the SARS-CoV-2 Proteomics as a Blue-Print for Mapping the Features for Drug and Vaccine Designs (Data Sets), (2020). doi:10.17632/95nwwzdcsc.1.

S.P. Lien, Y.P. Shih, H.W. Chen, J.P. Tsai, C.H. Leng, M.H. Lin, L.H. Lin, H.Y. Liu, A.H. Chou, Y.W. Chang, Y.M.A. Chen, P. Chong, S.J. Liu, Identification of synthetic vaccine candidates against SARS CoV infection, Biochem. Biophys. Res. Commun. 358 (2007) 716–721. doi:10.1016/j.bbrc.2007.04.164.

M. Prajapat, P. Sarma, N. Shekhar, P. Avti, S. Sinha, H. Kaur, S. Kumar, A. Bhattacharyya, H. Kumar, S. Bansal, B. Medhi, Drug targets for corona virus: A systematic review, Indian J. Pharmacol. 52 (2020) 56–65. doi:10.4103/ijp.IJP_115_20.

J. Gao, Z. Tian, X. Yang, Breakthrough: Chloroquine phosphate has shown apparent efficacy in treatment of COVID-19 associated pneumonia in clinical studies., Biosci. Trends. (2020) 2020.01047. doi:10.5582/bst.2020.01047.

Y.-R. Guo, Q.-D. Cao, Z.-S. Hong, Y.-Y. Tan, S.-D. Chen, H.-J. Jin, K.-S. Tan, D.-Y. Wang, Y. Yan, The origin, transmission and clinical therapies on coronavirus disease 2019 (COVID-19) outbreak - an update on the status., Mil. Med. Res. 7 (2020) 11. doi:10.1186/s40779-020-00240-0.

D. Gurwitz, Angiotensin receptor blockers as tentative SARS?CoV?2 therapeutics, Drug Dev. Res. (2020) ddr.21656. doi:10.1002/ddr.21656.

L. Dong, S. Hu, J. Gao, Discovering drugs to treat coronavirus disease 2019 (COVID-19)., Drug Discov. Ther. 14 (2020) 58–60. doi:10.5582/ddt.2020.01012.

D. Gurwitz, Angiotensin receptor blockers as tentative SARS-CoV-2 therapeutics, Drug Dev. Res. (2020). doi:10.1002/ddr.21656.

S.A. Baron, C. Devaux, P. Colson, D. Raoult, J.-M. Rolain, Teicoplanin: an alternative drug for the treatment of coronavirus COVID-19?, Int. J. Antimicrob. Agents. (2020) 105944. doi:10.1016/j.ijantimicag.2020.105944.

K.L. Seley-Radtke, M.K. Yates, The evolution of nucleoside analogue antivirals: A review for chemists and non-chemists. Part 1: Early structural modifications to the nucleoside scaffold, Antiviral Res. 154 (2018) 66–86. doi:10.1016/j.antiviral.2018.04.004.

A.A. Elfiky, Anti-HCV, nucleotide inhibitors, repurposing against COVID-19, Life Sci. 248 (2020) 117477. doi:10.1016/j.lfs.2020.117477.

Z. Jia, L. Yan, Z. Ren, L. Wu, J. Wang, J. Guo, L. Zheng, Z. Ming, L. Zhang, Z. Lou, Z. Rao, Delicate structural coordination of the Severe Acute Respiratory Syndrome coronavirus Nsp13 upon ATP hydrolysis, Nucleic Acids Res. 47 (2019) 6538–6550. doi:10.1093/nar/gkz409.

C.J. Gordon, E.P. Tchesnokov, J.Y. Feng, D.P. Porter, M. Gotte, The antiviral compound remdesivir potently inhibits RNA-dependent RNA polymerase from Middle East respiratory syndrome coronavirus., J. Biol. Chem. (2020) jbc.AC120.013056. doi:10.1074/jbc.AC120.013056.

L. Eyer, R. Nencka, E. de Clercq, K. Seley-Radtke, D. R?žek, Nucleoside analogs as a rich source of antiviral agents active against arthropod-borne flaviviruses, Antivir. Chem. Chemother. 26 (2018) 204020661876129. doi:10.1177/2040206618761299.

M.K. Yates, K.L. Seley-Radtke, The evolution of antiviral nucleoside analogues: A review for chemists and non-chemists. Part II: Complex modifications to the nucleoside scaffold, Antiviral Res. 162 (2019) 5–21. doi:10.1016/j.antiviral.2018.11.016.

H. van de Waterbeemd, E. Gifford, ADMET in silico modelling: towards prediction paradise?, Nat. Rev. Drug Discov. 2 (2003) 192–204. doi:10.1038/nrd1032.