pangolin lineage covid

4 we compare these divergence time estimates to those obtained using the MERS-CoV-centred rate priors for NRR1, NRR2 and NRA3. Because coronaviruses are known to be highly recombinant, we used three different approaches to identify non-recombinant regions for use in our Bayesian time-calibrated phylogenetic inference. Extensive diversity of coronaviruses in bats from China. Yres, D. L. et al. Evol. The red and blue boxplots represent the divergence time estimates for SARS-CoV-2 (red) and the 2002-2003 SARS-CoV (blue) from their most closely related bat virus, with the light- and dark-colored versions based on the HCoV-OC43 and MERS-CoV centered priors, respectively. The canine viral genome was excluded from the Bayesian phylogenetic analyses because temporal signal analyses (see below) indicated that it was an outlier. Eden, J.-S., Tanaka, M. M., Boni, M. F., Rawlinson, W. D. & White, P. A. Recombination within the pandemic norovirus GII.4 lineage. Virus Evol. In outbreaks of zoonotic pathogens, identification of the infection source is crucial because this may allow health authorities to separate human populations from the wildlife or domestic animal reservoirs posing the zoonotic risk9,10. 1, vev003 (2015). BEAST inferences made use of the BEAGLE v.3 library68 for efficient likelihood computations. Phylogenetic Assignment of Named Global Outbreak Lineages Pangolin was developed to implement the dynamic nomenclature of SARS-CoV-2 lineages, known as the Pango nomenclature. The presence in pangolins of an RBD very similar to that of SARS-CoV-2 means that we can infer this was also probably in the virus that jumped to humans. Boni, M. F., Posada, D. & Feldman, M. W. An exact nonparametric method for inferring mosaic structure in sequence triplets. July 26, 2021. In early January, the aetiological agent of the pneumonia cases was found to be a coronavirus3, subsequently named SARS-CoV-2 by an International Committee on Taxonomy of Viruses (ICTV) Study Group4 and also named hCoV-19 by Wu et al.5. Here, we analyse the evolutionary history of SARS-CoV-2 using available genomic data on sarbecoviruses. Means and 95% HPD intervals are 0.080 [0.0580.101] and 0.530 [0.3040.780] for the patristic distances between SARS-CoV-2 and RaTG13 (green) and 0.143 [0.1090.180] and 0.154 [0.0930.231] for the patristic distances between SARS-CoV-2 and Pangolin 2019 (orange). Article To begin characterizing any ancestral relationships for SARS-CoV-2, NRRs of the genome must be identified so that reliable phylogenetic reconstruction and dating can be performed. It is available as a command line tool and a web application. The relatively fast evolutionary rate means that it is most appropriate to estimate shallow nodes in the sarbecovirus evolutionary history. & Muhire, B. RDP4: Detection and analysis of recombination patterns in virus genomes. Preprint at https://doi.org/10.1101/2020.04.20.052019 (2020). Bruen, T. C., Philippe, H. & Bryant, D. A simple and robust statistical test for detecting the presence of recombination. PubMed Central The coronavirus genome that these researchers had assembled, from pangolin lung-tissue samples, contained some gene regions that were ninety-nine per cent similar to equivalent parts of the SARS . Mol. Current Overview on Disease and Health Research Vol. 6 [12] matics program called Pangolin was developed. =0.00075 and one with a mean of 0.00024 and s.d. When the first genome sequence of SARS-CoV-2, Wuhan-Hu-1, was released on 10January 2020 (GMT) on Virological.org by a consortium led by Zhang6, it enabled immediate analyses of its ancestry. EPI_ISL_410721) and Beijing Institute of Microbiology and Epidemiology (W.-C. Cao, T.T.-Y.L., N. Jia, Y.-W. Zhang, J.-F. Jiang and B.-G. Jiang, nos. 190, 20882095 (2004). Emerg. This statement informs us of the possibility that a virus has spilled over from a very rare and shy reptile-looking mammal . Using a third consensus-based approach for identifying recombinant regions in individual sequenceswith six different recombination detection methods in RDP5 (ref. 5. Mol. Holmes, E. C. The Evolution and Emergence of RNA Viruses (Oxford Univ. 2). Frontiers | Novel Highly Divergent SARS-CoV-2 Lineage With the Spike wrote the first draft of the manuscript, and all authors contributed to manuscript editing. Because 3SEQ identified ten BFRs >500nt, we used GARDs (v.2.5.0) inference on 10, 11 and 12 breakpoints. To estimate non-synonymous over synonymous rate ratios for the concatenated coding genes, we used the empirical Bayes Renaissance countingprocedure67. Genetics 172, 26652681 (2006). We thank A. Chan and A. Irving for helpful comments on the manuscript. covid19_mostefai2021_paper/01_CreateObjects.r at master HussinLab A., Filip, I., AlQuraishi, M. & Rabadan, R. Recombination and lineage-specific mutations led to the emergence of SARS-CoV-2. Grey tips correspond to bat viruses, green to pangolin, blue to SARS-CoV and red to SARS-CoV-2. Aiewsakun, P. & Katzourakis, A. Time-dependent rate phenomenon in viruses. Virus Evol. We compare both MERS-CoV- and HCoV-OC43-centred prior distributions (Extended Data Fig. Concurrent evidence also proposed pangolins as a potential intermediate species for SARS-CoV-2 emergence and suggested them as a potential reservoir species11,12,13. Chernomor, O. et al. Bioinformatics 30, 13121313 (2014). Unlike other viruses that have emerged in the past two decades, coronaviruses are highly recombinogenic14,15,16. Despite the SARS-CoV-2 lineages acquisition of residues in its Spike (S) proteins receptor-binding domain (RBD) permitting the use of human ACE2 (ref. Duchene, S. et al. Anderson, K. G., Rambaut, A., Lipkin, W. I., Holmes, E. C. & Garry, R. F. The proximal origin of SARS-CoV-2. Complete genome sequence data were downloaded from GenBank and ViPR; accession numbers of all 68sequences are available in Supplementary Table 4. First, we took an approach that relies on identification of mosaic regions (via 3SEQ14 v.1.7) that are also supported by PI signals19. the development of viral diversity. There is a 90% DNA match between SARS CoV 2 and a coronavirus in pangolins. The rate of genome generation is unprecedented, yet there is currently no coherent nor accepted scheme for naming the expanding . In the presence of time-dependent rate variation, a widely observed phenomenon for viruses43,44,52, slower prior rates appear more appropriate for sarbecoviruses that currently encompass a sampling time range of about 18years. with an alignment on which an initial recombination analysis was done. T.L. 4). Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent for the current coronavirus disease (COVID-19) pandemic that has affected more than 35 million people and caused . Since experts have suggested that pangolins may be the reservoir species for COVID-19, the scaly anteater has been catapulted into headlines, news reports, and conversationsand some are calling COVID-19 "the revenge of the . Trova, S. et al. Correspondence to A SARS-like cluster of circulating bat coronaviruses shows potential for human emergence. Preprint at https://doi.org/10.1101/2020.02.10.942748 (2020). Below, we report divergence time estimates based on the HCoV-OC43-centred rate prior for NRR1, NRR2 and NRA3 and summarize corresponding estimates for the MERS-CoV-centred rate priors in Extended Data Fig. In such cases, even moderate rate variation among long, deep phylogenetic branches will substantially impact expected root-to-tip divergences over a sampling time range that represents only a small fraction of the evolutionary history40. Uncertainty measures are shown in Extended Data Fig. Our most conservative approach attempted to ensure that putative NRRs had no mosaic or phylogenetic incongruence signals. We thank all authors who have kindly deposited and shared genome data on GISAID. Boni, M. F., de Jong, M. D., van Doorn, H. R. & Holmes, E. C. Guidelines for identifying homologous recombination events in influenza A virus. 3). 3). Green boxplots show the TMRCA estimate for the RaTG13/SARS-CoV-2 lineage and its most closely related pangolin lineage (Guangdong 2019). N. China corresponds to Jilin, Shanxi, Hebei and Henan provinces, and the N. China clade also includes one sequence sampled in Hubei Province in 2004. Biol. RegionsAC had similar phylogenetic relationships among the southern China bat viruses (Yunnan, Guangxi and Guizhou provinces), the Hong Kong viruses, northern Chinese viruses (Jilin, Shanxi, Hebei and Henan provinces, including Shaanxi), pangolin viruses and the SARS-CoV-2 lineage. Developed by the Centre for Genomic Pathogen Surveillance. M.F.B. Avian influenza a virus (H7N7) epidemic in The Netherlands in 2003: course of the epidemic and effectiveness of control measures. In the meantime, to ensure continued support, we are displaying the site without styles Specifically, using a formal Bayesian approach42 (see Methods), we estimate a fast evolutionary rate (0.00169 substitutions per siteyr1, 95% highest posterior density (HPD) interval (0.00131,0.00205)) for SARS viruses sampled over a limited timescale (1year), a slower rate (0.00078 (0.00063,0.00092) substitutions per siteyr1) for MERS-CoV on a timescale of about 4years and the slowest rate (0.00024 (0.00019,0.00029) substitutions per siteyr1) for HCoV-OC43 over almost five decades. It is available as a command line tool and a web application. Phylogenies of subregions of NRR1 depict an appreciable degree of spatial structuring of the bat sarbecovirus population across different regions (Fig. PDF single centre retrospective study Evol. We named the length-sorted BFRs as: BFRA (ntpositions 13,29119,628, length=6,338nt), BFRB (ntpositions 3,6259,150, length=5,526nt), BFRC (ntpositions 9,26111,795, length=2,535nt), BFRD (ntpositions 27,70228,843, length=1,142nt) and six further regions (EJ). Holmes, E. C., Dudas, G., Rambaut, A. The assumption of long-term purifying selection would imply that coronaviruses are in endemic equilibrium with their natural host species, horseshoe bats, to which they are presumably well adapted. Don't blame pangolins, coronavirus family tree tracing could prove key We aimed to analyze 3 naso-oropharyngeal swab samples collected between August and December 2021 to describe the amino acid changes present in the sequence reads that may have a role in the emergence of new . https://doi.org/10.1038/s41564-020-0771-4, DOI: https://doi.org/10.1038/s41564-020-0771-4. Using the most conservative approach to identification of a non-recombinant genomic region (NRR1), SARS-CoV-2 forms a sister lineage with RaTG13, with genetically related cousin lineages of coronavirus sampled in pangolins in Guangdong and Guangxi provinces (Fig. Menachery, V. D. et al. D.L.R. As illustrated by the dashed arrows, these two posteriors motivate our specification of prior distributions with standard deviations inflated 10-fold (light color). 6, e14 (2017). Sci. Bayesian evolutionary rate and divergence date estimates were shown to be consistent for these three approaches and for two different prior specifications of evolutionary rates based on HCoV-OC43 and MERS-CoV. R. Soc. 92, 433440 (2020). The difficulty in inferring reliable evolutionary histories for coronaviruses is that their high recombination rate48,49 violates the assumption of standard phylogenetic approaches because different parts of the genome have different histories. 2 Lack of root-to-tip temporal signal in SARS-CoV-2. Ge, X. et al. The species Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2. 2a. The 2009 influenza pandemic and subsequent outbreaks of MERS-CoV (2012), H7N9 avian influenza (2013), Ebola virus (2014) and Zika virus (2015) were met with rapid sequencing and genomic characterization. Biazzo et al. Specifically, progenitors of the RaTG13/SARS-CoV-2 lineage appear to have recombined with the Hong Kong clade (with inferred breakpoints at 11.9 and 20.8kb) to form the CoVZXC21/CoVZC45-lineage. Zhou, P. et al. These shy, quirky but cute mammals are one of the most heavily trafficked yet least understood animals in the world. A.R. Phylogenetic Assignment of Named Global Outbreak LINeages, The pangolin web app is maintained by the Centre for Genomic Pathogen Surveillance. 3) clusters with viruses from provinces in the centre, east and northeast of China. Open reading frames are shown above the breakpoint plot, with the variable-loop region indicated in the Sprotein. Possible Bat Origin of Severe Acute Respiratory Syndrome Coronavirus 2 Biol. By 2009, however, rapid genomic analysis had become a routine component of outbreak response. Virology 507, 110 (2017). The time-calibrated phylogeny represents a maximum clade credibility tree inferred for NRR1. Virological.org http://virological.org/t/ncovs-relationship-to-bat-coronaviruses-recombination-signals-no-snakes-no-evidence-the-2019-ncov-lineage-is-recombinant/331 (2020). However, on closer inspection, the relative divergences in the phylogenetic tree (Fig. It allows a user to assign a SARS-CoV-2 genome sequence the most likely lineage (Pango lineage) to SARS-CoV-2 query sequences. Press, 2009). The SARS-CoV divergence times are somewhat earlier than dates previously estimated15 because previous estimates were obtained using a collection of SARS-CoV genomes from human and civet hosts (as well as a few closely related bat genomes), which implies that evolutionary rates were predominantly informed by the short-term SARS outbreak scale and probably biased upwards. Results and discussion Genomic surveillance has been a hallmark of the COVID-19 pandemic that, in contrast to other pandemics, achieves tracking of the virus evolution and spread worldwide almost in real-time ( 4 ). Concatenated region ABC is NRR1. Did Pangolin Trafficking Cause the Coronavirus Pandemic? Trends Microbiol. Of the nine breakpoints defining these ten BFRs, four showed phylogenetic incongruence (PI) signals with bootstrap support >80%, adopting previously published criteria on using a combination of mosaic and PI signals to show evidence of past recombination events19. In our analyses of the sarbecovirus datasets, we incorporated the uncertainty of the sampling dates when exact dates were not available. c, Maximum likelihood phylogenetic trees rooted on a 2007 virus sampled in Kenya (BtKy72; root truncated from images), shown for five BFRs of the sarbecovirus alignment. Accurate estimation of ages for deeper nodes would require adequate accommodation of time-dependent rate variation. performed recombination analysis for non-recombining regions1 and 2, breakpoint analysis and phylogenetic inference on recombinant segments. Prolonged SARS-CoV-2 Infection and Intra-Patient Viral Evolu : The BEAGLE 3: improved performance, scaling, and usability for a high-performance computing library for statistical phylogenetics. 95% credible interval bars are shown for all internal node ages. Are you sure you want to create this branch? PubMed =0.00025. We thank originating laboratories at South China Agricultural University (Y. Shen, L. Xiao and W. Chen; no. Regions AC were further examined for mosaic signals by 3SEQ, and all showed signs of mosaicism. 4 TMRCAs for SARS-CoV and SARS-CoV-2. 24, 490502 (2016). Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic, https://doi.org/10.1038/s41564-020-0771-4. Mol. In regionA, we removed subregion A1 (ntpositions 3,8724,716 within regionA) and subregion A4 (nt1,6422,113) because both showed PI signals with other subregions of regionA. Subsequently a bat sarbecovirusRaTG13, sampled from a Rhinolophus affinis horseshoe bat in 2013 in Yunnan Provincewas reported that clusters with SARS-CoV-2 in almost all genomic regions with approximately 96% genome sequence identity2. These datasets were subjected to the same recombination masking approach as NRA3 and were characterized by a strong temporal signal (Fig. The ongoing pandemic spread of a new human coronavirus, SARS-CoV-2, which is associated with severe pneumonia/disease (COVID-19), has resulted in the generation of tens of thousands of virus .