【病毒外文文獻(xiàn)】2006 Adaptive evolution of the spike gene of SARS coronavirus_ changes in positively selected sites in different epidemi
《【病毒外文文獻(xiàn)】2006 Adaptive evolution of the spike gene of SARS coronavirus_ changes in positively selected sites in different epidemi》由會員分享,可在線閱讀,更多相關(guān)《【病毒外文文獻(xiàn)】2006 Adaptive evolution of the spike gene of SARS coronavirus_ changes in positively selected sites in different epidemi(10頁珍藏版)》請在裝配圖網(wǎng)上搜索。
BioMed Central BMC Microbiology Open Access Research article Adaptive evolution of the spike gene of SARS coronavirus changes in positively selected sites in different epidemic groups Chi Yu Zhang 1 Ji Fu Wei 2 and Shao Heng He 2 Address 1 Department of Biochemistry and Molecular Biology Jiangsu University School of Medical Technology Zhenjiang Jiangsu 212001 China and 2 The First Affiliated Hospital of Nanjing Medical University Nanjing Jiangsu 210026 China Email Chi Yu Zhang zhangcy1999 Ji Fu Wei weijifu Shao Heng He shoahenghe Corresponding author Equal contributors Abstract Background It is believed that animal to human transmission of severe acute respiratory syndrome SARS coronavirus CoV is the cause of the SARS outbreak worldwide The spike S protein is one of the best characterized proteins of SARS CoV which plays a key role in SARS CoV overcoming species barrier and accomplishing interspecies transmission from animals to humans suggesting that it may be the major target of selective pressure However the process of adaptive evolution of S protein and the exact positively selected sites associated with this process remain unknown Results By investigating the adaptive evolution of S protein we identified twelve amino acid sites 75 239 244 311 479 609 613 743 765 778 1148 and 1163 in the S protein under positive selective pressure Based on phylogenetic tree and epidemiological investigation SARS outbreak was divided into three epidemic groups 02 04 interspecies 03 early mid and 03 late epidemic groups in the present study Positive selection was detected in the first two groups which represent the course of SARS CoV interspecies transmission and of viral adaptation to human host respectively In contrast purifying selection was detected in 03 late group These indicate that S protein experiences variable positive selective pressures before reaching stabilization A total of 25 sites in 02 04 interspecies epidemic group and 16 sites in 03 early mid epidemic group were identified under positive selection The identified sites were different between these two groups except for site 239 which suggests that positively selected sites are changeable between groups Moreover it was showed that a larger proportion 24 of positively selected sites was located in receptor binding domain RBD than in heptad repeat HR 1 HR2 region in 02 04 interspecies epidemic group p 0 0208 and a greater percentage 25 of these sites occurred in HR1 HR2 region than in RBD in 03 early mid epidemic group p 0 0721 These suggest that functionally different domains of S protein may not experience same positive selection in each epidemic group In addition three specific replacements F360S T487S and L665S were only found between 03 human SARS CoVs and strains from 02 04 interspecies epidemic group which reveals that selective sweep may also force the evolution of S genes before the jump of SARS CoVs into human hosts Since certain residues at these positively selected sites are associated with receptor recognition and or membrane fusion they are likely to be the crucial residues for animal to human transmission of SARS CoVs and subsequent adaptation to human hosts Conclusion The variation of positive selective pressures and positively selected sites are likely to Published 04 October 2006 BMC Microbiology 2006 6 88 doi 10 1186 1471 2180 6 88 Received 16 April 2006 Accepted 04 October 2006 This article is available from 2006 Zhang et al licensee BioMed Central Ltd This is an Open Access article distributed under the terms of the Creative Commons Attribution License http creativecommons org licenses by 2 0 which permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited Page 1 of 10 page number not for citation purposes contribute to the adaptive evolution of S protein from animals to humans BMC Microbiology 2006 6 88 Background SARS is a new infectious disease that emerged in the Guangdong province of China in November 2002 It caused 8 096 infection cases including 774 deaths world wide during its epidemic 1 The causative pathogen of SARS was identified as a novel strain of human coronavi rus named as SARS CoV and its complete genome was sequenced in March 2003 2 5 In May 2003 SARS CoVs were also isolated from a few Himalayan palm civets Paguma larvata and a raccoon dog Nyctereutes procyo noides in a food market in Shenzhen Guangdong China 6 These isolations provided the first evidence that wild animals could be reservoirs for SARS CoV and that the virus might be transmitted from animals to humans The re emergence of SARS in 2003 2004 in Guangdong China confirmed that SARS CoV was inde pendently transmitted from animals to humans 7 The S protein of SARS CoV is composed of 1 255 amino acids and is responsible for viral attachment and entry into host cells 4 5 It is also a major antigenic determi nant that induces generation of neutralizing antibodies and protective immunity at least in human host 8 Unlike some coronaviruses in which S protein can be cleaved into two functional subunits S1 and S2 the S pro tein of SARS CoV is not cleavable due to the absence of the proteolytic cleavage site However two domains S1 residues 14 680 and S2 residues 681 1 255 were identified in SARS CoV S protein in the light of their homology with the S1 and S2 subunits 9 Domain S1 is responsible for binding to angiotensin converting enzyme 2 ACE2 which serves as the functional receptor of SARS CoV 10 11 Domain S2 mediates viral entry into host cells 12 13 Previous works indicated that interspe cies transmission may be due to the acquisition of muta tions in S protein which allows human infection suggesting that S protein ought to be a major target of selective pressure 6 7 14 A criterion for the determination of selective pressure is to compare nonsynonymous amino acid changing d N with synonymous silent d S substitution rates in protein coding genes The nonsynonymous synonymous rate ratio d N d S provides a straightforward measurement of selective pressure at the protein level The values of 1 1 and 0 95 four twelve and eight sites of S protein were identified to be under positive selection 1 by selection models M2a M3 and M8 respectively Table 1 Twelve positively selected sites detected by M3 were also identified by M2a and M8 at the level of poste rior probability 0 9 Table 1 The number of positively selected sites discovered in the present study was similar to the number of the sites identified in previous reports 16 17 Detection of recombination and positive selection on S genes of SARS CoV in different epidemic groups Recombination can influence the detection of positive selection 18 19 and previous studies had proposed that recombination occurs in the origin of SARS CoV 20 In 02 04 interspecies epidemic group human sequence GZ03 02 split 03 pcSARS CoVs from 04 pcSARS CoVs whereas other 04 huSARS CoVs clustered with 04 pcSARS CoVs Fig 1 suggesting that GZ03 02 may be a recombinant between 03 pcSARS CoVs and 04 pcSARS CoVs However the bootscan analysis of GZ03 02 using SimPlot software showed that majority of GZ03 02 S gene had the percent of permuted trees less than 40 Fig S1 see additional file 1 indicating that they possess similar identity to other sequences and suggesting that no recom bination occurred in this strain 21 In addition viral recombination requires the co infection of different virus strains 22 and there was little chance for GZ03 02 patients to be co infected with 03 pcSARS CoV and 04 pcSARS CoV during 2004 epidemic 7 further support ing the view that no recombination occurred in S gene of GZ03 02 21 Three selection models M2a M3 and M8 showed that positive selection occurred in both 02 04 interspecies and 03 early mid epidemic groups Table 2 For instance M8 showed that 0 6 of the sites in 02 04 interspecies epi demic group were under positive selection with values between 66 0 67 2 and 2 7 of the sites in 03 early mid epidemic group were under positive selection with 40 9 LRT statistic revealed that three selection models fit ted the data better than three null models in both groups of 02 04 interspecies epidemic and 03 early mid epi demic which supports further the presence of amino acid sites under positive selection in S protein Table 2 In contrast we were unable to identify any site under posi tive selection with any of the six models in the 03 late epi demic group Instead the results for this group were Comparison of positively selected sites on S genes in different epidemic groups The positively selected sites in both groups of 02 04 inter species epidemic and 03 early mid epidemic were identi fied using Codeml program Although three selection models M2a M3 and M8 detected same positively selected sites on S genes only the results from M8 are shown in Table 3 Four positively selected sites 479 609 743 and 765 in 02 04 interspecies epidemic group and four sites 75 239 778 and 1163 in 03 early mid epi demic group were identified at the level of posterior prob ability 0 95 respectively In addition 25 and 16 sites were detected under positive selection 1 in 02 04 interspecies epidemic and 03 early mid epidemic groups at the level of posterior probability 0 50 respectively By REL method completely identical 25 sites in 02 04 inter species epidemic group and 16 sites in 03 early mid epi demic group were identified under positive selection at Phylogenetic tree of 47 S gene sequences from human patients and animalsFigure 1 Phylogenetic tree of 47 S gene sequences from human patients and animals The evolutionary process of S proteins during whole epidemic was simplified into three epidemic groups 02 04 interspecies 03 early mid and 03 late epidemic groups Each group includes 15 unique S gene sequences after deleting all duplicate sequences Table S1 see additional file 2 Two sequences isolated from bats were used as the outgroup in phylogenetic tree construction The tree was constructed by the maximum likelihood method with 1000 bootstrap replicates using PHYML v2 4 4 Only the branch bootstrap values 50 are shown PC palm civet HP human patient Page 3 of 10 page number not for citation purposes consistent with purifying selection with values of 0 25 0 26 Table 2 significant level of Bayes factor 50 Table 3 When FEL and SLAC methods were used these sites were also iden BMC Microbiology 2006 6 88 tified under positive selection in despite of not reaching the significant level of p 0 50 implemented in Codeml program and three meth ods implemented in DataMonkey package Table 3 indi cating that this group was experiencing purifying selection In order to investigate the association of positively selected sites with the function of S protein we compared their location between groups of 02 04 interspecies epi demic and 03 early mid epidemic The results show that apart from the site 239 the two groups had completely different sites Table 3 suggesting for the first time that positively selected sites are variable in different epidemic groups It was found that 72 18 out of 25 positively 16 of that located in S1 domain in 03 early mid epi demic group p 0 0768 Table 3 Moreover 24 of positively selected sites in 02 04 interspecies epidemic group were concentrated in the region of receptor binding domain RBD only 4 in heptad repeat HR 1 HR2 region p 0 0208 but 0 in HR2 region p 0 0045 Contrarily 25 of positively selected sites in 03 early mid epidemic group were concentrated in HR1 HR2 region p 0 0721 18 8 in HR2 region p 0 1425 but only 6 3 in RBD region Table 3 and 4 These results suggest that positive selection tends to selectively influence certain functions of S protein but not others in each epidemic group Lineage fixation of positively selected sites on S genes for the adaptation of SARS CoV to human host Four positively selected sites 479 609 743 and 765 Table 2 Phylogenetic analysis by ML estimation for SARS S gene sequences from different epidemic groups Epidemic phases Model code lnL d N d S Estimates of parameters 2 l Positive selection 02 04 interspecies epidemic group M0 one ratio 5339 90 0 64 0 64 40 23 Yes M3 discrete 5319 78 0 77 p 0 0 00000 p 1 0 99407 p 2 0 00593 0 0 00 1 0 38 2 66 02 13 28 M1a NearlyNeutral 5334 74 0 35 p 0 0 65391 p 1 0 34609 29 92 Yes M2a PositiveSelection 5319 78 0 77 p 0 0 99407 p 1 0 00000 p 2 0 00593 2 66 01 9 21 M7 beta 5336 08 0 30 p 0 01702 q 0 03977 32 23 Yes M8 beta v 5319 97 0 80 p 0 0 99405 p 1 0 00595 p 0 0 01217 q 0 01714 67 24 9 21 03 early mid epidemic group M0 one ratio 5194 47 0 94 0 94 24 25 Yes M3 discrete 5182 34 1 10 p 0 0 05053 p 1 0 92251 p 2 0 02696 0 1 0 00 2 40 88 13 28 M1a NearlyNeutral 5192 6 9 0 45 p 0 0 54981 p 1 0 45019 20 70 Yes M2a PositiveSelection 5182 34 1 10 p 0 0 97303 p 1 0 00000 p 2 0 02696 2 40 88 9 21 M7 beta 5192 73 0 40 p 0 00510 q 0 00776 20 78 Yes M8 beta v 5182 34 1 10 p 0 0 97303 p 1 0 02697 p 0 00500 q 1 39962 40 87 9 21 03 late epidemic group M0 one ratio 5121 66 0 26 0 26 NA No M3 discrete 5121 66 0 26 p 0 0 43056 p 1 0 37490 p 2 0 19455 0 0 25 1 0 26 2 0 26 13 28 M1a NearlyNeutral 5121 66 0 26 p 0 1 00000 p 1 0 00000 NA No M2a PositiveSelection 5121 66 0 26 p 0 1 00000 p 1 0 00000 p 2 0 00000 0 0 26 1 2 1 00 9 21 M7 beta 5121 66 0 25 p 33 88884 q 99 00000 NA No M8 beta v 5121 68 0 26 p 0 1 00000 p 1 0 00000 p 0 64576 q 1 83663 2 08 9 21 The values in parentheses represent the significant level of 0 01 with a 2 distribution at d f 4 M0 vs M3 or 2 M1a vs M2a and M7 vs M8 NA not applicable Page 4 of 10 page number not for citation purposes selected sites in 02 04 interspecies epidemic group were located in S1 domain which is greater than 50 8 out of identified in 02 04 interspecies epidemic group were fixed in 03 early mid epidemic group Fig 2 The 04 BMC Microbiology 2006 6 88 pcSARS CoVs diverged from 03 pcSARS CoVs after the split between 03 pcSARS CoVs and 03 huSARS CoV Fig 1 7 The comparison of amino acid sequences between 03 pcSARS CoV and 03 huSARS CoV suggested that vari ants N479K and T743A play a dominant role in transition of viral host tropism from animals to humans Fig 2 The comparison between 03 huSARS CoV and 03 pcSARS CoV sequences discovered two additional variants L609A and V765A which may favor viral adaptation to palm civet Four sites 75T 239S 778Y and 1163K identified under positive selection in the 03 early mid epidemic group were fixed in the 03 late epidemic group Fig 2 The fixation of these amino acids suggests that they are likely to contribute to the adaptation process of S protein to human receptors Discussion The S protein of SARS CoV is responsible for the receptor binding and membrane fusion 10 It is also a major anti gen to stimulate humoral immunity of its host 8 The amino acid variation of S protein affects virus entry tissue tropism and host range of SARS CoV 11 23 Here we confirmed that the S gene undergoes strong positive selec tion 7 14 16 and identified twelve positively selected amino acid sites including 75 239 244 311 479 609 613 743 765 778 1148 and 1163 during the whole SARS outbreak Table 1 Among these sites positions 239 311 479 609 743 778 1148 and 1163 appeared to be exposed on the surface of S protein 9 24 suggest ing that they are likely to play a key role in viral transmis sion and survival In addition it was worth pointing out that SARS CoV is a rapidly evolving RNA virus with a mutation rate of 0 8 2 38 10 3 nucleotide substitution per site per year 25 The S gene sequences used in the present study were sampled during a year period and some mutations might be accumulated in late sampled sequences 26 27 However whether the accumulation of these mutations influences the detection of positive selection and the identification of positively selected sites remains unclear 26 This requires further investigation to confirm Adaptation of an animal virus to a new human host usu ally faces two crucial bottlenecks the receptor adaptation of viral surface protein to its new host followed by the adaptation of key enzymes e g viral replicases associ ated with viral replication to new cellular components that possibly support poorly productive infection e g non permissive cells 21 28 The latter is not always the step that limits host expansion and most viruses can establish productive infection after their entry of host cells 29 We found that two key replicases of SARS CoV RNA dependent RNA polymerase RdRp and helicase were not under positive selection Zhang CY et al unpub lished data which suggests that receptor adaptation of S protein to human host determines the animal to human transmission of SARS CoV 11 29 The receptor adapta tion of an animal virus to a new human host usually requires two key steps initial breakthrough of receptor barrier animal to human transmission followed by the molecular adaptation to human cellular receptors human to human transmission The two steps together result in eventual establishment of stable infection neces sary for efficient spread within human hosts In order to better reflect the course of viral trans species transmission and subsequent adaptation to human hosts the collection of SARS isolates was reclassified into three epidemic groups 02 04 interspecies 03 early mid and 03 late epidemic groups in the present study The 02 04 Table 4 The number of positively selected sites in different functional domains of S protein Functional domain 02 04 interspecies epidemic group 03 early mid epidemic group Fisher s exact test p RBD 61 HR1 HR2 1 4 0 045455 HR2 0 3 0 033333 Table 3 Positively selected sites identified by Codeml program and REL in DataMonkey package Domains Positively selected sites in different epidemic groups 02 04 interspecies 03 early mid 03 late S1 78 113 139 147 227 239 261 336 425 462 472 479 480 558 607 608 609 613 49 75 77 144 239 244 311 344 None S2 701 714 743 754 765 856 894 778 860 861 1001 1148 1163 1179 1247 None Positively selected sites identified by program Codeml at the level of posterior probability 0 95 are shown in boldface The underlines represent the sites locating in receptor binding domain RBD residues 318 510 of S1 domain or in heptad repeat HR 1 residues 889 972 HR2 residues 1142 1185 region of S2 domain Page 5 of 10 page number not for citation purposes The Fisher s exact test was performed for HR1 HR2 vs RBD and HR2 vs RBD respectively BMC Microbiology 2006 6 88 interspecies epidemic group reflects the process of viral trans species transmission and 03 early mid epidemic group represents the crucial phase of SARS CoVs to adapt to human host The two groups correspond with the two key steps described above for a virus to be adapted by a new cellular receptor We found that S genes underwent strong positive selection in both groups of 02 04 interspecies epidemic and 03 early mid epidemic whereas no positive selection was observed in 03 late epidemic group Table 2 It suggests that S protein experiences a step by step adaptation proc ess to human cellular receptors On the other hand the amino acid sites under positive selection in 02 04 inter species epidemic group differed clearly from those in 03 early mid epidemic group suggesting for the first time the changes in positively selected sites in different epidemic groups It was reported previously that two functional domains S1 and S2 of SARS CoV S protein are responsible for receptor recognition and membrane fusion respec 510 which is adequate for binding to human ACE2 30 In domain S2 two highly conserved regions HP1 resi dues 889 972 and 2 residues 1142 1185 are crucial for membrane fusion 31 Importantly a larger propor tion of positively selected sites was located in RBD than in HR1 HR2 or HR2 regions in 02 04 interspecies epidemic group and a greater percentage of these sites occurred in HR1 HR2 region than in RBD in 03 early mid epidemic group Fisher s exact test p 0 045 for HR1 HR2 vs RBD and p 0 033 for HR2 vs RBD Table 4 These differ ences suggest that positive selection p- 1.請仔細(xì)閱讀文檔,確保文檔完整性,對于不預(yù)覽、不比對內(nèi)容而直接下載帶來的問題本站不予受理。
- 2.下載的文檔,不會出現(xiàn)我們的網(wǎng)址水印。
- 3、該文檔所得收入(下載+內(nèi)容+預(yù)覽)歸上傳者、原創(chuàng)作者;如果您是本文檔原作者,請點此認(rèn)領(lǐng)!既往收益都?xì)w您。
下載文檔到電腦,查找使用更方便
10 積分
下載 |
- 配套講稿:
如PPT文件的首頁顯示word圖標(biāo),表示該PPT已包含配套word講稿。雙擊word圖標(biāo)可打開word文檔。
- 特殊限制:
部分文檔作品中含有的國旗、國徽等圖片,僅作為作品整體效果示例展示,禁止商用。設(shè)計者僅對作品中獨創(chuàng)性部分享有著作權(quán)。
- 關(guān) 鍵 詞:
- 病毒,外文文獻(xiàn) 【病毒,外文文獻(xiàn)】2006 Adaptive evolution of the spike gene SARS coronavirus_ changes in positively selected 病毒
鏈接地址:http://m.hcyjhs8.com/p-7153619.html