当前位置:文档之家› ESyPred3D蛋白质三级结构预测

ESyPred3D蛋白质三级结构预测

BIOINFORMATICSVol.18no.92002Pages1250–1256ESyPred3D:Prediction of proteins 3D structuresChristophe Lambert ∗,Nadia L´eonard,Xavier De Bolle and Eric DepiereuxFacult´es Universitaires Notre-Dame de la Paix,Unit´e de Recherche en Biologie Mol´eculaire,Rue de Bruxelles,61,B-5000Namur,BelgiumReceived on February 1,2002;revised on March 7,2002;accepted on March 18,2002ABSTRACTMotivation:Homology or comparative modeling is cur-rently the most accurate method to predict the three-dimensional structure of proteins.It generally consists in four steps:(1)databanks searching to identify the struc-tural homolog,(2)target–template alignment,(3)model building and optimization,and (4)model evaluation.The target–template alignment step is generally accepted as the most critical step in homology modeling.Results:We present here ESyPred3D,a new automated homology modeling program.The method gets benefit of the increased alignment performances of a new alignment strategy.Alignments are obtained by combining,weighting and screening the results of several multiple alignment programs.The final three-dimensional structure is build using the modeling package MODELLER.ESyPred3D was tested on 13targets in the CASP4experiment (C ritical A ssessment of Techniques for Proteins S tructural P rediction).Our alignment strategy obtains better results compared to PSI-BLAST alignments and ESyPred3D alignments are among the most accurate compared to those of participants having used the same template.Availability:ESyPred3D is available through its web site at http://www.fundp.ac.be/urbm/bioinfo/esypred/.Contact:mbert@fundp.ac.be;http://www.fundp.ac.be/∼lambertcINTRODUCTIONThree-dimensional (3D)protein structure is an important source of information to better understand the function of a protein,its interactions with other compounds (ligands,proteins,DNA,...)and to understand phenotypical effects of mutations (Tramontano,1998).The 3D protein struc-ture can be predicted according to three main categories of methods (Rost and O’Donoghue,1997):(1)homology or comparative modeling (described below);(2)fold recogni-tion (predicting the global fold of a protein);(3)ab initio techniques (trying to model the 3D structure of proteins using only the sequence and a force field).∗To whom correspondence should be addressed.Homology modeling is historically the first (Browne et al.,1969)and the most accurate method (Sanchez and Sali,1997).It was shown during the last CASP experiment (Venclovas et al.,1999)(C ritical A ssessment of Techniques for Proteins S tructural P rediction)that the critical steps are:(1)template selection,(2)target–template alignment step,(3)modeling of regions not present or significantly different from those in template and (4)modeling of side chains.Among these critical steps,it is commonly accepted that the target–template alignment step is the most critical (Mosimann et al.,1995;Martin et al.,1997).It is known that above 50%of identity rate between target and template,pairwise alignments provide accurate models.Between 30%and 50%of identity,multiple align-ments between target,template and similar proteins must be used and the pairwise alignments between target and template must be extracted from this multiple alignment.Below 30%of identity rate,only heuristic combinations of multiple alignments,experimental data and know-how of an expert are able to generate an accurate model.A large number of techniques have been developed to predict 3D structures of proteins by homology modeling.For the target–template alignment step,most of them use PSI-BLAST (Altschul et al.,1997),PileUp (Wisconsin Package Version 9.1,Genetics Computer Group (GCG),Madison,Wisc.),ClustalW (Thompson et al.,1994),3D-PSSM (Fischer et al.,1999),SAMT99(Karplus et al.,1997),or also the alignment producing the best model out of a collection computed from various alignment programs (Yang and Honig,1999).Our laboratory developed the MATCHBOX multiple sequence alignment software in the early 1990s (De-piereux and Feytmans,1992)and it has proved to be one of the most accurate in terms of specificity (Depiereux et al.,1997).Much effort has been consented into im-proving alignment accuracy by adding information such as secondary structure predictions,solvent accessibility predictions,specific scoring matrices and combination with ClustalW.In all cases,it was only possible to slightly improve multiple alignment accuracy (unpublished re-1250cOxford University Press 2002ESyPred3D:Prediction of proteins3D structuressults).Meanwhile,no significant improvements in align-ment performance have been published by other groups. Furthermore,no alignment method can be qualified as the absolute most reliable one.Indeed,benchmarks(Briffeuil et al.,1998;Thompson et al.,1999)have shown that comparative performances of alignment programs are deeply dependent on the set of aligned sequences.In this work,we tackle the target–template alignment problem by developing a specific program to align target and template sequences in homology modeling.Matching of homologous segments is improved by incorporation of the results of several multiple alignment programs.Results are scored to optimize the performances and screened to remove incompatible matches.Several algorithmic prob-lems have required specific developments in order to gen-erate and efficiently screen the database of the various and often incompatible alignments proposed by the different algorithms.This new alignment strategy is included into our ESyPred3D program(http://www.fundp.ac.be/urbm/ bioinfo/esypred/)that predicts the3D structure of proteins using the homology modeling approach.SYSTEM AND METHODSOur automatic program(ESyPred3D)implements the four steps of the homology modeling approach(Eisen-haber et al.,1995):(1)databanks searching to identify the structural homolog,(2)target–template alignment, (3)model building and optimization and(4)model eval-uation(not implemented at the time of the CASP4ex-periment).ESyPred3D was run on an SGI Octane Dual processor225MHz workstation under IRIX6.5. Identifying the structural homologTofind homologs to the target sequence,PSI-BLAST2.0.14 (downloaded from NCBI and run locally)is run using the latest possible version of the NR databank(NCBI).The chosen template is the sequence from the latest version of the PDB databank with the lowest expected value after four iterations.The cutoff for the expected value is0.0001 (-hflag).If no template is found with these criteria,the program stops.Aligning sequences and constructing the3D model According to Thompson et al.(1999),the quality of the alignment of sequences highly depends on the context of the alignment.The results obtained for a given pair of sequences may be different depending on the set of sequences submitted to multiple alignment programs.So, after fetching all sequences retrieved by PSI-BLAST,two sets of sequences are generated in order to create two different computational conditions for running multiple alignment programs:the set A contains the50best hits including the target and the template(the number of sequences is limited to50to reduce computing time).The set B is a subset of at least seven sequences,including the target and the template,produced by dropping too redundant sequences with the PURGE program(provided with the Gibbs package(Neuwald et al.,1995)).The BLAST score(using the-bflag)to select or eliminate sequences during the PURGE operation is250.The building of the target–template alignment is per-formed in these steps(see Figure1):(a)Matching.Both sets of sequences(A and B)arealigned byfive alignment programs emerging fromtwo benchmarks(Briffeuil et al.,1998;Thompson etal.,1999).These programs are:ClustalW,Dialign2(Morgenstern,1999),Match-Box(Depiereux et al.,1997),Multalin(Corpet,1988)and PRRP(Gotoh,1996).Ten multiple alignments are generated,eachone including the target and template sequences.Then,the pairwise alignments between the targetand template sequences are extracted,leading to tendifferent pairwise alignments between the target andthe template.(b)Database building.Each position of the alignmentsis stored in a database,all the redundant results,i.e.the same amino acid placed at the same positionby different programs,being scored in a frequencytable.(c)Screening.The position with the highest score istaken as thefirst anchor point to build thefinaltarget–template alignment.Incompatible results(seeFigure2.),aligning regions located up-and down-stream anchor points,are removed from the database.The process is pursued,new anchor positions beingdetermined,and incompatible regions being elimi-nated,until all results are selected or removed.Thefinal target–template alignment is thus composedby the most frequent aligned positions,under thecondition of compatibility.Thisfinal pairwise alignment is used by the MODEL homology modeling routine of MODELLER release4 (Sali and Blundell,1993;Sali et al.,1997)to build a 3D model of the target protein.This routine includes the satisfaction of spatial and geometric restraints and a very fast molecular dynamic annealing:no other refinements were applied.Participation to the CASP4experimentESypred3D server participated to the CASP4experiment (see complete results at / casp4/;group218:LAMBERT-CHRISTOPHE).All the models generated by MODELLER were submitted to the CASP4contest without any geometric or energetic evaluation.The number of homology modeling targets used during the CASP4competition was too small to take1251mbert et al.ESyPred3D target-template alignmentFig.1.Flowchart of the ESyPred3D target –template alignment method.See the text for details.Fig.2.Example of compatible and incompatible results on two hypothetical sequences.Three cases are reported:(a)Alignments I –I and I –L are not compatible because the same amino acid in sequence 1is aligned to two amino acids in sequence 2.(b)Alignments P –P and A –A are not compatible.P in sequence 1is at the right of A but P in the second sequence is the left of A.(c)Alignment I –I and P –P are compatible.The prolines are both at the right of the isoleucines.a very robust statistical conclusion about the performances of our method.However,results obtained provide a first estimation of performances.For more statistical results,see the continuous evaluation of servers performed by EV A (Eyrich et al.,2001)(/eva/).For the purpose of the CASP4experience,two models were built for 13comparative modeling targets (Table 1.)for which ESyPred3D was able to predict a 3D structure:(1)The first model was built using the complete strategydescribed above(ESyPred3D)(models T0xxxTS2181).(2)The second model was built using the same strategyas ESyPred3D but by using the rough sequence-structure alignment provided by PSI-BLAST (models T0xxxTS2182).Scoring schemes used to compare target structures to modelsTo compare ESyPred3D models to PSI models and ESyPred3D models to models of other CASP4partici-pants,the AL0and the GDT TS scores were chosen.Both scores where calculated using the LGA (Local –Global Alignment)program (Zemla,2000).AL0AL0is the number of correctly aligned residues in the target –template alignment.This score is very signi ficant in this case because our method was designed to generate op-timal alignment performances.This number is evaluated by,at first,making a structural alignment of the prediction and the target structure with the DALI-server,and then,counting the number of residues in the model for which the closest residue in the target is the correct one (the distancebetween their α-carbons being less than 3.8˚A).GDT TSThe Global Distance Test,G DT d i ,is the number of α-carbons of a prediction not deviating from more than d i ˚Afrom the α-carbons of the targets,after optimal super-imposition.This optimal superimposition is computed in such a way that the number of residues (α-carbons)that can fit under the distance cutoff d i is maximum.If NT is the total number of residues of the target,GDT TS (GDT Total Score)computed according to the formula given below is the mean fraction of residues of the target not deviating from the prediction after fouroptimal superimpositions with 1.0,2.0,4.0,8.0˚A α-carbon distance cutoffs.The GDT TS score represents the overall quality of the model.This score was used to evaluate the complete procedure of ESyPred3D:identify-ing the structural homolog,aligning target to template and building the 3D model.G DT T S =100∗d iG DT d iN T4d i ∈{1.0,2.0,4.0,8.0}1252ESyPred3D:Prediction of proteins3D structures Table1.Homology modeling targets for the CASP4experimentTarget Description PDB codeT0090ADP-ribose pyrophosphatase,E.coli1g0s,1g9q,1ga7 T0092Hypothetical protein HI0319,H.InfluenzaeT0099No descriptionT0103Pepstatin insensitive carboxyl proteinase,Pseudomonas sp.1ga6T0111Enolase,E.coliT0112Ketose Reductase/Sorbitol Dehydrogenase,B.argentifolii1e3jT0113Short chain3-hydroxyacyl-coa dehydrogenase,rat1e3w,1e3s,1e6w T0117Deoxyribonucleoside kinase,D.MelanogasterT0121MalK,T.litoralis1g29T0122Tryptophan Synthase alpha subunit,P.furiosus1geqT0123Beta-lactoglobulin,pig1exsT0125Sp18protein,H.fulgens1gakT0128Manganese superoxide dismutase homolog,P.aerophilumRESULTS AND DISCUSSIONThe performance of our homology modeling server is analyzed in three steps.In thefirst section,ESyPred3D alignments are compared to PSI-BLAST alignments.In the second section,ESyPred3D alignments are compared to those of other participants having used the same template.Since our alignment method is specifically designed for homology modeling,in the third section, ESyPred3D models are compared to those of other CASP4 competitors in order to evaluate the global performance of our homology modeling strategy.Alignment performances of ESyPred3D models compared to those of PSI-BLAST modelsTable2contains AL0scores for all models.Out of the 13models,ESyPred3D obtains nine AL0scores greater than PSI-BLAST and only one AL0score significantly lower(more than two amino acids incorrectly aligned) than PSI-BLAST,for T0112.Two reasons explain the poor alignment for T0112:(1)The number of homologs found by PSI-BLAST wasso large that the non-redundant set could not becomputed with PURGE.(2)Four regions of T0112shared only a very low sim-ilarity with homologues.So the different alignmentprograms produced contradictory results in these re-gions,and only a poor alignment could be estab-lished by our method.From thisfirst evaluation,we can conclude that the quality of the target–template alignment is generally better by using ESyPred3D alignment methodology than using the target–template alignment provided by PSI-BLAST. Comparison of ESyPred3D with those of the participants having used the same templateThe number of groups that have used the same templates as ESyPred3D is strongly variable(from two to65groups, Figure3).In Figure3,using AL0scores,ESyPred3D models obtained one time thefirst place,five times the second place,three times the third place and one times the fourth place.ESyPred3D models are then ten times in the top four places out of the13targets.Taking into account that a group that performs better than ESyPred3D model for one target is rarely the same that performs better for another target,one can conclude that our methodology is among the most efficient.Comparison of ESyPred3D models with those of all CASP4participantsIn this section,the complete strategy of ESyPred3D is evaluated and the performances are compared to those of other CASP4participants using the GDT TS score. The number of models submitted for each target was always above200.So,to enable a rapid interpretation of the distributions of scores,we have computed the third quartile of these distributions.The third quartile(Q3)of a distribution is the value such that75%of values in the list are less or equal to it.All information provided in Figure4has been normalized by the Q3value,for each target.For each target,Figure4contains:(1)the GDT TS score of ESyPred3D models;(2)the GDT TS score of the1253mbert et al.Table2.Scores for ESyPred3D and PSI-BLAST models.The last column shows templates that lead to the best models presented at CASP4 Model RMSD(allαcarbons)GDT TS AL0Template Templates leading to the best modelsT0090TS2181 6.5230.15411tum1mutT0090TS2182 6.4423.37301tum1mutT0092TS218114.7035.69731d2g1xva,1d2hT0092TS2182 5.2234.69661d2g1xva,1d2hT0099TS2181 5.5452.23261qly1a0n,1ad5,2hck,1qcf,2srcT0099TS2182 5.5650.00211qly1a0n,1ad5,2hck,1qcf,2srcT0103TS218111.9538.591281sbh1mee,1supT0103TS218212.4133.761131sbh1mee,1supT0111TS2181 2.2983.553831one1pdz,1pdy,1ykf,4-7enlT0111TS2182 2.2682.393811one1pdz,1pdy,1ykf,4-7enlT0112TS2181 5.3554.311741hdy1teh,1ykfT0112TS2182 4.1359.191971hdy1teh,1ykfT0113TS2181 3.3281.862141hdc1hdc,2hsdT0113TS2182 3.6880.492071hdc1hdc,2hsdT0117TS21818.2456.851141qhi1e2k,1kim,1ki2-7T0117TS2182 3.8755.711091qhi1e2k,1kim,1ki2-7T0121TS2181 3.3541.941431b0u1b0uT0121TS2182 3.3840.931411b0u1b0uT0122TS2181 2.4379.152031cw21a5a,1a5b,1beu,1cw2T0122TS2182 2.4174.581901cw21a5a,1a5b,1beu,1cw2T0123TS2181 4.1563.911022a2u2a2g,1bebT0123TS2182 3.7565.471022a2u2a2g,1bebT0125TS2181 4.1561.13743lyn2lis,3lynT0125TS2182 4.0760.40753lyn2lis,3lynT0128TS2181 1.7486.731851abm1b06,1sssT0128TS2182 1.6587.321871abm1b06,1sssbest model received by CASP4organizers and(3)the third quartile is equal to1.0because of the normalization. Figure4shows that ESyPred3D built three models with scores close to the best model,indeed the second place was obtained for targets T0103,T0121and T0122 (see full tables at /casp4/). ESyPred3D predicted eight models above Q3values,i.e. in the top25%of participants.Further analysis of the data at the CASP4web site shows that there are few groups that have reached such a number of scores values above the Q3. It is also important to note that the group that obtained the best model for one target is rarely the same that submitted the best model for another target.The analysis of GDT TS scores(Figure4)showed that seven targets(T0090,T0092,T0099,T0112,T00117,T0123and T0128)obtained values significantly lower than those of the best models.For targets T0090,T0092, T0099,T0117,T0123and T0128the low values of GDT TS are due to the selection of a template that was not fully adequate.Indeed,for these targets,the alignment performances remain good when comparing only to groups that used the same template,as shown by the AL0score in Figure3.Although in our methodology the template selection process has to be improved,it is important to note that a completely inadequate template was never chosen.The result of T0112is due to the quality of the alignment as you can see in Figure3.The fact that eight models from13are above the Q3 shows that our alignment method combined with the PSI-BLAST template selection and the use of MODELLER1254ESyPred3D:Prediction of proteins 3Dstructures102030405060708090100T 0090T 0092T 0099T 0103T 0111T 0112T 0113T 0117T 0121T 0122T 0123T 0125T 0128TargetsA L 0 (i n % o f t h e l e n g t h )Fig.3.AL0scores for targets studied in this work.Two series are reported for each target:the score of ESyPred3D models (black bullets)and the scores of models of other CASP4participants having used the same template (blank bullets).AL0scores are expressed as a fraction of the length of thetarget.T 0090T 0092T 0099T 0103T 0111T 0112T 0113T 0117T 0121T 0122T 0123T 0125T 0128TargetsG D T _T S (i n % o f Q 3)Fig.4.GDT TS scores for targets studied in this work.Three points are reported for each target:the score of the model that obtain the best score (bold line),the ESyPred3D model (box)and the third quartile value (dotted line).All values are expressed as a fraction of the third quartile value.The group IDs of best predictors with their selected templates are also reported.to obtain the 3D model is a good strategy.Even if the template selection or the alignment quality is not optimal,the global quality of the ESyPred3D modeling strategy remains good.CONCLUSIONA new alignment methodology for homology modeling of proteins has been developed.The program has been tested on 13targets of the CASP4for its alignment performances and for the general quality of the provided models.Our alignment strategy produced better results com-pared to PSI-BLAST alignments and ESyPred3D align-ments are among the most accurate comparing to partic-ipants having used the same template.Furthermore,our ESyPred3D program provides models that are among the best of the CASP4experiment.Nevertheless,our alignment methodology could be im-proved.Thompson et al.(1999)and Briffeuil et al.(1998)benchmarks showed that all alignment programs have different level of performance.We plan to use this in-formation to improve the computing of the alignment,by weighting each multiple alignment method with the numeric representation of the mean performance of the method.Additional information such as secondary struc-ture predictions can also be used in the box selection in order to improve the alignment quality.The template problem remains troublesome in homol-ogy modeling,especially when the target and template sequences are sharing a low identity rate.To improve the template selection,the use of better parameters or better scoring matrices for PSI-BLAST (like the one de-scribed in Kann et al.(2000))need to be investigated.In the same way,PSI-BLAST can also be replaced by SAM-T99(Karplus et al.,1998)or other programs.The intrinsic quality of the possible template structures (NMR,resolution,...)and the selection of multiple templates will also be taken into account to improve our modeling strategy.The model evaluation step of our homology modeling methodology has not yet been developed.Geometric and energetic evaluation of the model can be done using ANOLEA (Melo and Feytmans,1997),PROCHECK (Laskowski et al.,1993)or Verify3D (Luthy et al.,1992).The results of these evaluations will be used to change our target –template alignment or to select a more appropriate template.The process will be iterated in order to find the template that provides the best evaluated model.A similar iteration procedure has been used by the Blundell group in the CASP4.ACKNOWLEDGMENTSWe thank the organizers and assessors of the CASP4experiment for their valuable contributions to the structure prediction field.Christophe Lambert holds a specializedgrant from the ‘Fonds pour la Formation `ala Recherche dans l ’Industrie et dans l ’Agriculture ’(F.R.I.A.).We particularly want to thank Guy Baudoux,Katalin de Fays and Johan Wouters for helpful and fruitful discussions.REFERENCESAltschul,S.F.,Madden,T.L.,Sch ¨affer,A.A.,Zhang,J.,Zhang,Z.,Miller,W.and Lipman,D.J.(1997)Gapped BLAST and PSI-BLAST:a new generation of protein database search programs.Nucleic Acids Res.,25,3389–3402.Briffeuil,P.,Baudoux,G.,Reginster,I.,De Bolle,X.,Vinals,C.,Feytmans,E.and Depiereux,E.(1998)Comparative analysis of1255mbert et al.seven multiple protein sequence alignment servers:clues to enhance predictions reliability.Bioinformatics,14,357–366. Browne,W.J.,North,A.C.and Phillips,D.C.(1969)A possible three-dimensional structure of bovine alpha-lactalbumin based on that of hen’s egg-white lysozyme.J.Mol.Biol.,42,65–86. Corpet,F.(1988)Multiple sequence alignment with hierarchi-cal clustering.Nucleic Acids Res.,16,10881–10890. Depiereux,E.,Baudoux,G.,Briffeuil,P.,Reginster,I.,De Bolle,X., Vinals,C.and Feytmans,E.(1997)Match-Box server:a multiple sequence alignment tool placing emphasis on put.Appl.Biosci.,13,249–256.Depiereux,E.and Feytmans,E.(1992)Match-Box:a fundamentally new algorithm for simultaneous alignment of several protein put.Appl.Biosci.,8,501–509. Eisenhaber,F.,Persson,B.and Argos,P.(1995)Protein structure prediction:recognition of primary,secondary,and tertiary structural features from amino acid sequence.Crit.Rev.Biochem.Mol.Biol.,30,1–94.Eyrich,V.A.,Marti-Renom,M.A.,Przybylski,D.,Madhusudhan,M.S., Fiser,A.,Pazos,F.,Valencia,A.,Sali,A.and Rost,B.(2001)EV A: continuous automatic evaluation of protein structure prediction servers.Bioinformatics,17,1242–1243.Fischer,D.,Barret,C.,Bryson,K.,Elofsson,A.,Godzik,A.,Jones,D., Karplus,K.J.,Kelley,L.A.,MacCallum,R.M.,Pawowski,K., Rost,B.,Rychlewski,L.and Sternberg,M.(1999)CAFASP-1: critical assessment of fully automated structure prediction methods.Proteins,(Suppl3),209–217.Gotoh,O.(1996)Significant Improvement in Accuracy of Multiple Protein Sequence Alignments by Iterative Refinement as Assessed by Reference to Structural Alignments.J.Mol.Biol., 264,823–838.Kann,M.,Qian,B.and Goldstein,R.A.(2000)Optimization of a new score function for the detection of remote homologs.Proteins, 41,498–503.Karplus,K.,Barrett,C.and Hughey,R.(1998)Hidden Markov models for detecting remote protein homologies.Bioinformatics, 14,846–856.Karplus,K.,Sjolander,K.,Barrett,C.,Cline,M.,Haussler,D., Hughey,R.,Holm,L.and Sander,C.(1997)Predicting protein structure using hidden Markov models.Proteins,(Suppl1), 134–139.Laskowski,R.A.,Moss,D.S.and Thornton,J.M.(1993)Main-chain bond lengths and bond angles in protein structures.J.Mol.Biol., 231,1049–1067.Luthy,R.,Bowie,J.U.and Eisenberg,D.(1992)Assessment of protein models with three-dimensional profiles.Nature,356,83–85.Martin,A.C.,MacArthur,M.W.and Thornton,J.M.(1997)Assess-ment of comparative modeling in CASP2.Proteins,(Suppl1), 14–28.Melo,F.and Feytmans,E.(1997)Novel Knowledge-based Mean Force Potential at Atomic Level.J.Mol.Biol.,267,207–222. Morgenstern,B.(1999)DIALIGN2:improvement of the segment-to-segment approach to multiple sequence alignment.Bioinfor-matics,15,211–218.Mosimann,S.,Meleshko,R.and James,M.N.(1995)A critical assessment of comparative molecular modeling of tertiary structures of proteins.Proteins,23,301–317.Neuwald,A.F.,Liu,J.S.and Lawrence,C.E.(1995)Gibbs motif sampling:detection of bacterial outer membrane protein repeats.Protein Sci.,4,1618–1632.Rost,B.and O’Donoghue,S.(1997)Sisyphus and prediction of protein put.Appl.Biosci.,13,345–356.Sali,A.and Blundell,T.L.(1993)Comparative protein modelling by satisfaction of spatial restraints.J.Mol.Biol.,234,779–815. Sali,A.,Sanchez,R.and Badretdinov,A.(1997)MODELLER:A Program for Protein Structure Modeling Release4. Sanchez,R.and Sali,A.(1997)Advances in comparative protein-structure modeling.Curr.Opin.Struct.Biol.,7,206–214. Thompson,J.D.,Higgins,D.G.and Gibson,T.J.(1994)CLUSTALw: improving the sensitivity of progressive multiple sequence alignment through sequence weighting,position-specific gap penalties and weight matrix choice.Nucleic Acids Res.,22, 4673–4680.Thompson,J.D.,Plewniak,F.and Poch,O.(1999)A comprehensive comparison of multiple sequence alignment programs.Nucleic Acids Res.,27,2682–2690.Tramontano,A.(1998)Homology modeling with low sequence identity.Methods,14,293–300.Venclovas,C.,Zemla,A.,Fidelis,K.and Moult,J.(1999)Some measures of comparative performance in the three CASPs.Proteins,(Suppl3),231–237.Yang,A.S.and Honig,B.(1999)Sequence to structure alignment in comparative modeling using PrISM.Proteins,(Suppl3),66–72. Zemla,A.(2000)LGA program:A Method for Finding3D Similar-ities in Protein Structures.Accessed at http://PredictionCenter./local/lga.1256。

相关主题