Answers to tree reliability test question Dear all, Many thanks for all the replies to my question. The comments were really helpful and to the benefit of all I am posting the answers back to Evoldir. My question was: I am using a nuclear DNA marker to study the genetic differentiation among populations within an insect species complex. Using the Mega software I obtained an unrooted Neighbor Joining tree inferred from Fst matrix values. However, for pairwise distance input data files the program does not offer an option to assess the reliability of the tree since there is no nucleotide information to perform a bootstrap test. I was wondering if it would be possible to use the bootstrap resampling technique to compute 1000 pairwise distance matrixes. From this point, several trees could then be reconstructed. Next, the topologies of those trees could be compared to that of the original tree and the bootstrap value for each branch could be computed. Does anybody know if there is a software that can perform the test in this manner? Here are the answers I received: Answer 1. Your proposal to generate bootstrap samples sounds reasonable to me. PHYLIP has a program that generates pseudosamples from sequence data. You then could use other programs from PHYIP or any other program to analyze the samples. Jeff Thorne's statalign program had a module that allowed to generate pseudosamples of distance matrices, when the SE of the pairwise distances could be estimated. But I think MEGA also has the capability to create and analyze pseudosamples. Peter Answer 2.. Bom dia Luiz, para reiterar os datos, eu usei o PHYLIP (http://evolution.genetics.washington.edu/phylip.html), mas acho que deve ser com marcadores genéticos não ligados. No Phylip, abre o "Seqboot" para multiplicar os seus datos. Logo, abre o "Gendist" para calcular as matrices de distancias (com a opção M: Multiple data) Depois o "Neighbor" para fazer as arvores, depois o "Consensus" para construir o NJT consensus, e no final o "drawtree" (ou "drawgram") para vê-lo com as valores. Em cada etapa, o Phylip lhe faz um outfile ou um outfile e um outtree. Você tem que usa-los para a etapa seguinte. Melhor é rechama-los em nomequevcquer.txt Tb o programa Treeview (http://taxonomy.zoology.gla.ac.uk/rod/treeview.html) està bem bonitinho para ver os NJT com bootstrap. Sorte, Cordialmente, Stéphane. Answer 3. I have been trying to do the same - with limited success. You can try PAUP - but again the same problem with bootstraps. The program Populations by Langella - download from http://www.bioinformatics.org/project/filelist.php?group_id„ You can enter your data, and ask the programme to calculate Fst for you and then it performs the bootsrap. Problem I've had is that after 100 bootstraps it crashes. If anyone passes a solution that works, could you email me please. Kind regards Clare Marsden Answer 4. Hi Luiz,  if you are using genotypic nuclear data (allozymes, microsats, SNPs) and need several distances matrices and trees with bootstrap, I would simply suggets using the software POPULATIONS from Langella, availbale at http://www.bioinformatics.org/project/?group_id=84.  cheers,   Greg Answer 5. Luis Guilherme\240Bauzer -- My package PHYLIP can bootstrap gene frequency data, make a distance matrix for each boostrap sample, and compute trees from each sample. The bootstrapping is of whole loci, which means you must have many loci, not just one locus. Of course, you are assuming that the relationships among populations are adequately summarized by a tree, and not much influenced by gene flow after splitting of populations. Otherwise you should investigate using coalescent methods, such as Mary Kuhner's LAMARC package or Peter Beeli's MIGRATE. If both splitting and gene flow are important, Rasmus Nielsen and Jody Hey's package IMa would be needed. J.F. Answer 6. Luiz- You can do this with with the PHYLIP package from Felsenstein's group. http://evolution.genetics.washington.edu/phylip.html First you use SeqBoot to create the bootstrap samples, then GenDist, to get your genetic distance matrices (it doesn't make Fst) and then Neighbor to make your trees, and then Consense to get your bootstrap consensus tree and nodal support values. It is a little cumbersome, but not so bad once you learn to use it. I hope that this helps. Carlos Answer 7. You can do this with Phylip (http://evolution.genetics.washington.edu/phylip.html ). You may need to write a script to do it. I did a similar thing on Mac OS X - I will attach the Perl script so you can see how it might be accomplished. I think it is also discussed on the Phylip pages. Mike Answer 8. You should download Olivier Langellas's POPULATIONS, which will allow you to build neighbor-joining or UPGMA tree with various genetic distance coefficients or Fst, and also bootstrap on locus. You can generate the distances from your raw data with the program or supply your own pairwise distance matrix. There are Linux, Windows (command line window) and Mac builds of the program. The tree output files can be visualized in Tree Explorer in MEGA. It will take a GenePop file as input: http://www.bioinformatics.org/project/?group_id„ Alan Answer 9. If you have multiple markers (multiple loci), then resampling can be done in the same way as for nucleotide or amino acid sequences. This has been implemented in my software DAMBE quite a long time ago. Also, if your marker is microsatellite data, then you should use genetic distances based on stepwise mutation model (SMM) instead of the conventional distances such as Nei's, Reynolds et al, or Cavalier-sforza distances. A SMM-based distance is also implemented in DAMBE. Best Xuhua Answer 10. Hello Luiz, You shoud use Phylip (http://evolution.genetics.washington.edu/phylip.html), it does exactly what you want to do! Best regards, Céline Answer 11. Dear Luiz Bauzer, I am not sure I understand why you would want to base your NJ tree on Fst values instead of nucleotide distances. But I guess that a proper bootstrap analysis would involve resampling of your original DNA sequences into pseudoreplicates and subsequently recomputing the Fst matrix and creating a NJ tree for each pseudoreplicate. Unfortunately, I am not aware of any software that performs such a test. Hope this helps a little bit. Best wishes, Robin van Velzen Answer 12. Hi Luiz, If you have access to the original sequence data you can do the bootstrapping (and NJ tree building) in Phylip (free) or Paup (commercial). I use Phylip and here is a brief "how-to". If you Google "Phylip" it is the first thing that comes up. The web site for Phylip has lot's of helpful info. You can download executables for Unix/Linux, Windows, or the Mac. Phylip is a collection of executable files that work together.  To obtain the NJ tree for your original data, format the data for use with one of the genetic distance calculating modules. There a re a couple these modules to choose from and I don't believe any of them will calculate population pairwise Fst from sequence data. GenDist will calculate Reynold's distance, which is a function of Fst, but it expects marker data. Assuming you find what you need, the genetic distance modules typically expect the input data file to be named "infile". Next take the output genetic distance file "outfile" and rename it "infile" for use with the program  "Neighbor" to construct the NJ tree. Neighbor will generate two output files. One is called "outfile" and provides a crude, character-based representation of the NJ tree. The other is called "treefile" and it contains a representation of the tree for use with other Phylip programs, such as Consense (see below). For bootstrapping your data, you will first need to take your Phylip formatted original data set, name it "infile", and run "SeqBoot" to obtain N bootstrap data sets (say N00). Then use the output of SeqBoot as the "infile" for the genetic distance calculating modules you've chosen to use. Then use the output genetic distance file "outfile" as the "infile" for  "Neighbor" to construct the NJ trees. Then use the output file "outtree" from Neighbor as the "intree" file for  "Consence" to obtain the consensus tree. With all this infile and outfile business it is easy to get confused. I put each Phylip program in its own folder so that I can keep each programs input and output files separate and organized. Phylip also has modules to draw trees but the last time I used them the trees were not rendered with the same quality as generated by, for example, FigTree, Mega, Paup, or TreeView. With some of these you may have to add bootstrap support values by hand... Hope this helps! Cheers,  John. Answer 13. Hey. The software WINDIST/WINBOOT will generate bootstrap values for UPGMA trees for a given distance matrix. I am not aware of its utility for NJ analyses. http://www.irri.org/science/software/ Cheers, Luke Answer 14. The PHYLIP package (http://evolution.gs.washington.edu/phylip.html) can do what you ask: you would use the SEQBOOT program to make the replicate data sets, DNADIST to convert them into distance matrices, and NEIGHBOR to make the neighbor-joining trees. DNADIST and NEIGHBOR have options to take multiple files, so you need make only a single run of each. The program CONSENSE will then make a consensus tree of all the bootstrap results. The only flaw in PHYLIP for this purpose is that it does not contain a graphics program which can print a tree with bootstrap values marked on the branches. Mary Kuhner Answer 15. Dear Dr. Bauzer, We have a computer program called POPTREE2, which is a window version of POPTREE for making a neighbor joining and UPGMA trees with bootstrapping. It is about to be released for the public. You may write to professor Naoko Takezaki to get the program. Naoko's email address is listed above. Masatoshi Answer 16. Dear Dr. Bauzer: As in Dr. Nei's message, POPTREE2 can construct the neighbor-joining tree and UPGMA tree with bootstrap test. Once you make allele frequency data of populations as an inputfile of POPTREE2, the tree can be constructed easily through Windows-interface. POPTREE2 is available at the following webpage. http://www.bio.psu.edu/misc/nxm2/poptree2_HP/poptree2_index.htm If you have any questions or problems, please let me know. Best wishes, Naoko Takezaki lbauzer@ioc.fiocruz.br