Hi all, I recently posted a question about incomplete lineage sorting and hybridization and I will here forward all the relevant and excellent answers (thanks a lot!!) i recieved. I was glad to see that this is a topic that many of you are thinking about. Original post: Dear all, Are there any methods that can specifically distinguish between incomplete lineages sorting from intregressive hybridization for sequence data. I'm studying two sister species that are well separated in microsatellites, allozymes, morphology and ecology but share all common haplotypes in one mitochondrial gene and two nuclear introns (that also lack a deep divergence between any haplotypes). MDiv gives maximum likelihood estimates of M (mNe) upp to 2 in some populations and because of this divergence time estimates are uncertain (non-zero ends on the posterior probability distributions). Due to the high estimates of M we would like to argue that indeed the sharing of haplotypes is mainly due to introgression. Nevertheless reviewers to the manuscript argue that this still could be due to incomplete lineage sorting and claim that test specifically designed to distinguish between these two scenarios do exist (without mentioning which). I haven't been able to fins such a test in the literature so I ask for your help. Thank you in advance. ----- Answers: ----- Dear Petri, Your reviewer is wrong. The IM and Mdiv packages designed to implement the models described by Nielsen and Wakely (2001) and by Hey and Nielsen (2004) are specifically designed to address exactly this question. Do you have non-zero tails at both ends of your TDIV distribution? The right tail in your TDIV posterior will always be non-zero for analyses of a single locus data, no matter what your estimate of M is; this is shown in the original Nielsen and Wakeley paper. If you analyzed several of your data sets together using IM instead of MDIV, this non-zero tail in Tdiv MIGHT go away. As far as demonstrating introgression rather than lineage sorting, you could use the posterior distribution of M to calculate the probability that it is different from some benchmark. In a paper I published in Molecular Ecology (see refs below), I summed under the curve of the posterior distribution to show that I could reject the hypothesis that M was greater than or equal to 1; there is a boundary condition at zero so I couldn't calculate the probability that M was actually different than zero, but I was able to reject panmixia . Depending on the shape of your posterior distribution you MAY be able to reject the hypothesis that M is less than one. If you look at the IM discussion board: http://groups.google.com/ group/Isolation-with-Migration , you can find a recent exchange I had with Jody Hey about using this procedure to test demographic hypotheses: http://groups.google.com/group/Isolation-with-Migration/browse_thread/ thread/64dd7f6187960c1f Hey, J., and R. Nielsen. 2004. Multilocus Methods for Estimating Population Sizes, Migration Rates and Divergence Time, With Applications to the Divergence of Drosophila pseudoobscura and D. persimilis. Genetics 167:747-760. Nielsen, R., and J. Wakeley. 2001. Distinguishing migration from isolation: a Markov Chain Monte Carlo approach. Genetics 158: 885-896. Smith, C. I., and B. D. Farrell. 2005. Phylogeography of the longhorn cactus beetle Moneilema appressum Leconte (Coleoptera: Cerambycidae): Was the differentiation of the Madrean sky-islands driven by Pleistocene climate changes? Mol Ecol 14:3049-3065. (Petris comment: this is what I will first and foremost do for my data! thanks a lot Chris!) ----- Hi Petri, There are a couple of recent papers that talk about identifying introgression as inconsistent alleles from particular genes with respect to (most) other genes. They do this using IM or IMa, a program that will estimate divergence times and ancestral population sizes under a coalescent model (which will account for incomplete lineage sorting). The latter also includes a parameter for secondary gene flow following divergence. The logic is that if a set of genes or alleles gives a significantly more recent divergence time estimation than the combined dataset, or the dataset without those alleles, they are likely to have been introduced due to introgression. Jody Hey, who co-authored these programs, has a google discussion group as well. The papers are below. I'd be interested in any other suggestions you get. Good luck! Peters et al. 2007. Evolution 61(8): 1992-2006 Carling & Brumfield. 2008. Genetics 178: 363-377. ------ Hi Petri, Check out my paper attached. The method has it's problems but might be ok in some situations. Also, you should check out the models of Wakeley and Hey and the IM program. Cheers, Thomas ----- Dear Petri, I think there is an important ambiguity that exists in the distinction between lineage sorting and introgression you should think about. Introgression is in fact part and parcel of the lineage sorting process until lineage sorting has gone to completion. Subsequent introgression would reintroduce previously lost allelic lineages into one or both gene pools. This would, in effect, re-establish lineage sorting processes, although I recognize this is not what you or others mean when trying to draw this distinction. The question is really between primary and secondary lineage sorting, where introgression would be the trigger for secondary lineage sorting and merely a possible component of primary lineage sorting. I would essentially describe your current task as trying to determine whether or not primary lineage sorting had ever gone to completion for the loci you are studying. It is, of course, irrelevant whether it did or did not go to completion for any other loci, and I would object as a reviewer if you tried to extend your conclusions about lineage sorting at these loci to the rest of the genome. Cheers, Guy ----- Dear Petri, There are attempts at disentangling these two processes leading to reticulated species relationships, like that of Sang and Zhong (2000, Syst. Biol.), based on expectances of coalescent times. However, these methods are flawed and do not really allow to recognise hybridisation from introgression. My conclusion to this dilemma was that this distinction is really hard to make unless you call geography into play. Hybridisation is expected to show a geographic pattern, stochastic lineage sorting doesn't. Please, have a look to the paper Gómez- Zurita and Vogler (2006, J. Mol. Evol. 62: 421-433) [I attached it to my mail, but your server returned it to me because it exceded attachment size limitations.] Hope it helps. Txus ----- hi Petri Ryan Garrick and I came up with an approach to this, in Garrick RC, Sands CJ, Rowell DM, Hillis DM, Sunnucks P (2007). Catchments catch all: long-term population history of a giant springtail from the southeast Australian highlands a multigene approach. Molecular Ecology, 16, 1865-1882. I remember that Thomas R Buckley et al had a paper in 2006 in Syst Biol that tried another attempt at this Paul Sunnucks ----- Petri Kemppainen, PhD student Tjärnö Marine Biological Laboratory, 45296, Strömstad, Sweden Tel: +46 526 686 83 Fax: +46 526 686 07 Mob: +46 709 360 124 petri.kemppainen@marecol.gu.se