Dear all, Many thanks to all of you that have answered my question on null alleles estimates. As required by many of you, below are the answers I have received, together with those of Mike Ritchie, whom had had a similar question and kindly sent to me the answers he has got. I hope it helps, Best wishes Carolina Original post: Dear all, There are currently several methodologies available to estimate null allele frequencies (and evidence for) when working with microsatellite data. My question is: what is the best method to use? If anyone can shed me a light into it, I´ll appreciate it. Thank you very much, Sincerely, Carolina Carolina I. Miño, MSc. ANSWERS I GET The routine method is the maximum likelihood method with the EM algorithm. You can use the likelihood ratio test or information theoretic indices for statistical support. Best Xuhua Xuhua Xia, Professor of Biology CAREG and Biology Department University of Ottawa 30 Marie Curie, P.O. Box 450, Station A Ottawa, Ontario Canada K1N 6N5 Tel: (613) 562-5800 ext 6886 Fax: (613) 562-5486 URL: http://dambe.bio.uottawa.ca Hi Carolina, I use Microchecker and Cervus to check our data sets for the presence and frequencies of null alleles (nf). I generally cull loci with nf>0.10 for population genetic studies and nf>0.05 for parentage analyses. I hope this helps. Cheers, Rick Rick Brenneman, PhD Conservation Geneticist Center for Conservation and Research Omaha's Henry Doorly Zoo 3701 South 10th Street, Omaha, NE 68107 Office: 402-738-6904 Fax: 402-733-0490 Hola Carolina, I imagine you will receive many replies to your question, as you are correct! There are many methods available to estimate the suspected null allele frequencies. However, perhaps this paper is of interest? We published a paper a few years ago estimating null alleles and mutations as I was dealing with real cases of erroneous estimates of illegitimate offspring in a seabird species we were studying. Perhaps it is useful if you are trying to determine what the case is for specific individuals if you are working on parentage studies. (A formula example is given in the paper. There is a small typo on pg. 213 where it says 'The probability of randomly sampling from the same breeding site a chick and a male that appear by chance to be related was calculated as 1 in216,000 or less' which should read'was calculated as 1 in 11111 or less' which corresponds to dividing 1 into the more conservative probability given above in the same paragraph, 9.0 x 10-5). Best wishes, Gabriela Ibarguchi MIKE RITCHIE’S ANSWERS Good news on null alleles. Very many thanks to all those who replied to my query about null alleles and population structure. I had around 50 answers, which is testament to the extent of the problem. Shortly before my submission Chapuis & Estoup published a simulation of the effects of various ways of estimating the frequency of “the” null allele and adjusting Fst estimates. They have also developed software to do this, which will surely become an essential resource: Chapuis & Estoup 2007. Microsatellite null alleles and estimation of population differentiation. Mol. Biol. Evol. 24: 621-631. http://www.montpellier.inra.fr/URLB/ Van Oosterhout’s MICROCHECKER (the current version is a major update from the last time I looked) applies one of these and is available here: http://www.microchecker.hull.ac.uk/ van Oosterhout, C., D. Weetman, and W. F. Hutchinson. 2006. Estimation and adjustment of microsatellite null alleles in nonequilibrium populations. Mol Ecol Notes 6:255-256. So, if you are prepared to live with the assumptions of these (the main ones being that your problem is due to a single null, and no Wahlund effects) there is great scope for Fst-style analyses. A nice surprise to me is that an imminent release of STRUCTURE allows treating loci as dominant markers so allowing analyses of potential sub-structure despite nulls. This is not available yet, but is to be released soon on the STRUCTURE website and Daniel Falush and colleagues have a paper just out in MEN on this. http://www.blackwell-synergy.com/doi/abs/10.1111/j.1471-8286.2007.01758.x http://pritch.bsd.uchicago.edu/structure.html It may be worth mentioning here that there are also methods for relatedness and parentage: Wagner AP, Creel S, Kalinowski ST (2006) Estimating relatedness and relationships using microsatellite loci with null alleles. Heredity 97:336-345. I will not send all the responses I received out on EvolDir, as these are lengthy and the new developments bypass many of them, but I have compiled all of the answers here: http://bio.st-and.ac.uk/supplemental/ritchie/NullResponses.html I would like to finish with two nice quotes: My instincts are to think a lot about the data, keep your eye on the biology and total evidence, and be prepared to argue your case with editors. Sometimes they are unreasonable, but that can happen on any issue. Paul Sunnucks If we combine our problems, maybe there is food for an article "S.O.S. - Save Our Samples: survival guide to get the best out of the worst". This might get a lot of citations. Dieter Anseeuw Best wishes & thanks again, Mike A rather poorly edited compilation of many of the responses (they are still coming in but are mainly pleas to see this). [Note that I have indicated attachments but do not provide them] Forwarded message ---------- Date: Fri, 23 Mar 2007 10:21:31 -0700 From: Mark Camara To: mgr@st-andrews.ac.uk Subject: RE: Other: Nulls and pop structure Mike- I'd definitely be interested in any replies you get on this question. I work on oysters, and null alleles are a huge problem. Pretty much every microsat we look at has null alleles, sometimes at frequencies of 20-30%, so there's really no way around them. We're now using AFLPs out of desperation with the rationale that these can be treated as biallelic (i.e. null vs. detectable), but this also has issues as there are probably multiple nulls in reality. It's possible to estimate null frequencies for microsats, but this would also require assuming a single null allele for analysis, which is unlikely to hold. Best- Mark D. Camara USDA/ARS Shellfish Genetics OSU - Hatfield Marine Science Center 2030 SE Marine Science Dr. Newport, OR 97365 Office: 541-867-0296 Fax: 541-867-0138 Mailto: Mark.Camara@oregonstate.edu Forwarded message ---------- Date: Fri, 23 Mar 2007 09:52:15 -0600 From: "Kalinowski, Steven" To: mgr@st-andrews.ac.uk Subject: Nulls and pop structure Hi Mike, I agree that the question has not received the attention it deserves, which is unfortunate, because the questions you raise would not be difficult to answer. Simulation could be used to see how null alleles affect estimates of FST etc, and then, most, if not all, methods for analyzing data could be extended to accommodate null alleles. See Wagner et al. (2006) for an example of how null alleles were included in relatedness calculations. Wagner AP, Creel S, Kalinowski ST (2006) Estimating relatedness and relationships using microsatellite loci with null alleles. Heredity 97:336-345. See Kalinowski and Taper (2006) for a paper describing an improved method for estimating the frequency of null alleles at populations in Hardy-Weinberg equilibrium. Kalinowski ST, ML Taper (2006) Maximum likelihood estimation of the frequency of null alleles at microsatellite loci. Conservation Genetics 7:991-995. Good luck. Steven Kalinowski Forwarded message ---------- Date: Fri, 23 Mar 2007 09:33:21 -0600 From: Ruth Hufbauer To: mgr@st-andrews.ac.uk Subject: Re: Other: Nulls and pop structure Dear Mike, I'd like to see the responses you get to this. My solution generally has been to toss loci with too many nulls (e.g. 25%) and try to keep the rest. Then hope, as you said, that reviewers will be kind. Best, Ruth Ruth A. Hufbauer Associate Professor http://lamar.colostate.edu/~hufbauer/ Department of Bioagricultural Sciences and Pest Management ColoradoState University 1177 Campus Mail Fort Collins, CO 80523-1177 USA office: C147 Plant Sciences (970) 491-6945 lab: E113/115 Plant Sciences (970) 491-5984 fax: (970) 491-3862 email: hufbauer@lamar.colostate Date: Fri, 23 Mar 2007 08:16:43 -0700 From: Jennifer DeWoody To: mgr@st-andrews.ac.uk Subject: Nulls.and.pop.structure Hello, Dr. Ritchie, I have had similar experiences trying to estimate population structure from microsatellite data containing nulls. I used van Oosterhout's program which detects nulls and other scoring errors, and then recalculates allele frequencies taking into account the perceived frequency of the null at each locus. This provides allele frequencies for the actual null allele as well, and allows you to use the allele frequencies in programs like Spatial Genetic Structure (SGS) or even PHYLIP to conduct population-level analyses. I know of now way to adjust the genotypic data to reflect nulls (short of sequencing every sample), which means that any individual-based analyses (inbreeding, fine-scale population structure, parentage) will still be biased by the nulls. I am attaching vanOosterhout's most recent paper describing the revised version of his program, as well as a short review I co-authored on the challenges of using SSRs in these applications. Hope you find them helpful. I'd be interested in reading the other feedback you get. How to treat nulls (and other scoring errors) is a persistent and difficult question when using this powerful marker type. Thanks! Cheers, Jenn (See attached file: VanOosterhout et al MolEcoNotes 2006.pdf)(See attached file: DeWoody et al mitigating errors MolEcoNotes 2006.pdf) DeWoody, J., J. D. Nason, and V. D. Hipkins. 2006. Mitigating scoring errors in microsatellite data from wild populations. Mol Ecol Notes 6:951-957. Jennifer DeWoody, Biologist USDAForestService, NFGEL 2480 Carson Road, Placerville, CA 95667 530-295-3028 (voice), 530-622-2633 (fax) http://www.fs.fed.us/psw/programs/nfgel/ From par.ingvarsson@emg.umu.se Fri Mar 23 06:19:03 2007 Date: Fri, 23 Mar 2007 07:18:56 +0100 From: Pelle Ingvarsson To: mgr@st-andrews.ac.uk Subject: Re: Other: Nulls and pop structure Dear Mike, There's a recent paper out in MBE by Chapuis and Estoup (Microsatellite Null Alleles and Estimation of Population Differentiation, Mol Biol Evol 2007 24: 621-631) that describe a method for estimating null allele frequencies and that allows you to calculate Fst with or without null alleles. It might help you out. They also have a nice discussion about population differentiation with null alleles. It'd definitely worth a look. Cheers, -Pelle -- Pär K. Ingvarsson Senior Researcher, Swedish Research Council Associate Professor Umeå Plant Science Centre Department of Ecology and Environmental Science UmeåUniversity, SE-901 87 Umeå tel. +46-(0)90-786-7414, fax. +46-(0)90-786-6705 web: http://mendel.eg.umu.se From smouse@ias.huji.ac.il Fri Mar 23 06:27:28 2007 Date: Fri, 23 Mar 2007 08:26:56 +0200 From: Peter Smouse To: mgr@st-andrews.ac.uk Subject: RE: Other: Nulls and pop structure Mike, quick answer for the moment. Several years ago, I looked into this problem (even before microsats, we had null alleles, but usually not quite so bad), and I discovered that if you were to toss out the 'no types' as lab-artifacts (sometimes the magic doesn't work), then Wahlund Effect and Null alleles for single loci were indistinguishable in their effects. Both F-stat routines and Amova are vulnerable, and I don't see any way out. If you are comfortable treating the 'no-types' as null homozygotes, however, you can do a likelihood analysis that sorts it out, at least mathematically. Of course, that assumes that you have no lab artifacts, a big assumption with microsats. What was not clear to me then (and it still is not today) was what nulls would do to cross-locus disequilibrium. Legitimate Wahlund Effect also gives you (virtually impossible to avoid) cross-locus associations, usually viewed as disequilibrium. I'm not sure what to expect from null alleles, unless you can assume (can you?) that null alleles are not just generic lab problems (failure to amplify) with the assay that might affect more than one locus. Must run for now, but yell if you need more. Love to hear all the responses. Peter Smouse From n.barton@ed.ac.uk Fri Mar 23 06:32:25 2007 Date: Fri, 23 Mar 2007 06:32:06 +0000 From: Nick Barton To: mgr@st-andrews.ac.uk Subject: Nulls Dear Mike, In the attached, we use a ML method to estimate null frequencies & also use information from missing data. John Brookfield also has a paper on this: there's a discussion in the Goodman et al paper. Hope this helps, Nick Goodman, S. J., N. H. Barton, G. Swanson, K. Abernethy, and J. M. Pemberton. 1999. Introgression through rare hybridization: A genetic study of a hybrid zone between red and sika deer (genus Cervus) in Argyll, Scotland. Genetics 152:355-371. From Paul.Sunnucks@sci.monash.edu.au Fri Mar 23 06:32:39 2007 Date: Fri, 23 Mar 2007 17:32:26 +1100 From: Paul Sunnucks To: mgr@st-andrews.ac.uk Subject: Re: Other: Nulls and pop structure hi Mike this issue has been going on for a long time, and I agree that a centralized solution has not been particularly forthcoming. There has been a lot of hysteria about nulls, although of course in an ideal world they would be detected before you spend too much time and money on them, and avoided. As you suggest, sometimes the only loci you can get for an organism have problems with nulls - this will be endemic for certain taxonomic groups (see below), and over-reaction to this situation will have harmful impacts on the biology of certain taxa. There are a lot of nulls out there in real life, particularly in species with large Ne and thus long coalescence times, or ones with high flanking region mutation rate relative to slippage. Invertebrates seem to suffer disproportionately. The invertebrate projects in my lab encounter nulls frequently, the most extreme case being nearly all 20 loci in Wilson ACC, Sunnucks P, Barker JSF (2002) Isolation and characterization of 20 polymorphic microsatellite loci for Scaptodrosophila hibisci. Molecular Ecology Notes, 2, 242-244. (Ironically, the papers following from this, like the one below, got put through the wringer simply because Stewart Barker had done much more than usual - using large and numerous pedigrees - to show that there were nulls. Usually people just pretend they are not there, and massage them away with Bonferroni correction!). Two recent papers do a good job with nulls Selkoe KA, Toonen RJ (2006) Microsatellites for ecologists: a practical guide to using and evaluating microsatellite markers. Ecology Letters, 9, 615-629. particularly how to identify them (without having to do anything too outlandish) Barker JSF (2005) Population structure and host-plant specialization in two Scaptodrosophila flower-breeding species. Heredity, 94, 129-138. contains the nearest I know of to a manual for how to deal with nulls (see 'wringer' comment above!). There are a number of issues. (1) are they really nulls? You have probably checked, but many people do not do this well. Homozygous excess does not necessarily signal nulls - it is at least as likely to be Wahlund effect. Wahlund effect is relatively easily checked for in a genotypic data set, because you can look for groups of individuals that make sensible groupings of like-genotypes, and share some characteristic such as common sublocation, cohort or some other biologically meaningful feature. People often don't check their markers for sex-linkage, which leads to homozygous excess in the homogametic sex. If there are many nulls, there should be many putative homozygous nulls, which you can count. How do you tell homozygous null from failed PCR? (a) repeat failed reactions, (b) multiplex as internal control for PCR. (2) do nulls really matter? (a) If you have lots of loci, and nulls are just as likely to be at equal frequencies through samples, nulls might have very little effect. You can use the proportion of homozygous nulls as an estimate of whether nulls are at approximately equal frequency. You can fiddle with data sets to see if nulls really matter - eg assume that all homozygotes actually have a null allele, replace the assumed null with a non-existent allele (so that the other allele will be counted only once and not twice), recalculate everything and see if it makes much difference. Even when nulls are not equally distributed among samples, this probably means something (nulls are alleles too - they may not be any more heterogeneous in sequence than alleles that people accept simply because they are the same size. However, whenever we have looked into allele size homoplasy, it has been quite common). (b) genotypic analyses (assignment tests etc) might be a lot less sensitive to nulls, because the signal is a distilled essence of the individual over many loci. Don't know if that helps, but my instincts are to think a lot about the data, keep your eye on the biology and total evidence, and be prepared to argue your case with editors. Sometimes they are unreasonable, but that can happen on any issue. Paul -- Dr Paul Sunnucks Senior Lecturer in Zoology Schoolof BiologicalSciences MonashUniversity Clayton Campus 3800 Victoria Australia email Paul.Sunnucks@sci.monash.edu.au phone + 61 3 99059593 http://www.biolsci.monash.edu.au/staff/sunnucks/index.html From Bo.Simonsen@forensic.ku.dk Fri Mar 23 07:04:46 2007 Date: Fri, 23 Mar 2007 08:05:07 +0100 From: Bo Thisted Simonsen To: mgr@st-andrews.ac.uk Subject: VS: Other: Nulls and pop structure Dear Mike, You have a good point! I certainly agree that nulls are a problem you have to address if you want to do good and trustworthy analyses of population structure etc. I mainly do paternity and other kinship testings (on humans) and for these calculations of kinship indices it is possible to include the possibility of existence of silent alleles (null-alleles). However, over all I think the best approach, if you have good reason to suspect that you have a relatively high frequency of null alleles, will be to redesign your primers and get the missing alleles. Costly and time-consuming, but maybe the best way not to have to rely on kind and understanding reviewers... Good luck, Bo Bo Thisted Simonsen Forensic Geneticist, MSc., Ph.D Department of Forensic Genetics Instituteof Forensic Medicine 11 Frederik V's Vej DK-2100 Copenhagen e-mail: bo.simonsen@forensic.ku.dk Tel (+45) 3532 6110 Fax (+45) 3532 6120 From gernot.segelbacher@wildlife.uni-freiburg.de Fri Mar 23 07:17:23 2007 Date: Fri, 23 Mar 2007 08:17:33 +0100 From: Gernot Segelbacher To: mgr@st-andrews.ac.uk Subject: null alleles Dear Mike, good you brought this issue to the attention of evoldir. I would be very much interested in receiving any responses to that. In my experience there is either one of the two strategies - Omitting all loci which have high rates of null alleles (but there is no defined threshold, what is high) and only use the ones which are in HW equilibrium. or - using all loci and stating that some of them have high frequencies of null alleles. Comparing both strategies in my datasets I did not find strong significances in estimating FST and population differentiation. Of course if you start estimating pedigrees and paternity analyses you need to be more cautious. The only reference I know which is dealing with this problem is (Dakin & Avise 2004). would be interesting if there is a common sense in dealing with that problem. cheers Gernot -- Dr. Gernot Segelbacher Wildlife Ecology and Management University Freiburg Tennenbacher Str. 4 D-79106 Freiburg Germany gernot.segelbacher@wildlife.uni-freiburg.de http://www.wildlife.uni-freiburg.de/team/segelbacher From luisa.orsini@helsinki.fi Fri Mar 23 07:46:17 2007 Date: Fri, 23 Mar 2007 09:46:23 +0200 From: Luisa Orsini To: mgr@st-andrews.ac.uk Subject: null alleles Dear Mike, I was going to post the same enquiry on evoldir list. I have the same problem as you and it applies to microsatellites and possibly SNPs in my species (Glanville fritillary butterfly). I would very much appreciate if you could share suggestions on how to treat the null allele problem thank you best wishes Luisa Orsini Dr Luisa Orsini Metapopulation Research Group Department of Biological and Environmental Sciences PO Box65(Viikinkaari 1) 00014 University of Helsinki Finland/Europe Phone: +358 9 191 57737 Fax: +358 9 191 57694 e-mail addresses: luisa.orsini@helsinki.fi http://www.helsinki.fi/science/metapop/english/People/Luisa.htm From Dieter.Anseeuw@kuleuven-kortrijk.be Fri Mar 23 08:08:23 2007 Date: Fri, 23 Mar 2007 09:09:46 +0100 From: Dieter Anseeuw To: mgr@st-andrews.ac.uk Subject: Re: Other: Nulls and pop structure Hi Mike, same problem here. Please, keep me informed of the responses you get on your query. As an overall suggestion, I think (depending on the number of loci + how badly assumptions are violated, and can be corrected for) that it may still be possible to draw certain conclusions on the population level, but when it comes to individual-based analyses (like structure, baps, ...) it becomes a bit too tricky. But, you may want to get in contact with dr. van Oosterhout, who is settled in Hull and has quiet some experience with nulls. Hehe, if we combine our problems, maybe there is food for an article "S.O.S. - Save Our Samples: survival guide to get the best out of the worst". This might get a lot of citations, though. The email address is c.van-oosterhout@hull.ac.uk Keep in touch, Best regards, Dieter Anseeuw -- Dieter Anseeuw Katholieke Universiteit Leuven Campus Kortrijk Subfaculteit Wetenschappen Etienne Sabbelaan 53 B-8500 Kortrijk Belgium Direct phone: +32.(0)56.24.61.72 Fax: +32.(0)56.24.69.99 Skype contact: Diddeka http://www.kuleuven-kortrijk.be/~danseeuw Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From cpapetti@mail.bio.unipd.it Fri Mar 23 08:24:07 2007 Date: Fri, 23 Mar 2007 09:23:37 +0100 From: chiara papetti To: mgr@st-andrews.ac.uk Subject: Nulls and pop structure Dear Mike, I read your evoldir mail and I agree with you saying that the nulls problem in quite widespread (I have got the same problem with my micros) while the literature seems to lack a "best solution" for this. I tried with different approaches as for instance "MICROCHECKER genotypes correction", but I still don't think this is the best way. Could you please let me know about the answers you will get? Thank you very much Best regards Chiara Dr. Chiara Papetti Biology Dept UniversityOf Padova Via G. Colombo I-35100 Padova Italy e-mail cpapetti@bio.unipd.it chiara.papetti@unipd.it Tel 0039 049 8276222 From tbata@daimi.au.dk Fri Mar 23 08:25:17 2007 Date: Fri, 23 Mar 2007 09:25:09 +0100 From: Thomas Bataillon To: mgr@st-andrews.ac.uk Subject: null alleles and Fst Dear Mike I saw you post on Evoldir and you may want to look in to a recent paper published in MBE entitled Microsatellite Null Alleles and Estimation of Population Differentiation by Chapuis and Estoup http://mbe.oxfordjournals.org/cgi/content/short/24/3/621 Best Thomas. -- Thomas Bataillon INRA-UMR 1097 (Montpellier, France) & University of Aarhus (Denmark). Email: tbata@daimi.au.dk Homepage: http://www.daimi.au.dk/~tbata Tel +45 89 42 33 59 Fax +45 89 42 30 77 Working address: BiRC - Bioinformatics Research Center Universityof Aarhus Hoegh-Guldbergs Gade 10, Building 1090 DK-8000 Aarhus C. Denmark From thierry.demeeus@mpl.ird.fr Fri Mar 23 09:28:36 2007 Date: Fri, 23 Mar 2007 10:27:55 +0100 From: "Thierry de [iso-8859-1] Meeûs" To: mgr@st-andrews.ac.uk Subject: Nulls and pop structure Dear Mike Null alleles influence population structure by increasing the response variance and thus decrease the power of your tests The G based test of Fstat with the option not assuming HW is independent of intra-individual correlation between alleles. It should thus work without major problems. As far as I understood BAPS and Structure Documentations these software are multilocus based and not "heterozygosity" based. I think these should work OK with a probable loss of accuracy. Nevertheless, nobody (as far as I know) has ever precisely analysed the extent of impact that null alleles and their frequency may have on statistical decisions regarding population diffrentiation and biases in inferences (Nm or DSigma² estimates) that can be made. To my knowledge, much less can be found on more recent techniques as clustering and assignement procedures. My advise would be to compensate with the nicest possible sampling designs (more individuals in more restricted areas). If some of your loci have no null then you can also use it as safeguards to check for possible biases coming from loci with nulls. Good luck Sincerely Thierry Thierry de Meeûs Génétique et Evolution des Maladies Infectieuses UMR CNRS/IRD 2724, UR IRD 165 Equipe: Structures Génétiques et Adaptation dans les Systèmes Symbiotiques (SGASS) Centre IRD de Montpellier 911 Avenue Agropolis, B.P. 64501 34394 Montpellier Cedex 5, France. \||/ o <) ) o o /||\ o o o o Tel: +33 (0)467 41 63 10 Secrétariat: +33 (0)467 41 61 97 Fax: +33 (0)467 41 62 99 http://gemi.mpl.ird.fr/cepm/SiteWebESS/Fr/deMeeus/TdeMeeus.html Site web du groupe "Tiques et maladies à tiques" http://gemi.mpl.ird.fr/cepm/SiteWebESS/GroupeTiqueREID/GroupeTiqueREID.html From Martim.PINHEIRO-DE-MELO@cefe.cnrs.fr Fri Mar 23 09:47:28 2007 Date: Fri, 23 Mar 2007 10:47:05 +0100 From: Martim PINHEIRO-DE-MELO To: mgr@st-andrews.ac.uk Subject: RE: Other: Nulls and pop structure Dear Mike, I think FreeNA attempts to address what you are looking for. All the best Martim FreeNA: http://www.montpellier.inra.fr/URLB/ Marie-Pierre Chapuis & Arnaud Estoup FreeNA is a PC computer program which performs three major tasks on any microsatellite dataset harboring null alleles: (1) it estimates null allele frequencies for each locus and population analysed following the Expectation Maximization (EM) algorithm of Dempster, Laird, and Rubin (1977); (2) it estimates unbiased FST (Weir 1996) following the ENA method described in Chapuis and Estoup (in press); (3) it computes a genotype dataset corrected for null alleles following the INA method described in Chapuis and Estoup (in press), which may be used to calculate the Cavalli-Sforza and Edwards' (1967) genetic distance. All methods used in this program are tested and discussed in Chapuis and Estoup (in press). The paper has now been published: Marie-Pierre Chapuis and Arnaud Estoup Microsatellite Null Alleles and Estimation of Population Differentiation Mol. Biol. Evol., March 2007; 24: 621 - 631. From philippe.jarne@cefe.cnrs.fr Fri Mar 23 09:48:41 2007 Date: Fri, 23 Mar 2007 10:48:21 +0100 From: Philippe JARNE To: mgr@st-andrews.ac.uk Subject: RE: Other: Nulls and pop structure Dear Mike, I have been struggling with similar problems over the years. Chapuis & Estoup have a paper which is either in press, or was published recently on the influence of nulls on differentiation, and the good news is that they are of minor importance on genetic distance, and a little bit more on Fst. Fis is of course strongly affected, and you might be interested in using a recently developed technique based on multilocus heterozygosity for estimating inbreeding (it was indeed done for estimating the selfing rate, but can be used in any species) which is not affected by nulls. If your species has separate sexes and you expect on the whole to be at HW equilibrium, the difference between F and multilocus estimates of inbreeding even gives you an idea about the magnitude of nulls. This is by P. David and co-workers and in press in Mol Ecol. A software can be downloaded from my institute web site (cefe.cnrs.fr). Cheers, Philippe CEFE-CNRS 1919 route de Mende 34293 Montpellier cedex 5 France tél / phone (0)4 67 61 32 27 fax (0)4 67 41 21 38 -----Message d'origine----- From anne.chenuil-maurel@univmed.fr Fri Mar 23 09:50:44 2007 Date: Fri, 23 Mar 2007 10:50:36 +0100 From: Anne CHENUIL-MAUREL To: mgr@st-andrews.ac.uk Dear Mike Please send me the answers you will get. I have the same problem. It is a good idea to send this call. I would think that for populations differentiation (Fst permutation and exact tests) null alleles would not bias the results in a dangerous way... If differentiation is observed it may not be an artefact of null alleles. Null alleles would rather mask differentiation (conservative). thank you in advance Anne Quoting Dieter Anseeuw : From C.Van-Oosterhout@hull.ac.uk Fri Mar 23 10:20:11 2007 Date: Fri, 23 Mar 2007 10:20:02 +0000 From: C.Van-Oosterhout@hull.ac.uk To: Dieter.Anseeuw@kuleuven-kortrijk.be Cc: mgr@st-andrews.ac.uk, Bill Subject: Re: Other: Nulls and pop structure Hi Mike and Dieter, As you are probably both aware, there are correction algorithms for null alleles which reduce the bias in visible allele frequencies and estimate teh null frequency. Micro-Checker has algorithims that do just that and is pretty easy to use. Presently, this is the accepted way of dealing with this problem. Unpublished simulation studies show that the effect of nulls are unpredictable in that they can over - or underestimate the observed level of differentiation (Fst, Gst). The null allele correction reduces the error. I should really be writing this up but am too busy with other things. Sorry not to be of more help. Mike, what type of analyses were you thinking of using? Best wishes, Cock -- Dr. Cock van Oosterhout Universityof Hull HullHU6 7RX, UK Tel.: +44(0)1482 465505 Tel.: +44(0)1482 466434 Fax.: +44(0)1482 465458 http://www.hull.ac.uk/gyro-scope/ http://www.microchecker.hull.ac.uk/ Quoting Dieter Anseeuw : From M.Mc-Mullan@biosci.hull.ac.uk Fri Mar 23 11:34:52 2007 Date: Fri, 23 Mar 2007 11:34:24 -0000 From: Mark McMullan To: mgr@st-andrews.ac.uk Subject: Null alleles Hello there, I'm in the first year of my PhD I've not used microsatellite analysis yet but I am working with MHC, renowned for high numbers of alleles. van Oosterhout et al. 2006 Heredity paper showed the likelihood of having found all alleles by plotting discovery curves. These are curves of number of alleles discovered by number of clones screened. As you screen more clones you expect to find less new alleles. As this curve levels off you can say with some certainty that there are few alleles left to discover, or even make an assumption as to how many alleles you think are left undiscovered. Let me know if this is any help, or even if its not, Mark From jordanma@ipfw.edu Fri Mar 23 13:10:23 2007 Date: Fri, 23 Mar 2007 09:10:11 -0400 From: Mark Jordan To: mgr@st-andrews.ac.uk Subject: Re: Other: Nulls and pop structure Hello Mike, Thanks for bringing up this question as I am working on a similar problem. A partial solution that I have recently discovered is found in a new paper that addresses the effect of null alleles on population structure (Chapuis, M. and A. Estoup. 2007. Mol. Biol. Evol. 24:621-631). Fst and genetic distance are considered in simulations and two empiricial datasets. Software that "corrects" a dataset for null alleles so that both parameters can be estimated is available (FreeNA, http://www.montpellier.inra.fr/URLB/ ). I may be wrong, but I imagine that the corrected dataset could be used in Structure or BAPS as well. I look forward to learning of other ideas generated from your query. Although it is not part of your question, if you get responses that discuss measures of genetic diversity in this context I would like to know of them. Regards, Mark Mark A. Jordan, Ph.D. Assistant Professor Department of Biology IndianaUniversity-Purdue University 2101 E. Coliseum Blvd. Fort Wayne, IN 46805-1499 Phone: 260-481-6315 FAX: 260-481-6087 Email: jordanma@ipfw.edu Web: http://users.ipfw.edu/JordanMA/index.htm From elmerk@biology.queensu.ca Fri Mar 23 13:42:14 2007 Date: Fri, 23 Mar 2007 09:41:28 -0400 From: Kathryn Elmer To: mgr@st-andrews.ac.uk Subject: Re: Other: Nulls and pop structure Mike, I am not totally clear on what you've already covered, but the program Micro-checker can give you a "corrected" allele frequency (ie correcting for nulls) locus by locus but it assumes HWE. There is also some options out of U Alberta, I think Stroebeck's Lab, but I haven't used them. Myself I don't think lack of HWE is good evidence for nulls but it depends on the pattern of allele sizes and sampling strategy. For example, isolation by distance may give the same pattern of homozygote excess. Below are a couple recent useful reviews. It would definitely be interesting to develop a correction factor based on rate of null alleles. This would be tricky though because the homozygote could be a biological rather than artefactual reality. Null alleles are such a headache! best, Kathryn Dakin, E. E., & Avise, J. C. (2004). Microsatellite null alleles in parentage analysis. Heredity, 93, 504-09. DEWOODY, J. , NASON, J. D., & HIPKINS, V. D. (2006). Mitigating scoring errors in microsatellite data from wild populations. Molecular Ecology Notes, 6(4), 951-57. Kathryn R. Elmer, Ph.D. elmerk@biology.queensu.ca http://www.walled.net/~kelmer/ (416) 535-5838 From bwestfall@fs.fed.us Fri Mar 23 13:53:33 2007 Date: Fri, 23 Mar 2007 07:51:03 -0700 From: Bob Westfall To: mgr@st-andrews.ac.uk Subject: Fw: Other: Nulls and pop structure Mike, Because nulls are, in effect, 'missing' data, consider multiple imputation. See: http://en.wikipedia.org/wiki/Imputation_%28statistics%29 Although there is not a lot of detail in the Wikipedia entry, there are a number of useful links. If you have access to SAS, the statistical package does have a MI procedure with a number of options of potential use for genetic data. Bob Robert D. Westfall, Quantitative Geneticist Sierra NevadaResearch Center PSW Research Station, USDA Forest Service Fall-Spring: Summer: PO Box245 . c/o Inyo National Forest, Box 429 Berkeley, CA 94701 USA Lee Vining, CA 93541 USA PH: 510-559-6438 760-647-3026 FAX: 510-559-6499 760-647-3027 email: bwestfall@fs.fed.us Forwarded by Bob Westfall/PSW/USDAFS on 03/23/07 06:43 AM ----- evoldir@evol.biology.mcmaster.ca wrote on 03/22/2007 09:43:43 PM: Carolina Minio