Dear EvolDir members, Thank you very much for your suggestions. The consensus seems to be the multitube approach and a pilot survey. All the communications are given below. original query: I amplify macaque faecal DNA using human microsatellite primers. In addition to the risk of contamination, the chance of allelic dropout is high due to the degraded condition of the source DNA. There are many reviews that talk about many plans (e.g.Taberlet et al. 1999 and Pompanon et al. 2005 etc. ) and methods (e.g. multiple tube approach) to tackle this problem as a subset of the whole genotyping error, but I was wondering if there is any consensus in the scientific community about the solution. Any other suggestion about paternity study in macaques also will be appreciated. Thank you in advance. Deb debapriyo@ncbs.res.in -- Debapriyo Chakraborty Research Scholar Nature Conservation Foundation 3076/5, 4th Cross, Gokulam Park, Mysore - 570002, India Website: www.ncf-india.org -- Current Address: Laboratory III National Centre For Biological Sciences Tata Institute of Fundamental Research University of Agricultural Sciences Campus Bangalore - 560065, India Telephone: 91-80-23666031 Website: www.ncbs.res.in Debapriyo Chakraborty Dear Deb, Without simplifying too much you can categorise all microsatellite errors into two classes, stochastic (allelic dropout, false alleles) and systematic (null alleles, contamination). Given that you're using DAN from faeces and primers from another species, you can expect to have every type of error in your data. I don't know much about dealing with systematic errors in paternity analysis but there is a review of the effect of null alleles (Dakin & Avise, Heredity 93:504-9). There is also a parentage analysis program that deals with null alleles (NewPatXL, available on Bill Amos' website http://www.zoo.cam.ac.uk/zoostaff/amos). For general approaches to estimating null allele frequencies see last Saturday's evoldir post on null alleles and population differentiation. Regarding stochastic errors, I'm not sure if there's a consensus, but I think the best strategy is: 1. Clean the data by repeat-genotyping. How often? You'll never make it perfect so I'd say repeat as often as possible without limiting the number of loci you type. For example, in a typical study I suspect that repeating 10 loci 4 times will give you more informative data than repeating 5 loci 8 times, even if the error rate is high (providing your analysis method uses a realistic error model - see below). There are also ways of targeting repeats on lower quality samples (qPCR: Wandeler et al. Mol Ecol 12,1087-1093; statistical: Miller et al. Genetics, 160, 357-366). 2. Estimate the residual error rate in your data. This is typically done by duplicating a subset of your genotypes. You'll get most out of your data if your analysis method (see below) can tell the difference between allelic dropouts and false alleles, in which case you need to estimate these rates separately. Dan Haydon and I have written a method (Johnson & Haydon, Genetics 175, 827-842) and program (http://www.stats.gla.ac.uk/~paulj/pedant.html) for doing this. 3. Perform the parentage analysis, incorporating your estimated error rate(s). The best approach to doing this, particularly for data with frequent errors, is Hadfield et al. Mol Ecol 15, 3715-3730. The program (which is a package for R) estimates and adjusts for allelic dropout and false allele rates, so you can skip step 2. There is also a sibship reconstruction method by Jinliang Wang (Colony, http://www.zoo.cam.ac.uk/ioz/software.htm, Genetics 166,1963-1979) that uses separate error rates. Good luck, Paul ________ Paul Johnson Robertson Centre for Biostatistics Level 11, Boyd Orr Building University of Glasgow University Avenue Glasgow G12 8QQ, UK paulj@stats.gla.ac.uk http://www.stats.gla.ac.uk/~paulj/index.html Deb, I doubt a consensus will emerge for lcnDNA samples, since each study is so unique. Different combinations of sample type, marker, sample size, analysis method and inference drawn will encounter unique challenges. An easy solution is to repeat all specimens 8 times, but this is not feasible, nor is it necessary, for all types of studies. The self-critical approach advocated by Gilbert et al 2005 (TREE 20, 541-544) is on the right track. Ken "Ken Petren" Woodruff-209_NoninvasiveGenotyping.pdf -- David S. Woodruff Professor of Biological Sciences Ecology, Behavior and Evolution Section Division of Biological Sciences University of California San Diego La Jolla CA 92093-0116 Telephone (private answering machine): 858 534 2375 FAX (not confidential): 858 534 7108 dwoodruf@ucsd.edu Hello Deb. The best is to use the most up to date Extraction protocol (Probably Qiagen Stool kit, that u can modify to your needs - or maybe Promega DNA IQ system) and PCR protocol (definitely Qiagen PCR multiplex mix, also ok PE Gold Taq. Maybe something else? SuperTaq HT Biotechno). Then run a pilot study to compare the best methods you've found, with 7-8 replicates and enough samples to estimate droupout. Then u can estimate the number of replicates in the multitube approach. Yes, sorry, it always comes down to multitube approach except that contamination among stool samples is not as big a problm as one could think, but contamination from other tissue samples (references or from other studies) have to be monitored very closely. I've published my method in Cons Genetics. (Regnaut 2006) if u want some details but the main idea is in this email. Good luck. Sebastien. The Zoological Society of London is incorporated by Royal Charter Principal Office England. Company Number RC000749 Registered address: Regent?s Park, London, England NW1 4RY Registered Charity in England and Wales no. 208728 "Sebastien Regnaut" hmmm. Too bad. If something else had come up, it would have been published I guess. The last move is the Taq. Qiagen multiplex appears to be a lot better - better results - easyer to use - faster cycles - than PE Gold Taq. Money matter: Qiagen tool kit comes down to ca. 5 USD per extraction. But extraction success is usually ca. 50 - 70 %. So if have constraints on a number of samples, or on some particular samples that are specifically important, you might have to consider 2 extractions per sample on average. Taq is expensive. I count 1 USD per PCR incl. disposables. That's where negociations with your provider and local VAT matters, because if you have 20 markers on 100 samples, with 5 repeats and 80% PCR success, you are already using 12 000 USD only for the PCR for genotyping. To which you shall add all the reruns, the standards Of course, if you manage to multiplex some of your markers in the PCR, you could reduce this cost very significantly (by half maybe?). Calculating the cost of the genotyping will depend on the technology you use and on how well you can optimise your multiplex. By finding a good multiplex pattern, you could cut the price of runs by a third, which can very very importnat in the budget, depending on the price of your stardard size markers. These can be very very expensive. On some sequencers, if you have time, ability or competent creative technician, you could make your own stds using your own samples. Saves a lot. However, from hearsay, Qiagen Multiplex mix, because it uses a lot less primer and chemicals, more than double the life span of Capillaries (If you are using capillary sequencer). This reduces the cost, but not by much. Time matter: My extraction protocol is 3 hours per 11 samples + blank. 22 samples per day, if you have 70% and 100 samples might seem ok, but this section can end up being a lot. You need to have a plate PCR thermocycler. If you use only tubes, you will increase the handling time too much to secure sample conservation anyway. And also you will need a good quality PCR machine to ensure homogeneity across samples and dramatic loss of valuable samples when the hotlid is not doing the job. General organization, highthroughput and supply chain kind of logistic will cut down the lab time by a lot. (both the number of hours in handling and the number of days of lab use), but the time used in the end will greatly vary depending on your sequencer and on your ability to organise your work. I guess you could go through 100 samples and 20 markers in 4 months if you are midly experienced. If you need to provide a precise prospective budget for your finance dept or your funding source, you should include time and budget for pilot study and for data analysis, which takes a hell of a long time (checking, double checking, getting consensus genotypes etc.) In case any of that I mention in this email is unclear, please do not hesitate to mention it. All the best. Seb. Dear Debapriyo, unfortunately, rather than offering a consensus, I can provide another paper characterizing SSR genotyping errors from low-quality template DNA (attached pdf). I found that choosing markers with small fragment sizes (or redesigning the primers) greatly helped. Sincerely, Kristina Dr. Kristina Sefc Department of Zoology University of Graz Universitätsplatz 2 8010 Graz Austria Tel. +43-(0)316-3805601 Fax +43-(0)316-3809875 > -----Ursprüngliche Nachricht----- > Von: evoldir@evol.biology.mcmaster.ca > [mailto:evoldir@evol.biology.mcmaster.ca] > Gesendet: Monday, April 02, 2007 7:41 AM > An: Sefc, Kristina (kristina.sefc@uni-graz.at) > Betreff: Other: Allelic dropout -- Debapriyo Chakraborty Research Scholar Nature Conservation Foundation 3076/5, 4th Cross, Gokulam Park, Mysore - 570002, India Website: www.ncf-india.org -- Current Address: Laboratory III National Centre For Biological Sciences Tata Institute of Fundamental Research University of Agricultural Sciences Campus Bangalore - 560065, India Telephone: 91-80-23666031 Website: www.ncbs.res.in Debapriyo Chakraborty