Hi Evoldirs, thank you very much for your answers, it appears a similar question was asked not so long ago. I have pasted the replies below. Hi, Are you using the automatic scoring from Genotyper? My advice is don't, you can not really trust the machine to score your peaks only based on the peak size. Are you defining categories? If you define a category by drawing the square on a peak and then ctl+l (give it a name and select highest peak, exclusive, with (scaled) height of at least___) for each band individually you are more likely to extract the real data. It is like giving standards for each band-size individually. You'll probably find that each band-size has its own characteristics and then you'll be able to tell between noise and a real peak. You may have to avoid overlapping categories, since they are very noisy. You'll find that some peaks need the (scaled) height of 1000 and others of 90. It is kind of half-ways between hand-scoring and the automatic method. Yes, it is time-consuming, a good two weeks of scoring only, but is worthy. I have also found that Nei's distances are not appropriate for AFLPs, they assume the same mutation rate across loci. For a dendorgram try Reynolds, Weir and Cockerham's 1983 as implemented in Phylip.That one does not assume equal mutation rate. Hope this helps, Alejandro Nettel-Hernanz Ph.D. Candidate Dodd Lab UC Berkeley ESPM-Ecosystem Sciences 323 Mulford Hall Hi, I have spent a lot of time also thinking about the problem with calling AFLPs. I have been working with Steve Keller at UVA, and I think an email he sent me recently might be helpful. Second, I've fiddled with this quite a bit, and thought about it even more (sometimes to the point of keeping me up at night!). As you say, it's subjective. I don't see any way around that. However, my best solution has been the following: First, make sure you assign each individual sample a relative signal strength (Genotyper has a way of doing this automatically), so you normalize signal strength among your samples. Next, define your loci across the entire sample using a very stringent (i.e., high RFU threshold, say 200) criteria. The intent here is that only "real" loci are going to get defined. This approach requires a high signal strength in only one individual for a locus to get defined. After I get my loci definitions this way, I then score all the samples with a less stringent RFU requirement (say 50). That way, you're scoring individuals for loci that you have high confidence are real and not artifactual, but you are lenient on whether any given individual is present for that locus. I have started to use this technique and feel more confident when I'm seeing a peak vs. noise. I think that 20 is too low. Hope this helps. Amy Hi, I don't know what instrument you are using to run your AFLP samples, but the ABI capillary system we've been using (ABI 3100) only processes peaks above 50. It's suggested you do not score low peaks as it's close to instrument threshold. So if I'm correct people usualy score peaks that are mostly above 100. The problem with smaller peaks is not only the risk that another small peak won't be detected, but also some small peaks can be just AFLP artefacts --> hence 2 possibilities of introduction of noise into your dataset. Using AFLP, you should have a plenty of markers, so using less, but more correct ones should only be an advantage. Another thing to look at is the size of your bins (categories) in Genotyper. I think the recommended size is around 1bp. So adopting rather broader concepts of categories - as size of peaks may vary slightly between runs or capillaries - may also reduce some noise in your data. Especially if you have a lot of markers that are quite close to each other and/or many diverse populations or even species. Hope this helps & good luck with your work! Sarka Sarka Jahodova Department of Ecology Charles University Vinicna 7 128 44 Prague Czech Republic Hi, I have been wrestling with my AFLP's for a couple of years now, so I can tell you what I've learned. I'd be interested to hear what others advise you as well. Abundant noise is common for AFLP's, as well as problems with changes in signal strength between runs, and differences in peak size due to heterozygosity and superimposition of different fragments of the same length. Because of these things, it is appropriate to limit your 'loci' to those that you can score reliably, even in weak samples. I focus on peaks that appear several times the strength of the background noise in at least some samples, then go through and look carefully for peaks of any size in those positions in the other samples. I do not feel that differentiation of heterozygotes and homozygotes is possible or reliable or distinguishable from superimposed fragments of different origin, so I score any size peak at a locus equally. It is important to run samples in a randomized manner and to run some duplicates in order to verify that your scored loci are reliably detectable. Given the noise, it is also important to generate a dataset with at least about 150 loci, or more if your resolution is still poor. There is another important issue if you are using an analyzer that can detect multiple flourescent dyes. It is not acceptable to overlay multiple primers with different dye labels over one another, as many people can do with microsats. Peaks in one dye consistently interfere with detection of other dyes, which is a huge problem for peak intensive samples like AFLP's. It sounds like you may also have multiple species? This may be okay if they are closely related, but if they are distant or differ in ploidy, your AFLP's may be too divergent to use in the same analysis - i.e. fragments of the same size in different species may represent different regions of the DNA. Hope this helps. Kind regards, Katrina Katrina M. Dlugosch Ecology and Evolutionary Biology University of California Santa Cruz, CA 95064 Dr. Miguel Toro has forwarded to me your e-mail regarding your trouble scoring AFLPs. We have been working with the same subject for the last 3 years and we found the same kind of problems than you got. We only scored AFLPs over 50 and because we had a family structure with our fish we were able to diferentiate parents and full sibs using microsatellites. After we made the families we found some of the peaks under 50 were good and we corrected, but only after a visual inspection. We scored other problems, like from the beginning to the end there were differences in the sensibility of the apparatus employed (an ABI 310) due to the exhaustion of the laser beam. So we lost some of the marker of the biggest molecular weight. Another problem was to use different apparatus (ABI 310 and 3100), because de behaviour of the ROX marker was a little different, and the scoring was different too. My advise is to run all the samples in the minimum amount of time in the best machine available, but only in that one, to score only the best markers (exclude those with ambigous bands or very close to the baseline) with a baseline of 50 or even 100. It is better to get a lower number of good loci than a lot of bad ones. Only experience and time will tell you the right thing and if you were able to travel to some laboratory with a lot of experience using your same device you will get your best results. Good luck, Gonzalo Gonzalo Martinez-Rodriguez Instituto de Ciencias Marinas de Andalucia Consejo Superior de Investigaciones Cientificas Avda. Republica Saharaui, 2 11510 Puerto Real - Cadiz Hi, I have never worked with AFLP's, but I have a lot of experience with genotyping microsatellites with Genotyper, and I never score any peaks with an intensity < 80. Even at this intensity I will usually only score size standard peaks. I think the general rule of thumb is only to score peaks that have an intensity of around 200. Peaks smaller than 100 or 200 could be the result of non-specific amplification or simply background noise from your equipment. Optimizing PCR conditions may help you get rid ofsome of these smaller confounding bands. If this doesn't work, you may want to consult the operator's manual of your genotyping platform to determine if some equipment setting is contributing to all the noise you see. I know that ABI at least puts a decent troubleshooting section in the back of their equipment manuals. I hope this is helpful! Andrea Drauch Graduate Student Department of Forestry and Natural Resources Purdue University 715 West State St West Lafayette, IN 47907 With AFLPs you face a classic trade-off between quality and quantity. If you score only the most reliable ones, you will end up with few markers per reaction. Peak height is usually a good indicator of marker quality but not always. If you can, you should assess the bimodality of your band intensities as one possible quality check (I dont know how you could automate that but manually it should be possible). If you have a clear a (absence) and b (band) then your markers is most likely a "true" Mendelian marker, if you have also all values in-between it could just be a variable PCR artifact. Sincerely, Olav Dr. Olav Rueppell Assistant Professor Department of Biology, 105 Eberhart Bldg. University of North Carolina, Greensboro Greensboro, NC 27403 USA Hi, If you run your samples twice, preferably from the digestion step on, then you will know whether those peaks are noise are not. It is a very good idea to do that for at least a few samples, so you know how repeatable everything is. If that is highly repeatable it is still a good idea to run samples through both PCRs twice. I've found some small peaks (in the 20 range) that are repeatable, however, I worry about using them. I haven't analysed all my data yet, but my thinking is that is it worth while to work with more primer pairs and just the larger peaks than it is to try to capture all the variation that doesn't amplify well. We are working with GeneMarker in our lab, so I don't know anything about Genotyper. Also, if you run all your samples twice, there is a freeware program out there called PeakMatcher, which will analyze the raw data for you and tell you whether a peak was repeatable in you samples and how repeatable it was among samples, too. I've just played with it a little so can't give too many details, but it is pretty easy to work with. Good Luck, Steve There is always a lot of noise in the AFLP gels and there is little consistency between lanes and especially gels. My solution to this problem was torun a few individuals as standards on every gel and to unfortunately scoreeverything by hand, this becomes very time consuming. Using size standards with more peaks seems to help the resolution of the peaks. Some of the gels run at a very low resolution, the way I dealt with that was to visuallydecide whether I thought there was a peak even if it was below 50. I know some of the literature says they used genotyper and set the detection at 50 so I don't think you'll lose too much information, they also usually had close to 300 or more bands that they were scoring and multiple primercombinations. If you do this, I would suggest re-ampliflying gels that are low resolution to get a cleaner one with higher peaks. Hi, I can think of a few possible explanations for population genetic "noise" in your AFLP data. One is that there really is very little genetic differentiation between your populations due to high gene flow or hybridization events -- we see that in some plants I've been working with. However, the fact that you get better differentiation with the strong bands suggests there may be an issue with the AFLP reactions themselves. One common source of AFLP noise is poor ligations. If this happens, you end up with so few genome equivalents in your preamplification template that bands that should be present will randomly drop out, and those that are present will vary widely in intensity from sample to sample relative to other bands. The other possibility is that the fainter bands represent mismatch amplification, which tends to happen at low levels but can be amplified to the point of being visible if there are relatively few "real" fragments (the intense bands) to use up the primers during the selective amplifications. If the ligation issue is the culprit, re-doing the template preparation using fresh T4 DNA ligase for the ligation should solve the problem. I have had lots of problems with ligase that comes with the kits. It's a finicky enzyme and I prefer to buy fresh ligase separately from the kits. This may require adjusting some volumes due to concentration differences. If you have mismatch amplification, the solution is to use selective primer combinations that amplify more bands; e.g. ones that are more A/T rich in the selective extensions, lack CG dinucleotides, or even ones with fewer selective bases. If your organism has a small genome and you are using E+3 and M+3 primers, you may have better luck using E+2 primers. I hope this helps -- keep me posted. Dave David Remington Assistant Professor Department of Biology University of North Carolina at Greensboro P.O. Box 26170 Greensboro, NC 27402-6170 Hello, We also had some trouble with this and had to calibrate the system for the species we worked with by testing the markers used on control groups of known origin and relationship. We picked the informative peaks and ignored the overly noisy ones. Unfortunately, AFLP are not as 'out of the box' as advertised. We needed a high level of repeatability because we use it to identify plant clones in the wild. So just a little noise can result in large errors. Also, as a non- specific marker any contamination at all (from any species large or small) will cause real problems. This is almost inevitable in samples collected from the wild. You will find some of these issues addressed in the attached article. Unfortunately, this is a common problem with this technique, and there is no really perfect solution. Basically, the data from AFLP's are messy, and you must be very careful not to mislead yourself. While your idea of an arbitrary cutoff (at the expense of losing bands) is a good one, I think you'll also find that the intensity overall might vary from lane to lane, so choosing a single threshold might not work very well. The best solution is to get your size/intensity information into a spreadsheet and find one or more bands that are shared in all individuals. Use the intensity of these bands to "normalize" the intensity of the other bands for each individual. Then select a normalized intensity level for each lane to use as a cutoff. It's also very important to avoid any bias when you're working with these kinds of data. One way is to randomize individuals on your gels and resist the temptation to look at sample ID's while you're doing your scoring! Good luck! Jeffrey Markert, Ph. D. Oak Ridge / EPA Postdoctoral Fellow Atlantic Ecology Division 27 Tarzwell Drive Narragansett, RI 02882 Hi, I had the opportunity to work with AFLP markers, run on an ABI 3100-avant sequencer. After playing with different reaction conditions, different polymerases and such, my rule of the thumb is that you 'd better discard everything below 70 bp. You have fewer bands, true, but 1. larger fragments are less sensitive to size homoplasy (lower band density) and 2. they are more likely to be polymorphic. If you don't have enough peaks above 70bp, then maybe you should take fewer selective nucleotides? I'd be happy to answer any other questions, cheers, Antoine Hi I worked with Cerastium alpinum (Caryophyllaceae) and used AFLP last year. I used fragments in the 70-500 bp size range. The lowest intensity of peaksI scored as present had a peak-hight/intensity of about 50-100 at least. To decide what is noice - I just compared with a blank or control, if the noice never were above 40, I could interpret a peak of 50 in height, otherwise not. I strongly recommend you to use a control. I did this in the following way: a sample were done in parallel exactly as the individal samples with the exception that no DNA was added - so the blank was used in all steps from restriction/ligation to the selective amplification as the other samples. In addition I used a reference individual DNA in all subsets of PCR-s so that I could compare the histograms from time to time. This might help you to get an idea of where you should put your limits on what is a peak or not. Another tip is to run several double samples - extract DNA several times, and use these as independent controls in different ampifications so thatyou can validate your data! Good luck and best wishes! Anna-Britt "Ammie" Berglund Hi I think the potential solution to your problem depends on the source of your noise: An RFU of 20 points would be close to the detection limit for ABI377 and genoscan/ genotyper or similar. If the peaks which you can see are "clean" inrelation to the background noise, then I would suggest simply loading morePCR product when do a genotyping run. I don't think there is a way round excluding data from AFLP alleles which cannot be scored reliably. If data are unreliable they should be chucked out! One sensible way of gauging the reliability of your data would be to conduct duplicate extractions and AFLPs on the same set of genetic individuals,and calculate an error rate in scoring among replicated AFLPs. Vary your scoring thresholds to find an optimum which maximises information content with an acceptable level of scoring error. This reference might be useful: Bonin et al Molecular Ecology 13 (11): 3261-3273 Nov 2004 How to track and assess genotyping errors in population genetic studies If the process of removing unreliable data leaves you with too few scorablealleles, part of the AFLP reaction itself could be sub-optimal, or could have failed. Potential sources of poor AFLPs include (but are not limited to): Poor quality/ degraded DNA Incomplete restriction enzyme digestion Incomplete / unsuccessful ligation PCR primers which do not have a high enough degree of selectivity for the study organism Too much MgCl2 in PCRs Taq polymerase (some seem to work better than others in and AFLP) These need to be tested one by one so you can find the source of the problem and rectify it. Agarose gels are useful here! Dr. Rajenda Whitlock Department of Animal and Plant Sciences University of Sheffield Sheffield S10 2TN Hi, When scoring AFLP profiles, knowing which peaks to score as absent is very difficult and often problematic. In my experience the best way to avoid the problem of background noise is to increase the peak heights by maximizing the DNA quality and optimizing the PCR conditions. In general if you have good peak heights >1000 intensity then you can score peaks of lower intensity as absent (<100 was what I used). Many of these smaller peaks are not consistently amplified. If you are worried about which peaks to exclude from the analysis it is a good idea to run the analysis based on different methods of scoring (i.e. <50 or <100 absent or only including characteristic and highly repeatable peaks) and comparing the results. I hope this helps if some way. Cheers, Paul Rymer Postdoctoral Researcher Department of Plant Sciences University of Oxford South Park Road Oxford OX13RB, UK Hi You should NEVER use such small peaks with AFLPs, nevermind what AFLPSurv might say. Because of the technique's nature, there is always residual amplification from the Restriction-Ligation to preselective PCR and from there to Selective PCR, so even if you use only water and no DNA you would get a lot of peaks of small size which correspond to nothing, just "noise" as you have put it, and are mainly due to the amplification of excess primers formthe Preselective. By the way, this is in Vos' 1995 paper where the AFLP technique is described. My advise: start scoring at 75 bp peaks Cheers.: Rafael Rubio de Casas Hi, I've recently run and scored AFLPs using Genescan/Genotyper for a colleague. I don't think you can use a cutoff value as all traces vary in intensity dependant on the concentration and quality of your starting material. You'll obviously score more noise as peak intensity decreases. I labelled the peaks by size and peak height and then filtered the labels by 10% maximum peak height. This method will give you a measure of PCR efficiency and generate scaled data sets that should account for background noise. I hope this helps. Jake Eddy Brede