Thank you for all the people that responded to my inquiry. Down there are the answers. Of all the answers, I choose GenAlEx6, (thanks to Peter Smouse) and already tried it. It is a module for Excel, fit to both Mac and PC, easy and intuitive to use. Based on my three-days experience, I can recommend it for such analyses, and perhaps for many more. Yuval Here is the original question: >Dear EvolDir members: >I have genotypes data of >50 microsatellite loci for ~400 samples of three Helianthus species, one is a homoploid hybrid species, and the other two are parental. I want to plot the individuals based on their genotypes, in order to test the relationships between the species. I am not interested in the phylogeny, just in the relative distances between individuals in multi-dimensional scale (k=no. of loci). The goal is to examine the relative position (in k dimensions) of the hybrid species relative to its parents. The problem with microsatellite data is that it is di-allelic data, i.e., for each character (=locus) I have two states for each individual, which could be identical (homozygote) or different (heterozygote), and are not independent. I assume (for now) that each locus is independent of the other loci (non-realistic, but it is corrected in a different analysis). >Does anyone aware of a method to - and better: software - that does things like that? Freewares are favorable, of course. > ANSWERS: ************** Data modifications advices - You could calculate a genetic distance called DPS, the proportion of alleles shared between individuals. This is an individual-based genetic distance suitable for codominant markers. The computer program MSA (Microsatellite Analyzer) by Dieringer & Schlotterer can do this. The software was published in Molecular Ecology Notes. After you got the individual-based distance matrix you could resolve this with one of many multivariate techniques. E.g. you could use MSD (multidimensional scaling) or PCA (principial component analysis). Both methods can be carried out with general stats software packages, e.g. SPSS or JMP. ---------- (Rodney Dyer:) I have software on my server (Multivariate Genotypes) at: http://dyerlab.bio.vcu.edu/wiki/index.php/Software that takes diploid multilocus data and turns it into multivariatly normal data. You can look at the 2GenerV paper (pdf #10 on my publications page) to get an overview of how this works, or I would be happy to discuss it with you directly if you like. ---------- Look at this page: http://dyerlab.bio.vcu.edu/wiki/index.php/Software#Multivariate_Genotypes there you'll have your data ready for PCA, CDA and related ---------- you may use the PCO, like a PCA but on genetic distances, GenAlex software performs this kind of method, but also the package ecodist of R software. ---------- Freeware - (Peter Smouse:) Dear Registered User of GenAlEx We are pleased to advise that the official release of GenAlEx 6 is now available: http://www.anu.edu.au/BoZo/GenAlEx/ This version includes all of the features listed in Peakall and Smouse (2006) : Peakall, R., Smouse, P.E., 2006. GENALEX 6: genetic analysis in Excel. Population genetic software for teaching and research. Molecular Ecology Notes 6, 288-295. http://www.blackwell-synergy.com/doi/abs/10.1111/j.1471-8286.2005.01155.x Molecular Ecology Notes http://www.blackwell-synergy.com/loi/men Please note that the t he current program and documentation files supersede all previous versions. Therefore, it is strongly recommended that you update your program and documentation. Further information about updating GenAlEx is provided in the 'Read me' file when you download the new documentation. We thank the many users of our beta releases of GenAlEx 6 for their positive and supportive feedback and bug reports. Enjoy! Rod Peakall and Peter Smouse April 7, 2006 ---------- There’s a program called GenAlEx which is a Microsoft Excel add-in which will create PCO plots for both dominant and codominant data. I'm not sure of the exact details (like the non-independence you mentioned) but you could look into it. Its free at: http://www.anu.edu.au/BoZo/GenAlEx/ It was pretty easy to use as well which is always helpful with these things! ---------- GenAlEx (http://www.anu.edu.au/BoZo/GenAlEx/) can do a PCA based on inter-individual genetic distances, and it's free. ---------- This is just a very simple way of doing it :-) Make a table with one column for each allele (with 50 loci and an average of 10 alleles at each loci would make a table with about 500 columns) and for each individual (each row) you put in 0 and 1 for absence/prescence of that allele. This large 0/1 matrix could be analyzed by several multivariate methods, like principal coordinate analysis (PCoA) or principal component analysis (PCA). Both of these methods can be easily used in the freeware PAST (http://folk.uio.no/ohammer/past/). At least, the score plots would give you some ideas of the distance between the different species. If your major aim is to have measures of the distances, you may use the given coordinates for the components you chose to focus on. ---------- I have used the following sotwares, with good results in plotting and identifying hybrid individuals: 1) Genetix, wich performs Factorial Correspondece Analysis: http://www.genetix.univ-montp2.fr/genetix/genetix.htm 2) PCAGEN: http://www2.unil.ch/popgen/softwares/pcagen.htm ---------- I have done this for microsats using a number of techniques. 1)Bayesian assignment software such as "Structure" by Pritchard et. al or a very new one "structurama" by Huelsenbeck are well suited for this question and freely available. 2) Non metric multidimensional scaling is another approach. This method is analogous to PCA except the number of axes you select determines the placement of each individual in multivariate space. First I calculate a pairwise distance matrix. Then I input the matrix into a Non Metric Multidimensional Scaling program. I use NTSYS for this. An iterative process is implemented in order to minimize the "stress", which, as I understand it, is a goodness of fit. Models with low stress are better. The one major problem with NTSYS is that it can only handle a limited number of individuals. I know there are a number of programs out there for NMMDS, and I am pretty sure SAS has such a module. So all you have to do is generate a genetic distance matrix then you should be able to use that file in whichever program you end up using. ---------- PCAgen (Goudet - http://www.unil.ch/dee/page6767_en.html#3) with 2D interface, 2 digit coding of msat,and statistical test of significance of the components (broken-stick) Genetix (Belkhir - www.univ-montp2.fr/~*genetix*/*genetix*.htm ) with 3D interface, quite good one, easy import data from Fstat, genepop or text, includes other stats on All. freqs., in French (but still manageable). ******************** $$$$$ware - It sounds like you might want to try principal coordinates analysis. It could be done using the software NTSYS and the modules: SIMGEND, DCENTER, EIGEN, MOD3D, in that order. A number of distance coefficients might be used, but offhand I'd recommend "BAND", which is a simple band sharing coefficient that is applicable to codominant data like microsatellites. The output would be a 2, 3, or k-dimensional plot (at your choosing) with clusters (or perhaps not, depending on the structure of the data) corresponding to the two parental species and the homoploid hybrid derivative. ---------- NTSys does various multidimensional scalings based on genetic distances created in eg. the freeware microsat or others. NTSys itself is not a freeware itself but can be purchased for a favourable price if you are an academic. Yuval Sapir, PhD Dept of Biology Indiana University Bloomington, IN 47405 USA http://www.bio.indiana.edu/~rieseberglab/yuval_sapir.html