--- file_transformation: - glyma.Wm82.gnm2.div3.Lee_Jeong_2015.Original_Korean222.txt.gz: Transformed into VCF file format using custom script - glyma.Wm82.gnm2.div3.Lee_Jeong_2015.SNPdata.vcf.gz: Transformed into HapMap file format using Tassel 5.0 - glyma.Wm82.gnm2.div3.Lee_Jeong_2015.SNPdata.hmp.gz: Transformed into Flapjack format by removing columns "alleles", "chrom","pos","strand","assembly#","center","protLSID","assayLSID","panelLSID",and"QCcode". Heterozygotes in IUPAC code are converted to double allele code separated by "/" ex. R = A/G. File is then transposed to that SNP IDs are columns and soybean accessions are rows. changes: - 2018-02-20: Changed file names from glysp.mixed.div1.Lee_Jeong_2015 to mixed.gnm2.div3.dDzw to reflect that the variants were called with respect to Williams82 genome assembly 2 - 2018-02-20: Renamed mixed.gnm2.div3.Lee_Jeong_2015.SNPdata.hmp.txt to mixed.gnm2.div3.dDzw.SNPdata.hmp - 2018-2-28: Changed prefix of marker files from "mixed" to "glyma.Wm82" to reflect that the variants were called with respect to the Williams82 genome assembly - 2018-2-28: created MANIFEST files - 2018-2-28: renamed mixed.gnm.div3.Lee_Jeong_2015.SNPdata.hmp to glyma.Wm82.gnm2.div3.dDzw.hmp - 2018-6-18: updated the VCF and Hapmap files. a little over 10,000 SNPs were called on the negative strand, so the allele calls at these positions were switched to the positive strand using PERL: perl -F'\t' -Wlane 'if($F[6] eq "-"){tr/ACTG/TGAC/ foreach(@F[8..$#F])}; print join("\t", @F);' - 2021-04-27: add genome prefix - 2021-04-27: Change key from dDzw to Lee_Jeong_2015 - 2021-12-19: adf- fixed header and INFO issues, most of which were pointed out in https://github.com/legumeinfo/datastore-issues/issues/63