---
The authors selected markers from assembly 1 (Wm82.gnm1) and reported those coordinates.
In preparing the markers for the Data Store, we incorrectly assumed that the markers came from assembly 2 (Wm82.gnm2);
To fix the problem, find the marker sequences from assembly 1 and find their locations in assembly 2.
Flanking sequence around markers was extracted from Wm82.gnm1 using bedtools getfasta; then
that sequence was used to identify marker position in Wm82.gnm2

Also, the SLAF-Seq (aka SLAFSeq) markers have been used in at least two publications: Zhao_Teng_2017 and
Zhao_Han_2015 (doi: 10.1111/tpj.12810). Rename marker filenames to refer to the respective papers:
glyma.Wm82.gnm1.mrk.Zhao_Teng_2017.gff3 and glyma.Wm82.gnm1.mrk.Zhao_Han_2015.gff3


From the publication (Zhao et al. BMC Genomics (2017) 18:462 DOI 10.1186/s12864-017-3843-y ),
generate a GFF representing the 13 SLAFSeq markers reported in the paper:
 Gm07 rs35898587
 Gm07 rs36423980
 Gm08 rs7631207
 Gm08 rs7640250
 Gm08 rs7671170
 Gm08 rs7662003
 Gm08 rs7664479
 Gm08 rs7622492
 Gm08 rs7661660
 Gm08 rs16268025
 Gm14 rs3853672
 Gm15 rs38522986
 Gm18 rs46625879

From the GFF, derive the BED file. The GFF is in one-based coords and BED is zero-based coords.
Use bedops to do the right thing:
  conda create -n bedops
  conda install -n bedops -c conda-forge -c bioconda bedops
  conda activate bedops

  gff2bed < glyma.Wm82.gnm1.mrk.SLAFseq.gff3 > glyma.Wm82.gnm1.mrk.SLAFseq.bed
  cat glyma.Wm82.gnm1.mrk.SLAFseq.bed |
    awk -v OFS="\t" '{print $1, $2-1000, $3+1000, $10}' |
    perl -pe 's/ID=([^;]+);.+/$1/; s/glyma.Wm82.gnm1.rs/rs/' > glyma.Wm82.gnm1.mrk.SLAFseq_1kflank.bed

Activate bedtools and extract the sequences from gnm1
  conda activate bedtools

  bedtools getfasta -name -fi glyma.Wm82.gnm1.FCtY.genome_main.fna \
                    -bed glyma.Wm82.gnm1.mrk.SLAFseq_1kflank.bed \
                    -fo glyma.Wm82.gnm1.mrk.SLAFseq_1kflank.fna

Simplify the marker names, stripping out the coordinate information added by bedtools
  perl -pi -e 's/::\S+//' glyma.Wm82.gnm1.mrk.SLAFseq_1kflank.fna

Activate BLAST and search the genome sequence - first in Wm82.gnm1 to confirm sequence locations 
The locations should be same as BED.
  conda activate blast

  mkdir blastdb blastout

  makeblastdb -in glyma.Wm82.gnm1.FCtY.genome_main.fna \
              -dbtype nucl -hash_index -out blastdb/glyma.Wm82.gnm1

  blastn -db blastdb/glyma.Wm82.gnm1 \
         -query glyma.Wm82.gnm1.mrk.SLAFseq_1kflank.fna \
         -evalue 1e-10 -outfmt 6 -perc_identity 99 \
         -out blastout/glyma.Wm82.gnm1.mrk.SLAFseq.x.glyma.Wm82.gnm1.bln &

Confirm that the query sequences find the correct coordinates on Wm82.gnm1: 
  grep -v "^#" blastout/glyma.Wm82.gnm1.mrk.SLAFseq.x.glyma.Wm82.gnm1.bln | top_line.awk

Search glyma.Wm82.gnm2.DTC4
  makeblastdb -in glyma.Wm82.gnm2.DTC4.genome_main.fna \
              -dbtype nucl -hash_index -out blastdb/glyma.Wm82.gnm2
  blastn -db blastdb/glyma.Wm82.gnm2 \
         -query glyma.Wm82.gnm1.mrk.SLAFseq_1kflank.fna \
         -evalue 1e-10 -outfmt 6 -perc_identity 99 \
         -out blastout/glyma.Wm82.gnm1.mrk.SLAFseq.x.glyma.Wm82.gnm2.bln &

Find locations in Wm82.gnm2 and print the marker coordinates (at hit start position + 1000)
  cat blastout/glyma.Wm82.gnm1.mrk.SLAFseq.x.glyma.Wm82.gnm2.bln |
    awk -v OFS="\t" '$3>=99 && $4>=1900 && $11==0 {print $2, $9+1001, $9+1001, $1}' |
    cat > glyma.Wm82.gnm2.mrk.SLAFseq.bed

Reshape bed file into gff
  cat glyma.Wm82.gnm2.mrk.SLAFseq.bed |
    awk -v OFS="\t" '{print $1, "Zhao_Teng_2017", "genetic_marker", $2, $3, ".", ".", ".", "ID=" $1 "." $4 ";Name=" $4}' |
    cat > glyma.Wm82.gnm2.mrk.SLAFseq.gff3

RESULT: glyma.Wm82.gnm2.mrk.SLAFseq.gff3 contains the marker locations on Wm82.gnm2 .


Check the work by looking at the sequences in gnm2 and comparing with those from gnm1.
Activate bedtools and extract the sequences from gnm2
  conda activate bedtools

  cat glyma.Wm82.gnm2.mrk.SLAFseq.bed | awk -v OFS="\t" '{print $1, $2-1000, $3+1000, $4}' > glyma.Wm82.gnm2.mrk.SLAFseq_1kflank.bed

  bedtools getfasta -name -fi glyma.Wm82.gnm2.DTC4.genome_main.fna \
                    -bed glyma.Wm82.gnm2.mrk.SLAFseq_1kflank.bed \
                    -fo glyma.Wm82.gnm2.mrk.SLAFseq_1kflank.fna