A standard set of annotation and assembly files released as part of Phytozome v7.0. All files are compressed by gzip to reduce the file size for faster downloads. Note: the number 109 in all file names is a Phytozome internal identifier for the current release of this genome and can be safely ignored. Files in the annotation subdirectory: 1) Gmax_109_annotation_info.txt A summary of annotation details available in Phytozome. This is a tab-delimited file, as follows: (Note: Columns are blank if no corresponding data is available) 1: Phytozome transcript name 2: PFAM 3: Panther 4: KOG 5: KEGG ec 6: KEGG Orthology 7: best arabidopsis TAIR10 hit name 8: best arabidopsis TAIR10 hit symbol 9: best arabidopsis TAIR10 hit defline 10: best rice hit name 11: best rice hit symbol 12: best rice hit defline 2) Gmax_109_cds.fa Nucleotide FASTA format file of all gene coding sequences 3) Gmax_109_peptide.fa Amino acid FASTA format file of all gene coding sequences 4) Gmax_109_peptide.fa Nucleotide FASTA format file of spliced mRNA transcripts (UTR, exons) 5) Gmax_109_gene.gff3 GFF3 format representation of all mRNA sequences (UTR, CDS). Genomic coordinates are relative to the reference sequence in column 1 6) Gmax_109_gene_exons.gff3 GFF3 format representation of all mRNA sequences as above, but with exon subfeatures. Genomic coordinates are relative to the reference sequence in column 1 The following two files are only present if the current annotation contains these data: 1) Gmax_109_synonym.txt Tab-delimited list of all gene symbol/synonyms for the Phytozome transcript in the first column. Not all transcripts may have these annotations. 2) Gmax_109_defline.txt Tab-delimited list of all defline descriptions for the Phytozome transcript in the first column. Not all transcripts may have these annotations. ----- Files in the assembly subdirectory: 1) Gmax_109.fa Nucleotide FASTA format of the current genomic assembly 2) Gmax_109_softmasked.fa Gmax_109_hardmasked.fa Nucleotide FASTA format of the current genomic assembly, masked for repetitive sequence by RepeatMasker (softmasked sequence is in lower case; hardmasked replaces masked sequence with Ns). Not all species have masked assemblies. ----- Files in the related_data subdirectory are releases of data related to the current annotation, such as EST sequences, and are not always available for all organisms.