An insight into structure and composition of the fig genome

E. Barghini, F. Mascagni, T. Giordani, L.J. Solorzano Zambrano, L. Natali, A. Cavallini
Ficus carica L. is a diploid species, with a genome size of 0.36 pg/2C, still poorly characterized at genetic and genomic level. With the aim of analysing the fig genome structure, we used Illumina technology to produce 25.64 genome equivalents of 35-511 nt long MiSeq sequences and 12.96 genome equivalents of 25-100 nt long HiSeq paired-end reads. The two libraries were subject to a first assembly run separately, then a hybrid assembly was performed; finally, contigs and supercontigs were scaffolded. This first rough assembly is composed of 264,088 scaffolds, up to 41,760 nt in length, covering 323,708,138 nt, that corresponds to 87.5% of the fig genome, with N50 = 2,523. Masking the scaffolds with a transcriptome of Rosaceae, from which sequences related to repetitive elements were removed, allowed us to establish that coding genes account for at least 6.8% of the fig genome. Gene prediction analysis produced 44,419 putative genes. A sample of around 5,000 predicted genes were annotated with regard to gene ontology and function. Concerning the repetitive component, the fig genome resulted composed for around 58% of repeated sequences, of which none was especially redundant. Among identified repeats, the most represented were LTR-retrotransposons, with Gypsy elements more frequent than Copia.
Barghini, E., Mascagni, F., Giordani, T., Solorzano Zambrano, L.J., Natali, L. and Cavallini, A. 2017. An insight into structure and composition of the fig genome. Acta Hort. (ISHS) 1173:69-74
Ficus carica, Illumina sequencing, genome structure, Repetitive DNA, gene prediction

Acta Horticulturae