SNP discovery and SSR mining from Carya illinoinensis - RAD sequences
Pecan (Carya illinoinensis) is recognized as one of the major nut trees worldwide. Long juvenility, large tree size, and heterozygosity due to out-breeding hamper genomic characterization of this species. We sequenced restriction-site associated DNA (RAD) from nine pecan cultivar genomes. In total, 87.7 M useful short reads with an average length of 45.91 bp, equivalent to ~5× pecan genomes were produced from four restriction enzyme-digested RAD libraries. Sequences were assembled into a total of 64,388 contigs N50 with an average length of 269 bp. Cultivar 'Wichita' generated the most contigs (10,455) and 'Sumner' the fewest (5,441). Contigs were mapped to two previously sequenced pecan cultivar scaffolds (87MX3-2.11 and 'Pawnee'), generating 78,264 and 56,047 SNP markers, respectively. Contigs were also mined to discover 4,698 SSR motifs (di-nucleotides or higher), with 3,014 (64.2%) allowing design of SSR primers. Of the four restriction enzymes, SbfI generated the highest number (41.6 M) of useful reads from nine cultivars, of which approximately 18 M reads (43%) were assembled, followed by FseI and NotI. AscI showed the lowest numbers of reads and fewest SNP. The rates of SSR discovery from four RAD libraries showed the same trend with SNP discovery. Based on the preliminary results, SbfI was the optimal enzyme for RAD-based marker discovery in pecan. In addition, the SNP variation among pecan cultivars has no significant difference, but apparently depends on their genetic/geographic distance from the reference genome. This study provides not only useful molecular markers for population association mapping and genotyping, but a strategy for pecan whole genome sequencing and subsequently gene discovery.
Wang, X. and Grauke, L.J. (2021). SNP discovery and SSR mining from Carya illinoinensis - RAD sequences. Acta Hortic. 1318, 165-176
restriction enzyme, scaffold, contig, RAD-based marker, geographic distance