Cassandria Tay Fernandez
Thesis: Assembling a pan pan-genome to identify universal core genes in legume
A pan-genome is basically all the genes in a species, divided into two groups. A 'core genome' which has genes present in all individuals of that species, and a 'variable genome' which has genes that are only present in some individuals. These are particularly useful references but usually are constructed for a single species or genre. A pan pan-genome is a pan-genome for a higher taxonomic group that uses the data from other pan-genomes. This project will be using the publicly available references and bioinformatic tools to assemble a legume pan-pangenome. In this, I will compare the genes from 11 different legume species and map them to identify what genes are essential or universally inherited by the legume family.
Why my research is important
With Next Generation Sequencing (NGS) making genomic sequences more available, there is a need to understand the scope of gene diversity within a species. Many genetic and genomic studies use pan-genomes as a reference, but a single reference genome is not enough to represent this diversity. To circumvent this problem, pan-genomes are used as they can provide a broader scope of allelic and structural variations within a species and characterise phenotypic differences.
A pan pan-genome will be able to capture the scope of the variation within the family, not just the species, and allow characterisation of essential genes within the legume family. This can be used for mining genes and to create resources or genetic maps to help accelerate breeding legumes. Establishing a pan pan-genome will also be a stepping stone in studying the evolutionary relationship between these plants and looking at the variation between these distantly related species.