Project 463280

Assembly and annotation of genomes, transcriptomes, and metagenomes

463280

Assembly and annotation of genomes, transcriptomes, and metagenomes

$753,526
Project Information
Study Type: Unclear
Research Theme: Biomedical
Institution & Funding
Principal Investigator(s): Birol, Inanc
Institution: BC Cancer, part of PHSA (Vancouver)
CIHR Institute: Genetics
Program: Project Grant
Peer Review Committee: Genomics: Systems and computational biology
Competition Year: 2022
Term: 5 yrs 0 mth
Abstract Summary

Over the last two decades, technology advances have made DNA sequencing a routine and cost-effective method in many fields of life sciences research. New sequencing technologies are generating more and more reliable information about longer and longer stretches of input DNA. When coupled with bioinformatics tools that can leverage their rich information content, long reads will continue to open up new and exciting fields of research and applications in health genomics. The proposed project builds on our highly successful research program on sequence assembly and annotation, where we have established a strong expertise in building, disseminating, and maintaining widely used bioinformatics tools. Here, we will develop innovative algorithms for genome, transcriptome, and metagenome assembly problems. Following recent advances in sequencing technologies, these algorithms will be designed ground-up for long-read platforms. We will also introduce novel algorithms to assess sequence assembly and annotation quality, providing scalable methods. Our tools will quickly, accurately, and efficiently assemble and analyze large sequencing datasets, and provide advanced capabilities in a range of downstream research and precision medicine applications, such as tracking infectious disease outbreaks, using genetic information to guide drug selection in cancer care, and diagnose the genetic causes of rare diseases. These tools will build on innovative data structures with low memory footprints, allowing rapid run times for large datasets. These data structures will include error tolerant sequence representations, succinct sequence linkage graphs, and general-purpose sequence transformations. In preliminary work on novel data structures and advanced scalable algorithms, we have exciting and encouraging results. Some of our tools are already prototyped and demonstrated proof-of-principle.

No special research characteristics identified

This project does not include any of the advanced research characteristics tracked in our database.

Keywords
Assembly Assessment Bioinformatics De Novo Assembly Genome Annotation