32. MONOCLONAL ANTIBODY SEQUENCING: A DE NOVO-MEDIATED DATABASE APPROACH
Department: Computer Science & Engineering
Faculty Advisor(s):
Vineet Bafna
Primary Student
Name: Natalie E Castellana
Email: ncastell@ucsd.edu
Phone: 858-534-8855
Grad Year: 2011
Abstract
An antibody's preference and efficiency in the detection and removal of encountered antigens is heavily dependent on its amino acid sequence. Oftentimes, an antibody's sequence may be determined early in its lifetime by sequencing the DNA of the source cell line. However, few direct protein sequencing options exist when the source is unavailable or for the purpose of ensuring antibody integrity. Somatic hypermutation and chemical post-translational modifications confound tandem mass spectrometry-based peptide identification methods; both database search tools and de novo sequencing. We present a hybrid approach which draws on the strengths of these methods, combining a database search against known antibody sequences and a guided de novo sequencing of regions which cannot be annotated using the database.
In the database phase of the method, regions of the antibody consistent with the germline sequence are identified using InsPecT. The protein database contains all germline immunoglobulin genes, as well as a splice graph representation permitting the identification of peptides spanning splice junctions. The regions identified in the database phase are then used as anchors to guide subsequent de novo sequencing. We demonstrate the accuracy of this method by sequencing an antibody designed against the B- and T-cell lymphocyte attenuator molecule. The efficiency and throughput of our method greatly exceeds the capabilities of Edman degradation, the traditional method of antibody sequencing, transforming days of work into a few hours. We believe this method can be generalized to a variety of other applications, such as the discovery of splice junctions and fusion genes