Frédéric Ducongé

Improvement of aptamers using PATTERNITY-seq (high-throughput analysis of sequence patterns and paternity relationship between them).

 

Nam Nguyen Quang1,2,3, Clément Bouvier1,2,3, Benoit Lelandais1,2,3 and Frédéric Ducongé1,2,3.

 

1, The Neurodegenerative Diseases Laboratory (LMN), CEA-CNRS UMR 9199, Université Paris-Saclay, Fontenay aux roses, France 

2, Molecular Imaging Research Center (MIRCen), Institute of Biomedical Imaging (I²BM), CEA SCIENCES, Fontenay aux roses, France.

frederic.duconge@cea.fr

 

 

Aptamers are usually identified by a molecular evolution process named SELEX. This process could be presented as a Darwinian evolution realized in a test tube. A random population of nucleic acid species has to survive a selection pressure (binding to a target) before artificial reproduction by enzymatic amplification. The repetition of selection and reproduction steps should gradually enrich the population in sequences with the best-inherited traits (3D structures) adapted to the selection pressure. Furthermore, these sequences can evolve by mutations during the reproduction steps leading to the apparition of new sequences that are capable of binding the target even more strongly than their parents.

In order to identify aptamers, cloning and Sanger sequencing of a small part of the population (a hundred of sequences) is usually performed at the end of SELEX. Such method has successfully identify the fast majority of aptamers until now, but it need a sufficiently large number of rounds in order to amplify aptamers up to a level (at least more than 1% of the pool) that can be significantly measured by such low throughput sequencing. Next-generation sequencing allows now to analysis a higher part of the population (millions of sequences). Most importantly, it allows for the first time to analyse millions of sequences of the pool from each rounds of selection. Such amount of data has already been used to improve the classical analysis of sequence patterns, reducing the number of rounds in SELEX. However, it opens the door for more complete phylogenetic analysis of SELEX that can be used to identify the best aptamer as possible form a class of structure.

Here we develop a bioinformatics pipeline for high-throughput analysis of several patterns: evolution of primary sequences, clustering of sequences in families, predicted paternity relationship between variants of these families and evolution of common predicted sub-structures between sequences or families. Analysis of these patterns allows us to build new kind of dendrograms that can be useful to identify aptamers and predict structural motifs that are involved in the binding. Additionally, it can provide important knowledge on the influence of selection variables (ex: concentration of targets, time of incubation, washings…). We validated our approach re-analysing a previously published data of a cell-SELEX and demonstrate that we were able to identify better aptamers against Annexin-A2.