Deep learning accelerates designer gene therapy delivery vehicles

deep learning accelerates designer capsids

Date: 17th February 2021

Gene therapy is set to transform the treatment of many genetic disorders, and adeno-associated viruses (AAVs) have become promising delivery vehicles for transferring therapeutic DNA and genes into target cells. In fact, the already approved gene therapies are AAV-based.  However, whilst there have been many efforts to engineer and improve new AAV variants, there is still an unmet clinical need for a more diverse, catalogue of AAVs that can meet gene therapeutic challenges.  Now, researchers have applied computational deep learning (DL) algorithms to design highly diverse capsid variants from AAVs, which remain viable for packaging of a DNA payload.

Current AAV capsids – the protein shell of the virus that encloses the genetic material – are restricted clinically as they lack homing abilities to target specific organs or tissues.  In addition, they can evoke an immune response producing neutralising antibodies, limiting subsequent gene therapy treatments, and can commonly arise if the patient has previously been exposed to comparable naturally occurring AAVs, which are widespread.  Whilst, efforts are being made to engineer AAV variants that can evade the immune system, current available methods can only create a limited diversity of capsids, and most still resemble that of the naturally occurring AAV variants, known as serotypes.

Now, scientists from the Wyss Institute, Harvard Medical School, and Dyno Therapeutics, US, led by Eric Kelsic, George Church, and Lucy Colwell from the University of Cambridge, UK, and Google Research, US, have applied a deep learning approach to design highly diverse capsid variants from the AAV2 serotype – generating 201,426 variants of which over half yielded viable capsids, and >57,000 exhibited much higher diversity than is seen in natural AAV serotype sequences.

The team started by basing the research around the AAV2 capsid which is the delivery vehicle for the first gene therapy to be approved by the US Food and Drug Administration for use in humans.  As epitopes for neutralising antibodies can occur at several surface regions on the capsid, they reasoned that numerous engineered changes would have to be made in order to avoiding neutralising serum.  Therefore, they chose to focus on a highly immunogenic 28-amino acid segment region, making variable numbers of mutations.

Using multiple design strategies, the team first generated smaller data sets on which they could train several machine learning models. A high-throughput method to synthesise mutated capsid sequences and in vitro viability testing was used to assess their overall approach. Once the pilot learning study was performed, the team moved on to create a diverse functional capsid library.

They used the DL algorithms to design >200,000 capsid variants, of which 110,689 produced viable AAVs, 57,348 of which surpass the average diversity of natural AAV serotype sequences, and contained between 12 and 29 mutations in the 28 amino acid region.  These consisted of combined substituted or additionally inserted amino acids. The result – the highest functional diversity of any capsid library produced to date.

Conclusions and future applications

The team here have created a vast capsid library, leveraging deep learning and high-throughput synthetic testing, to deliver an unprecedented deep diversification of an AAV capsid protein.  The approach unlocks vast functional areas of previously inaccessible sequence, and holds the promise of the next generation of improved viral vectors and protein therapeutics.

There are many applications for improved delivery vectors, such as improved selectively of target tissues, more efficient gene therapies, and those that have reduced immunogenicity.  Indeed, the Church lab also published last week an engineered ‘cloaked’ AAV that can evade the innate & adaptive immune response, creating safer & more effective tools for gene therapy.

By using expert knowledge of AAVs and combining it with artificial intelligence (AI) – the design of catalogues of capsids, engineered in regions known to be involved in targeting specificity, increased expression, or as here immunogenic sequence will produce the viral vehicles of the future.

The next milestone will be to develop and train models to simultaneously predict multiple phenotypes and to jointly optimise variants for numerous desirable properties, a task that will substantially more challenging.

The work here highlights the vast predictive power of AI, which is starting to have a high impact on research and medical fields.  It has the ability to predict the risk factors of autism, early-stage breast cancer, COVID-19 associated pneumonia, cancer killing drug combinations or to evaluate anti-senescent drugs and multiregional dynamics of brain networks.  The addition of an AI-model to predict designer gene therapy vehicles will be a valuable tool in the synthetic biologists’ toolbox.


For more information please see the press release from the Wyss Institute


Bryant, D. H., A. Bashir, S. Sinai, N. K. Jain, P. J. Ogden, P. F. Riley, G. M. Church, L. J. Colwell and E. D. Kelsic (2021). “Deep diversification of an AAV capsid protein by machine learning.” Nature Biotechnology.