Measuring gene relatedness by machine learning

Date: 12th March 2019

The ability to construct biological networks to enable the prediction of novel genomic functions has far-reaching implications in basic research and drug development, whether understanding basic developmental pathway and disease states or creating synthetic life. One way to add to our understanding here is to measure relatedness between a pair of genes through either expression similarities (for co-expression analysis and so-called conditional relatedness) and prior-knowledge based similarities (based on an analysis of published information of the genes of interest).

Both methods, however, have their own flaws and strengths, including high false positive rates and so to overcome these researchers in the US and China have turned to machine learning to develop a more robust method known as Multi-Feature Relatedness (MFR); a novel machine learning model which leverages both expression similarity and prior-knowledge based similarities and when subjected to cross validation MFR outperformed the other models currently in use.  Furthermore, it was used to predict metabolic pathways in four cancer types which are affected by an increase in glutamine and glutamate metabolism (which are elevated in many types of cancer) with MFR predicting the largest number of pathways.

Plans are already in place to improve this model further by integrating deep learning models with the aim to produce an even more accurate and robust system.

Wang, Y., S. Yang, J. Zhao, W. Du, Y. Liang, C. Wang, F. Zhou, Y. Tian and Q. Ma (2019). “Using Machine Learning to Measure Relatedness Between Genes: A Multi-Features Model.” Scientific Reports 9(1): 4192.

https://doi.org/10.1038/s41598-019-40780-7