The coronavirus crisis has brought an unprecedented level of worldwide scientific collaboration. Artificial Intelligence (AI) based on Neural Networks (NNs) and Deep Learning [DL1] can help to fight Covid-19 in many ways. The basic principle is simple. Teach NNs to detect patterns in data from viruses and patients and others. Use those NNs to predict future consequences of possible actions. Act to minimize damage. Numerous examples were discussed at the recent ELLIS workshops on Machine Learning against the virus [EL1] [EL2]. (Disclaimer: I am an ELLIS Fellow.) Here are my very incomplete and subjective notes on various relevant approaches.
1. Track populations through pattern recognition [EL2]. Example: Peer-to-peer bluetooth apps on smartphones may prevent potentially dangerous contacts (not so much AI needed for that). More challenging: use deep NNs to recognise faces or gaits of persons and their contacts in videos. Detect mass behavior and predict outbreaks and other consequences to build early warning systems (compare Covid-19 forecasting challenge [KAG1]). This may be harder in countries with strict privacy laws [GEO] [DEC]. Identify groups at risk and predict results of therapies. Predict future demand for limited resources (ventilators, doctors) to optimize logistics [EL1]. Sequence virus genomes and detect their cities of origin [EL1]; predict where similar genomes will show up next. Build causal models of the spread of the disease [EL2]. Use NNs to obtain improved epidemiological models from data [DER].
2. Observe single patients. Teach NNs to monitor bio signals, heart rates (e.g., from smart watches), breathing [CHO], coughs [IMR] [EL2], other signals, e.g., [HYL]. Detect & predict asymptomatic cases in time. Analyze X-ray [MAG] [EL2] and other types of images; diagnose pathologies. (The first medical imaging contest won by NNs dates back to 2012 [MED] [TOP1] [GPUCNN5] when compute was almost 100 times more expensive than today.)
3. Partially automate drug design [EL1] and use AI to advance the field of immunology. Find molecules that dock on the (few) proteins of the simple virus to inhibit its activity (like antibodies). E.g., predict folding of proteins to find docking stations. Already 13 years ago when compute was almost 1000 times more expensive than today, Long Short-Term Memory (LSTM) excelled at protein folding prediction [HO1]. See also Google DeepMind's recent computational predictions of protein structures associated with Covid-19 [DMCO].
Teach NNs relevant chemistry and molecular biology. 1. Indirectly: use NN-based Natural Language Processing to mine scientific articles, e.g., [KAG2]. 2. Directly: Feedforward NNs or LSTM [DEC] [GAU] or Graph NNs (since 1995-96 [GOL] [KU]) can learn from examples to model chemical reactions: input ingredients plus conditions (temperature, catalyzers, etc) yield output molecules with certain properties. Then work the NN backwards: I want a substance that does this - which ingredients do I need? Then try the NN's suggestions in the real world. This may save lots of time & physical resources. NNs are sometimes good enough to replace wet lab tests (assays) [EL1]. NNs won the Merck Molecular Activity Challenge [JMA] [MER] and the Tox21 data challenge on predicting the toxicity of substances [TOX]. NNs can design new molecules [SEG] [GOM] and find the antibody needle in an antibody repertoire haystack [WID]. Ligand-based approach: given a molecule, an NN can predict to which proteins it will bind [UNT].
Typical drug discovery & development pipeline: 6+ years for selecting 5 out of 10'000 compounds, 7 years of clinical trials, 1+ years approval [EL1]. Speed this up by fast virtual screening: Use a large database such as ZINC which contains descriptions of 1 billion molecules. Pipe the data through a deep NN called SmilesLSTM [SMI] to suggest 30,000 top scoring molecules as SARS-CoV-2 inhibitors [HOF]. Test in the wet lab. Apply this approach also to drugs on the market, to reduce costly clinical trials. See the JEDI challenge of 9 April 2020: A billion molecules against Covid-19 [JEDI]. See [RCO] for many additional resources to help with Covid-19 research.
[EL1] ELLIS against Covid-19 (weekly online workshop series). 1 April 2020: Machine learning applications in Covid-19 research. Contributions by B. Schoelkopf, O. Stegle, N. Oliver, R. Neher, C. Mason, G. Klambauer, Y. Bengio, M. v. d. Schaar, F. Hamprecht, D. Rishi. Youtube video. (Of particular interest for drug discovery through neural nets: Klambauer's contribution at 1:19.)
[CHO] Y. Cho, N. Bianchi-Berthouze, S. J. Julier (2017). DeepBreath: Deep Learning of Breathing Patterns for Automatic Stress Recognition using Low-Cost Thermal Imaging in Unconstrained Settings. Preprint arXiv:1708.06026v1 [cs.HC].
[MAG] H. S. Maghdid, A. T. Asaad, K. Z. Ghafoor, A. S. Sadiq, M. K. Khan (2020). Diagnosing COVID-19 pneumonia from X ray and CT images using deep learning and transfer learning algorithms. Preprint arXiv:2004.00038.
[IMR] A. Imran, I. Posokhova, H. N. Qureshi, U. Masood, S. Riaz, K. Ali, C. N. John, M. Nabeel (2020). AI4COVID-19: AI Enabled Preliminary Diagnosis for COVID-19 from Cough Samples via an App. Preprint arXiv:2004.01275.
[HYL] S. L. Hyland, M. Faltys, M. Hueser, X. Lyu, T. Gumbsch, C. Esteban, C. Bock, M. Horn, M. Moor, B. Rieck, M. Zimmermann, D. Bodenham, K. Borgwardt, G. Raetsch, T. M. Merz (2020). Early prediction of circulatory failure in the intensive care unit using machine learning. Nature Medicine, vol. 26, pages 364-373, 2020.
[KAG1] Kaggle competition on COVID19 Global Forecasting. Forecast daily COVID-19 spread in regions around world (2020).
[KAG2] Kaggle: COVID-19 Open Research Dataset Challenge (CORD-19). An AI challenge with AI2, CZI, MSR, Georgetown, NIH & The White House.
[MER] Kaggle: Merck Molecular Activity Challenge (2012). Winners: G. Dahl, R. Salakhutdinov, N. Jaitly, C. Jordan-Squire, G. Hinton.
[JMA] J. Ma, R. P. Sheridan, A. Liaw, G. E. Dahl, V. Svetnik (2015). Deep neural nets as a method for quantitative structure-activity relationships. Journal of chemical information and modeling, 55(2), 263-274.
[DMCO] Google DeepMind (2020): Computational predictions of protein structures associated with COVID-19.
[GOM] R. Gomez-Bombarelli, J. N. Wei, D. Duvenaud, J. M. Hernandez-Lobato, B. Sanchez-Lengeling, D. Sheberla, J. Aguilera-Iparraguirre, T. D. Hirzel, R. P. Adams, A. Aspuru-Guzik (2018). Automatic chemical design using a data-driven continuous representation of molecules. ACS central science, 4(2), 268-276, 2018.
[UNT] T. Unterthiner, A. Mayr, G. Klambauer, M. Steijaert, J. K. Wegner, H. Ceulemans, S. Hochreiter (Dec 2014). Deep learning as an opportunity in virtual screening. Proc. of the deep learning workshop at NIPS 2014 (Vol. 27, pp. 1-9).
[SMI] A. Mayr, G. Klambauer, T. Unterthiner, M. Steijaert, J. K. Wegner, D. Clever, H. Ceulemans, S. Hochreiter (2018). Large-scale comparison of machine learning methods for drug target prediction on ChEMBL. Chemical Science, 9(24), 5441-5451, 2018.
[GAU] Schwaller, P., Gaudin, T., Lanyi, D., Bekas, C., & Laino, T. (2018). Found in Translation: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models. Chemical science, 9(28), 6091-6098, 2018.
[WID] M. Widrich, B. Schaefl, M. Pavlovic, G. K. Sandve, S. Hochreiter, V. Greiff, G. Klambauer (2020). DeepRC: Immune repertoire classification with attention-based deep massive multiple instance learning. bioRxiv.
[HOF] M. Hofmarcher, A. Mayr, E. Rumetshofer, P. Ruch, P. Renz, J. Schimunek, P.Seidl, A. Vall, M. Widrich, S. Hochreiter, G. Klambauer (2020). Large-scale ligand-based virtual screening for SARS-CoV-2 inhibitors using deep neural networks. Available at SSRN 3561442.
[GOL] C. Goller & A. Küchler (1996). Learning task-dependent distributed representations by backpropagation through structure. Proceedings of International Conference on Neural Networks (ICNN'96). Vol. 1. IEEE, 1996. Based on TR AR-95-02, TU Munich, 1995.
[KU] A. Küchler & C. Goller (1996). Inductive learning in symbolic domains using structure-driven recurrent neural networks. Lecture Notes in Artificial Intelligence, vol 1137. Springer, Berlin, Heidelberg.
[RCO] Reddit Machine Learning (2020). Resources and channels to help with COVID-19 research.
[JEDI] JEDI 2020 Grand Challenge: Billion Molecules Against Covid-19. Press release.
[DL1] J. Schmidhuber (2015). Deep Learning in neural networks: An overview. Neural Networks, 61, 85-117. More.
[MED] J. Schmidhuber (2012). First Deep Learner to win a medical imaging contest (and first to win a contest on object detection in large images) by D. Ciresan et al. (2011-2012).
[GPUCNN5] J. Schmidhuber (March 2017). History of computer vision contests won by deep CNNs on GPU. [How D. Ciresan et al. at the Swiss AI Lab IDSIA used deep and fast GPU-based CNNs to win four important computer vision competitions 2011-2012 before others started using similar approaches.]
[DEC] J. Schmidhuber (02/20/2020). The 2010s: Our Decade of Deep Learning / Outlook on the 2020s. The relevant Sec. 6 compares super-organisms such as cities & states & companies to biological multicellular organisms whose individual cells enjoy little privacy. It asks: Are surveillance and loss of privacy inevitable consequences of increasingly complex societies? Especially during pandemics, some nations may find it easier than others to become more complex kinds of super-organisms at the expense of the privacy rights of their constituents.
[GEO] J. Schmidhuber (4/14/2020). Coronavirus Geopolitics.
Germs and the Rise of Empires.
Pros of Surveillance States.
Chinese Marshall Plan?
One century ago, the "Spanish flu" emerged, apparently from America.
A few months ago, Covid-19 emerged, apparently from China.
Long before that, germs and viruses contributed to the rise and fall of empires.