Truyen Tran

Deep Learning for Biomedical Discovery and Data Mining
A tutorial @PAKDD18, Melbourne, June 2018.
Slides (Part I; Part II)
Abstract: The goals of this tutorial are to provide the general PAKDD audience with knowledge and materials about a great venture for KDD research – the intersection between deep learning and biomedicine and to provide the deep learning community with relatively new, high impact research problems within biomedicine. The tutorial introduces the state of the field for deep learning, and argues how biomedicine is an ideal data–intensive domain. It gives a brief review of deep learning, covering classic neural architectures including feedforward, recurrent and convolutional nets and more advanced topics including CapsNet, powerful memory-augmented neural nets (MANN), as well as models for graph data. Two major subtopics of Genomics are covered: nanopore sequencing (which is about converting electrical signals into DNA character sequences), and genomics modeling (which is about making sense of the DNA sequences for multiple biological processes). For healthcare coverage is on data mining of Electronic Medical Records. Two main problems are considered: The first is modeling time-series and the second is mid-term health trajectories prediction. Then I will cover recent advances in data eficient methods: few-shot learning and deep generative models (RBM, VAE and GAN). This describes how to apply these advances to drug designs, and the future outlook into a 5-year horizon and beyond on the joint venture of deep learning and biomedicine.
Prerequisite: the tutorial does not require detailed prior knowledge of biomedicine or deep learning, but basic familiarity with machine learning is assumed.
Outline:
Part I
Topic 1: Introduction (20 mins)
Topic 2: Brief review of deep learning (30 mins)
Classic architectures
Capsules & graphs
Memory & attention
Topic 3: Genomics (30 mins)
Nanopore sequencing
Genomics modelling
QA (10 mins)
Part II

Topic 4: Healthcare (40 mins)
Time series (regular & irregular)
EMR analysis: Trajectories prediction
EMR analysis: Sequence generation
Topic 5: Data efficiency methods (40 mins)
Few-shot learning
Generative models
Unsupervised learning of drugs
Topic 6: Future outlook
QA (10 mins)

References

Genomics & drug design
Healthcare
Deep learning fundamentals

Genomics & drug design

Altae-Tran, Han, et al. "Low Data Drug Discovery with One-Shot Learning." ACS central science 3.4 (2017): 283-293.
Alipanahi, Babak, et al. "Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning." Nature biotechnology 33.8 (2015): 831-838.
Angermueller, Christof, et al. "Deep learning for computational biology." Molecular systems biology 12.7 (2016): 878.
Boža, Vladimír, Broňa Brejová, and Tomáš Vinař. "DeepNano: Deep recurrent neural networks for base calling in MinION nanopore reads." PloS one 12.6 (2017): e0178751.
Ching, Travers, et al. "Opportunities And Obstacles For Deep Learning In Biology And Medicine." bioRxiv (2018): 142760.
Duvenaud, David K., et al. "Convolutional networks on graphs for learning molecular fingerprints." Advances in neural information processing systems. 2015.
Eser, Umut, and L. Stirling Churchman. "FIDDLE: An integrative deep learning framework for functional genomic data inference." bioRxiv (2016): 081380.
Gilmer, Justin, et al. "Neural message passing for quantum chemistry." arXiv preprint arXiv:1704.01212 (2017).
Gómez-Bombarelli, Rafael, et al. "Automatic chemical design using a data-driven continuous representation of molecules." ACS Central Science (2016)
Gupta, Anvita, et al. "Generative Recurrent Networks for De Novo Drug Design." Molecular Informatics (2017).
Jin, W., Barzilay, R., & Jaakkola, T. (2018). "Junction Tree Variational Autoencoder for Molecular Graph Generation". ICML'18.
Kadurin, A., Aliper, A., Kazennov, A., Mamoshina, P., Vanhaelen, Q., Khrabrov, K., & Zhavoronkov, A. (2017). "The cornucopia of meaningful leads: Applying deep adversarial autoencoders for new molecule development in oncology". Oncotarget, 8(7), 10883.
Kadurin, A., Nikolenko, S., Khrabrov, K., Aliper, A., & Zhavoronkov, A. (2017). "druGAN: an advanced generative adversarial autoencoder model for de novo generation of new molecules with desired molecular properties in silico". Molecular pharmaceutics, 14(9), 3098-3104.
Kien Do, et al. "Attentional Multilabel Learning over Graphs-A message passing approach." arXiv preprint arXiv:1804.00293(2018).
Kusner, Matt J., Brooks Paige, and José Miguel Hernández-Lobato. "Grammar Variational Autoencoder." arXiv preprint arXiv:1703.01925 (2017).
Lanchantin, Jack, Ritambhara Singh, and Yanjun Qi. "Memory Matching Networks for Genomic Sequence Classification." arXiv preprint arXiv:1702.06760 (2017).
Leung, Michael KK, et al. "Deep learning of the tissue-regulated splicing code." Bioinformatics 30.12 (2014): i121-i129.
Olivecrona, Marcus, et al. "Molecular De Novo Design through Deep Reinforcement Learning." arXiv preprint arXiv:1704.07555(2017).
Penmatsa, Aravind, Kevin H. Wang, and Eric Gouaux. "X-ray structure of dopamine transporter elucidates antidepressant mechanism." Nature 503.7474 (2013): 85-90.
Pham, Trang et al. "Graph Classification via Deep Learning with Virtual Nodes. Third Representation Learning for Graphs Workshop (ReLiG 2017).
Pham, Trang, Truyen Tran, and Svetha Venkatesh. "Graph Memory Networks for Molecular Activity Prediction." ICPR'18.
Quang, Daniel, and Xiaohui Xie. "DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences." Nucleic acids research 44.11 (2016): e107-e107.
Romero, Adriana, et al. "Diet Networks: Thin Parameters for Fat Genomic." arXiv preprint arXiv:1611.09340 (2016).
Roses, Allen D. "Pharmacogenetics in drug discovery and development: a translational perspective." Nature reviews Drug discovery 7.10 (2008): 807-817.
Segler, Marwin HS, et al. "Generating focussed molecule libraries for drug discovery with recurrent neural networks." arXiv preprint arXiv:1701.01329 (2017).
Segler, Marwin, Mike Preuß, and Mark P. Waller. "Towards" AlphaChem": Chemical Synthesis Planning with Tree Search and Deep Neural Network Policies." arXiv preprint arXiv:1702.00020(2017).
Simonovsky, M., & Komodakis, N. (2018). "GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders". arXiv preprint arXiv:1802.03480.
Stoiber, Marcus, and James Brown. "BasecRAWller: Streaming Nanopore Basecalling Directly from Raw Signal." bioRxiv (2017): 133058.
Teng, Haotien, et al. "Chiron: Translating nanopore raw signal directly into nucleotide sequence using deep learning, GigaScience, Volume 7, Issue 5, 1 May 2018, giy037.

Healthcare

Acharya, U. Rajendra, et al. "Application of deep convolutional neural network for automated detection of myocardial infarction using ECG signals." Information Sciences 415 (2017): 190-198
Che, Zhengping, et al. "Recurrent neural networks for multivariate time series with missing values." arXiv preprint arXiv:1606.01865(2016).
Choi, Edward, et al. "Generating Multi-label Discrete Electronic Health Records using Generative Adversarial Networks." arXiv preprint arXiv:1703.06490 (2017).
Choi, Edward, et al. "Doctor AI: Predicting clinical events via recurrent neural networks." Machine Learning for Healthcare Conference. 2016.
Choi, Edward, et al. "GRAM: Graph-based attention model for healthcare representation learning." Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2017.
Choi, Edward, et al. "RETAIN: An interpretable predictive model for healthcare using reverse time attention mechanism." Advances in Neural Information Processing Systems. 2016.
Do, Kien et al. "Learning Recurrent Matrix Representation", Third Representation Learning for Graphs Workshop (ReLiG 2017), also: arXiv preprint arXiv: 1703.01454.
Esteva, Andre, et al. "Dermatologist-level classification of skin cancer with deep neural networks." Nature 542.7639 (2017): 115-118.
Hung Le, Truyen Tran, and Svetha Venkatesh. “Dual Control Memory Augmented Neural Networks for Treatment Recommendations”, PAKDD'18.
Hung Le, Truyen Tran, and Svetha Venkatesh. "Dual Memory Neural Computer for Asynchronous Two-view Sequential Learning." KDD'18.
Lipton, Zachary C., et al. "Learning to diagnose with LSTM recurrent neural networks." arXiv preprint arXiv:1511.03677(2015).
Miotto, Riccardo, et al. "Deep patient: An unsupervised representation to predict the future of patients from the electronic health records." Scientific reports 6 (2016): 26094.
Nguyen, Phuoc. "Deep Learning to Attend to Risk in ICU", IJCAI'17 Workshop on Knowledge Discovery in Healthcare II: Towards Learning Healthcare Systems (KDH 2017).
Nguyen, Phuoc et al. "Deepr: A Convolutional Net for Medical Records". IEEE Journal of Biomedical and Health Informatics, vol. 21, no. 1, pp. 22–30, Jan. 2017, Doi: 10.1109/JBHI.2016.2633963
Phuoc Nguyen, Truyen Tran, and Svetha Venkatesh. "Resset: A Recurrent Model for Sequence of Sets with Applications to Electronic Medical Records." IJCNN (2018).
Nguyen, Tu et al. "Tensor-variate Restricted Boltzmann Machines", AAAI 2015.
Pham, Trang et al. "Predicting healthcare trajectories from medical records: A deep learning approach". Journal of Biomedical Informatics, April 2017, DOI: 10.1016/j.jbi.2017.04.001.
Tran, Truyen. "Living in the future: AI for healthcare". Blog, Feb 2017.
Zhang et al., “Leap: Learning to prescribe effective and safe treatment combinations for multimorbidity”, KDD'17.

Deep learning fundamentals

Goodfellow, Ian et al., "Generative Adversarial Nets". NIPS, 2014.
Graves, Alex et al. "Hybrid computing using a neural network with dynamic external memory", Nature, 2016.
Hochreiter, Sepp, et al. "Learning to learn using gradient descent". In Artificial Neural Networks (ICANN) 2001, pp. 87–94. Springer,2001
Kingma, Diederik P., and Max Welling. "Auto-encoding variational Bayes." arXiv preprint arXiv:1312.6114 (2013).
Koch, Gregory et al. "Siamese neural networks for one-shot image recognition." ICML Deep Learning Workshop. Vol. 2. 2015.
Kumar, Ankit, et al. "Ask me anything: Dynamic memory networks for natural language processing." International Conference on Machine Learning. 2016.
Mishra, Nikhil, et al. "Meta-Learning with Temporal Convolutions." arXiv preprint arXiv:1707.03141 (2017).
Santoro, Adam, et al. "Meta-learning with memory-augmented neural networks." International conference on machine learning, 2016
Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks." Advances in neural information processing systems. 2015.
Wagstaff, K. L. (2012, June). "Machine learning that matters". In Proceedings of the 29th International Coference on International Conference on Machine Learning (pp. 1851-1856). Omnipress.