I am a fifth year PhD candidate studying computational biology and machine learning in the Paul G. Allen School of Computer Science and Engineering at the University of Washington with Su-In Lee. I love working on the application of machine learning to personalized health. Over the next few decades I believe automated data analysis will lead to significant advances in our understanding and treatment of health and disease. Before UW I had the opportunity to study graph theory at Colorado State University, and lead research projects at Numerica for several years.
My current work focuses on actionable machine learning in both basic biology and predictive medicine in the hospital. In both areas a combination of interpretable models and transparent visualizations of the learned structure is important. This has lead to our development of broadly applicable methods and tools for interpreting complex machine learning models.
Open source software
- SHAP – A unified approach to explain the output of any machine learning model. Under certain assumptions it can be shown to be the optimal linear explanation of any model’s prediction. It includes an implementation of an exact polynomial time algorithm for tree models such as random forests or gradient boosted trees, making it particularly useful for these types of models.
- ChromNet.jl – A network learning method that ingests BAM/BED files and other pre-processed data bundles (such as the one provided for all human ENCODE ChIP-seq data).
For a full list of open source packages see GitHub
- ChromNet – An online network visualization of the chromatin network estimated from ENCODE ChIP-seq data, or custom network users upload.
- S. Lundberg, B. Nair, M. Vavilala, M. Horibe, M. Eisses, T. Adams, D. Liston, D. Low, S. Newman, J. Kim, and S. Lee “Explainable machine-learning predictions for the prevention of hypoxaemia during surgery,” Nature Biomedical Engineering volume 2, pages 749–760 (2018). (Selected to be the cover article) (free ShareIt link)
- S. Lundberg, G. Erion, S. Lee “Consistent Individualized Feature Attribution for Tree Ensembles,” pre-print.
- S. Lee, S. Celik, B. Logsdon, S. Lundberg, T. Martins, V. Oehler, E. Estey, C. Miller, S. Chien, J. Dai, and A. Saxena “A machine learning approach to integrate big data for precision medicine in acute myeloid leukemia,” in Nature communications, 2018.
- S. Lundberg, S. Lee “A unified approach to interpreting model predictions,” NIPS 2017 (selected for oral presentation) (3 min overview video) (errata).
- G. Erion, H. Chen, S. Lundberg, and S. Lee. “Anesthesiologist-level forecasting of hypoxemia with only SpO2 data using deep learning,” in Neural Information Processing Systems (NIPS) 2017 Workshop ML4H: Machine Learning for Health
- H. Chen, S. Lundberg, and S. Lee. “Hybrid Gradient Boosting Trees and Neural Networks for Forecasting Operating Room Data,” in Neural Information Processing Systems (NIPS) 2017 Workshop ML4H: Machine Learning for Health
- N. Hiranuma, S. Lundberg, and S. Lee. “CloudControl: Leveraging many public ChIP-seq control experiments to better remove background noise,” in Proceedings of the 7th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM, 2016.
- S. Lundberg, S. Lee “An unexpected unity among methods for interpreting model predictions,” presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems. (best paper award)
- S. Lundberg, W. Tu, B. Raught, L. Penn, M. Hoffman, S. Lee “ChromNet: Learning the human chromatin network from all ENCODE ChIP-seq data,” in Genome Biology, 2016. (F1000Prime recommended)
- S. Lundberg, C. Calderon, and R. Paffenroth, “Detecting Clustered Chem/Bio Signals in Noisy Sensor Feeds Using Adaptive Fusion,” in Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, vol. 8393, p. 1, 2012.
- R. Nong, R. Paffenroth, S. Lundberg, and W. Leed, “Method for Lossy Compression of Point Clouds with Pointwise Error Constraints,” Patent application filed 2012.
- C. Calderon, A. Jones, S. Lundberg, and R. Paffenroth, “A data-driven approach for processing heterogeneous categorical sensor signals,” Proceedings of SPIE, vol. 8137, p. 813704, 2011.
- B. Joeris, S. Lundberg, and R. McConnell, “O (mlogn) split decomposition of strongly-connected graphs,” Discrete Applied Mathematics, vol. 158, no. 7, pp. 779–799, 2010.
- S. Lundberg, R. Paffenroth, and J. Yosinski, “Analysis of CBRN sensor fusion methods,” in Information Fusion (FUSION), 2010 13th Conference on, pp. 1–8, IEEE, 2010.
- A. Curtis, C. Izurieta, B. Joeris, S. Lundberg, and R. McConnell, “An implicit representation of chordal comparability graphs in linear time,” Discrete Applied Mathematics, vol. 158, no. 8, pp. 869–875, 2010.
- S. Lundberg, R. Paffenroth, and J. Yosinski, “Algorithms for Distributed Chemical Sensor Fusion,” Proceedings of SPIE, vol. 7698, p. 769806, 2010.
- S. Lundberg, “O (m log n) split decomposition of directed graphs,” Master’s thesis, Colorado State University, 2008.
- D. Moore, J. Stevens, S. Lundberg, and B. Draper, “Top down image segmentation using congealing and graph-cut,” in Pattern Recognition, 2008. ICPR 2008. 19th International Conference on, pp. 1–4, IEEE, 2008.