Hi! I’m a Ph.D. student in EECS at UC Berkeley, where I’m fortunate to be advised by Prof. Yi Ma. I’m affiliated with BAIR and supported by a UC Berkeley College of Engineering fellowship. Prior to my PhD, I completed a BA in CS and MS in EECS, also at UC Berkeley.
My research interests broadly lie in simplifying deep learning. More specifically, I’m interested in developing theory to understand, improve, and simplify empirical deep learning methodology. I work on this problem through the following research threads:
- understanding the latent representations learned by modern deep neural networks;
- connecting modern deep learning practice to classical signal processing and statistics;
- and leveraging such-obtained conceptual insights to design interpretable, efficient, and principled learning algorithms.
I’m particularly interested in problem instances where data is high-dimensional yet has rich structure, such as computer vision, natural language processing, and multi-modal contexts.
In my free time, I play basketball, chess, and TFT, and read sci-fi novels.
Notes for undergraduate and masters students.
Note 1: I'm happy to chat about my research or general advising. Please send me an email and we can work out a time.
Note 2: If you are interested in research collaboration, please send me an email with your background and specific interests (the more detailed, the better). The recommended time investment is at least 15 hours per week. Unfortunately, right now my schedule is tight and generally does not permit consistent long-term mentoring of younger students, so some degree of self-sufficiency would be highly valued. To ensure a more fruitful collaboration, it would be best to have the technical knowledge to read and understand deep learning papers, especially theory-oriented work. Thank you for your understanding.
- (January 2024) Our paper Masked Completion via Structured Diffusion with White-Box Transformers, which develops a connection between iterative denoising in diffusion models and representation learning in transformer-like deep networks, and uses it to construct a performant, efficient, and interpretable transformer-like autoencoder, was accepted to ICLR 2024. The contents of this paper are also contained in the below large pre-print.
- (November 2023) New (comprehensive) pre-print reviewing our “White-Box Transformers” line of work: deriving efficient, interpretable, and performant transformer-like architectures from first-principles information theory and signal processing.
- (November 2023) Our papers Emergence of Segmentation with Minimalistic White-Box Transformers, Closed-Loop Transcription via Convolutional Sparse Coding, and Masked Completion via Structured Diffusion with White-Box Transformers were accepted to CPAL 2024.
- (October 2023) Our paper Emergence of Segmentation with Minimalistic White-Box Transformers was accepted to NeurIPS 2023 XAIA Workshop.
- (September 2023) Our paper White-Box Transformers via Sparse Rate Reduction, proposing an interpretable and parameter-efficient transformer-like architecture derived from first-principles, was accepted to NeurIPS 2023.
- (August 2023) Started my Ph.D. program in EECS at UC Berkeley!