About Me

Hi! I’m a Ph.D. student in EECS at UC Berkeley, where I’m fortunate to be advised by Prof. Yi Ma and Prof. Jiantao Jiao. I’m affiliated with BAIR and supported by a UC Berkeley College of Engineering fellowship. Prior to my PhD, I completed a BA in CS and MS in EECS, also at UC Berkeley.

My research interests broadly lie in developing theory for large-scale empirical deep learning methodology. I work on this problem through the following intertwined threads:

  • Finding theoretical principles for deep learning that are relevant at large scales.
  • Building theoretically principled deep learning systems at large scales.

I’m particularly interested in problem instances where data is high-dimensional yet has rich structure, such as computer vision and natural language processing, and how this structure interacts with mechanisms for representation and generation within deep neural networks.

Here are some specific problems I'm interested in.
Large Language Models (LLMs): What concepts and algorithms do LLMs learn, and how are they represented mechanistically? How do approximate retrieval and approximate reasoning manifest in LLMs? How do the (pre-)training dynamics of LLMs adapt to the structure of the training data and produce high-level model behaviors?

Diffusion Models: What allows diffusion models to generalize beyond the empirical distribution of their training data? What structures within data and network architecture enable diffusion models to succeed in some domains and not others?

Multi-Modal Deep Learning: What are the key information-theoretical principles of cross-modality learning? What is the relationship between the representations of text and visual data (both in modern vision-language models and conditional diffusion models), and how is this relationship mechanistically enforced by the underlying deep neural network?

Vision Self-Supervised Learning: How to model faithful and high-quality representations of visual data for recognition tasks? I'm especially interested in developing and applying principles for two problems: (1) continual self-supervised learning, (2) self-supervised learning of dynamic time-correlated data (such as frames of videos).

Finally: How to leverage answers to the above questions to build more powerful, more sample-efficient, multi-modal deep learning models at large scale?


Notes for undergraduate and masters students.
Note 1: I'm happy to chat about my research or general advising. Please send me an email and we can work out a time. Please include "[Advising Inquiry]" in your email title.

Note 2: If you are interested in research collaboration, please send me an email with your background and specific interests (the more detailed, the better). Please include "[Research Collaboration Inquiry]" in your email title. The recommended time investment is at least 15 hours per week. Unfortunately, right now my schedule is tight and generally does not permit consistent long-term mentoring of younger students, so some degree of self-sufficiency would be highly valued. To ensure a more fruitful collaboration, it would be best to have the technical knowledge to read and understand deep learning papers, especially theory-oriented work. Thank you for your understanding.


Recent Updates