Jonathan Lin

01 /

About

I am a second-year PhD candidate in the Department of Statistical Science at Duke University, where I am advised by Professor Surya Tokdar. My interests lie broadly in the field of Bayesian statistics — in particular, with nonparametric density estimation, the predictive Bayes framework, and empirical Bayes. Within these fields I aim to develop novel statistical methods that are both practical and theoretically valid.

My most recent work deals with Newton's algorithm, otherwise known as predictive recursion. In its original formulation in 1998, predictive recursion arose as a result of pretending that a stream of data came from a sequence of nested Dirichlet process priors. Such an intuition is perfectly in line with the framework of predictive Bayes. Indeed, recent progress in the theory of martingale posteriors suggests that the success of predictive recursion may be fundamentally linked to a predictive interpretation of Bayesian statistics. I am interested in not only investigating the theoretical underpinnings of this relationship, but also in leveraging the predictive Bayes framework to gain practical advantages over standard Bayesian methodologies.

Before coming to Duke I double-majored in Statistics and English at UC Berkeley, where my statistical thinking was shaped by Sandrine Dudoit and Steven Evans. Outside of research, I write novels, short stories, and poetry. I also enjoy cooking.

Institution

Duke University

Department

Statistical Science

Year

2nd Year PhD

Advisor

Prof. Surya Tokdar

Education

B.A. Statistics & English, UC Berkeley

jonathan.lin@duke.edu

Research Interests

density estimation · empirical Bayes · predictive Bayes · predictive recursion

02 /

Research

Nonparametric conditional density estimation is considered a difficult task as it deals with extracting extremely rich information at a target covariate value using data points that may lie far away from it. Existing methods for density regression often take a traditional Bayesian approach in which one specifies a clever prior over some quantity of interest; these procedures, while often yielding good results, necessitate prohibitively expensive MCMC procedures that can take hours or even days to run. Predictive Bayes offers a solution to this by prompting the statistician to specify a sequence of predictive distributions instead of a likelihood-prior combination. Philosophically, this places the subjective burden on observables rather than latent quantities — but from a more practical perspective, it allows one to bypass expensive MCMC procedures in favor of faster, recursive schemes.

2025

Fast Density Regression with Weight-localized Predictive Recursion

Jonathan Lin, Surya Tokdar

Working Paper · 2025

We introduce PRx, a fast, finite-iteration algorithm for conditional density estimation operating within the predictive Bayes framework. Consistency is established, and the method is validated through simulations and applied case studies. PRx achieves an exceptionally efficient runtime, admits provable consistency guarantees, and comes with built-in hyperparameter tuning — all without recourse to MCMC.

In Progress

2025

Frequency Adaptation in Multiplexing Neurons

Jonathan Lin, Surya Tokdar, Jennifer M. Groh

Working Paper · 2025

Experiments suggest that the spike count distribution of certain cells in the presence of multiple external stimuli (sounds) may adapt with respect to the frequency of the sounds played. We propose using a weight-localized extension of predictive recursion to model these spike count distributions. We then localize the so-called predictive recursion marginal likelihood to target covariate values in order to perform model selection over individual frequency levels.

In Progress

03 /

Teaching

STA 199 · Fall 2024

Introduction to Data Science

Duke University · Undergraduate

An introductory course to R and data science. Teaches the foundations of simple hypothesis testing, linear regression, and exploratory statistics.

→ Teaching Assistant

STA 610 · Fall 2025

Hierarchical Models

Duke University · Masters

Bayesian and frequentist interpretations of hierarchical linear models; fixed, random, and mixed effects, generalized linear models, programming in STAN.

→ Teaching Assistant

STA 540 · Spring 2026

Case Studies

Duke University · Masters

Statistics applied to public health and the social sciences. Missing data, hierarchical models, generalized linear models, programming in R.

→ Teaching Assistant