Assistant Professor of Statistics, Naveen Narisetty, has recently received the distinguished Faculty Early Career Development (CAREER) award from the National Science Foundation (NSF). The CAREER Program is a Foundation-wide activity that offers the National Science Foundation's most prestigious awards in support of early-career faculty who have the potential to serve as academic role models in research and education and to lead advances in the mission of their department or organization.
Professor Narisetty's work will explore several key aspects of Big Data and how statisticians and data scientists can more efficiently work within the Bayesian framework. Multiple disciplines including biology, economics, environmental sciences, marketing, and medical sciences that use statistical practices will be influenced by the research. The research will be integrated into teaching special topics courses for graduate and undergraduate students and developing an outreach workshop for K-12 students to provide exposure to modern statistics and its applications.
The abstract at the time of the award is provided below:
Title: Flexible and Efficient Exploration of the Bayesian Framework for High Dimensional Modeling
The modern era of Big Data brings unique opportunities as well as challenges to the statistician. While the Big Data revolution brings a great opportunity to obtain valuable and profound insights from the richness of data and to enhance data-driven decision making, it also brings challenging demands for innovation and knowledge discovery in three crucial aspects from statisticians and data scientists: (i) development of flexible models that can appropriately describe the complexities of the data (ii) efficient and valid statistical estimation and inferential procedures, and (iii) development of computational algorithms that scale-up to large datasets. The purpose of this project is to make advances in all the three aspects by fully exploring the Bayesian framework, which treats the parameters of a model to be random and provides an efficient mechanism to quantify the uncertainty of the model parameters. In particular, the techniques developed will be useful for analyzing datasets containing a large number of covariates, for learning the dependence structures between a large number of outcome variables, and for obtaining a comprehensive description of the impact of covariates on outcome variables by modeling their relationships at different quantile levels. The research developed will have impact on statistical practice in various disciplines including biology, economics, environmental sciences, marketing, and medical sciences. The training component will integrate research into teaching by offering special topics courses to graduate students based on the proposed research and by developing undergraduate research projects that incorporate research concepts at an accessible level. The PI will mentor high school research projects and organize a K-12 outreach workshop to provide exposure to modern statistics and its applications to high school students and teachers.
Statistically rigorous and computationally efficient Bayesian methodologies and inferential procedures will be developed which will be applicable for a variety of complex high dimensional models including generalized linear models, quantile regression models, and graphical models. General classes of Bayesian regularization priors will be proposed, and their regularization properties will be rigorously studied for a variety of commonly used likelihood functions. In contrast to most of the existing Bayesian approaches that focus on high dimensional estimation, a novel Bayesian framework for performing high dimensional Bayesian inference having valid frequentist properties will be developed. Scalable computational techniques that do not involve large matrix operations for obtaining point estimators from the posteriors as well as for sampling the full posterior distributions will be devised and their statistical properties will be studied. An attractive feature of the computational developments will be that they will be applicable to a diverse range of statistical models commonly used in practice. The research developed will be closely related to several highly active areas of modern statistics including high dimensional modeling, Bayesian computation, nonconvex regularization, post-selection inference, graphical models, and quantile regression, and will contribute to the advancement of and interaction between these areas. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.