Sounak Paul

Sounak Paul

Research Statistician Developer | PhD in Statistics | AI/ML Specialist

Intro

About Me

Brief Introduction:

I am an applied scientist and developer focused on building intelligent, scalable systems that combine deep learning, forecasting, and statistical modeling. My current work centers on turning advanced AI methods into practical, high-performance software solutions.

I have contributed to patents and patent-pending innovations in AI-driven code triage and automated forecasting, integrating RAG architectures, agent frameworks, and MLOps for large-scale analytical systems. I work extensively with Python, R, and SAS, leveraging tools like PyTorch, Docker, and Kubernetes to deliver robust, production-ready models.

My broader interests include generative AI, computer vision, and applied statistics, with a research background in deep learning–based Cryo-EM reconstruction - an area that I worked on during my PhD at UChicago. I’m passionate about bridging the gap between theory and engineering, developing solutions that are both scientifically sound and operationally impactful.

Education

Academic Journey

University of Chicago

PhD in Statistics

Oct 2019 – Aug 2024

GPA: 3.92/4.00

  • Thesis: On Learning and Optimization in Inverse Problems with Group Structured Latent Variables

University of Alberta

MSc in Mathematics

Sep 2017 – Aug 2019

GPA: 4.00/4.00

  • Thesis: Marcinkiewicz Strong Law of Large Numbers for Products of Long Range Dependent and Heavy Tailed Linear Processes

Indian Statistical Institute

Bachelor of Mathematics (B.Math)

Aug 2014 – May 2017

  • First Division with Distinction
  • Class Rank: 2

Technical Skills

Technical Expertise

Research areas:

Deep learning Computer vision Generative AI Applied statistics Time series forecasting

Languages:

Python R C/C++ SQL SAS Bash

Libraries:

PyTorch Tensorflow numpy scipy pandas scikit-learn OpenCV PyTorch3D MLflow

Tools and frameworks:

Git Docker Kubernetes JIRA AWS LangGraph OpenAI Agent SDK

Professional Experience

Professional Journey

Research Statistician Developer

SAS Institute Inc.

Aug 2024 – Present

Cary, NC

  • Contributed to patent-pending innovations in AI-driven code triage system, leveraging agents, RAGs, and MCP to integrate static and dynamic analysis tools for automated identification of performance bottlenecks, security vulnerabilities, and code inefficiencies.
  • Streamlined multistep code triage processes, reducing manual effort, accelerating diagnostics, and improving code quality, and achieved significant performance improvements (up to 93% reduction in run time) in computational tasks through targeted optimizations.
  • Collaborated with customers, consultants, technical support, and testers to develop analytical components of forecasting and scientific computing, gather and analyze business requirements, gain expertise in the usage of ML methods like regression, clustering, decision/regression trees, neural nets, CNNs, LSTMs and transformers in a data science setting.
  • Developed TASK options for multiple forecasting nodes using Python, C, and SAS, leveraging industry-standard technologies such as GitHub CI/CD, containers, Kubernetes, cloud platforms, and MLOps practices for model tracking, deployment, and lifecycle management.

Forecasting R&D Intern

SAS Institute Inc.

Jun – Aug 2022 and 2023

Remote

  • Developed a novel multi-objective blackbox optimization method using genetic algorithms to autotune seasonal and subset model parameters of general ARIMA models, simultaneously with Box-Cox parameter.
  • Achieved significant improvement in out-of-sample fit statistic (40% for RMSE) over popular methods such as auto-arima and SAS Diagnose, averaged over a huge collection of data sets.
  • Resulted in a patent (US12380369B1) and a paper (under review).

Projects

Research Projects

Second order methods for stochastic ERM and EM algorithms in orbit recovery setting

  • Used second order methods (newton and quasi-newton) to accelerate stochastic variance reduced gradient descent and EM algorithms for orbit recovery problems.
  • Achieved ≈ 75% reduction in run time using variance-reduced methods on simulated signals.
  • Tools Used: numpy, scipy, PyTorch, MLflow, matplotlib

Deep learning priors for orbit recovery problems

  • Developed neural network architectures for supervised learning of signals and rotational distributions.
  • Demonstrated the advantage of using our method to accelerate the convergence for the reconstruction of signals from the moments (up to 83%).
  • Tools Used: numpy, scipy, PyTorch, OpenCV

Estimation of the amount of heavy-tailedness and long-range dependence in linear processes

  • Used Marcinkiewicz strong laws of large numbers (MSLLN) to find rates of convergence for heavy-tailed multivariate products of long-range dependent two-sided linear processes.
  • Developed a novel method to estimate how much (if any) LRD and HT a sequential data set possesses, and tested it on real financial data using R.

Smaller projects

  • Instance-Level Object Detection using SIFT descriptors. (Computer vision)
  • Does BCG vaccine have a protective effect against severe COVID-19? (Applied statistics)
  • Reimplementation of various machine learning and deep learning papers (ML and DL)

Publications

Publications and Patents

Patents:

M.V. Joshi, S. Paul, I.V. Farahani and Y. Park Hyperparameter tuning in autoregressive integrated moving average (ARIMA) models. US Patent 12,380,369. 5 Aug 2025

Papers:

M. A. Kouritzin and S. Paul On almost sure limit theorems for heavy-tailed products of long-range dependent linear processes. Stochastic Process. Appl., 152 (2022), pp. 208-232 arXiv

Y. Khoo, S. Paul and N. Sharon Deep Neural-network Prior for Orbit Recovery from Method of Moments. J. Comput. Appl. Math., 444 (2024), 115782 arXiv

S. Paul, I.V. Farahani, M.V. Joshi and Y. Park On the Use of Derivative-Free Optimization for Autotuning ARIMA Models. Int. J. Forecast. Under review.

Awards

Honors & Awards

Academic Honors

  • Dr. Josephine M. Mitchell Scholarship, University of Alberta (2018)
  • Pundit RD Sharma Memorial Graduate Award, University of Alberta (2017)
  • University of Alberta Master's Scholarship, University of Alberta (2017)
  • Visiting Student Research Program Fellowship, TIFR Bombay & TIFR CAM (2016)
  • KVPY Scholarship, IISc Bangalore and Dept. of Science and Technology, India (2013)