Ahmed Imtiaz Humayun

I am a PhD student at Rice University, where I work on deep learning theory and generative modeling, advised by Dr. Richard Baraniuk. I am jointly working as a Student Researcher at Google on adapting text-to-image generative models.

My interests lie in interpretation and improvement of deep neural network models via approximation theory, e.g., spline theory. I have received the Lowernstein Fellowship at Rice and have published in ICLR, CVPR, ICASSP, and INTERSPEECH, to name a few. I have also founded Bengali.AI, a non-profit initiative that crowdsources datasets and open-sources them through international ML competitions, e.g., Out-of-Distribution Speech Recognition @ Kaggle.

My work has been featured in multiple news sources, e.g., WIRED, New Scientist, Futurism, Tom's Hardware, The Daily Star, The Business Standard, Prothom Alo (Bengali), and IEEE Signal Processing Magazine.

Email  /  CV  /  Google Scholar  /  Twitter  /  Github

profile photo
Research

My research transects the domains of deep learning theory, generative modeling, interpretability, and optimization. Some of my representative projects are listed below.

SplineCam: Exact Visualization and Characterization of Deep Network Geometry and Decision Boundary
Ahmed Imtiaz Humayun, Randall Balestriero, Guha Balakrishnan, Richard Baraniuk
CVPR (Highlight), 2023
website / codes / arXiv

The first provably exact method for computing the geometry of ANY DNN's mapping, including its decision boundary. For a specified region of the input space, SplineCam can be used to compute and visualize the 'linear regions' formed by any DNN with piecewise linear non-linearities, e.g., LeakyReLU, Sawtooth.

Training Dynamics of Deep Network Linear Regions
Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
ArXiv 2023
arXiv

We present empirical evidence showing that the local complexity of DNNs -- measured in terms of linear region density -- undergro an epoch-wise double descent phenomenon during training. Starting with an initial descend phase, after a number of epochs the network accumulates its non-linearities around training points, resulting in a ascent of local complexity. Finally through another descent phase, all the non-linearities of the networks accumulate around the decision boundary.

Self-consuming Generative Models go MAD
Sina Alemohammad*, Josue Casco-Rodriguez*, Lorenzo Luzi, Ahmed Imtiaz Humayun, Hossain Babaei, Daniel Lejeune, Ali Siahkoohi, Richard Baraniuk
ArXiv 2023
arXiv / news

We study the phenomenon of training new generative models with synthetic data from previous generative models. Our primary conclusion is that without enough fresh real data in each generation of a self-consuming or autophagous loop, future generative models are doomed to have their quality (precision) or diversity (recall) progressively decrease.

Provable Instance Specific Robustness via Linear Constraints
Ahmed Imtiaz Humayun*, Josue Casco-Rodriguez*, Randall Balestriero, Richard Baraniuk
ICML 2023 AdvML Workshop
website / arXiv (coming soon)

Using spline theory, we present a novel method for imposing analytical constraints directly on the decision boundary for provable robustness. Our method can provably ensure robustness for any set of instances, e.g. training samples from a specific class, against adversarial, backdoor or poisoning attack.

Polarity Sampling: Quality and Diversity Control of Pre-Trained Generative Networks via Singular Values
Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
CVPR (Oral Presentation), 2022
codes / arXiv / video

A provable method for controllable generation based on quality and diversity from any pre-trained deep generative model. We show that increasing the sampling diversity helps surpass SOTA image generation.

MaGNET: Uniform Sampling from Deep Generative Network Manifolds Without Retraining
Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
ICLR, 2022
codes / arXiv / video

A novel and theoretically motivated latent space sampler for any pre-trained DGN, that produces samples uniformly distributed on the learned output manifold. Applications in fairness and data augmentation.

Exact Visualization of Deep Neural Network Geometry and Decision Boundary
Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
NeurIPS 2022 Workshop on Symmetry and Geometry (NeurReps)
website / arXiv / poster

Using spline theory, we present a method for exact visualization of deep neural networks that allows us to visualize the decision boundary and also sample arbitrarily many inputs that provably lie on the model's decision boundary

No More Than 6ft Apart: Robust K-Means via Radius Upper Bounds
Ahmed Imtiaz Humayun, Randall Balestriero, Anastasios Kyrillidis, Richard Baraniuk
ICASSP, 2022
codes / arXiv

Repeated samples and sampling bias may manifest imbalanced clustering via K-methods. We propose the first method to impose a hard radius constraint on K-Means, achieving robustness towards sampling inconsistencies.

Compressed Representations of Variable-Length Sequences Using Recurrent Neural Tangent Kernels
S Alemohammad, H Babaei, R Balestriero, MY Cheung, Ahmed Imtiaz Humayun, D Lejeune, L Luzi, Richard Baraniuk
ICASSP, 2021
codes / arXiv

We extend existing methods that rely on the use of kernels to variable-length sequences by using the Recurrent Neural Tangent Kernel (RNTK).

Towards Domain Invariant Heart Sound Abnormality Detection using Learnable Filterbanks
Ahmed Imtiaz Humayun, Shabnam Gaffarzadegan, Zhe Feng, Taufiq Hasan
IEEE JBHI, 2020
codes / arXiv

We show that novel Convolutional Neural Network (CNN) layers that emulate different classes of Finite Impulse Response (FIR) filters can perform domain invariant heart sound abnormality detection.

End-to-end Sleep Staging with Raw Single Channel EEG using Deep Residual ConvNets
Ahmed Imtiaz Humayun, Asif Sushmit, Taufiq Hasan, MIH Bhuiyan
IEEE BHI, 2019
codes / arXiv

Very Deep Convolutional Residual Network achieve state-of-the-art results in sleep staging, using only raw single channel EEG.

X-Ray Image Compression Using Convolutional Recurrent Neural Networks
Asif Sushmit, SU Zaman, Ahmed Imtiaz Humayun, Taufiq Hasan, MIH Bhuiyan
IEEE BHI, 2019
codes / paper

Convolutional Recurrent Neural Networks outperform SOTA RNN based compression methods as well as JPEG 2000 for X-ray image compression.

An Ensemble of Transfer, Semi-supervised and Supervised Learning Methods for Pathological Heart Sound Classification
Ahmed Imtiaz Humayun, MT Khan, Shabnam Gaffarzadegan, Zhe Feng, Taufiq Hasan
INTERSPEECH, 2018
codes / arXiv

State-of-the-art heart abnormality classification using an ensemble of Representation Learning methods.

Learning Front-end Filter-bank Parameters using Convolutional Neural Networks for Abnormal Heart Sound Detection
Ahmed Imtiaz Humayun, Shabnam Gaffarzadegan, Zhe Feng, Taufiq Hasan
IEEE EMBC, 2018
codes / arXiv

We propose novel linear phase and zero phase convolutional neural networks that can be used as learnable filterbank front-ends.

Predictive Real-time Beat Tracking from Music for Embedded Application
Irfan Hussaini, Ahmed Imtiaz Humayun, Shariful Foysal, Samiul Alam, Ahmed Masud, Rakib Hyder, SS Chowdhury, MA Haque
IEEE MIPR, 2018
codes / paper / video

IEEE Signal Processing Cup Honorable Mention for Real-time Music Beat Tracking Embedded System.

Bengali.AI

Bengali.AI is a non-profit in Bangladesh where we create novel datasets to accelerate Bengali Language Technologies (e.g., OCR, ASR) and open-source them through machine learning competitions (e.g., Grapheme 2020, ASR 2022)

OOD-Speech: A Large Bengali Speech Recognition Dataset for Out-of-Distribution Benchmarking
FR Rakib, SS Dip, S Alam, N Tasnim, MIH Shihab, + 5 authors, Farig Sadeque, Tahsin Reasat, AS Sushmit, Ahmed Imtiaz Humayun
INTERSPEECH, 2023
competition / arXiv

Jointly largest open-sourced Bengali ASR dataset as well as first Bengali Out-of-Distribution Speech Recognition benchmarking dataset.

BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset
MIH Shihab, MR Hassan, M Rahman, SM Hossen, +11 authors, AS Sushmit*, Ahmed Imtiaz Humayun*
ICDAR, 2023
dataset / arXiv / competition

First Multi-Domain Bengali Document Layout Analysis Dataset, with 700K polygon annotations from image captured documents in the wild.

Bengali Common Voice Speech Dataset for Automatic Speech Recognition
Samiul Alam, Asif Sushmit, Zaowad Abdullah, Shahrin Nakkhatra, MD Ansary, Syed Hossen, Sazia Mehnaz, Tahsin Reasat, Ahmed Imtiaz Humayun
ArXiv, 2022
competition / arXiv

We have crowdsourced the first public 500 hr Bengali Speech Dataset on the Mozilla Common Voice platform, with speech contributed by over 20K people from Bangladesh and India.

A Large Multi-Target Dataset of Common Bengali Handwritten Graphemes
Samiul Alam, Tahsin Reasat, Asif Sushmit, SM Siddiquee, Fuad Rahman, Mahady Hasan, Ahmed Imtiaz Humayun
ICDAR, 2021
competition / arXiv / news

A benchmark datset for multi-target classification of handwritten Bengali Graphemes, with novel implications for all alpha-syllabary languages, e.g., Hindi, Gujrati, and Thai.

NumtaDB: Assembled Bengali Handwritten Digits
Samiul Alam, Tahsin Reasat, Rashed Doha, Ahmed Imtiaz Humayun
ArXiv, 2018
competition / arXiv

The first large scale Multi-Domain Bengali Handwritten Digit Recognition Dataset


Yet another steal of Jon Barron's amazing website.