Selected Projects
Natural Language Processing
- InfoPopularity | python • pytorch
Code and dataset for our paper, 'Towards Proactively Forecasting Sentence-Specific Information Popularity within Online News Documents' at the 33rd ACM Conference on Hypertext and Social Media (HT '22)
- Summaformers | python • pytorch
Code for our paper 'Summaformers @ LaySumm 20, LongSumm 20' at EMNLP 2020, Scholarly Document Processing Workshop'
- Multilingual Hate Speech Detection | python • pytorch
Code for our paper 'Leveraging Multilingual Transformers for Hate Speech Detection' at FIRE 2020
- Augment4Gains | python • pytorch
Enhancing Hate Speech Detection through Data Augmentation Techniques
- Acronym Sense Disambiguator | python • nltk
Identifies acronyms in a text file and disambiguates possible expansions, where a guided disambiguator module runs queries on an indexed Wikipedia dump
- Detecting parse errors due to linguistic repair | python • spacy
Detecting errors in dependency parsing of speech transcript primarily due to active linguistic repair ― correlations between types of parse failures and the existence of repair in the utterance
- Extractive summarization unit | python • scikit-learn • spacy
A lightweight easy-to-train extractive summarization formulation for the CNN dataset
- Language Modeling | python
From scratch implementation of language modeling scheme with Kneser-Ney Smoothing and Interpolation with a sentence and Tweet generator
- English ⟶ Bengali NMT | python • PyTorch
Supervised neural MT system for English ⟶ Bengali using a seq2seq encoder-decoder architecture with an attention mechanism
- Wikisearch | python • xml-sax
A Search Engine for the Wikipedia Dump built from scratch
- Brill’s Tagger Illustration | python
An Implementation of a Modular Brill's Tagger with Analysis on Formulation of Disambiguation Schemes
- English ⟶ Hindi SMT | python
Statistical MT system for English ⟶ Hindi utilizing IBM models and an HMM formulation for relative token alignment
- LexiNER | python
A simple Named Entity Recognition procedure for Bengali purely based on Lexical Rules
- Bengali Anaphora Resolution Challenges | python
Outline of a Rule Based Anaphora Resolution System for Bengali with granular performance analysis
Computer Vision
- Manifold Learning | python • scikit-learn
Illustration of various Spectral Clustering Techniques + Manifold Visualization schemes such as MDS, LLE, & Isomap; with applications including dimensionality reduction and Image Classification on CIFAR10
- Image classification formulations | python • scikit-learn • TensorFlow
Performance of various deep learning architectures for face recognition and miscellaneous image classification tasks: insights and analysis
- Multiclass Classification using Logistic Regression | python • scikit-learn
Study of different styles of classification schemes including a set of one-vs-rest classifiers, and binary classifiers with majority-voting for the task of Image Classification
Machine Learning Fundamentals
- COVID-19 modeling for India | python • scipy
Modeling the Coronavirus Statistics for India using the Susceptible-Infected-Recovered (SIR) Model
- LDA Demonstration | python • scipy • scikit-learn
Linear Discriminant Analysis ~ a dimensionality reduction as well as a classification technique — with applications in document understanding
- relevance-paradox | python • spacy
Analyzing trends, Demonstrating Simpson's, Berkson's, and Lord's Paradox in real world Natural Language Data
- Approaches to Principal Component Analysis | python
Different formulations for PCA using Data Covariance Matrix, using Gradient Descent with comparisons among different types of regularization schemes
- Extending K-means | python • scikit-learn
Extending the K-means Clustering Algorithm for Classifying Points into 2 Straight Lines with Insight as to how change in K affects Objective Value
- Inside Cross Val | python • scikit-learn
Insights into how the error varies with change in K while performing k-fold Cross Validation
Core Deep Learning
- Time Series Prediction using Recurrent Neural Networks | python • PyTorch
Formulations for an auto regressive model and a moving average model using LSTMs and a base model with non-recurrent single-layer perceptron
- Optimization and Learning | python • scikit-learn
From-scratch implementation of various optimization techniques such as Adam Optimizer, Nesterov's Accelerated GD, Polyak's Learn Rate, Newton's Method, Rprop, and Quickprop: Tested out on classification & regression tasks. Bonus: An illustration of the usefulness of Data Normalization
Optimization Methods
- Linear Programming Problems | python • cvxpy
Formulations and Solutions to a set of Problems using Linear Programming Paradigms
Principles of Programming Languages
- Functional Interpreter | haskell
Interpreter for a Functional Arithmetic Language with support for assignments and stores
- Functional Programming in Racket | racket
Functional Programming constructs on Lists, & Trees and an Interpreter for the simple Arithmetic Language
Operating Systems
- C shell | c
From scratch implementation of a user-defined interactive shell program similar to a bash shell capable of creating and managing new processes
- Proc-Sync | c
Applications of Process Synchronization in Simulating real-life situations
Artificial Intelligence
- Markov Decision Process | python
Computes utility map of states in a described grid world after every step of the Value Iteration algorithm till convergence and experiments with various discount factors and step costs
Computer Networks
- File Server Lite | c
Basic file server and client using socket connections
Game Design
- Mario on Terminal | python
Simple version of the popular Mario game playable on your terminal
- Funride | c++ • openGL
A basic twist on the classic Jetpack Joyride
- Space Invaders Retro | python • pygame
Space Invaders recreated with the retro-feel
- Angry Paper Jet | c++ • openGL
Frugal Fighter Jet Simulation Game
- Green Lawn Surfers | javascript • webGL
Subway Surfers but on a lawn & with Angry Birds
Web Development
- Project Showcase | jekyll • css • html
Little website theme featuring personal introduction & a projects showcase powered by Jekyll and Github Pages, built upon Alembic
- Retro Portfolio | javascript • css • html
A portfolio website theme with a retro feel built purely out of HTML, CSS and Javascript without the use of any framework
Created Resources
- Introduction to Deep Learning Tutorial
Teaching Material for Intro to DL Workshop, Summer '20, '21
- Data Collection & Preprocessing
Teaching Material for the 'Data Collection and Preprocessing' course, AMPBA, ISB, 2021
- NLP Course at ISB
Tutorial Code for the 'Natural Language Processing' course, AMPBA, ISB, 2022
- Linguistics Mine
Collection of Solved Assignments from various courses related to Linguistics taken at IIIT-H
- Analysis
Problem sets and solutions from Multivariate Analysis
- JAVA for schools
High school level programming problems solved using JAVA
- Neural and Cognitive Modelling Coursework
Multiple experiments and write-ups from the course, ‘Intro to Neural and Cognitive Modeling’
- Probabilistic Graphical Models Coursework
A set of solved problems on core machine learning & Gaussian Mixture Models (GMM) and a mathematical project on bounds for VC Dimensions
- Intro to Philosophy Coursework
Lecture Notes, Solved Assignments, and more from Introduction to Philosophy, 2020
- YouTube scraper
YouTube information scraper plus video downloader
- Data Systems Coursework
Created solutions to group assignments and exercises from the Data Systems course.