Hi, I'm Shaligram. I am moving into the 3rd year of my undergraduate in Artificial Intelligence at IIT Guwahati. I was previously a Research Intern at IISc, Bangalore where I worked on building various AI models for legged robots. I've previously trained a GPT-2 sized model on 20 billion tokens for Indic languages and have outperformed the original.
I am deeply interested in building large deep neural nets, which can supercharge the potential of humanity. If you're interested in something similar too, maybe we could be good friends!
Mar 2025 - May 2025 | GitHub Repo
Pre-trained 124M parameter decoder only dense transformer model. Trained it on 20 billion English and Hindi language tokens from FineWeb-Edu and Fineweb-2 datasets. My model outperformed OpenAI GPT-2 (124M) on Hindi-Hellaswag by ~10% and achieved a similar performance on Hellaswag (English) benchmark. My BPE tokenizer achieved 3x better average tokenization efficiency for English and Hindi language task.
Jun 2023 - Aug 2023
Made a 4-legged 12DoF robot based on Stanford Pupper v1 Project, which can trot, walk and jump.
Oct 2023 - Aug 2027 (ongoing) | 4 Years | CPI: 8.6/10
Major in Data Science and Artificial Intelligence
Dec 2023 - Feb 2025 | 1.2 Years
Advised by Prof. Shishir NY, I worked on Vision based Deep RL models for quadruped robots, Deep Learning based models for actuators and various other things.
Dec 2024 | Artificial Intelligence Confluence, IIT Guwahati
Vision based Deep RL models for Agile Industrial Robots, Shaligram Dewangan, et al.
Very familer with Python and its ecosystems
Python
PyTorch
NumPy
Pandas
Matplotlib
Tensorboard
Datasets
Including but not limited to the following
Deep Learning
Reinforcement Learning
Natural Language Processing
Large Language Models
Pre Training
Evaluation
Tokenization