Projects

from scratch decoder-only transformer language model, writing everything from the layer norm to the multihead attention mechanisms.
Wrote articles about the mathematics of self-attention mechanisms & decoder-only transformer models
Implementation of "Language Models are Unsupervised Multitask Learners" (Radford et al.)

qwen-0.5b-reasoning | article | v2 download

supervised fine tuned version of qwen2.5-0.5B
demonstrated that small models with no reasoning capabilities can learn to create chain of thought thinking traces.
created a custom symbolic chain-of-thought dataset using teacher model format
Implementation of arXiv:2306.14050

SCoTD-deepseek-math-7B

Custom dataset used for symbolic chain of thought distillation in smaller models.
Entries include a MATH dataset question and 6 unique thinking traces for each question generated by deepseek-math-7B.
300+ huggingface downloads.

Proximal Policy Optimization (PPO) & Critic-Actor agent architecture

Implemented a reusable continuous and discrete PPO from scratch, with policy gradient clipping and GAE for advantage estimation
Built and optimized reinforcement learning models using the critic-actor architecture for tasks like CartPole-v1 and HalfCheetah-v5

hyprwindow Rust GTK-based minimal workspace & application manager for Wayland desktop environments

Notate image annotation service, annotate and store images in seconds

Dante Truly free, fast and simple to use learning app based on proved spaced repetition learning algorithms