Skip to content
View shaheennabi's full-sized avatar

Block or report shaheennabi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
shaheennabi/README.md

Thanks for tuning hereπŸ‘‹






Who I am

╔════════════════════╗ β•‘ research -- thinking, reasoning models β•‘ β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•


Maintainer of an open-source 25k+ image dataset on Hugging Face used by thousands of developers. falcon 9-- sonic boom πŸ’₯ is my caffeine

* I am a fast learner, I know the things current systems are build atop of * * I love SpaceX rockets *

Pinned Loading

  1. Reinforcement-Learning-Zero-to-Hero Reinforcement-Learning-Zero-to-Hero Public

    Reinforcement Learning (RL)! This repository is your hands-on guide to implementing RL algorithms, from Markov Decision Processes (MDPs) to advanced methods like PPO and DDPG. Build smart agents, l…

    Python 5

  2. Production-Ready-Instruction-Finetuning-of-Meta-Llama-3.2-3B-Instruct-Project Production-Ready-Instruction-Finetuning-of-Meta-Llama-3.2-3B-Instruct-Project Public

    Instruction Fine-Tuning of Meta Llama 3.2-3B Instruct on Kannada Conversations. Tailoring the model to follow specific instructions in Kannada, enhancing its ability to generate relevant, context-a…

    Jupyter Notebook 23 6

  3. My_Datasets My_Datasets Public

    A collection of custom, open-source and essential datasets, perfect for powering machine learning and data science projects. Organized for easy access and reuse, this repo is updated regularly to f…

    2

  4. Q-Learning-Off-policy Q-Learning-Off-policy Public

    Q-learning is an off-policy temporal-difference control algorithm. It learns the value of the optimal action, independent of the action actually taken by the agent.

    Python

  5. SARSA-On-Policy SARSA-On-Policy Public

    A minimal, from-scratch implementation of SARSA (on-policy, model-free RL) on a custom GridWorld with no external RL libraries. Emphasizes algorithmic clarity and correct temporal dynamics for unde…

    Python