top of page

ZERO

back1.png

Pre-trained Foundational Model

LLaMA-3.1-instruct-8B

ALPHA

back2.png

Training Approach

Supervised Fine Tuning
back2.png

Dataset

Reddit-science-sft
back2.png

Results

Neptune

BETA

back3_edited.jpg

Training Approach

DPO
back3_edited.jpg

Dataset

Reddit-science-dpo
back3_edited.jpg

Training Results

Neptune

GAMMA

back5.png

Training Approach

PPO
back5.png

Dataset

reddit-questions
back5.png

Training Results

Neptune

DELTA

back4.png

Training Approach

RLVR
back4.png

Dataset

gsm8k (math QA)
back4.png

Training Results

Neptune
bottom of page