Following the RFT Overview? This is the Single-Turn Training path—the fastest way to get started with RFT.
Qwen3 0.6B—to solve mathematical reasoning problems from the GSM8K dataset.
What you’ll learn
- How to set up and test an evaluator locally, using the Eval Protocol SDK
- How to take that evaluator and use it in an RFT job, from the command line
- How to monitor training progress and evaluate accuracy improvements
Prerequisites
- Python 3.10+
- A Fireworks API key (stored in your shell or .env)
- Command-line access (terminal or shell)
1. Install dependencies and set up files
- Clone repository (recommended)
- Download files manually
Clone the quickstart-gsm8k repository and install dependencies:Create the The repository includes:
gsm8k_artifacts/ folder structure and copy files:- Evaluator (
evaluation.py): Defines how to evaluate math answers - Dataset (
gsm8k_sample.jsonl): Contains example math problems to test on
2. Test your evaluator locally
In this step, we will test your evaluator by examining the output locally. Feel free to iterate on the evaluator you downloaded in the last step until it gives the output you want.1
Start the local UI server
Open a terminal and run:This will start a local server, navigate to
http://localhost:8000. Keep this terminal running.2
Run the test script
In a new terminal, call the test script to run the evaluator on your dataset of sample math problems.This command discovers and runs your
@evaluation_test with pytest.As the test runs, you’ll see evaluation scores appear in the browser, with detailed logs for each problem the model attempts. pytest will also register your evaluator and dataset with Fireworks automatically, so you can use them in the next step for RFT.
3. Start training
First, set your Fireworks API key so the Fireworks CLI can authenticate you:qwen3-0p6b) to keep training fast and inexpensive. Because your evaluator and dataset were already registered with Fireworks in the last step, we don’t need to specify them again here.

Monitor your training progress
Your RFT job is now running. You can monitor progress in the dashboard links provided by the CLI output.Evaluate accuracy regularly
Evaluate accuracy regularly
Re-run the pytest evaluation command to measure your model’s performance on new checkpoints:This helps you see how your model’s accuracy improves over time and decide when to stop training.
Customize your evaluation
Customize your evaluation
You can adjust the evaluation logic to better fit your needs:
- Modify reward shaping: Edit the scoring logic in
test_pytest_math_example.pyto match your answer format expectations - Use your own data: Replace the sample dataset by either editing the JSONL file locally or passing
--dataset-jsonlwhen creating the RFT job
What’s happening behind the scenes
Understanding the training workflow:- Evaluation registration: The pytest script evaluates a small GSM8K subset using numeric answer checking, then automatically registers both your evaluator and dataset with Fireworks
- RFT job creation: The
create rftcommand connects your registered evaluator and dataset to a Reinforcement Fine-Tuning job for your chosen base model - Continuous improvement: As training progresses, evaluation scores on the held-out set reflect improved accuracy, allowing you to iterate quickly before scaling to larger experiments