Skip to main content
You can tune models for free on Fireworks. Models under 16B parameters are available for free tuning—when creating a fine-tuning job in the UI, filter for free tuning models in the model selection area on the fine-tuning creation page. If kicking off jobs from the terminal, you can find the model ID from the Model Library.
The Eval Protocol CLI provides the fastest, most reproducible way to launch RFT jobs. This page covers everything you need to know about using eval-protocol create rft.
Before launching, review Training Prerequisites & Validation for requirements, validation checks, and common errors.
Already familiar with firectl? Use it as an alternative to eval-protocol.

Installation and setup

The following guide will help you:
  • Upload your evaluator to Fireworks. If you don’t have one yet, see Concepts > Evaluators
  • Upload your dataset to Fireworks
  • Create and launch the RFT job
1

Install Eval Protocol CLI

pip install eval-protocol
Verify installation:
eval-protocol --version
2

Set up authentication

Configure your Fireworks API key:
export FIREWORKS_API_KEY="fw_your_api_key_here"
Or create a .env file:
FIREWORKS_API_KEY=fw_your_api_key_here
3

Test your evaluator locally

Before training, verify your evaluator works. This command discovers and runs your @evaluation_test with pytest. If a Dockerfile is present, it builds an image and runs the test in Docker; otherwise it runs on your host.
cd evaluator_directory
ep local-test
If using a Dockerfile, it must use a Debian-based image (no Alpine or CentOS), be single-stage (no multi-stage builds), and only use supported instructions: FROM, RUN, COPY, ADD, WORKDIR, USER, ENV, CMD, ENTRYPOINT, ARG. Instructions like EXPOSE and VOLUME are ignored. See Dockerfile constraints for RFT evaluators for details.
4

Create the RFT job

From the directory where your evaluator and dataset (dataset.jsonl) are located,
eval-protocol create rft \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --output-model my-model-name 
The CLI will:
  • Upload evaluator code (if changed)
  • Upload dataset (if changed)
  • Create the RFT job
  • Display dashboard links for monitoring
Expected output:
Created Reinforcement Fine-tuning Job
   name: accounts/your-account/reinforcementFineTuningJobs/abc123

Dashboard Links:
   Evaluator: https://app.fireworks.ai/dashboard/evaluators/your-evaluator
   Dataset:   https://app.fireworks.ai/dashboard/datasets/your-dataset
   RFT Job:   https://app.fireworks.ai/dashboard/fine-tuning/reinforcement/abc123
5

Monitor training

Click the RFT Job link to watch training progress in real-time. See Monitor Training for details.

Common CLI options

Customize your RFT job with these flags: Model and output:
--base-model accounts/fireworks/models/llama-v3p1-8b-instruct  # Base model to fine-tune
--output-model my-custom-name                                   # Name for fine-tuned model
Training parameters:
--epochs 2                    # Number of training epochs (default: 1)
--learning-rate 5e-5          # Learning rate (default: 1e-4)
--lora-rank 16                # LoRA rank (default: 8)
--batch-size 65536            # Batch size in tokens (default: 32768)
Rollout (sampling) parameters:
--temperature 0.8   # Sampling temperature (default: 0.7)
--n 8               # Number of rollouts per prompt (default: 4)
--max-tokens 4096   # Max tokens per response (default: 32768)
--top-p 0.95        # Top-p sampling (default: 1.0)
--top-k 50          # Top-k sampling (default: 40)
Remote environments:
--remote-server-url https://your-evaluator.example.com  # For remote rollout processing
Force re-upload:
--force                       # Re-upload evaluator even if unchanged
See all options:
eval-protocol create rft --help

Advanced options

Track training metrics in W&B for deeper analysis:
eval-protocol create rft \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --wandb-project my-rft-experiments \
  --wandb-entity my-org
Set WANDB_API_KEY in your environment first.
Save intermediate checkpoints during training:
firectl rftj create \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --checkpoint-frequency 500  # Save every 500 steps
  ...
Available in firectl only.
Speed up training with multiple GPUs:
firectl rftj create \
  --base-model accounts/fireworks/models/llama-v3p1-70b-instruct \
  --accelerator-count 4  # Use 4 GPUs
  ...
Recommended for large models (70B+).
For evaluators that need more time:
firectl rftj create \
  --rollout-timeout 300  # 5 minutes per rollout
  ...
Default is 60 seconds. Increase for complex evaluations.

Examples

Fast experimentation (small model, 1 epoch):
eval-protocol create rft \
  --base-model accounts/fireworks/models/qwen3-0p6b \
  --output-model quick-test
High-quality training (more rollouts, higher temperature):
eval-protocol create rft \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --output-model high-quality-model \
  --n 8 \
  --temperature 1.0
Remote environment (for multi-turn agents):
eval-protocol create rft \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --remote-server-url https://your-agent.example.com \
  --output-model remote-agent
Multiple epochs with custom learning rate:
eval-protocol create rft \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --epochs 3 \
  --learning-rate 5e-5 \
  --output-model multi-epoch-model

Using firectl CLI (Alternative)

For users already familiar with Fireworks firectl, you can create RFT jobs directly:
firectl rftj create \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --dataset accounts/your-account/datasets/my-dataset \
  --evaluator accounts/your-account/evaluators/my-evaluator \
  --output-model my-finetuned-model
Differences from eval-protocol:
  • Requires fully qualified resource names (accounts/…)
  • Must manually upload evaluators and datasets first
  • More verbose but offers finer control
  • Same underlying API as eval-protocol
See firectl documentation for all options.

Next steps