Skip to main content
The Eval Protocol CLI provides the fastest, most reproducible way to launch RFT jobs. This page covers everything you need to know about using eval-protocol create rft.
Before launching, review Training Prerequisites & Validation for requirements, validation checks, and common errors.
Already familiar with firectl? Use it as an alternative to eval-protocol.

Installation and setup

The following guide will help you:
  • Upload your evaluator to Fireworks. If you don’t have one yet, see Concepts > Evaluators
  • Upload your dataset to Fireworks
  • Create and launch the RFT job
1

Install Eval Protocol CLI

pip install eval-protocol
Verify installation:
eval-protocol --version
2

Set up authentication

Configure your Fireworks API key:
export FIREWORKS_API_KEY="fw_your_api_key_here"
Or create a .env file:
FIREWORKS_API_KEY=fw_your_api_key_here
3

Test your evaluator locally

Before training, verify your evaluator works. This command discovers and runs your @evaluation_test with pytest. If a Dockerfile is present, it builds an image and runs the test in Docker; otherwise it runs on your host.
cd evaluator_directory
ep local-test
4

Create the RFT job

From the directory where your evaluator and dataset (dataset.jsonl) are located,
eval-protocol create rft \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --output-model my-model-name 
The CLI will:
  • Upload evaluator code (if changed)
  • Upload dataset (if changed)
  • Create the RFT job
  • Display dashboard links for monitoring
Expected output:
Created Reinforcement Fine-tuning Job
   name: accounts/your-account/reinforcementFineTuningJobs/abc123

Dashboard Links:
   Evaluator: https://app.fireworks.ai/dashboard/evaluators/your-evaluator
   Dataset:   https://app.fireworks.ai/dashboard/datasets/your-dataset
   RFT Job:   https://app.fireworks.ai/dashboard/fine-tuning/reinforcement/abc123
5

Monitor training

Click the RFT Job link to watch training progress in real-time. See Monitor Training for details.

Common CLI options

Customize your RFT job with these flags: Model and output:
--base-model accounts/fireworks/models/llama-v3p1-8b-instruct  # Base model to fine-tune
--output-model my-custom-name                                   # Name for fine-tuned model
Training parameters:
--epochs 2                    # Number of training epochs (default: 1)
--learning-rate 5e-5          # Learning rate (default: 1e-4)
--lora-rank 16                # LoRA rank (default: 8)
--batch-size 65536            # Batch size in tokens (default: 32768)
Rollout (sampling) parameters:
--inference-temperature 0.8   # Sampling temperature (default: 0.7)
--inference-n 8               # Number of rollouts per prompt (default: 4)
--inference-max-tokens 4096   # Max tokens per response (default: 2048)
--inference-top-p 0.95        # Top-p sampling (default: 1.0)
--inference-top-k 50          # Top-k sampling (default: 40)
Remote environments:
--remote-server-url https://your-evaluator.example.com  # For remote rollout processing
Force re-upload:
--force                       # Re-upload evaluator even if unchanged
See all options:
eval-protocol create rft --help

Advanced options

Track training metrics in W&B for deeper analysis:
eval-protocol create rft \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --wandb-project my-rft-experiments \
  --wandb-entity my-org
Set WANDB_API_KEY in your environment first.
Save intermediate checkpoints during training:
firectl create rftj \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --checkpoint-frequency 500  # Save every 500 steps
  ...
Available in firectl only.
Speed up training with multiple GPUs:
firectl create rftj \
  --base-model accounts/fireworks/models/llama-v3p1-70b-instruct \
  --accelerator-count 4  # Use 4 GPUs
  ...
Recommended for large models (70B+).
For evaluators that need more time:
firectl create rftj \
  --rollout-timeout 300  # 5 minutes per rollout
  ...
Default is 60 seconds. Increase for complex evaluations.

Examples

Fast experimentation (small model, 1 epoch):
eval-protocol create rft \
  --base-model accounts/fireworks/models/qwen3-0p6b \
  --output-model quick-test
High-quality training (more rollouts, higher temperature):
eval-protocol create rft \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --output-model high-quality-model \
  --inference-n 8 \
  --inference-temperature 1.0
Remote environment (for multi-turn agents):
eval-protocol create rft \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --remote-server-url https://your-agent.example.com \
  --output-model remote-agent
Multiple epochs with custom learning rate:
eval-protocol create rft \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --epochs 3 \
  --learning-rate 5e-5 \
  --output-model multi-epoch-model

Using firectl CLI (Alternative)

For users already familiar with Fireworks firectl, you can create RFT jobs directly:
firectl create rftj \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --dataset accounts/your-account/datasets/my-dataset \
  --evaluator accounts/your-account/evaluators/my-evaluator \
  --output-model my-finetuned-model
Differences from eval-protocol:
  • Requires fully qualified resource names (accounts/…)
  • Must manually upload evaluators and datasets first
  • More verbose but offers finer control
  • Same underlying API as eval-protocol
See firectl documentation for all options.

Next steps