Skip to main content
If you already have an agent running in your product, or need to run rollouts on your own infrastructure, you can integrate it with RFT using the RemoteRolloutProcessor. This delegates rollout execution to an HTTP service you control. Remote agent are ideal for:
  • Multi-turn agentic workflows with tool use
  • Access to private databases, APIs, or internal services
  • Integration with existing agent codebases
  • Complex simulations that require your infrastructure
New to RFT? Start with local agent instead. They’re simpler and cover most use cases. Only use remote agent environments when you need access to private infrastructure or have an existing agent to integrate.

How remote rollouts work

Remote rollout processor flow diagram showing the interaction between Eval Protocol, your remote server, and Fireworks Tracing
1

Fireworks triggers rollout

During training, Fireworks calls your service’s POST /init endpoint with the dataset row and correlation metadata.
2

Your service processes the rollout

Your agent executes the task (e.g., multi-turn conversation, tool calls, simulation steps), logging progress via Fireworks tracing.
3

Status reporting

Your service sends structured logs tagged with rollout metadata to Fireworks so the system can track completion.
4

Evaluation

Once Fireworks detects completion, it pulls the full trace and evaluates it using your scoring logic.
Everything except implementing your remote server is handled automatically by Eval Protocol. You only need to implement the /init endpoint and add Fireworks tracing.

Implementing the /init endpoint

Your remote service must implement a single /init endpoint that accepts rollout requests.

Request schema

completion_params
object
required
Model configuration including model name and inference parameters like temperature, max_tokens, etc.
messages
array
Array of conversation messages to send to the model
tools
array
Array of available tools for the model (for function calling)
model_base_url
string
Base URL for making LLM calls through Fireworks tracing (includes correlation metadata)
metadata
object
required
Rollout execution metadata for correlation (rollout_id, run_id, row_id, etc.)
api_key
string
Fireworks API key to use for model calls

Example request

{
  "completion_params": {
    "model": "accounts/fireworks/models/llama-v3p1-8b-instruct",
    "temperature": 0.7,
    "max_tokens": 2048
  },
  "messages": [
    { "role": "user", "content": "What is the weather in San Francisco?" }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get the weather for a city",
        "parameters": {
          "type": "object",
          "properties": {
            "city": { "type": "string" }
          }
        }
      }
    }
  ],
  "model_base_url": "https://tracing.fireworks.ai/rollout_id/brave-night-42/invocation_id/wise-ocean-15/experiment_id/calm-forest-28/run_id/quick-river-07/row_id/bright-star-91",
  "metadata": {
    "invocation_id": "wise-ocean-15",
    "experiment_id": "calm-forest-28",
    "rollout_id": "brave-night-42",
    "run_id": "quick-river-07",
    "row_id": "bright-star-91"
  },
  "api_key": "fw_your_api_key"
}

Metadata correlation

The metadata object contains correlation IDs that you must include when logging to Fireworks tracing. This allows Eval Protocol to match logs and traces back to specific evaluation rows. Required metadata fields:
  • invocation_id - Identifies the evaluation invocation
  • experiment_id - Groups related experiments
  • rollout_id - Unique ID for this specific rollout (most important)
  • run_id - Identifies the evaluation run
  • row_id - Links to the dataset row
RemoteRolloutProcessor automatically generates these IDs and sends them to your server. You don’t need to create them yourself—just pass them through to your logging.

Fireworks tracing integration

Your remote server must use Fireworks tracing to report rollout status. Eval Protocol polls these logs to detect when rollouts complete.

Basic setup

import logging
from eval_protocol import Status, InitRequest, FireworksTracingHttpHandler, RolloutIdFilter

# Configure Fireworks tracing handler globally
fireworks_handler = FireworksTracingHttpHandler()
logging.getLogger().addHandler(fireworks_handler)

@app.post("/init")
def init(request: InitRequest):
    # Create rollout-specific logger with filter
    rollout_logger = logging.getLogger(f"eval_server.{request.metadata.rollout_id}")
    rollout_logger.addFilter(RolloutIdFilter(request.metadata.rollout_id))

    try:
        # Execute your agent logic here
        result = execute_agent(request)

        # Log successful completion with structured status
        rollout_logger.info(
            f"Rollout {request.metadata.rollout_id} completed",
            extra={"status": Status.rollout_finished()}
        )

        return {"status": "success"}

    except Exception as e:
        # Log errors with structured status
        rollout_logger.error(
            f"Rollout {request.metadata.rollout_id} failed: {e}",
            extra={"status": Status.rollout_error(str(e))}
        )
        raise

Key components

  1. FireworksTracingHttpHandler: Sends logs to Fireworks tracing service
  2. RolloutIdFilter: Tags logs with the rollout ID for correlation
  3. Status objects: Structured status reporting that Eval Protocol can parse
    • Status.rollout_finished() - Signals successful completion
    • Status.rollout_error(message) - Signals failure with error details

Alternative: Environment variable approach

For simpler setups, you can use the EP_ROLLOUT_ID environment variable instead of manual filters.
  • Single rollout per instance
  • Separate processes
If your server processes one rollout at a time (e.g., serverless functions, container per request):
import os
import logging
from eval_protocol import Status, InitRequest, FireworksTracingHttpHandler

# Set rollout ID in environment
os.environ["EP_ROLLOUT_ID"] = request.metadata.rollout_id

# Configure handler (automatically picks up EP_ROLLOUT_ID)
fireworks_handler = FireworksTracingHttpHandler()
logging.getLogger().addHandler(fireworks_handler)

logger = logging.getLogger(__name__)

@app.post("/init")
def init(request: InitRequest):
    # Logs are automatically tagged with rollout_id
    logger.info("Processing rollout...")
    # ... execute agent logic ...

How Eval Protocol uses tracing

  1. Your server logs completion: Uses Status.rollout_finished() or Status.rollout_error()
  2. Eval Protocol polls: Searches Fireworks logs by rollout_id tag until completion signal found
  3. Status extraction: Reads structured status fields (code, message, details) to determine outcome
  4. Trace retrieval: Fetches full trace of model calls and tool use for evaluation

Complete example

Here’s a minimal but complete remote server implementation:
from fastapi import FastAPI
from fastapi.responses import JSONResponse
from eval_protocol import InitRequest, FireworksTracingHttpHandler, RolloutIdFilter, Status
import logging

app = FastAPI()

# Setup Fireworks tracing
fireworks_handler = FireworksTracingHttpHandler()
logging.getLogger().addHandler(fireworks_handler)

@app.post("/init")
async def init(request: InitRequest):
    # Create rollout-specific logger
    rollout_logger = logging.getLogger(f"eval_server.{request.metadata.rollout_id}")
    rollout_logger.addFilter(RolloutIdFilter(request.metadata.rollout_id))

    rollout_logger.info(f"Starting rollout {request.metadata.rollout_id}")

    try:
        # Your agent logic here
        # 1. Make model calls using request.model_base_url
        # 2. Call tools, interact with environment
        # 3. Collect results

        result = run_your_agent(
            messages=request.messages,
            tools=request.tools,
            model_config=request.completion_params,
            api_key=request.api_key
        )

        # Signal completion
        rollout_logger.info(
            f"Rollout {request.metadata.rollout_id} completed successfully",
            extra={"status": Status.rollout_finished()}
        )

        return {"status": "success", "result": result}

    except Exception as e:
        # Signal error
        rollout_logger.error(
            f"Rollout {request.metadata.rollout_id} failed: {str(e)}",
            extra={"status": Status.rollout_error(str(e))}
        )
        return JSONResponse(
            status_code=500,
            content={"status": "error", "message": str(e)}
        )

def run_your_agent(messages, tools, model_config, api_key):
    # Implement your agent logic here
    # Make model calls, use tools, etc.
    pass

Testing locally

Before deploying, test your remote server locally:
1

Start your server

uvicorn main:app --reload --port 8080
2

Configure RemoteRolloutProcessor

In your evaluator test, point to your local server:
from eval_protocol.pytest import RemoteRolloutProcessor

rollout_processor = RemoteRolloutProcessor(
    remote_base_url="http://localhost:8080"
)
3

Run test evaluation

pytest my-evaluator-name.py -vs
This sends test rollouts to your local server and verifies the integration works.

Deploying your service

Once tested locally, deploy to production:
  • ✅ Service is publicly accessible (or accessible via VPN/private network)
  • ✅ HTTPS endpoint with valid SSL certificate (recommended)
  • ✅ Authentication/authorization configured
  • ✅ Monitoring and logging set up
  • ✅ Auto-scaling configured for concurrent rollouts
  • ✅ Error handling and retry logic implemented
  • ✅ Service availability SLA meets training requirements
Vercel/Serverless:
  • One rollout per function invocation
  • Use environment variable approach
  • Configure timeout for long-running evaluations
AWS ECS/Kubernetes:
  • Handle concurrent requests with proper worker configuration
  • Use RolloutIdFilter approach
  • Set up load balancing
On-premise:
  • Ensure network connectivity from Fireworks
  • Configure firewall rules
  • Set up VPN if needed for security

Connecting to RFT

Once your remote server is deployed, create an RFT job that uses it:
eval-protocol create rft \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --remote-server-url https://your-evaluator.example.com \
  --dataset my-dataset
The RFT job will send all rollouts to your remote server for evaluation during training.

Troubleshooting

Symptoms: Rollouts show as timed out or never completeSolutions:
  • Check that your service is logging Status.rollout_finished() correctly
  • Verify Fireworks tracing handler is configured
  • Ensure rollout_id is included in log tags
  • Check for exceptions being swallowed without logging
Symptoms: Eval Protocol can’t match logs to rolloutsSolutions:
  • Verify you’re using the exact rollout_id from request metadata
  • Check that RolloutIdFilter or EP_ROLLOUT_ID is set correctly
  • Ensure logs are being sent to Fireworks (check tracing dashboard)
Symptoms: Training is slow, high rollout latencySolutions:
  • Scale your service to handle concurrent requests
  • Optimize your agent logic (caching, async operations)
  • Add more workers or instances
  • Profile your code to find bottlenecks
Symptoms: Model calls fail, API errorsSolutions:
  • Verify API key is passed correctly from request
  • Check that your service has network access to Fireworks
  • Ensure model_base_url is used for traced calls

Example implementations

Learn by example:

Next steps