Getting Started
Overview
CSF Core RT is the end-to-end SDK for training CSF Core object-detection models and running them in production. Follow the six steps below in order — each maps to a section in this guide.
Step 01
Activate Licence
Bind your key to this device. One internet call, then fully offline.
Step 02
Install Libraries
pip install myelion-csf + myelion-csf-ops from the MYELION private registry.
Step 03
Train a Model
Run csfrt.train() on your labelled YOLO-format dataset.
Step 04
Export .csf
result.export() produces the encrypted .csf model file.
Step 05
Device Setup
Install SDK on the inference machine, activate, copy .csf over.
Step 06
Run Inference
csfrt.load() + model.infer() — Python. Android SDK available separately.
Architecture split: Training and export run on a GPU workstation. The .csf file is then copied to every inference device. The .mkey key file stays on MYELION servers and is never shipped — the SDK fetches it automatically at activation time.
1
Getting Started · Step 1
Activate Your Licence Key
Every machine that runs the SDK must be activated before it can load models. Activation binds a signed token to that machine's hardware fingerprint. After this step the device works fully offline — no repeated network calls.
Licence key formats
| Format | Duration | Use |
| MYELION-XXXX-XXXX-frt | str | 14-day evaluation trial. Prototyping only. |
| MYELION-XXXX-XXXX-1yr | str | 1-year standard production licence. |
| MYELION-XXXX-XXXX-2yr | str | 2-year licence. |
| MYELION-XXXX-XXXX-5yr | str | 5-year long-term / embedded deployment. |
Online activation
Run this once on each machine. Requires a one-time internet connection.
# Replace with your actual licence key
python -m csfrt.activate MYELION-XXXX-XXXX-1yr
# Confirm activation
python -m csfrt.status
# Status: ACTIVE Key: MYELION-XXXX-… Expires: 2027-06-01 Device: a3f7c2…
The token is written to /etc/csfrt/license.mkey by default. To use a custom path set the environment variable CSFRT_LICENSE_PATH before activating.
Air-gap / no-internet devices: Extract the hardware fingerprint, generate your token from the
MYELION portal, and copy it to the device. Full instructions in the
Air-Gap Activation section.
2
Getting Started · Step 2
Install Libraries
Two packages are required and must always be installed together. Both come from the MYELION private registry — you need your customer token to access it.
Configure pip registry
# Run once per environment — replace YOUR_TOKEN with your registry token
pip config set global.index-url https://pypi.fury.io/YOUR_TOKEN/myelion/simple/
pip config set global.extra-index-url https://pypi.org/simple/
Install
pip install myelion-csf myelion-csf-ops
# Verify both packages
python -c "import csfrt, csfrt_ops; print(csfrt.__version__)"
# → 1.0.0
System requirements
| Platform | Minimum | Recommended |
| Python | str | 3.9 minimum · 3.11 recommended |
| CUDA | str | 11.8 minimum · 12.1+ recommended |
| Linux | str | Ubuntu 20.04 x86-64 / ARM64 · 22.04 recommended |
| Windows | str | Windows 10 x86-64 · Windows 11 recommended |
| Android | str | API 27 ARM64 · API 30+ Snapdragon 8 Gen 1+ recommended |
3
Getting Started · Step 3
Train a Model
Training runs on your GPU workstation. Prepare a labelled dataset in YOLO format, then call csfrt.train(). The SDK handles the full pipeline: data loading, augmentation, AdamW optimisation with cosine scheduling, mixed-precision training, and checkpoint saving.
Dataset structure
dataset/
images/
frame_001.jpg
frame_002.jpg
...
labels/
frame_001.txt
frame_002.txt
...
classes.txt
Each label file contains one detection per line in normalised YOLO format:
# class_id cx cy w h (all values 0–1)
0 0.512 0.384 0.224 0.318
1 0.712 0.206 0.109 0.154
Run training
import csfrt
result = csfrt.train(
dataset = "dataset/",
epochs = 100,
batch_size = 16,
img_size = 640,
device = "cuda",
save_dir = "csf_runs/",
)
# Inspect last-epoch metrics
print(result.history[-1])
# {'epoch': 99, 'loss': 0.312, 'box': 0.084, 'cls': 0.201, 'dfl': 0.027, 'lr': 1e-5}
Resume from checkpoint
result = csfrt.train(
dataset = "dataset/",
epochs = 200,
resume = "csf_runs/last.pt", # continues from saved epoch
)
Checkpoints: Training saves best.pt (lowest loss) and last.pt (most recent epoch) into save_dir. Both are standard PyTorch checkpoint files that feed directly into result.export().
4
Getting Started · Step 4
Export the Model
Export converts the trained PyTorch checkpoint into the encrypted .csf format. The weights are encrypted before being written to disk. Two files are produced.
# Directly from the training result
csf_path = result.export("detector.csf")
print(csf_path) # → detector.csf ← copy this to your inference devices
# Or export from a saved checkpoint
from csfrt.csf_exporter import CSFExporter
from csfrt.csf_core.csf_core_model import CSFCoreModel
import torch
ckpt = torch.load("csf_runs/best.pt")
model = CSFCoreModel(num_classes=ckpt["num_classes"])
model.load_state_dict(ckpt["model_state"])
csf_path = CSFExporter().export(model, ".", model_name="detector")
Output file
| File | Description |
| detector.csf | Your encrypted model file. Transfer this to every inference device that needs to run the model. |
5
Getting Started · Step 5
Device Setup
Configure the inference machine. The steps differ slightly per platform.
Linux / CUDA server
# 1. Install SDK on the inference machine
pip install myelion-csf myelion-csf-ops
# 2. Activate this device (one-time internet call)
python -m csfrt.activate MYELION-XXXX-XXXX-1yr
# 3. Transfer the model file from your training machine
scp detector.csf user@inference-server:/opt/models/detector.csf
# 4. Quick sanity check
python -c "import csfrt; m=csfrt.load('/opt/models/detector.csf'); print(m.get_info())"
NVIDIA Jetson (edge device)
# Jetson runs ARM64 Linux — identical SDK, identical commands
pip install myelion-csf myelion-csf-ops
python -m csfrt.activate MYELION-XXXX-XXXX-1yr
# Push model over SSH
scp detector.csf jetson@192.168.1.100:~/models/
# Benchmark latency on device
python -m csfrt.bench ~/models/detector.csf --iterations 200 --device cuda
6
Getting Started · Step 6
Run Inference
Python — synchronous
import csfrt
import numpy as np
# Load model — licence verified automatically from /etc/csfrt/license.mkey
model = csfrt.load("detector.csf", device="cuda", precision="fp16")
# Prepare image — [B, 3, H, W] float32
image = np.random.rand(1, 3, 640, 640).astype(np.float32)
# Run inference
result = model.infer(image)
print(result.shape) # → (1, 144, 80, 80) — P3 scale detections
# Model metadata
print(model.get_info())
Python — real-time stream
from csfrt import CSFStream
def on_result(result, metadata):
print(f"Frame {metadata['id']} shape={result.shape}")
stream = CSFStream(model_path="detector.csf", callback=on_result, batch_size=4)
stream.start()
for i, frame in enumerate(camera_feed):
stream.push(frame, metadata={"id": i})
stream.stop()
Python SDK
csfrt.load()
| Parameter | Type | Description |
| path | str | Absolute or relative path to the .csf model file. |
| device | str = "cuda" | "cuda", "cuda:N" for a specific GPU, "cpu", or "rocm". |
| precision | str = "fp16" | "fp16" (default, full accuracy) or "int8" (~2× throughput, requires calibration data in the .csf). |
| license | str | None = None | Path to a .mkey token file. If None, reads from $CSFRT_LICENSE_PATH or /etc/csfrt/license.mkey. |
| num_threads | int = 4 | CPU threads for preprocessing. Ignored on GPU. |
ReturnsCSFModel
import csfrt
# Default — CUDA FP16
model = csfrt.load("detector.csf")
# Explicit precision
model = csfrt.load("detector.csf", precision="int8")
# CPU inference (no GPU required)
model = csfrt.load("detector.csf", device="cpu")
# Second GPU in multi-GPU setup
model = csfrt.load("detector.csf", device="cuda:1")
# Custom token path
model = csfrt.load("detector.csf", license="/opt/keys/license.mkey")
model.infer()
| Parameter | Type | Description |
| inputs | np.ndarray | torch.Tensor | Input in NCHW format — [batch, 3, height, width] float32. Shape must match model.input_shape. |
| return_numpy | bool = True | Return np.ndarray when True (default). Return torch.Tensor on the model device when False. |
import numpy as np
# Single image
img = np.random.rand(1, 3, 640, 640).astype(np.float32)
out = model.infer(img)
print(out.shape) # (1, 144, 80, 80)
# Batch of 4 images
batch = np.random.rand(4, 3, 640, 640).astype(np.float32)
results = model.infer(batch) # (4, 144, 80, 80)
# Stay on GPU (avoid CPU transfer)
tensor_out = model.infer(img, return_numpy=False)
model.infer_async()
| Method | Type | Description |
| future.get(timeout) | float = 10.0 | Block until result is ready. Raises TimeoutError if not done within timeout seconds. |
| future.done() | bool | Returns True if inference has completed. |
# Submit — returns immediately
future = model.infer_async(frame_1)
# Preprocess next frame while frame_1 is running
frame_2 = preprocess(next_frame)
# Collect result (blocks if not ready, raises TimeoutError after 5 s)
result_1 = future.get(timeout=5.0)
# Pipeline 10 frames
futures = [model.infer_async(f) for f in frames]
results = [fut.get() for fut in futures]
CSFStream
| Parameter | Type | Description |
| model_path | str | Path to the .csf model file. |
| callback | Callable[[ndarray, dict], None] | Called with (result, metadata) for each completed inference. Runs on a background thread. |
| batch_size | int = 1 | Frames accumulated before dispatching a single inference call. |
| precision | str = "fp16" | Inference precision — "fp16" or "int8". |
| device | str = "cuda" | Inference device. |
Methods
| Method | Description |
| stream.start() | Start the background inference thread. |
| stream.push(frame, metadata={}) | Queue a frame. Non-blocking. Metadata dict passed through to callback. Raises RuntimeError if called after stop(). |
| stream.stop() | Flush remaining frames, wait for completion, and stop the thread. After calling this, push() will raise. |
from csfrt import CSFStream
def on_result(result, metadata):
print(f"Frame {metadata['id']}: {result.shape}")
stream = CSFStream(
model_path = "detector.csf",
callback = on_result,
batch_size = 4,
precision = "fp16",
)
stream.start()
for i, frame in enumerate(source):
stream.push(frame, metadata={"id": i})
stream.stop()
model.benchmark()
| Parameter | Type | Description |
| n_images | int = 1000 | Total number of inference passes to time. |
| batch_size | int = 1 | Images per inference call. Throughput is reported per image. |
Return keys
| Key | Type | Description |
| fps | float | Images processed per second. |
| ms_per_image | float | Mean latency per image in milliseconds. |
| ms_std | float | Standard deviation of per-image latency. |
| device | str | Device used. |
| precision | str | Precision mode used. |
stats = model.benchmark(n_images=1000, batch_size=1)
print(f"FPS: {stats['fps']:.1f} Latency: {stats['ms_per_image']:.2f} ms")
# FPS: 168.4 Latency: 5.94 ms (A100, fp16)
model.get_info()
print(model.get_info())
# {
# 'sdk_version': '1.0.0',
# 'model_name': 'detector',
# 'arch_version': 'CSF-Core-v1',
# 'num_params': '17.8M',
# 'num_classes': 80,
# 'input_shape': [1, 3, 640, 640],
# 'precision': 'fp16',
# 'device': 'cuda:0',
# 'licence_key': 'MYELION-XXXX-…',
# 'expires_at': '2027-06-01',
# }
csfrt.train()
| Parameter | Type | Description |
| dataset | str | Path to dataset root. Must contain images/, labels/, and optionally classes.txt. |
| epochs | int = 100 | Total training epochs. |
| batch_size | int = 16 | Images per gradient step. Reduce if VRAM is exhausted. |
| img_size | int = 640 | Square training resolution. Use 1280 for high-res aerial / dense scenes. |
| device | str = "cuda" | Training device. |
| num_classes | int | None = None | Override class count. If None, read from classes.txt. |
| lr | float = 1e-3 | Peak learning rate (AdamW). |
| lr_min | float = 1e-5 | Minimum LR at end of cosine schedule. |
| weight_decay | float = 5e-4 | AdamW weight decay. Applied to conv/linear params only (not BN/bias). |
| warmup_epochs | int = 5 | Linear warmup duration in epochs. |
| amp | bool = True | Mixed-precision training. Auto-disabled on CPU. |
| grad_clip | float = 1.0 | Gradient clipping max norm. |
| save_dir | str = "csf_runs" | Output directory for best.pt and last.pt checkpoints. |
| resume | str | None = None | Path to a .pt checkpoint to resume from. Restores model weights, optimizer state, and epoch counter. |
| depth_mult | float = 1.0 | Backbone depth multiplier. 1.0 = full model (~19.8M params). |
| width_mult | float = 1.0 | Backbone width multiplier. |
| num_workers | int = 4 | DataLoader worker processes. |
Parameter validation: csfrt.train() raises ValueError before training starts if you pass invalid parameters — for example a negative learning rate (lr <= 0), warmup_epochs >= epochs, or batch_size < 1. Corrupt images in the dataset are automatically skipped with a warning rather than crashing training.
import csfrt
# Standard training run
result = csfrt.train(
dataset = "dataset/",
epochs = 150,
batch_size = 32,
img_size = 640,
device = "cuda",
lr = 1e-3,
save_dir = "runs/exp1",
)
# Print training history
for epoch in result.history:
print(epoch)
# {'epoch': 0, 'loss': 2.41, 'box': 0.92, 'cls': 1.21, 'dfl': 0.28, 'lr': 0.0002}
# {'epoch': 1, 'loss': 1.98, ...}
# Inspect best checkpoint path
print(result.model_path) # runs/exp1/best.pt
result.export()
| Parameter | Type | Description |
| output_path | str | Destination path for the .csf file. Must end in .csf. The .mkey file is written to the same directory. |
# Standard export
csf_path = result.export("models/detector.csf")
print(csf_path) # → models/detector.csf ← copy to inference devices
# Export with INT8 calibration flag
csf_path = result.export("models/detector.csf", int8_calib=True)
CLI Commands
python -m csfrt.activate
Activates the SDK on the current machine. Writes a signed token to disk. Requires internet access once.
# Air-gap: use your licence key directly (from portal)
python -m csfrt.activate MYELION-XXXX-XXXX-1yr
# Online: use your registered MYELION email
python -m csfrt.activate your@email.com
# Custom token path
python -m csfrt.activate MYELION-XXXX-XXXX-1yr --license /opt/keys/license.mkey
| Flag | Default | Description |
| <key> | required | Your MYELION licence key. |
| --output PATH | /etc/csfrt/license.mkey | Destination path for the signed token file. |
python -m csfrt.status
Shows the activation status of the current device. No network call required.
python -m csfrt.status
# Status: ACTIVE
# Key: MYELION-XXXX-…1yr
# Expires: 2027-06-01
# Device: a3f7c2b19d4e…
# Token path: /etc/csfrt/license.mkey
python -m csfrt.fingerprint
Prints the hardware fingerprint for this device as a base64 string. Use this value in the MYELION portal to generate an offline activation token.
python -m csfrt.fingerprint
# a3f7c2b19d4e8f2a6c0d1b5e3f7a9b2c4d6e8f0a1b3c5d7e9f2a4b6c8d0e2f4…
python -m csfrt.bench
Command-line benchmark utility. Profiles throughput and latency on any supported device.
# Basic — 100 iterations, FP16, batch 1
python -m csfrt.bench detector.csf --iterations 100 --precision fp16
# Compare FP16 vs INT8 side-by-side
python -m csfrt.bench detector.csf --precision fp16 int8 --compare
# CPU benchmark
python -m csfrt.bench detector.csf --device cpu --iterations 50
# Save JSON report
python -m csfrt.bench detector.csf --output benchmark.json
| Flag | Default | Description |
| <path> | required | Path to the .csf model file. |
| --iterations N | 100 | Number of timed inference passes. |
| --batch N | 1 | Batch size per inference call. |
| --precision MODE | fp16 | fp16, int8, or both (space-separated). |
| --device DEVICE | cuda | cuda, cuda:N, or cpu. |
| --compare | — | Print a side-by-side comparison table for all listed precision modes. |
| --output PATH | — | Write results to a JSON file. |
python -m csfrt.train (CLI)
Run training from the command line without writing a Python script.
python -m csfrt.train --dataset dataset/ --epochs 100 --batch 16 --img-size 640 --device cuda --save-dir csf_runs/
# Resume from checkpoint
python -m csfrt.train --dataset dataset/ --epochs 200 --resume csf_runs/last.pt
Deployment
Air-Gap Activation
For devices with no internet access. Takes three steps — fingerprint extraction, token request, and token deployment.
Step 1 — Extract fingerprint (on the isolated device)
python -m csfrt.fingerprint
# Copy the printed base64 string
Step 2 — Download your token from the portal
Log in to myelion.com/portal, go to Devices → Add Device, paste the fingerprint string, and download the signed token file.
Step 3 — Deploy token (on the isolated device)
# Transfer the token file by any means (USB, SCP, etc.)
sudo cp license.mkey /etc/csfrt/license.mkey
sudo chmod 644 /etc/csfrt/license.mkey
# Verify — no network required
python -m csfrt.status
Docker Images
Pre-built images with all dependencies. Mount your licence file and model at runtime — never bake them into image layers.
# Pull
docker pull ghcr.io/myelion/csfrt:1.0-cuda11.8
docker pull ghcr.io/myelion/csfrt:1.0-cuda12.1
docker pull ghcr.io/myelion/csfrt:1.0-cpu
# Run — licence and model injected at runtime
docker run --gpus all \
-v /etc/csfrt:/etc/csfrt:ro \
-v /path/to/detector.csf:/model.csf:ro \
ghcr.io/myelion/csfrt:1.0-cuda11.8 \
python -c "import csfrt; m=csfrt.load('/model.csf'); print(m.benchmark())"
| Tag | Base image | Use case |
| 1.0-cuda11.8 | str | nvidia/cuda:11.8-runtime — NVIDIA GPU servers |
| 1.0-cuda12.1 | str | nvidia/cuda:12.1-runtime — newer NVIDIA GPUs |
| 1.0-cpu | str | python:3.11-slim — CPU-only inference |
Precision Modes
| Mode | Flag | Throughput | Notes |
| FP16 | "fp16" | Baseline | Default. Full accuracy. Recommended for all GPU deployments. |
| INT8 | "int8" | ~2× FP16 | 8-bit quantised. Requires calibration data bundled in the .csf at export time. Minor accuracy reduction on some models — always benchmark before deploying. |
INT8 requires calibration data embedded at training time. If the model was exported without it, csfrt.load(precision="int8") raises CSFPrecisionError.
Dataset Format
| Path | Required | Description |
| images/ | Yes | JPEG or PNG images. Supported: .jpg .jpeg .png .bmp .webp. |
| labels/ | Yes | One .txt per image (same stem). Each line: class_id cx cy w h — all values normalised 0–1. |
| classes.txt | Optional | One class name per line. If absent, class names default to their integer indices. |
Image–label pairing: images/frame_042.jpg must have its labels at labels/frame_042.txt. Images with no label file are treated as background-only (zero detections).
Reference
Exceptions
Python
| Exception | Raised when |
| MyelionLicenceError | No valid token found, key not in database, or device limit exceeded. |
| MyelionLicenceExpiredError | Token validity period elapsed. Renew at myelion.com. |
| MyelionHardwareError | Device fingerprint changed since activation (hardware replaced, OS reinstalled). |
| MyelionModelError | .csf corrupt, wrong magic bytes, or incompatible SDK version. |
| MyelionNetworkError | Online activation cannot reach the MYELION licence server. |
| CSFPrecisionError | INT8 requested but the .csf has no calibration data. |
| CSFShapeError | Input tensor shape does not match model.input_shape. |
| CSFDeviceError | Requested device (cuda, rocm) not available on this machine. |
import csfrt
try:
model = csfrt.load("detector.csf")
result = model.infer(image)
except csfrt.MyelionLicenceError as e:
print(f"Licence: {e}")
except csfrt.MyelionModelError as e:
print(f"Model: {e}")
except csfrt.CSFShapeError as e:
print(f"Shape: {e}")