# ML-Dash — Full documentation

> ML-Dash is a simple, flexible SDK for ML experiment tracking and data storage — log parameters, metrics, files, and time-series tracks locally or against a remote dash.ml server.

Generated from https://docs.dash.ml. 14 pages.

---

Source: https://docs.dash.ml

# ML-Dash Documentation

ML-Dash is a simple, flexible SDK for ML experiment tracking and data storage. Log parameters, metrics, files, and time-series tracks with one API — locally or against a remote dash.ml server.

## Installation

```bash
pip install ml-dash
```

## Quick Start

```python
from ml_dash import Experiment

with Experiment(prefix="my-user/my-project/exp1", dash_root=".dash").run as exp:
    exp.params.set(learning_rate=0.001, batch_size=32)

    for epoch in range(10):
        loss = train_one_epoch()
        exp.metrics("train").log(loss=loss, epoch=epoch)
```

Swap `dash_root=".dash"` for `dash_url="https://api.dash.ml"` (after `ml-dash login`) to sync to a remote server. See [Getting Started](/getting-started.md) for both modes in detail.

Using [Claude Code](https://claude.ai/download)? Install the plugin for in-editor help: `/plugin marketplace add fortyfive-labs/ml-dash` then `/plugin install ml-dash@ml-dash`.

## Documentation

**Core**

- [Getting Started](/getting-started.md) — install, auth, first experiment
- [Experiments](/experiments.md) — lifecycle and configuration
- [Parameters](/parameters.md) — hyperparameter logging
- [Metrics](/metrics.md) — scalar metrics and batching
- [Logging](/logging.md) — text logs and console capture
- [Files](/files.md) — artifacts and uploads
- [CLI Commands](/cli.md) — `ml-dash` command reference
- [API Reference](/api-reference.md) — full Python API
- [Complete Examples](/complete-examples.md) — end-to-end scripts

**Advanced**

- [Background Buffering](/buffering.md) — non-blocking I/O with auto-batching
- [Tracks](/tracks.md) — time-series data for robotics and RL
- [Images](/images.md) — numpy to PNG/JPEG conversion

## Links

- **GitHub**: https://github.com/fortyfive-labs/ml-dash
- **PyPI**: https://pypi.org/project/ml-dash/
- **Dashboard**: https://dash.ml

---

Source: https://docs.dash.ml/getting-started

# Getting Started

Get up and running with ML-Dash in under 5 minutes.

## Installation

```bash
pip install ml-dash
```

## Your First Experiment (Local)

Local mode stores everything on your filesystem — no account, no network, perfect for trying things out.

```python
from ml_dash import Experiment

# Prefix format: owner/project/experiment-name
with Experiment(
    prefix="alice/tutorial/my-first-experiment",
    dash_root=".dash",
).run as exp:
    exp.log("Training started", level="info")

    exp.params.set(
        learning_rate=0.001,
        batch_size=32,
        epochs=10,
    )

    for epoch in range(10):
        loss = 1.0 - epoch * 0.08
        exp.metrics("train").log(loss=loss, epoch=epoch)

    exp.log("Training completed", level="info")
```

Your data lives in `.dash/alice/tutorial/my-first-experiment/`:

```
.dash/
└── alice/                              # owner
    └── tutorial/                       # project
        └── my-first-experiment/        # experiment
            ├── logs/logs.jsonl
            ├── parameters/parameters.json
            └── metrics/train/data.jsonl
```

## Your First Experiment (Remote)

Ready to sync to the ML-Dash server? Authenticate once, then pass a `dash_url`.

### 1. Authenticate

```bash
ml-dash login
```

This opens your browser for OAuth2 and stores a token in your system keychain.

### 2. Run with `dash_url`

```python
from ml_dash import Experiment

with Experiment(
    prefix="alice/my-project/training-run",
    dash_url="https://api.dash.ml",  # token auto-loaded from keychain
).run as exp:
    exp.log("Running on remote server", level="info")
    exp.params.set(learning_rate=0.001)

    for epoch in range(10):
        loss = 1.0 - epoch * 0.08
        exp.metrics("train").log(loss=loss, epoch=epoch)
```

The API is identical — only the constructor args change.

## Next Steps

- **Concepts**: read the [Overview](/index.md) for the mental model
- **Feature guides**: [Logging](/logging.md), [Parameters](/parameters.md), [Metrics](/metrics.md), [Files](/files.md)
- **Going deeper**: [Experiments](/experiments.md) for advanced patterns (training loops, file uploads, tracks)

Using [Claude Code](https://claude.ai/download)? Install the companion plugin for in-editor help: `/plugin marketplace add fortyfive-labs/ml-dash` then `/plugin install ml-dash@ml-dash`.

## Install the docs as a skill

These docs ship as an [Agent Skill](/llm-readable.md) so your agent can answer
ML-Dash questions accurately — without you pasting context. Install it once and
Claude loads it on demand.

**Claude Code — this project only** (drop it in the project's skills dir):

```bash
curl -L https://docs.dash.ml/skills/dash-docs.zip -o dash-docs.zip
unzip dash-docs.zip -d .claude/skills/ && rm dash-docs.zip
```

**Claude Code — every project** (install under your home config):

```bash
curl -L https://docs.dash.ml/skills/dash-docs.zip -o dash-docs.zip
unzip dash-docs.zip -d ~/.claude/skills/ && rm dash-docs.zip
```

The skill is `dash-docs/` with a `SKILL.md` and one markdown reference file
per docs page. It's regenerated on every deploy, so it never drifts from the
site.

> **Note:** **No install needed for one-off questions.** Point any agent at
> [`https://docs.dash.ml/llms.txt`](https://docs.dash.ml/llms.txt)
> (an index) or `https://docs.dash.ml/llms-full.txt` (the whole site in one
> file). See [LLM-Readable Docs](/llm-readable.md) for all the ways to consume these
> docs as markdown.

---

Source: https://docs.dash.ml/experiments

# Experiments

The `Experiment` class is the core abstraction in ML-Dash. A single instance owns one run's logs, parameters, metrics, and files.

## Prefix Format

The prefix is a universal key that identifies an experiment:

```
owner/project/path.../name
```

- **owner**: first segment (e.g. your username)
- **project**: second segment
- **path**: any number of intermediate segments forming a folder path
- **name**: last segment

Reopening the same prefix resumes that experiment (upsert semantics).

## Constructor

```python
Experiment(
    prefix: str | None = None,        # owner/project/.../name (or DASH_PREFIX env)
    *,
    readme: str | None = None,        # human-readable description
    tags: list[str] | None = None,    # categorization
    bindrs: list[str] | None = None,  # resource/team association
    metadata: dict | None = None,
    dash_url: str | bool | None = None,  # remote API URL (None = local-only)
    dash_root: str | None = ".dash",     # local storage root (None = remote-only)
    **run_params,                        # forwarded to the RUN namespace
)
```

Mode is inferred from `dash_url` / `dash_root`:

- `dash_root` only (default): local-only, writes to `.dash/`
- `dash_url` + `dash_root`: hybrid (local + remote)
- `dash_url`, `dash_root=None`: remote-only

The auth token is auto-loaded from `~/.dash/token.enc` when `dash_url` is set.

## The `.run` Lifecycle

Every `Experiment` exposes `.run`, a context manager that drives the lifecycle:

| Method | Status set | Notes |
|---|---|---|
| `exp.run.start()` | RUNNING | also performed by `__enter__` |
| `exp.run.complete()` | COMPLETED | called by `__exit__` on clean exit |
| `exp.run.fail()` | FAILED | called by `__exit__` if an exception propagates |
| `exp.run.cancel()` | CANCELLED | manual only |

Status updates require remote mode; local mode does not track status.

## Usage

Context manager (recommended):

```python
from ml_dash import Experiment

with Experiment(prefix="alice/project/my-experiment").run as exp:
    exp.log("Training started")
    exp.params.set(learning_rate=0.001)
    exp.metrics("train").log(loss=0.5, epoch=1)
    exp.files("models").save("model.pth")
```

For decorator-style usage (`@ml_dash_experiment(...)`) and the `RUN.entry = __file__` auto-prefix pattern, see [Getting Started](/getting-started.md). For full multi-step training scripts, see [Complete Examples](/complete-examples.md).

## What Lives Where

Once an experiment is open, feature-specific APIs are documented on their own pages:

- [Logging](/logging.md) — `exp.log(...)`
- [Parameters](/parameters.md) — `exp.params.set(...)`
- [Metrics](/metrics.md) — `exp.metrics(track).log(...)`
- [Files](/files.md) — `exp.files(prefix).save(...)`

---

**Next:** Learn about [Logging](/logging.md) to track events and progress.

---

Source: https://docs.dash.ml/parameters

# Parameters

Hyperparameters and configuration values for an experiment. Parameters are static key-value pairs set once (or merged across calls) — for time-series data, see [Metrics](/metrics.md).

## Setting Parameters

```python
from ml_dash import Experiment

with Experiment(prefix="alice/project/run-01").run as exp:
    exp.params.set(
        learning_rate=0.001,
        batch_size=32,
        optimizer="adam",
    )
```

Nested dicts are flattened to dot notation:

```python
exp.params.set(
    model={"architecture": "resnet50", "pretrained": True},
    optimizer={"type": "adam", "lr": 0.001},
)
# Stored as: model.architecture, model.pretrained, optimizer.type, optimizer.lr
```

## Updating Parameters

Multiple calls merge — later values overwrite earlier ones:

```python
exp.params.set(learning_rate=0.001, batch_size=32)
exp.params.set(learning_rate=0.0001)  # overrides; batch_size preserved
```

## From Config Objects

Pass class objects directly — works with `params-proto`, plain classes, or any object with attributes. Private attributes (prefixed with `_`) are skipped.

```python
class Args:
    batch_size = 64
    learning_rate = 0.001

exp.params.set(Args=Args)
# → Args.batch_size = 64, Args.learning_rate = 0.001
```

For dataclasses, splat with `asdict`:

```python
from dataclasses import asdict
exp.params.set(**asdict(config))
```

## Reading Parameters

```python
exp.params.get()                # flat dict with dot notation (default)
exp.params.get(flatten=False)   # nested dict
```

## API

- `exp.params.set(**kwargs)` — set or merge parameters. Accepts scalars, nested dicts, and class objects. Returns self.
- `exp.params.log(**kwargs)` — alias for `set()`, identical behavior.
- `exp.params.get(flatten=True)` — retrieve current parameters.

---

**Next:** [Metrics](/metrics.md) for time-series tracking.

---

Source: https://docs.dash.ml/metrics

# Metrics

Time-series data that changes over the course of a run: loss, accuracy, learning rate, and any custom scalars you want to chart.

## Streams

Every metric value lives on a *stream*. A stream is just a named series identified by a prefix, like `train` or `eval`. You select one by calling `exp.metrics(prefix)`, then `.log(...)` appends a point.

The unprefixed `exp.metrics.log(...)` writes to the root stream, which is the right place for run-wide scalars like `epoch` or `step`.

## Basic Logging

Log scalars by keyword. Nested `dict`s automatically route to child streams:

```python
from ml_dash import Experiment

with Experiment(prefix="alice/project/my-experiment").run as exp:
    for epoch in range(10):
        train_loss, train_acc = train_one_epoch(model)
        eval_loss, eval_acc = evaluate(model)

        exp.metrics.log(
            epoch=epoch,
            train=dict(loss=train_loss, accuracy=train_acc),
            eval=dict(loss=eval_loss, accuracy=eval_acc),
        )
```

The `train=dict(...)` form is equivalent to `exp.metrics("train").log(...)`. Pick whichever reads better at the call site.

## Explicit Streams

When metrics for a stream are produced at different points in the loop, address each stream directly and flush once you have a coherent snapshot:

```python
for epoch in range(10):
    exp.metrics("train").log(loss=train_loss, accuracy=train_acc)
    exp.metrics("eval").log(loss=eval_loss, accuracy=eval_acc)
    exp.metrics.log(epoch=epoch).flush()
```

Stream names are arbitrary; `system`, `lr`, `grad_norm`, etc. all work the same way.

## Reading

Read points back by index range:

```python
result = exp.metrics("train").read(start_index=0, limit=10)
for point in result["data"]:
    print(point["index"], point["data"])
```

## Per-Batch Logging

Logging every batch with `.log()` is fine but produces a lot of points. To accumulate batch-level values and emit one summarized row per epoch, see [Buffering](/buffering.md).

For configuration like learning rate and batch size, use [Parameters](/parameters.md). For text events and structured logs, see [Logging](/logging.md).

---

**Next:** Learn about [Files](/files.md) to upload models, plots, and artifacts.

---

Source: https://docs.dash.ml/logging

# Logging

Structured event logging with timestamps, levels, and optional metadata. For numeric series use [Metrics](/metrics.md); for hyperparameters use [Parameters](/parameters.md).

## Basic Usage

```python
from ml_dash import Experiment

with Experiment(prefix="alice/project/my-experiment").run as exp:
    exp.log("Training started")
    exp.log("GPU memory low", level="warn")
    exp.log("Failed to load checkpoint", level="error")
```

## Log Levels

`debug`, `info` (default), `warn`, `error`, `fatal`.

```python
exp.log("Detailed debugging info", level="debug")
exp.log("Learning rate decreased", level="warn")
exp.log("Out of memory - aborting", level="fatal")
```

## Structured Metadata

Attach arbitrary fields via `metadata=`:

```python
exp.log(
    "Epoch completed",
    level="info",
    metadata={"epoch": 5, "train_loss": 0.234, "val_loss": 0.456},
)
```

## Error Tracking

```python
try:
    result = risky_operation()
except Exception as e:
    exp.log(
        f"Operation failed: {e}",
        level="error",
        metadata={"error_type": type(e).__name__},
    )
    raise
```

## Storage and Retrieval

**Local mode** writes JSONL to `.dash/<prefix>/logs/logs.jsonl`. Each line:

```json
{"timestamp": "2025-10-29T10:30:00Z", "level": "info", "message": "Training started", "metadata": null, "sequenceNumber": 0}
```

**Remote mode** stores entries in MongoDB, indexed by timestamp and level. Retrieve via the dashboard UI or the experiment API.

---

**Next:** Learn about [Parameters](/parameters.md) to track hyperparameters and configuration.

---

Source: https://docs.dash.ml/files

# Files

Upload and manage experiment artifacts — checkpoints, configs, results, and arbitrary blobs. Files are automatically checksummed, organized by prefix, and addressable by path.

For image uploads see [/images](/images.md). For frame buffers and ring-buffer patterns see [/buffering](/buffering.md). For time-series media tracks see [/tracks](/tracks.md).

## API at a Glance

```python
from ml_dash import Experiment

with Experiment(prefix="alice/project/my-experiment").run as exp:

    # Upload a file from disk
    exp.files("checkpoints").upload("./model.pt")
    exp.files("checkpoints").upload("./model.pt", to="best.pt")

    # Save Python objects directly
    exp.files("configs").save_json({"lr": 1e-3}, to="config.json")
    exp.files("configs").save_text("yaml: content\n", to="view.yaml")
    exp.files("data").save_blob(b"\x00\x01", to="data.bin")
    exp.files("checkpoints").save_torch(model, to="model.pt")
    exp.files("data").save_pkl(obj, to="data.pkl")
    exp.files("plots").save_fig(fig, to="loss.png")
    exp.files("videos").save_video(frames, to="rollout.mp4", fps=30)
```

`save_*` methods accept the same metadata kwargs as `upload()` (see below) and return a result dict with `id`, `filename`, `path`, `sizeBytes`, and `checksum`.

The unified `save()` method dispatches based on content type: strings that are file paths go through `upload()`, bytes through `save_blob()`, dicts/lists through `save_json()`, and numpy arrays through `save_image()`.

## Save with Metadata

Attach a description, tags, and arbitrary metadata to any saved file:

```python
result = exp.files("models").save_torch(
    model,
    to="best_model.pt",
    description="Best model from epoch 50",
    tags=["checkpoint", "best"],
    metadata={"epoch": 50, "val_accuracy": 0.95},
)
```

| Param        | Type              | Description                                  |
|--------------|-------------------|----------------------------------------------|
| `to`         | `str`             | Destination filename (relative to prefix)    |
| `description`| `str`             | Human-readable description                   |
| `tags`       | `list[str]`       | Tags for filtering and search                |
| `metadata`   | `dict`            | Custom JSON-serializable metadata            |

## Paths and Organization

The argument to `exp.files(...)` is a logical prefix. Use it to group related artifacts:

```python
exp.files("models").upload("model.pth")
exp.files("models/checkpoints").upload("best.pth")
exp.files("config").save_json(config, to="config.json")
exp.files("results").upload("results.csv")
```

### Direct Style (Alternative)

You can also skip the prefix call and embed the full path in `to=`:

```python
exp.files.upload("./model.pt", to="models/checkpoints/best.pt")
exp.files.save_json({"k": "v"}, to="configs/config.json")
exp.files.save_text("content", to="notes/view.yaml")
```

## Listing, Downloading, Deleting

```python
# List
all_files   = exp.files().list()
model_files = exp.files("models").list()
pngs        = exp.files("images").list("*.png")
configs     = exp.files().list("**/*.json")

# Download
exp.files("model.pt").download()                       # current dir
exp.files("model.pt").download(to="./local.pt")        # custom dest
paths = exp.files("images").download("*.png", to="./out")

# Delete
exp.files("some.txt").delete()
exp.files("images").delete("*.png")
```

Downloads automatically verify SHA256 checksums against the recorded value.

## Updating Metadata

Update the description, tags, or metadata of an existing file by its `file_id`:

```python
exp.files(
    file_id="abc123",
    description="Updated description",
    tags=["new", "tags"],
    metadata={"updated": True},
).update()
```

## Deduplication

Each file is stored under a unique snowflake ID, so saving the same logical filename multiple times preserves all versions rather than overwriting. The path layout on disk is:

```
files/{prefix}/{snowflake_id}/{filename}
```

In remote mode, files land at `s3://bucket/files/{namespace}/{project}/{experiment}/{prefix}/{file_id}/filename` with metadata (path, size, SHA256, tags, description) in MongoDB. Maximum file size: **100 GB**.

## End-to-End Example

```python
import torch
from ml_dash import Experiment

with Experiment(prefix="alice/cv/resnet").run as exp:
    exp.params.set(model="resnet50", epochs=100)

    best_acc = 0.0
    for epoch in range(100):
        loss, acc = train_one_epoch(model, loader)
        exp.metrics.log(epoch=epoch, train=dict(loss=loss, accuracy=acc))

        if (epoch + 1) % 10 == 0:
            exp.files("checkpoints").save_torch(
                model.state_dict(),
                to=f"epoch_{epoch + 1}.pt",
                tags=["checkpoint"],
                metadata={"epoch": epoch + 1, "accuracy": acc},
            )

        if acc > best_acc:
            best_acc = acc
            exp.files("models").save_torch(
                model.state_dict(),
                to="best.pt",
                description=f"Best model (acc={best_acc:.4f})",
                tags=["best"],
                metadata={"epoch": epoch + 1, "accuracy": best_acc},
            )
```

---

Source: https://docs.dash.ml/cli

# Command-Line Interface (CLI)

The `ml-dash` CLI authenticates, queries, and manages projects and experiments on a remote server. Installed with the Python package:

```bash
pip install ml-dash
ml-dash --help
```

| Command | Description |
|---|---|
| `version` | Show ml-dash version |
| `login` | Authenticate via OAuth2 device flow |
| `logout` | Clear stored token |
| `profile` | Show current user |
| `api` | Send raw GraphQL queries/mutations |
| `list` | List projects, experiments, or tracks |
| `create` | Create a project |

All commands accept `--dash-url` / `--api-url` (defaults to stored config or `https://api.dash.ml`) and `--help`.

---

## `ml-dash version`

Print the installed version.

```bash
ml-dash version
```

---

## `ml-dash login`

Authenticate using the OAuth2 device authorization flow. Displays a URL, user code, and QR code; opens a browser; polls for authorization (10-minute timeout); stores the token in the system keychain.

```bash
ml-dash login [--dash-url URL] [--auth-url URL] [--no-browser]
```

| Flag | Description |
|---|---|
| `--dash-url`, `--api-url` | ML-Dash server URL |
| `--auth-url` | OAuth authorization server URL |
| `--no-browser` | Don't open the browser automatically |

```bash
ml-dash login --dash-url https://your-server.com
```

After login, all commands pick up the stored token; no `--api-key` flag needed.

---

## `ml-dash logout`

Clear the stored token from the system keychain.

```bash
ml-dash logout
```

---

## `ml-dash profile`

Show the current authenticated user. By default fetches live data from the server; `--cached` reads only the stored token payload.

```bash
ml-dash profile [--dash-url URL] [--json] [--cached]
```

| Flag | Description |
|---|---|
| `--json` | Output as JSON |
| `--cached` | Use cached token data instead of fetching |

Displays username, user ID, name, email, remote URL, and token expiration status.

```bash
ml-dash profile --json
```

---

## `ml-dash api`

Send raw GraphQL queries or mutations.

```bash
ml-dash api (--query QUERY | --mutation MUTATION) [--jq PATH] [--dash-url URL]
```

| Flag | Description |
|---|---|
| `--query`, `-q` | GraphQL query string |
| `--mutation`, `-m` | GraphQL mutation string |
| `--jq PATH` | Extract a value by dot-path (e.g. `.me.username`) |

Notes:
- Single quotes are auto-converted to double quotes.
- `--jq` paths skip the top-level `data` key — use `.me.username`, not `.data.me.username`.
- Bare bodies are auto-wrapped: `me { username }` becomes `{ me { username } }`.

```bash
ml-dash api --query "me { username }" --jq ".me.username"
```

---

## `ml-dash create`

Create a new project. If the project already exists, the command exits successfully with a warning.

```bash
ml-dash create -p PROJECT [-d DESCRIPTION] [--dash-url URL]
```

| Flag | Description |
|---|---|
| `-p`, `--project` | `project` or `namespace/project` (required). Namespace auto-resolves from the authenticated user if omitted. |
| `-d`, `--description` | Project description |

```bash
ml-dash create -p tom/my-project -d "Baseline experiments"
```

---

## `ml-dash list`

List projects, experiments, or tracks with server-side pagination (50 per page; navigate with `n`/`→`/`Space` for next, `p`/`←` for previous, any other key to quit).

```bash
ml-dash list [-p PROJECT] [-n NAMESPACE] [--status STATUS] [--tags TAGS]
             [--detailed] [--tracks] [--topic-filter TOPIC]
             [--dash-url URL] [-v]
```

| Flag | Description |
|---|---|
| `-p`, `--project` | Project filter. Supports glob patterns — **always quote them** to prevent shell expansion |
| `-n`, `--namespace` | Namespace (defaults to authenticated user) |
| `--status` | Filter by `COMPLETED`, `RUNNING`, `FAILED`, or `ARCHIVED` |
| `--tags` | Comma-separated tag filter |
| `--detailed` | Show tags and created time |
| `--tracks` | List tracks inside an experiment (requires full `namespace/project/experiment` path) |
| `--topic-filter` | Filter tracks by topic pattern (e.g. `robot/*`) |
| `-v`, `--verbose` | Show full error tracebacks |

Glob patterns expand against the current namespace: `tes*` becomes `{namespace}/tes*/*`; `tom/tes*` becomes `tom/tes*/*`; fully-qualified patterns are left unchanged.

```bash
# List your projects
ml-dash list

# Wildcard search (quote the pattern)
ml-dash list -p 'tom/tes*'

# Tracks inside an experiment, filtered by topic
ml-dash list --tracks -p tom/test/my-experiment --topic-filter "robot/*"
```

See [Experiments](/experiments.md) and [Metrics](/metrics.md) for working with the data these commands surface.

---

## CI / scripting

Store the token in `~/.dash/config.json` to skip `login`:

```json
{
  "remote_url": "https://api.dash.ml",
  "api_key": "your-jwt-token"
}
```

If `ml-dash` isn't on PATH, run `python -m ml_dash.cli` or `uv run ml-dash`.

---

Source: https://docs.dash.ml/api-reference

# ML-Dash API Reference

Complete API reference for the ML-Dash Python SDK. For tutorials and workflow examples, see [/parameters](/parameters.md), [/metrics](/metrics.md), [/files](/files.md), and [/complete-examples](/complete-examples.md).

## Table of Contents

- [Experiment](#experiment)
  - [Constructor](#constructor)
  - [Properties](#properties)
  - [RunManager (`exp.run`)](#runmanager)
- [Parameters (`exp.params`)](#parameters)
- [Logging (`exp.log`, `exp.logs`)](#logging)
- [Metrics (`exp.metrics`)](#metrics)
- [Files (`exp.files`)](#files)
- [Auto-Start (`dxp`)](#auto-start)

---

## Experiment

The `Experiment` class is the main entry point for ML-Dash. It represents a single machine learning experiment run.

```python
from ml_dash import Experiment
```

### Constructor

```python
Experiment(
    prefix: Optional[str] = None,
    *,
    readme: Optional[str] = None,
    tags: Optional[List[str]] = None,
    bindrs: Optional[List[str]] = None,
    metadata: Optional[Dict[str, Any]] = None,
    dash_url: Optional[Union[str, bool]] = None,
    dash_root: Optional[str] = ".dash",
)
```

**Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `prefix` | `str` | `None` | Universal key in format `owner/project/path/name` |
| `readme` | `str` | `None` | Human-readable experiment readme/description |
| `tags` | `List[str]` | `None` | Tags for categorization and search |
| `bindrs` | `List[str]` | `None` | Binders for advanced organization |
| `metadata` | `Dict[str, Any]` | `None` | Additional structured metadata |
| `dash_url` | `str \| bool` | `None` | Remote API URL. `None` = local-only (no remote); `True` = use default remote (`https://api.dash.ml`); string = custom URL. Token auto-loaded from `~/.dash/token.enc` |
| `dash_root` | `str` | `".dash"` | Local storage root path. Set to `None` for remote-only mode |

**Prefix format:** `owner/project/path.../name` — first segment is owner, second is project, remaining segments form the folder path, last segment becomes the experiment name.

**Example:**

```python
# Local mode
exp = Experiment(prefix="alice/my-project/exp-001", dash_root=".dash")

# Remote mode
exp = Experiment(
    prefix="alice/my-project/exp-001",
    dash_url="https://custom-server.com",
)
```

### Properties

| Property | Type | Description |
|----------|------|-------------|
| `experiment.name` | `str` | Experiment name |
| `experiment.project` | `str` | Project name |
| `experiment.readme` | `str` | Experiment readme/description |
| `experiment.tags` | `List[str]` | Experiment tags |
| `experiment.bindrs` | `List[str]` | Experiment bindrs |
| `experiment.folder` | `str` | Folder path |
| `experiment.id` | `str` | Experiment ID (remote mode only, after start) |
| `experiment.data` | `dict` | Full experiment data (remote mode only, after start) |

### RunManager

`exp.run` returns a `RunManager` supporting three usage patterns: context manager, decorator, or manual control.

#### Context Manager

```python
with Experiment(prefix="alice/proj/exp").run as exp:
    exp.log("Training started")
    # Auto-completes on success, auto-fails on exception
```

#### Decorator

```python
exp = Experiment(prefix="alice/proj/exp")

@exp.run
def train(experiment):
    experiment.log("Training...")
    return "done"

result = train()
```

#### Manual Control

```python
exp.run.start()
try:
    exp.log("Training...")
    exp.run.complete()
except Exception:
    exp.run.fail()
```

#### Methods

| Method | Description | Status Set |
|--------|-------------|------------|
| `run.start()` | Start the experiment | `RUNNING` |
| `run.complete()` | Mark as successfully completed | `COMPLETED` |
| `run.fail()` | Mark as failed | `FAILED` |
| `run.cancel()` | Mark as cancelled | `CANCELLED` |

---

## Parameters

Access via the `exp.params` property. See [/parameters](/parameters.md) for usage patterns.

### `params.set(**kwargs)`

Set or merge experiment parameters. Supports nested dicts (auto-flattened to dot notation for storage).

**Returns:** `ParametersBuilder`

```python
exp.params.set(learning_rate=0.001, batch_size=32)
exp.params.set(model={"architecture": "resnet50", "layers": 50})
# Stored as: {"learning_rate": 0.001, "batch_size": 32,
#             "model.architecture": "resnet50", "model.layers": 50}
```

### `params.get(flatten=True)`

Retrieve parameters.

**Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `flatten` | `bool` | `True` | If `True`, return dot-notation flat dict; if `False`, return hierarchical dict |

**Returns:** `dict`

```python
flat = exp.params.get()
# {"learning_rate": 0.001, "model.architecture": "resnet50"}

nested = exp.params.get(flatten=False)
# {"learning_rate": 0.001, "model": {"architecture": "resnet50"}}
```

---

## Logging

Log messages with severity levels and optional metadata. See [/logging](/logging.md) for patterns.

### `exp.log(message, level="info", metadata=None, **kwargs)`

Log a message.

**Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `message` | `str` | required | Log message |
| `level` | `str` | `"info"` | Log level: `debug`, `info`, `warn`, `error`, `fatal` |
| `metadata` | `dict` | `None` | Structured metadata |
| `**kwargs` | any | — | Metadata as keyword arguments |

```python
exp.log("Training started")
exp.log("Warning: Low GPU memory", level="warn")
exp.log("Epoch completed", level="info", epoch=1, loss=0.5)
```

### Fluent API: `exp.logs`

Convenience methods for each level.

| Method | Equivalent |
|--------|------------|
| `exp.logs.debug(msg, **meta)` | `exp.log(msg, level="debug", **meta)` |
| `exp.logs.info(msg, **meta)` | `exp.log(msg, level="info", **meta)` |
| `exp.logs.warn(msg, **meta)` | `exp.log(msg, level="warn", **meta)` |
| `exp.logs.error(msg, **meta)` | `exp.log(msg, level="error", **meta)` |
| `exp.logs.fatal(msg, **meta)` | `exp.log(msg, level="fatal", **meta)` |

```python
exp.logs.info("Training started", epoch=1)
exp.logs.error("Failed to load data", error_code=500)
```

### Log Levels

| Level | Description |
|-------|-------------|
| `debug` | Detailed diagnostic information |
| `info` | General informational messages (default) |
| `warn` | Warning messages for potential issues |
| `error` | Error messages for failures |
| `fatal` | Fatal errors causing termination |

---

## Metrics

Track time-series metrics. Access via `exp.metrics`. See [/metrics](/metrics.md) for patterns.

### `exp.metrics(prefix)`

Create/get a `MetricBuilder` for the given namespace prefix.

**Parameters:** `prefix: str`

**Returns:** `MetricBuilder`

```python
exp.metrics("train").log(loss=0.5, accuracy=0.85)
```

### `metrics.log(**data)`

Log a metric data point. When called on a prefixed builder, the prefix is applied. When called on the top-level `metrics`, supports grouped fields like `train=dict(...)`, `eval=dict(...)`.

**Returns:** `MetricBuilder`

```python
exp.metrics("train").log(loss=0.5, accuracy=0.85)
exp.metrics.log(
    epoch=epoch,
    train=dict(loss=train_loss, accuracy=train_acc),
    eval=dict(loss=val_loss, accuracy=val_acc),
)
```

### `metrics("prefix").buffer(**data)`

Buffer per-batch values in memory (not yet written). Use with `log_summary()` to aggregate.

**Returns:** `None`

```python
for batch in dataloader:
    loss = train_step(batch)
    exp.metrics("train").buffer(loss=loss)
```

### `metrics.buffer.log_summary(*aggregations)`

Compute and log summary statistics from buffered values.

**Parameters:** `*aggregations: str` — names of aggregations. Defaults to `"mean"`. Supported: `mean`, `std`, `min`, `max`, `count`, percentile codes like `p50`, `p90`, `p95`, `p99`.

**Returns:** `None`

```python
exp.metrics.buffer.log_summary()                          # mean only
exp.metrics.buffer.log_summary("mean", "std", "min", "max", "count")
exp.metrics.buffer.log_summary("p50", "p90", "p95", "p99")
```

### `metrics.flush()`

Flush pending metric writes to disk/remote.

**Returns:** `None`

```python
exp.metrics.log(epoch=epoch).flush()
```

### `metrics("prefix").read(start_index=0, limit=100)`

Read logged data points for the given metric.

**Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `start_index` | `int` | `0` | First index to read |
| `limit` | `int` | `100` | Maximum points to return |

**Returns:** `dict` — `{"data": [...], "startIndex", "endIndex", "total", "hasMore"}`

```python
data = exp.metrics("train_loss").read(start_index=0, limit=100)
```

### `metrics("prefix").stats()`

Get summary statistics for a metric.

**Returns:** `dict` — `{"count", "firstValue", "lastValue", ...}`

```python
stats = exp.metrics("train_loss").stats()
```

### Method Summary

| Method | Returns | Description |
|--------|---------|-------------|
| `metrics(prefix)` | `MetricBuilder` | Get builder with prefix |
| `metrics.log(**data)` | `MetricBuilder` | Log data point |
| `metrics("p").log(**data)` | `MetricBuilder` | Log with prefix |
| `metrics("p").buffer(**data)` | `None` | Buffer for summary |
| `metrics.buffer.log_summary(*aggs)` | `None` | Log aggregated summary |
| `metrics.flush()` | `None` | Flush pending writes |
| `metrics("p").read(start_index, limit)` | `dict` | Read data points |
| `metrics("p").stats()` | `dict` | Get statistics |

---

## Files

Upload, download, and manage experiment files. Access via `exp.files`. See [/files](/files.md) for patterns.

### `exp.files(**kwargs)`

Create a `FileBuilder`.

**Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `path` | `str` | Logical path/prefix (positional, e.g. `"models"`, `"configs"`) |
| `description` | `str` | File description |
| `tags` | `List[str]` | File tags |
| `bindrs` | `List[str]` | File bindrs |
| `metadata` | `dict` | File metadata |
| `file_id` | `str` | File ID (for download/update/delete) |
| `dest_path` | `str` | Destination path (for download) |

```python
builder = exp.files("models")
builder = exp.files(file_id="abc123")
```

### `save(local_path, **kwargs)`

Upload a file from local disk.

**Returns:** `dict`

```python
exp.files("models").save("./model.pt")
exp.files("models").save(
    "./model.pt",
    description="Best checkpoint",
    tags=["best"],
    metadata={"epoch": 50},
    bindrs=["v1", "production"],
)
```

### `save_json(content, to)`

Serialize a Python object as JSON and upload.

**Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `content` | `Any` | JSON-serializable object |
| `to` | `str` | Target filename |

**Returns:** `dict`

```python
exp.files("configs").save_json({"lr": 0.001}, to="config.json")
```

### `save_torch(model, to)`

Save a PyTorch model (or state dict) via `torch.save`.

**Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `model` | `Any` | `nn.Module` or state dict |
| `to` | `str` | Target filename |

**Returns:** `dict`

```python
exp.files("models").save_torch(model, to="model.pt")
exp.files("models").save_torch(model.state_dict(), to="model.pth")
```

### `save_pkl(content, to)`

Serialize a Python object via `pickle` and upload.

**Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `content` | `Any` | Picklable object |
| `to` | `str` | Target filename |

**Returns:** `dict`

```python
exp.files("data").save_pkl({"results": [1, 2, 3]}, to="data.pkl")
```

### `save_fig(fig=None, to, **kwargs)`

Save a matplotlib figure.

**Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `fig` | `Optional[Any]` | `None` | Figure (uses current via `plt.gcf()` if `None`) |
| `to` | `str` | required | Target filename |
| `**kwargs` | any | — | Passed to `fig.savefig` (e.g. `dpi`, `bbox_inches`) |

**Returns:** `dict`

```python
exp.files("plots").save_fig(to="plot.png")
exp.files("plots").save_fig(fig=fig, to="plot.pdf", dpi=150, bbox_inches="tight")
```

### `save_video(frames, to, fps=20, **kwargs)`

Encode a list/array of frames as MP4 or GIF.

**Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `frames` | `Union[List, Any]` | required | Frames (grayscale or RGB ndarray, or list of frames) |
| `to` | `str` | required | Target filename (`.mp4`, `.gif`) |
| `fps` | `int` | `20` | Frames per second |
| `codec` | `str` | — | Codec name (e.g. `"libx264"`) |
| `quality` | `int` | — | Quality setting |

**Returns:** `dict`

```python
exp.files("videos").save_video(frames, to="output.mp4", fps=30)
exp.files("videos").save_video(frames, to="hq.mp4", codec="libx264", quality=8)
```

### `list()`

List files for the current builder filters.

**Returns:** `List[dict]`

```python
exp.files().list()                       # all files
exp.files("models").list()               # by prefix
exp.files(tags=["checkpoint"]).list()    # by tags
```

### `download(dest_path=None)`

Download a file (requires `file_id`).

**Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `dest_path` | `Optional[str]` | `None` | Target local path; if `None`, uses original filename |

**Returns:** `str` — local path of the downloaded file

```python
path = exp.files(file_id="123").download()
path = exp.files(file_id="123").download(dest_path="./model.pt")
```

### `update()`

Update file metadata (requires `file_id`).

**Returns:** `dict`

```python
exp.files(
    file_id="123",
    description="Updated",
    tags=["new"],
    metadata={"version": "2.0"},
).update()
```

### `delete()`

Soft-delete a file (requires `file_id`).

**Returns:** `dict`

```python
exp.files(file_id="123").delete()
```

### Method Summary

| Method | Parameters | Returns | Description |
|--------|------------|---------|-------------|
| `files(**kwargs)` | see above | `FileBuilder` | Create builder |
| `save(local_path, **kwargs)` | `str`, kwargs | `dict` | Upload file |
| `save_json(content, to)` | `Any`, `str` | `dict` | Save JSON |
| `save_torch(model, to)` | `Any`, `str` | `dict` | Save PyTorch model |
| `save_pkl(content, to)` | `Any`, `str` | `dict` | Save pickle |
| `save_fig(fig, to, **kwargs)` | `Optional[Any]`, `str`, kwargs | `dict` | Save matplotlib figure |
| `save_video(frames, to, fps, **kwargs)` | `Union[List, Any]`, `str`, `int`, kwargs | `dict` | Save video |
| `list()` | — | `List[dict]` | List files |
| `download(dest_path)` | `Optional[str]` | `str` | Download |
| `update()` | — | `dict` | Update metadata |
| `delete()` | — | `dict` | Soft delete |

---

## Auto-Start

The `ml_dash.auto_start` module exposes `dxp`, a pre-configured, auto-started `Experiment` singleton for quick prototyping and notebooks.

```python
from ml_dash.auto_start import dxp
```

### Behavior

- **Pre-configured:** name `"dxp"`, project `"scratch"`, local storage at `.dash`
- **Auto-started:** ready to use on import; no `.run.start()` needed
- **Auto-completed:** closed on Python exit via `atexit`
- **Full API:** supports all `Experiment` methods

### Fixed Configuration

| Property | Value | Mutable |
|----------|-------|---------|
| Name | `"dxp"` | No |
| Project | `"scratch"` | No |
| Storage Mode | Local (`.dash`) | No |
| Local Path | `".dash"` | No |
| Parameters | empty initially | Yes |
| Tags | empty | No |
| Readme | `None` | No |

### Usage

```python
from ml_dash.auto_start import dxp

dxp.params.set(lr=0.001, batch_size=32)
dxp.metrics("train").log(loss=0.5, step=0)
dxp.files("models").save("model.pt")
# Auto-completed on Python exit
```

### Manual Lifecycle

The standard `run` methods still work:

```python
dxp.run.complete()   # close manually
dxp.run.start()      # reopen
dxp.run.fail()       # mark failed
```

### Regular Experiment vs. `dxp`

| Feature | `Experiment` | `dxp` |
|---------|--------------|-------|
| Import | `from ml_dash import Experiment` | `from ml_dash.auto_start import dxp` |
| Configuration | User-defined | Fixed |
| Lifecycle | Manual / context manager / decorator | Auto-started, auto-completed |
| Storage | Local or remote | Local only |
| Instances | Many | Single global |
| Use case | Production / multi-run | Prototyping / notebooks |

---

Source: https://docs.dash.ml/buffering

# Background Buffering

ML-Dash writes are non-blocking. Logs, [metrics](/metrics.md), [tracks](/tracks.md), and [files](/files.md) are queued and flushed from a background daemon thread so the hot path stays fast.

## Flush Triggers

Buffered data flushes when any of these occur:

1. **Time-based**: every `flush_interval` seconds (default `5.0`).
2. **Size-based**: when a queue reaches its batch size (default `100` items per queue).
3. **Manual**: `experiment.flush()` blocks until queues drain.
4. **Context exit**: leaving the `Experiment(...).run` context waits for a full drain (no timeout).

## Forcing a Flush

Call `flush()` before any action that depends on uploads being durable (checkpoint markers, downstream readers, etc.):

```python
with Experiment("my-project/exp").run as experiment:
    experiment.metrics("train").log(loss=loss)
    experiment.flush()
    torch.save(model, "checkpoint.pt")
```

## Configuration

### Environment Variables

```bash
export ML_DASH_BUFFER_ENABLED=true       # default: true
export ML_DASH_FLUSH_INTERVAL=5.0        # seconds
export ML_DASH_LOG_BATCH_SIZE=100
export ML_DASH_METRIC_BATCH_SIZE=100
export ML_DASH_TRACK_BATCH_SIZE=100
export ML_DASH_FILE_UPLOAD_WORKERS=4
```

### Programmatic

```python
from ml_dash import Experiment
from ml_dash.buffer import BufferConfig

config = BufferConfig(
    flush_interval=10.0,
    log_batch_size=200,
    metric_batch_size=500,
    file_upload_workers=8,
)

with Experiment("my-project/exp", buffer_config=config).run as exp:
    exp.log("custom buffer config")
```

Pass `BufferConfig(buffer_enabled=False)` to make every write synchronous. Useful only for debugging.

## Context Exit

On `__exit__`, the buffer manager drains all queues before returning. Expect a short pause and console output like:

```
[ML-Dash] Flushing buffered data...
[ML-Dash]   - 1000 log(s), 100 metric(s), 50 track(s), 10 file(s)
[ML-Dash]   Uploading 10 file(s)...
[ML-Dash] All data flushed successfully
```

Upload failures are logged as warnings, not raised, so a flaky network won't crash training. Authentication errors usually mean re-running `ml-dash login`.

## File Uploads

When you call `save_image`, `save_json`, etc., content is written to a temp file, queued, uploaded by one of `file_upload_workers` threads, then the temp file is removed. Cleanup runs even if the upload is delayed.

## Thread Safety

Queues are thread-safe; logging from multiple worker threads against the same `Experiment` is supported.

For end-to-end examples, see [Complete Examples](/complete-examples.md).

---

Source: https://docs.dash.ml/tracks

# Track API

Tracks are timestamp-indexed, multi-modal streams: robot poses, sensor readings, per-step state. Each entry carries a float timestamp and an arbitrary dict payload, and entries that share a timestamp on the same topic merge into one row.

## Tracks vs. Metrics

Use a **track** when entries are timestamp-indexed and the schema may vary across topics (poses, cameras, lidar, RL transitions). Use a **metric** when you have step-indexed scalars for plotting (loss, accuracy). See [/metrics](/metrics.md).

## Basic Usage

```python
from ml_dash import Experiment

with Experiment("robotics/training").run as exp:
    for step in range(1000):
        t = step / 30.0  # 30 Hz simulator clock
        exp.tracks("robot/pose").append(
            q=[0.1, -0.22, 0.45],
            e=[0.5, 0.0, 0.6],
            _ts=t,
        )
```

`exp.tracks(topic)` returns a `TrackBuilder` bound to a topic path (e.g. `"robot/pose"`). `append(**fields, _ts=...)` writes one entry.

## Timestamps

`_ts` is **required** on every `append` call and must be numeric (cast to `float` internally). There is no auto-generated or inherited timestamp — omitting `_ts` raises `ValueError`. Pick a consistent clock per experiment (simulator time, wall clock, sensor timestamp).

Two `append` calls to the same topic at the same `_ts` merge: later fields overwrite earlier ones at the same keys. This lets you split a sample across calls without duplicating rows:

```python
exp.tracks("camera/rgb").append(frame_id=0, _ts=0.0)
exp.tracks("camera/rgb").append(path="frame_0.png", _ts=0.0)
# -> one row at _ts=0.0 with {frame_id: 0, path: "frame_0.png"}
```

Different topics keep independent timestamp tables, so log multi-modal samples at the same `_ts` across topics to align them later.

## Flexible Schema

The data dict is free-form per call — different fields per entry are allowed. The backend reconciles columns at read time.

## Reading

```python
TrackBuilder.read(
    start_timestamp: float | None = None,
    end_timestamp: float | None = None,
    columns: list[str] | None = None,
    format: str = "json",  # "json" | "jsonl" | "parquet" | "mocap"
)
```

`read()` returns the topic's entries (optionally filtered by timestamp range and projected to selected columns). Flush before reading in the same process.

```python
exp.tracks.flush()

data    = exp.tracks("robot/pose").read()
window  = exp.tracks("robot/pose").read(start_timestamp=0.0, end_timestamp=10.0)
parquet = exp.tracks("robot/pose").read(format="parquet")
```

## Flushing

```python
exp.tracks.flush()                  # flush all topics
exp.tracks("robot/pose").flush()    # flush one topic
```

Appends are non-blocking and batched by the background uploader. See [/buffering](/buffering.md) for batch size and flush interval configuration.

## Aligning with Frames

To pair a track entry with an image, log the filename alongside the data and use a consistent zero-padded index. See [/images](/images.md).

```python
for step in range(1000):
    t = step / 30.0
    fname = f"frame_{step:05d}.jpg"
    exp.files("frames").save_image(frame, to=fname)
    exp.tracks("robot/pose").append(frame=fname, q=q_step, _ts=t)
```

In MDX prose, wrap path templates like `{step}` or `{i:05d}` in backticks so the renderer doesn't treat them as expressions.

---

Source: https://docs.dash.ml/images

# Image Saving

`save_image()` writes a numpy array directly to PNG or JPEG — no manual conversion needed. Useful for MuJoCo/PyBullet renders, RL observations, model predictions, or any HxW / HxWxC array.

Requires Pillow: `pip install Pillow`.

## Basic Usage

```python
import numpy as np
from ml_dash import Experiment

with Experiment("vision/training").run as experiment:
    pixels = renderer.render()  # numpy array from MuJoCo, OpenCV, etc.

    experiment.files("frames").save_image(pixels, to="frame_001.png")  # lossless
    experiment.files("frames").save_image(pixels, to="frame_001.jpg")  # smaller
```

`save()` auto-detects numpy arrays and dispatches to `save_image()`, so these are equivalent:

```python
experiment.files("images").save(pixels, to="frame.png")
experiment.files("images").save_image(pixels, to="frame.png")
```

Works with any numpy array — OpenCV frames (remember `cv2.cvtColor(..., COLOR_BGR2RGB)`), `np.array(PIL.Image)`, model outputs, etc.

## Array Types

- **uint8** — passed through directly. Shape `HxW` (grayscale), `HxWx3` (RGB), or `HxWx4` (RGBA).
- **float in `[0.0, 1.0]`** — multiplied by 255 and cast to uint8.
- **float in any other range** — normalized via `(value - min) / (max - min) * 255`.

```python
experiment.files("images").save_image(np.random.rand(480, 640, 3), to="norm.png")
experiment.files("images").save_image(np.random.rand(480, 640) * 1000, to="scaled.png")
```

## Format: PNG vs JPEG

| Aspect       | PNG               | JPEG                |
|--------------|-------------------|---------------------|
| Compression  | Lossless          | Lossy               |
| File Size    | Larger            | Smaller             |
| Transparency | Yes               | No                  |
| Quality      | Perfect           | Configurable        |
| Best For     | Graphics, text    | Photos, renders     |
| Speed        | Slower            | Faster              |

The extension picks the encoder: `.png`, `.jpg`, and `.jpeg` all work. JPEG drops the alpha channel (transparent pixels composited onto white) and applies optimization.

## Quality

JPEG only. Default is 95. Range 1–100.

```python
experiment.files("frames").save_image(pixels, to="frame.jpg", quality=85)
```

Rough guide: **95–100** near-lossless, **85–90** balanced (recommended default for sequences), **70–80** visible compression, **below 50** poor.

## API Reference

### `save_image(array, *, to, quality=95)`

Save a numpy array as an image file.

**Parameters**
- `array` (`numpy.ndarray`) — image array, shape `HxW` or `HxWxC`.
- `to` (`str`) — target filename with extension (`.png`, `.jpg`, `.jpeg`).
- `quality` (`int`, optional) — JPEG quality 1–100. Default 95. Ignored for PNG.

**Returns** — dict with file metadata, or a queued-status dict when the write is buffered (see [/buffering](/buffering.md)).

**Raises** — `ImportError` if Pillow is missing; `ValueError` for invalid `array` or missing `to`.

## Example: MuJoCo Renders

```python
import mujoco
import numpy as np
from ml_dash import Experiment

with Experiment("robotics/mujoco-renders").run as experiment:
    model = mujoco.MjModel.from_xml_string(xml_content)
    data = mujoco.MjData(model)
    renderer = mujoco.Renderer(model, height=480, width=640)

    for i in range(1000):
        mujoco.mj_step(model, data)

        if i % 10 == 0:
            renderer.update_scene(data)
            pixels = renderer.render()  # (480, 640, 3) uint8

            experiment.files("robot/frames").save_image(
                pixels,
                to=f"frame_{i:05d}.jpg",
                quality=85,
            )
```

Image writes are buffered for non-blocking uploads — see [/buffering](/buffering.md). To align frames with numeric data, share a step index across `save_image()` and `track()` calls — see [/tracks](/tracks.md).

---

Source: https://docs.dash.ml/llm-readable

# LLM-Readable Docs

These docs are built to be read by agents as easily as by people. Every page
has a markdown twin, and the whole corpus is published in the formats LLM
tooling already looks for — so you can point Claude (or any agent) at ML-Dash
and have it answer accurately.

## Fetch a single page

Append `.md` to any docs URL to get the raw markdown — no nav, no chrome:

```bash
curl https://docs.dash.ml/getting-started.md
```

Every page also advertises its markdown twin in the HTML head:

```html
<link rel="alternate" type="text/markdown" href="/getting-started.md" />
```

## The whole site, two ways

- **[`/llms.txt`](https://docs.dash.ml/llms.txt)** — a short, linked index
  of every page ([llmstxt.org](https://llmstxt.org) standard). The entry point
  an agent reads first to decide what to fetch.
- **[`/llms-full.txt`](https://docs.dash.ml/llms-full.txt)** — every page
  concatenated into one markdown file. Drop the entire product into a context
  window in a single request.

## Import it as a skill

The docs are also packaged as an [Agent Skill](https://docs.dash.ml/skills/dash-docs.zip)
— a `SKILL.md` plus one markdown reference file per page. Install it so your
agent loads ML-Dash knowledge on demand:

```bash
# Claude Code: drop it into your project (or ~/.claude) skills directory
curl -L https://docs.dash.ml/skills/dash-docs.zip -o dash-docs.zip
unzip dash-docs.zip -d .claude/skills/
```

> **Note:** **Always current.** Every surface above — the `.md` pages, both `llms` files,
> and the skill — is generated from the same source on each deploy, so none of
> them can drift from what you read on the site.

---

Source: https://docs.dash.ml/complete-examples

# Complete Examples

End-to-end, runnable examples for distinct ML-Dash use cases. Each example is self-contained — copy, paste, and run.

For a first-time walkthrough, see [Getting Started](/getting-started.md). Replace `alice/...` prefixes with your own `owner/project` path.

## Minimal Experiment

The absolute minimum: open an experiment, set parameters, log a metric. This is the smallest useful ML-Dash program.

```python
"""Hello, ML-Dash."""
from ml_dash import Experiment

with Experiment(
    prefix="alice/tutorials/hello-ml-dash",
    readme="My first ML-Dash experiment",
    tags=["tutorial"],
).run as experiment:
    experiment.params.set(learning_rate=0.001, batch_size=32)
    experiment.log("Hello from ML-Dash!", level="info")

    for epoch in range(5):
        loss = 1.0 / (epoch + 1)
        experiment.metrics("train").log(loss=loss, epoch=epoch)
```

Data is written under `./.dash/` by default. Pass `dash_url=True` to mirror to the remote server.

## Three Usage Styles

ML-Dash supports three equivalent ways to scope an experiment. Pick the one that fits your code; all three produce identical data on disk.

```python
"""Decorator, context manager, and imperative styles."""
from ml_dash import Experiment, ml_dash_experiment

# 1. Decorator — cleanest for a training function.
@ml_dash_experiment(
    prefix="alice/usage-styles/decorator",
    readme="Decorator style",
    tags=["decorator"],
)
def train_decorated(experiment):
    experiment.params.set(learning_rate=0.001)
    for epoch in range(3):
        experiment.metrics("train").log(loss=1.0 / (epoch + 1), epoch=epoch)

# 2. Context manager — best for scripts and notebooks.
def train_context():
    with Experiment(
        prefix="alice/usage-styles/context",
        readme="Context manager style",
    ).run as experiment:
        experiment.params.set(learning_rate=0.002)
        for epoch in range(3):
            experiment.metrics("train").log(loss=0.8 / (epoch + 1), epoch=epoch)

# 3. Imperative — when the experiment must span multiple scopes.
def train_imperative():
    experiment = Experiment(
        prefix="alice/usage-styles/imperative",
        readme="Imperative style",
    )
    experiment.run.start()
    try:
        experiment.params.set(learning_rate=0.003)
        for epoch in range(3):
            experiment.metrics("train").log(loss=0.6 / (epoch + 1), epoch=epoch)
    finally:
        experiment.run.complete()

if __name__ == "__main__":
    train_decorated()
    train_context()
    train_imperative()
```

The decorator injects `experiment` as a keyword argument. The context manager auto-closes on exit (and marks `FAILED` if an exception is raised). The imperative form needs an explicit `try`/`finally` to ensure `complete()` runs.

## Parameters from a Config Class

`params.set()` accepts class objects directly — their public attributes are extracted into a namespaced parameter group. This pairs naturally with `params-proto` config classes.

```python
"""Pass a config class straight into params.set()."""
from ml_dash import Experiment

class Args:
    learning_rate = 0.001
    batch_size = 64
    optimizer = "adam"
    weight_decay = 1e-4

class ModelArgs:
    architecture = "resnet50"
    pretrained = True
    num_classes = 10

with Experiment(
    prefix="alice/config-class/run-001",
    readme="Config classes as parameter groups",
    tags=["config"],
).run as experiment:
    # Class attributes are flattened into Args.learning_rate, Args.batch_size, ...
    experiment.params.set(Args=Args, Model=ModelArgs)

    for epoch in range(Args.batch_size // 16):
        experiment.metrics("train").log(epoch=epoch, loss=1.0 / (epoch + 1))
```

The same call shape works with `params_proto.PrefixProto` subclasses, so a CLI-configurable class can be logged with a single line.

## PyTorch Training with Checkpoints

Full MNIST training loop with parameters, metrics, structured logs, and best/final model uploads.

```python
"""PyTorch MNIST training with ML-Dash tracking."""
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from ml_dash import Experiment

class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 64)
        self.fc3 = nn.Linear(64, 10)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = x.view(-1, 784)
        x = self.relu(self.fc1(x))
        x = self.relu(self.fc2(x))
        return self.fc3(x)

def train_mnist():
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    batch_size, epochs, lr = 64, 5, 0.001

    transform = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,)),
    ])
    train_ds = datasets.MNIST("./data", train=True, download=True, transform=transform)
    test_ds = datasets.MNIST("./data", train=False, transform=transform)
    train_loader = torch.utils.data.DataLoader(train_ds, batch_size=batch_size, shuffle=True)
    test_loader = torch.utils.data.DataLoader(test_ds, batch_size=batch_size)

    model = SimpleNet().to(device)
    optimizer = optim.Adam(model.parameters(), lr=lr)
    criterion = nn.CrossEntropyLoss()

    with Experiment(
        prefix="alice/computer-vision/mnist-pytorch",
        readme="MNIST classification with PyTorch",
        tags=["mnist", "pytorch"],
    ).run as experiment:
        experiment.params.set({
            "model": {"architecture": "SimpleMLP", "layers": [784, 128, 64, 10]},
            "training": {"optimizer": "adam", "learning_rate": lr,
                         "batch_size": batch_size, "epochs": epochs},
            "device": str(device),
            "dataset": "MNIST",
        })
        experiment.log(f"Training on {device}", level="info")

        best_accuracy = 0.0
        for epoch in range(epochs):
            model.train()
            train_loss, correct, total = 0.0, 0, 0
            for data, target in train_loader:
                data, target = data.to(device), target.to(device)
                optimizer.zero_grad()
                output = model(data)
                loss = criterion(output, target)
                loss.backward()
                optimizer.step()
                train_loss += loss.item()
                correct += output.argmax(dim=1).eq(target).sum().item()
                total += target.size(0)
            train_loss /= len(train_loader)
            train_acc = correct / total

            model.eval()
            val_loss, correct, total = 0.0, 0, 0
            with torch.no_grad():
                for data, target in test_loader:
                    data, target = data.to(device), target.to(device)
                    output = model(data)
                    val_loss += criterion(output, target).item()
                    correct += output.argmax(dim=1).eq(target).sum().item()
                    total += target.size(0)
            val_loss /= len(test_loader)
            val_acc = correct / total

            experiment.metrics.log(
                epoch=epoch,
                train=dict(loss=train_loss, accuracy=train_acc),
                eval=dict(loss=val_loss, accuracy=val_acc),
            )
            experiment.log(f"Epoch {epoch + 1}/{epochs}", level="info",
                           metadata={"train_loss": train_loss, "val_acc": val_acc})

            if val_acc > best_accuracy:
                best_accuracy = val_acc
                torch.save(model.state_dict(), "best_model.pth")
                experiment.files("models").save(
                    "best_model.pth",
                    description=f"Best model (accuracy: {best_accuracy:.4f})",
                    tags=["best"],
                    metadata={"epoch": epoch, "accuracy": best_accuracy},
                )

        torch.save(model.state_dict(), "final_model.pth")
        experiment.files("models").save("final_model.pth", tags=["final"])
        experiment.log("Training complete!", level="info")

if __name__ == "__main__":
    train_mnist()
```

## Hyperparameter Sweep

Grid search across configurations. Each run becomes a separate experiment for side-by-side comparison in the dashboard.

```python
"""Hyperparameter grid search."""
import random
from itertools import product
from ml_dash import Experiment

def train_with_config(lr, batch_size, experiment):
    epochs = 10
    accuracy = 0.0
    for epoch in range(epochs):
        loss = 1.0 / (epoch + 1) * (lr / 0.01) + random.uniform(-0.05, 0.05)
        accuracy = min(0.95, 0.5 + epoch * 0.05 * (32 / batch_size))
        experiment.metrics.log(epoch=epoch, train=dict(loss=loss, accuracy=accuracy))
    return accuracy

def sweep():
    results = []
    for lr, bs in product([0.1, 0.01, 0.001], [16, 32, 64]):
        run_name = f"search-lr{lr}-bs{bs}"
        with Experiment(
            prefix=f"alice/hyperparameter-search/{run_name}",
            readme=f"Grid search: lr={lr}, batch_size={bs}",
            tags=["grid-search", f"lr-{lr}", f"bs-{bs}"],
        ).run as experiment:
            experiment.params.set(learning_rate=lr, batch_size=bs,
                                  optimizer="sgd", epochs=10)
            experiment.log(f"Starting run lr={lr} bs={bs}")
            acc = train_with_config(lr, bs, experiment)
            results.append({"lr": lr, "batch_size": bs, "accuracy": acc})

    best = max(results, key=lambda r: r["accuracy"])
    print(f"Best: lr={best['lr']} bs={best['batch_size']} acc={best['accuracy']:.4f}")

if __name__ == "__main__":
    sweep()
```

## Resume an Experiment

Open the same prefix again to upsert. Metrics, parameters, and logs append rather than overwrite — useful for resuming a crashed run or extending a finished one with a second analysis pass.

```python
"""Resume an existing experiment and append more metrics."""
from ml_dash import Experiment

PREFIX = "alice/resume-demo/run-001"

# First pass: train for a few epochs.
with Experiment(prefix=PREFIX, readme="Resume demo").run as experiment:
    experiment.params.set(learning_rate=0.001, batch_size=32)
    for epoch in range(3):
        experiment.metrics("train").log(loss=1.0 / (epoch + 1), epoch=epoch)
    experiment.log("Initial run complete", level="info")

# Later (different process, same prefix): read back and continue.
with Experiment(prefix=PREFIX, readme="Resume demo - continued").run as experiment:
    prev = experiment.metrics("train").read(start_index=0, limit=1000)
    last_epoch = max(p["data"]["epoch"] for p in prev["data"])
    experiment.log(f"Resuming from epoch {last_epoch}", level="info")

    # Bump a hyperparameter and append more epochs.
    experiment.params.set(learning_rate=0.0001)
    for epoch in range(last_epoch + 1, last_epoch + 4):
        experiment.metrics("train").log(loss=0.3 / (epoch + 1), epoch=epoch)
```

`params.set()` merges into existing parameters, so the second `learning_rate` overrides the first. `metrics(...).read()` returns `data`, `total`, and `hasMore` — see [Metrics](/metrics.md).

## Project Root Pattern

For multi-experiment repos, set `RUN.project_root` once and pass `RUN.entry = __file__` in each training script. ML-Dash derives the prefix from the script's path relative to the project root — no hardcoded names.

```python
"""experiments/__init__.py — one-time setup."""
from pathlib import Path
from ml_dash import RUN

RUN.project_root = str(Path(__file__).parent)
```

```python
"""experiments/vision/resnet/train.py — auto-prefixed by file path."""
from ml_dash import RUN, Experiment
import experiments  # triggers project_root setup

# Compute prefix from this file's location relative to project_root.
# experiments/vision/resnet/train.py  ->  RUN.prefix = "vision/resnet/train"
RUN.__post_init__(entry=__file__)

with Experiment(
    prefix=f"alice/my-project/{RUN.prefix}",
    readme="Auto-prefixed from filesystem layout",
).run as experiment:
    experiment.params.set(script=RUN.entry, model="resnet50", lr=0.001)
    for epoch in range(5):
        experiment.metrics("train").log(loss=1.0 / (epoch + 1), epoch=epoch)
```

Moving `train.py` to a new directory automatically gives it a new prefix; no constants to update. The same pattern accepts a directory (e.g. one containing `sweep.jsonl`) instead of a file.

## Robotics: Timestamped Tracks and Buffered Telemetry

Log timestamped joint state on a track while accumulating per-step values into a buffer, then flush per-epoch summary statistics. Tracks store time-aligned data with a required `_ts=` timestamp; the metric buffer reduces per-step write overhead by aggregating values before logging.

```python
"""Robotics episode: joint-state track + buffered telemetry summaries."""
import numpy as np
from ml_dash import Experiment

def run_episode():
    with Experiment(
        prefix="alice/robotics/pick-and-place-001",
        readme="Pick-and-place demo with joint telemetry",
        tags=["robot", "episode"],
    ).run as experiment:
        experiment.params.set(
            robot="ur5",
            task="pick_and_place",
            control_hz=100,
        )

        steps_per_epoch = 100
        for epoch in range(5):
            for i in range(steps_per_epoch):
                step = epoch * steps_per_epoch + i
                t = step / 100.0

                # Timestamped track entry: _ts is required (float seconds).
                experiment.tracks("robot/joints").append(
                    q=[np.sin(t), np.cos(t), np.sin(2 * t), np.cos(2 * t)],
                    gripper_force=0.5 + 0.5 * np.sin(t),
                    _ts=t,
                )

                # Accumulate scalars in the buffer instead of logging every step.
                experiment.metrics("control").buffer(
                    gripper_force=0.5 + 0.5 * np.sin(t),
                    q0_abs=abs(float(np.sin(t))),
                )

                if step == 250:
                    experiment.log("Object grasped", level="info",
                                   metadata={"step": step, "t": t})

            # Flush per-epoch summary stats: logs control/gripper_force.mean,
            # control/q0_abs.mean, plus .max for each.
            experiment.metrics.buffer.log_summary("mean", "max")
            experiment.metrics("epoch").log(epoch=epoch)

        experiment.tracks.flush()
        experiment.log("Episode complete", level="info")

if __name__ == "__main__":
    run_episode()
```

See [Tracks](/tracks.md) and [Buffering](/buffering.md) for details on the APIs used here.

## Structured Logging for Debugging

Use log levels and metadata to make runs easy to triage. Filter by `level` in the dashboard.

```python
"""Training with structured debug/warn/info logs."""
import random
from ml_dash import Experiment

def train_with_debug():
    with Experiment(
        prefix="alice/debugging/debug-training",
        readme="Training with debug logging",
        tags=["debug"],
    ).run as experiment:
        experiment.params.set(learning_rate=0.001, batch_size=32, model="debug_net")
        experiment.log("Training started", level="info")
        experiment.log("Initializing model", level="debug")

        for epoch in range(5):
            experiment.log(f"Starting epoch {epoch + 1}", level="debug")
            loss = 1.0 / (epoch + 1)

            if epoch == 2:
                experiment.log(
                    "Learning rate may be too high",
                    level="warn",
                    metadata={"current_lr": 0.001, "suggested_lr": 0.0001},
                )

            if random.random() < 0.2:
                experiment.log(
                    "Gradient clipping applied",
                    level="warn",
                    metadata={"gradient_norm": 15.5, "max_norm": 10.0},
                )

            experiment.metrics("train").log(loss=loss, epoch=epoch)
            experiment.log(f"Epoch {epoch + 1} complete", level="info",
                           metadata={"loss": loss})

        experiment.log("Training complete", level="info")

if __name__ == "__main__":
    train_with_debug()
```

## See Also

- [Getting Started](/getting-started.md) — install and first experiment
- [Experiments](/experiments.md) — `Experiment` lifecycle and prefixes
- [Parameters](/parameters.md) — hyperparameter tracking
- [Metrics](/metrics.md) — time-series metrics
- [Logging](/logging.md) — structured logs with levels and metadata
- [Files](/files.md) — checkpoints and artifacts
- [Tracks](/tracks.md) — time-aligned media streams
- [Images](/images.md) — image logging and formats
- [Buffering](/buffering.md) — batched writes for high-frequency loops
- [CLI](/cli.md) — `ml-dash` command-line tools
- [API Reference](/api-reference.md) — full API surface