# Background Buffering

ML-Dash writes are non-blocking. Logs, [metrics](/metrics.md), [tracks](/tracks.md), and [files](/files.md) are queued and flushed from a background daemon thread so the hot path stays fast.

## Flush Triggers

Buffered data flushes when any of these occur:

1. **Time-based**: every `flush_interval` seconds (default `5.0`).
2. **Size-based**: when a queue reaches its batch size (default `100` items per queue).
3. **Manual**: `experiment.flush()` blocks until queues drain.
4. **Context exit**: leaving the `Experiment(...).run` context waits for a full drain (no timeout).

## Forcing a Flush

Call `flush()` before any action that depends on uploads being durable (checkpoint markers, downstream readers, etc.):

```python
with Experiment("my-project/exp").run as experiment:
    experiment.metrics("train").log(loss=loss)
    experiment.flush()
    torch.save(model, "checkpoint.pt")
```

## Configuration

### Environment Variables

```bash
export ML_DASH_BUFFER_ENABLED=true       # default: true
export ML_DASH_FLUSH_INTERVAL=5.0        # seconds
export ML_DASH_LOG_BATCH_SIZE=100
export ML_DASH_METRIC_BATCH_SIZE=100
export ML_DASH_TRACK_BATCH_SIZE=100
export ML_DASH_FILE_UPLOAD_WORKERS=4
```

### Programmatic

```python
from ml_dash import Experiment
from ml_dash.buffer import BufferConfig

config = BufferConfig(
    flush_interval=10.0,
    log_batch_size=200,
    metric_batch_size=500,
    file_upload_workers=8,
)

with Experiment("my-project/exp", buffer_config=config).run as exp:
    exp.log("custom buffer config")
```

Pass `BufferConfig(buffer_enabled=False)` to make every write synchronous. Useful only for debugging.

## Context Exit

On `__exit__`, the buffer manager drains all queues before returning. Expect a short pause and console output like:

```
[ML-Dash] Flushing buffered data...
[ML-Dash]   - 1000 log(s), 100 metric(s), 50 track(s), 10 file(s)
[ML-Dash]   Uploading 10 file(s)...
[ML-Dash] All data flushed successfully
```

Upload failures are logged as warnings, not raised, so a flaky network won't crash training. Authentication errors usually mean re-running `ml-dash login`.

## File Uploads

When you call `save_image`, `save_json`, etc., content is written to a temp file, queued, uploaded by one of `file_upload_workers` threads, then the temp file is removed. Cleanup runs even if the upload is delayed.

## Thread Safety

Queues are thread-safe; logging from multiple worker threads against the same `Experiment` is supported.

For end-to-end examples, see [Complete Examples](/complete-examples.md).