Chronos-2 ONNX

This is an ONNX export of the Chronos-2 time series forecasting model, optimized for use with transformers.js.

Model Details

Model Type: Time Series Forecasting
Architecture: T5-based encoder-decoder with patching
Context Length: 8192 timesteps
Output Patch Size: 16 timesteps
Quantile Levels: 21 levels (0.01, 0.05, ..., 0.95, 0.99)
Model Dimension: 768
Layers: 12
Attention Heads: 12

Files

model.onnx - FP32 ONNX model (456.3 MB)
model_quantized.onnx - INT8 quantized model (124.7 MB, ~70% size reduction). transformers.js symlinks point to this file by default.
config.json - Model configuration
generation_config.json - Generation parameters
onnx/ - transformers.js-compatible directory structure

Usage

JavaScript (transformers.js)

import { pipeline } from '@huggingface/transformers';

// Load the forecasting pipeline
const forecaster = await pipeline('time-series-forecasting', 'kashif/chronos-2-onnx');

// Your historical time series data
const timeSeries = [605, 586, 586, 559, 511, 487, 484, 458, ...];  // 100+ timesteps

// Generate 16-step forecast with quantiles
const output = await forecaster(timeSeries, {
    prediction_length: 16,
    quantile_levels: [0.1, 0.5, 0.9],  // 10th, 50th (median), 90th percentiles
});

// Output format: { forecast: [[t1_q1, t1_q2, t1_q3], ...], quantile_levels: [...] }
console.log('Median forecast:', output.forecast.map(row => row[1]));  // Extract median

// Clean up
await forecaster.dispose();

Batch Forecasting

const batch = [
    [100, 110, 105, 115, 120, ...],  // Series 1
    [50, 55, 52, 58, 60, ...],       // Series 2
];

const outputs = await forecaster(batch);
// Returns array of forecasts, one per input series

Forecast with Covariates

const result = await forecaster(
    {
        target: salesSeries,
        past_covariates: {
            temperature: pastTemps,
            promo: pastPromoFlags,
        },
        future_covariates: {
            temperature: futureTemps,
            promo: futurePromoFlags,
        },
    },
    {
        prediction_length: 24,
        quantile_levels: [0.1, 0.5, 0.9],
    },
);

console.log(result.forecast[0]);  // Quantile matrix for the target series

Performance

Inference Time: ~35-80ms per series (CPU, Node.js)
Speedup vs PyTorch: 3-8x faster
Accuracy: <1% error vs PyTorch reference

Technical Details

Preprocessing

Chronos-2 uses automatic preprocessing:

Repeat-padding: Input is padded to be divisible by patch_size (16)
Instance normalization: Per-series z-score normalization
arcsinh transformation: Nonlinear transformation for better modeling

All preprocessing is handled automatically by the pipeline.

Output Format

The model outputs quantile forecasts:

interface Chronos2Output {
    forecast: number[][];        // [prediction_length, num_quantiles]
    quantile_levels: number[];   // The quantile levels for each column
}

Extract specific quantiles:

const median = output.forecast.map(row => row[1]);    // 50th percentile
const lower = output.forecast.map(row => row[0]);     // 10th percentile (lower bound)
const upper = output.forecast.map(row => row[2]);     // 90th percentile (upper bound)

Limitations

Maximum context: 8192 timesteps
Fixed prediction length: 16 timesteps (for now; autoregressive unrolling coming soon)
Univariate only: Single time series per input (multivariate support coming)

Citation

@article{ansari2024chronos,
  title={Chronos: Learning the Language of Time Series},
  author={Ansari, Abdul Fatir and others},
  journal={arXiv preprint arXiv:2403.07815},
  year={2024}
}

License

Apache 2.0

kashif
/

chronos-2-onnx