AI for Apple silicon devices.

Run your AI workflows locally on Apple silicon devices running iOS, macOS, tvOS, visionOS, or watchOS.

Start with the Swift SDK, the Dart SDK, or Onde CLI if you want to test model downloads and GGUF output before you wire inference into app code.

Download on the App Store
main.rs
use onde::inference::{ChatEngine, GgufModelConfig};
let engine = ChatEngine::new();
engine.load_gguf_model(
GgufModelConfig::platform_default(),
Some("You are a helpful assistant.".into()),
None,
)
.await?;
let result = engine.send_message("Hello!").await?;
println!("{}", result.text);
// completed in 85ms — 100% on device

Benchmark

The fastest API call is the one you never make.

Inference LayerLatency (ms)Server CostPrivacy
Cloud API1,200ms+$$$$Leaves device
Onde on-device85ms$0Stays on device

First-class citizen

Developer developer developer developer developer developer developer developer developer developer developer developer developer developer - Steve

Reliable and efficient

Your app does not stall at one million users. More devices means more distributed compute at the edge.

Ergonomic API

Two lines to load a model. One line to run it. The SDK gets out of your way so you can ship.

Hardware Optimized

Hand-tuned for Apple Neural Engine and Apple silicon. Onde speaks closer to the metal than generic wrappers.

In production

Action speaks (c)louder.

Onde ships in real apps from Splitfire AB. If you want to see the network side of the system, open Onde Inference Pulse.

One More Thing

The World's Intelligence. On Your Terms.