AI for Apple silicon devices.

Run your AI workflows locally on Apple silicon devices running iOS, macOS, tvOS, visionOS, or watchOS.

Start with the Swift SDK, the Dart SDK, or Onde CLI if you want to test model downloads and GGUF output before you wire inference into app code.

main.rs

use onde::inference::{ChatEngine, GgufModelConfig};
 
let engine = ChatEngine::new();
engine.load_gguf_model(
    GgufModelConfig::platform_default(),
    Some("You are a helpful assistant.".into()),
    None,
)
    .await?;
 
let result = engine.send_message("Hello!").await?;
println!("{}", result.text);
// completed in 85ms — 100% on device

Inference Layer	Latency (ms)	Server Cost	Privacy
Cloud API	1,200ms+	$$$$	Leaves device
Onde on-device	85ms	$0	Stays on device

Inference Layer

Latency (ms)

Server Cost

Privacy

Cloud API

1,200ms+

$$$$

Leaves device

Onde on-device

85ms

Stays on device

Developer developer developer developer developer developer developer developer developer developer developer developer developer developer - Steve

Reliable and efficient

Your app does not stall at one million users. More devices means more distributed compute at the edge.

Ergonomic API

Two lines to load a model. One line to run it. The SDK gets out of your way so you can ship.

Hardware Optimized

Hand-tuned for Apple Neural Engine and Apple silicon. Onde speaks closer to the metal than generic wrappers.

AI for Apple silicon devices.

The fastest API call is the one you never make.

Developer developer developer developer developer developer developer developer developer developer developer developer developer developer - Steve

Reliable and efficient

Ergonomic API

Hardware Optimized

Action speaks (c)louder.

The World's Intelligence. On Your Terms.