MOCHART
Initializing Rust WASM + WebGPU…
WebGPU + Rust WASM financial charting for the modern web.
↓ scroll
Performance
Mochart keeps the visible OHLCV working set in a cache-friendly SoA layout so hot scans stay inside CPU cache as often as possible. Roughly one screen of data is about 24 KB in columnar form, which is small enough to target L1 cache residency and maintain a high cache hit rate while panning, range scanning, and updating indicators.
Architecture
Two threads collaborate via SharedArrayBuffer, with zero serialization overhead per frame. The main thread only dispatches viewport events; the Unified Worker owns WASM data engine, WebGPU rendering, and the vsync rAF loop entirely.
The Unified Worker collocates the Rust WASM data engine and the WebGPU renderer
in one thread, so WASM view data flows directly to
queue.writeBuffer() with no inter-thread transfer.
Main thread only writes a 64-byte SAB control block per frame.
Unified Worker —
The earlier 3-thread split (Main → Data Worker → Render Worker) required a frameBuf
intermediary for every frame, introduced a vsync double-gate, and scaled poorly at 2 workers per chart.
Collapsing both into a single Unified Worker eliminated the copy, fixed the vsync phase lag,
and let WASM memory flow directly to queue.writeBuffer().
No MSAA / No FSR / No FXAA — Candlestick charts are dominated by high-contrast vertical and horizontal edges. Blur-based anti-aliasing and upscaling techniques soften these edges and reduce perceived sharpness. Rendering at native DPR resolution delivers pixel-perfect lines at zero cost — and looks better. 4× MSAA alone would add ~132 MB of VRAM at 1080p DPR 2, contradicting our sub-10 KB VRAM budget.
No Motion Blur — Seven shader phases were explored and rejected. Chart panning redraws bars rather than translating pixels, so screen-space blur produces no perceptible effect. A proper velocity-buffer approach may revisit this in the future.
CPU-first Indicators — EMA, RSI, MACD, and ATR are IIR-sequential chains that cannot be parallelized via SIMD or GPU compute. With a visible window of 200–2000 bars, scalar Rust loops complete in under 1 µs — well within the 16.6 ms frame budget. GPU compute dispatch overhead (50–200 µs) would be slower for these workloads.
VSync-aligned rAF —
Replacing Atomics.waitAsync blocking with
requestAnimationFrame polling in the Worker
eliminated tearing and redundant GPU submits — delivering the single largest UX improvement in the project.
What's next
A thin, zero-overhead wrapper for React 18+.
Strict Mode safe. SSR-ready. Hooks & declarative component API.
Also planned: @mochart/vue · @mochart/solid