Demucs music stem separator rewritten in Rust – runs in the browser
FrameworkFreeHi HN! I reimplemented HTDemucs v4 (Meta's music source separation model) in Rust, using Burn. It splits any song into individual stems — drums, bass, vocals, guitar, piano — with no Python runtime or server involved.Try it now: https://nikhilunni.github.io/demucs-rs/ (needs
Capabilities8 decomposed
browser-native audio stem separation with onnx inference
Medium confidenceExecutes the Demucs neural network model (vocals, drums, bass, other) directly in the browser using ONNX Runtime WebAssembly, eliminating server-side processing. The Rust codebase compiles to WebAssembly via wasm-bindgen, exposing a JavaScript API that loads pre-trained model weights and runs inference on client-side audio buffers without network latency or privacy concerns.
Rewrite of Demucs (originally Python/PyTorch) into Rust compiled to WebAssembly, enabling full stem separation inference in browsers without server dependency. Uses ONNX Runtime WebAssembly for cross-platform model execution, avoiding the need to bundle PyTorch or maintain Python backend infrastructure.
Faster and more private than cloud-based stem separation services (Splitter.ai, Lalal.ai) because processing happens locally; more accessible than native Demucs because no Python/GPU setup required; smaller bundle than full PyTorch-to-WASM ports because ONNX Runtime is optimized for inference-only workloads.
real-time audio buffer streaming and windowing
Medium confidenceHandles chunked audio input processing by managing sliding windows of audio frames, buffering partial chunks, and coordinating inference timing to avoid gaps or overlaps in stem output. The Rust implementation uses ring buffers or deque structures to queue incoming audio data and emit inference-ready chunks at the model's required sample rate and frame size, with overlap-add reconstruction for seamless stem reconstruction.
Implements overlap-add windowing in Rust with zero-copy buffer management, allowing seamless reconstruction of stems from overlapping inference windows without intermediate allocations. Uses WASM memory views to avoid copying audio data between JavaScript and Rust boundaries.
More memory-efficient than loading entire audio files before processing because windowing processes fixed-size chunks; lower latency than naive chunking because overlap-add prevents discontinuities at chunk boundaries.
onnx model weight loading and caching
Medium confidenceLoads pre-trained Demucs model weights from ONNX format files and caches them in browser memory or IndexedDB to avoid re-downloading on subsequent uses. The implementation handles model initialization, weight tensor mapping to the inference graph, and optional persistent storage using browser APIs, with fallback to re-download if cache is unavailable.
Implements dual-layer caching (in-memory + IndexedDB) for ONNX models in Rust/WASM, with automatic fallback to re-download if cache is stale or unavailable. Uses WASM memory views to avoid copying model weights between storage and inference engine.
Faster repeat loads than cloud-based services because models are cached locally; more efficient than naive re-download on every page load because IndexedDB persists across sessions; avoids server-side model serving costs.
multi-stem parallel inference orchestration
Medium confidenceCoordinates inference across multiple output stems (vocals, drums, bass, other) by running the Demucs model once per stem or using a multi-output model variant that produces all stems in a single forward pass. The Rust implementation manages tensor allocation, inference scheduling, and output collection to ensure all stems are computed and synchronized before returning results to the caller.
Orchestrates inference across multiple stems using ONNX Runtime's graph execution, potentially leveraging multi-output model variants to compute all stems in a single forward pass rather than sequential inference. Manages tensor lifecycle and memory to minimize allocations across stem computations.
More efficient than running separate models per stem because a single multi-output model reduces redundant computation; faster than sequential single-stem inference because overlapping computation can be parallelized on multi-core CPUs.
audio format conversion and resampling
Medium confidenceConverts input audio from various formats (MP3, WAV, WebM, OGG) to raw PCM buffers at the model's expected sample rate, handling codec decoding and sample rate conversion transparently. The implementation uses browser Web Audio API for decoding and Rust-based resampling (e.g., sinc interpolation or linear interpolation) to match the model's input requirements without requiring external libraries.
Implements resampling in Rust/WASM to avoid JavaScript overhead and enable high-quality sinc interpolation without external dependencies. Uses Web Audio API for codec decoding (browser-native, no transcoding overhead) and delegates resampling to Rust for performance and quality control.
More efficient than JavaScript-based resampling libraries because Rust/WASM is faster; avoids server-side transcoding because Web Audio API handles decoding; supports more formats than naive implementations because it leverages browser codec support.
stem output export to audio files
Medium confidenceEncodes separated stems from raw PCM buffers into downloadable audio files (WAV, MP3, or other formats) with metadata (sample rate, bit depth, channel count). The implementation uses browser APIs or Rust-based encoders to convert Float32Array buffers to file formats, handling byte ordering, header generation, and optional compression.
Implements WAV encoding directly in Rust/WASM to avoid JavaScript overhead and external encoder dependencies. Generates valid WAV headers with correct RIFF structure and PCM format specifications, enabling direct file download without server-side encoding.
Faster than JavaScript-based WAV encoding because Rust is compiled; avoids server-side encoding costs and latency; produces valid WAV files without external libraries or APIs.
progress reporting and cancellation for long-running inference
Medium confidenceExposes callbacks or event emitters that report inference progress (e.g., percentage complete, current stem being processed) and allow users to cancel ongoing inference. The implementation divides inference into checkpoints, emits progress events after each checkpoint, and checks for cancellation signals before proceeding to the next step.
Implements checkpoint-based progress reporting in Rust/WASM by dividing inference into discrete steps and emitting progress events via JavaScript callbacks. Uses atomic flags for cancellation signaling to avoid race conditions between WASM and JavaScript threads.
More responsive than blocking inference because progress is reported incrementally; allows cancellation without restarting the entire process; provides better UX than silent inference by keeping users informed.
error handling and graceful degradation for inference failures
Medium confidenceCatches inference errors (e.g., out-of-memory, invalid model, corrupted audio) and returns meaningful error messages to the caller, with optional fallback strategies (e.g., reduce audio quality, use smaller model variant). The implementation includes validation of input audio, model state checks, and error propagation through the JavaScript API.
Implements comprehensive error handling in Rust with custom error types that map to JavaScript exceptions, providing structured error information (code, message, recovery suggestions) rather than opaque WASM panics. Validates input audio and model state before inference to catch errors early.
More informative than raw WASM errors because custom error types provide context; better UX than silent failures because errors are reported with recovery suggestions; more robust than naive implementations because validation catches edge cases early.
Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.
Related Artifactssharing capabilities
Artifacts that share capabilities with Demucs music stem separator rewritten in Rust – runs in the browser, ranked by overlap. Discovered automatically through the match graph.
wav2vec2-large-xlsr-53-russian
automatic-speech-recognition model by undefined. 45,90,191 downloads.
distil-large-v3
automatic-speech-recognition model by undefined. 13,05,832 downloads.
ruvector-onnx-embeddings-wasm
Portable WASM embedding generation with SIMD and parallel workers - run text embeddings in browsers, Cloudflare Workers, Deno, and Node.js
RMBG-1.4
image-segmentation model by undefined. 10,16,325 downloads.
Kokoro TTS
Lightweight 82M parameter open-source TTS with high-quality output.
OpenAI: GPT-4o Audio
The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs...
Best For
- ✓web developers building music production tools
- ✓teams building privacy-first audio applications
- ✓indie music producers wanting client-side stem separation
- ✓developers prototyping audio ML features without server costs
- ✓developers building real-time audio processing pipelines
- ✓applications that need to process long audio files without loading entirely into memory
- ✓live music production tools that process microphone input
- ✓web apps where users return multiple times and want fast startup
Known Limitations
- ⚠ONNX Runtime WebAssembly has higher memory overhead than native inference; typical 3-5 minute songs require 2-4GB RAM during processing
- ⚠inference speed depends on client CPU/GPU; no GPU acceleration in most browsers (WebGPU still experimental)
- ⚠model weights must be downloaded to client (typically 200-500MB for full Demucs model); no streaming model loading
- ⚠browser tab may become unresponsive during long inference; no built-in progress reporting or cancellation
- ⚠limited to browsers with WebAssembly support (IE11 not supported)
- ⚠overlap-add reconstruction introduces latency proportional to window size; typical 2-5 second delay before first stem output
Requirements
Input / Output
UnfragileRank
UnfragileRank is computed from adoption signals, documentation quality, ecosystem connectivity, match graph feedback, and freshness. No artifact can pay for a higher rank.
About
Show HN: Demucs music stem separator rewritten in Rust – runs in the browser
Categories
Alternatives to Demucs music stem separator rewritten in Rust – runs in the browser
Search the Supabase docs for up-to-date guidance and troubleshoot errors quickly. Manage organizations, projects, databases, and Edge Functions, including migrations, SQL, logs, advisors, keys, and type generation, in one flow. Create and manage development branches to iterate safely, confirm costs
Compare →Are you the builder of Demucs music stem separator rewritten in Rust – runs in the browser?
Claim this artifact to get a verified badge, access match analytics, see which intents users search for, and manage your listing.
Get the weekly brief
New tools, rising stars, and what's actually worth your time. No spam.
Data Sources
Looking for something else?
Search →