MetaDJ Studio

Where music, imagination, and world building converge.

MetaDJ Studio is a real-time creative performance workspace built on Daydream Scope. It turns browser audio and text prompts into live AI-generated visuals — one shared session, two creative modes, transitions handled by the system so you never leave the flow. I built it for my live sets. It's how I see my music.

by MetaDJ / Z — Zuberant

The Experience

Press play. The visuals start breathing with the music.

Environment mode runs 16 calibrated visual worlds that react to energy, brightness, texture, and beat. High energy drives intense motion. Low energy lets the imagery settle. Beats pulse through the frame as controlled visual spikes, not jarring cuts. Switch themes mid-track and the world crossfades — SLERP blending the mathematical state of one world into the next. Nothing flickers. Nothing hard-cuts.

Toggle to Avatar mode. Studio performs a fast reconnect behind the scenes, freezes the last frame during the handshake, and lands on a clean jump cut into the MetaDJ character scene — a cosmic stage, a bioluminescent forest, a neon city. The audio still modulates the generation underneath. The character breathes with the music the same way the environments do. Prompt accents apply as subtle overlays on top of the active preset, layering mood or detail onto the scene without replacing the character or the base environment.

Pop out the Stage view onto a second screen. The Ambilight edge glow pulses, evolving from the live video feed. Drop it into OBS, stream it, record it. The compositor runs entirely in the browser — no additional GPU needed.

No music? Ambient mode keeps the visuals evolving on their own — sinusoidal drift maintains motion from prompt and latent state alone. Useful for installations, background loops, or letting a world breathe without audio input.

Same track, different themes — different visual journeys every time.

The Audio-Visual Connection

Audio analysis runs entirely in the browser at ~86Hz using Meyda — no server round-trips. Sending audio to a remote server would add 50-100ms of latency, enough to sever the rhythm-to-visual connection. By analyzing locally and streaming only lightweight extracted parameters over WebRTC, the sync between what you hear and what you see stays tight.

Four signals drive the synthesis:

Energy — Overall amplitude mapped to visual intensity. Louder music means more evolution between frames.

Spectral Centroid — A brightness descriptor. Bright, airy tones pull the palette one direction; dark, bass-heavy textures pull it another.

Spectral Flatness — The texture signal. Smooth synth pads produce different visual grain than noisy percussion or distorted guitars.

Beat Detection — Energy-based BPM tracking that triggers rhythmic noise pulses. The beat lands on screen without the visual descending into chaos.

The mapping engine translates these raw signals into Scope's generation parameters. Each theme defines its own translation — how that specific world breathes with the music.

The Visual Worlds

Environment Themes

Each world is a self-contained ecosystem with its own parameter mapping:

Astral · Forge · Forest · Synthwave · Sanctuary · Ocean · Cyber · Aurora · Arcade · Volcano · Quantum · Tokyo · Circuit · Amethyst · Matrix · Sakura

Themes carry multiple prompt variations that rotate across sessions — the same world never looks exactly the same twice. The custom theme builder lets you write your own prompt and tune reactivity and beat response from scratch. A built-in AI enhancement toggle refines prompts before they reach Scope — sharpening vague ideas into precise visual descriptions without leaving the workflow.

The Auto Theme Timeline handles live sets where manual switching creates too much overhead — rotating themes on beat boundaries (every 16, 32, or 64 beats), aligning visual scene changes with the music's structure.

Avatar Scene Presets

6 prompt-driven character scenes for the MetaDJ avatar:

Studio Session · Main Stage · Adventure — Cosmic · Adventure — Forest · Adventure — City · Talkover

Each preset describes the character and environment through text. The avatar's identity stays coherent across scene changes because the character description is baked into the prompt — not derived from a reference image.

Under the Hood

Stack: Next.js 16, TypeScript, Tailwind v4, Meyda for audio analysis, WebRTC streaming to Daydream Scope, Canvas/WebGL client-side compositor.

Two modes, one pipeline: Environment and Avatar share the same Scope pipeline. Mode switching triggers a quick reconnect with the destination mode's parameters — the compositor holds the last frame while the new session starts, a one-shot cache break clears the previous mode's latent state, and the visuals jump to the new mode. No loading bar. No morph. SLERP handles transitions within a mode: theme and scene changes blend over a consistent 4-step cadence — fast enough to feel responsive, slow enough to avoid hard cuts.

Latent cache: Scope maintains the deep mathematical representation of the previous generation. Each new frame uses that cached state as its seed. This is why the world feels continuous rather than a flickering slideshow of disconnected images. A time-based sinusoidal drift prevents the cache from converging to a static frame — visuals keep evolving even during quiet passages.

Session variety: Each session generates a random seed at pipeline load. Same theme, same track — different visual journey. Stable within one session, unique across sessions.

Recording: The compositor captures the Scope video stream into a Canvas element, applies overlay layers, and exposes the result as a capturable stream. Recording happens in the browser — WebM export, no external capture tools needed.

Performance shortcuts: Space for play/pause, 1-9 for instant theme switching, F for fullscreen, arrows to cycle themes — all guarded so they don't fire when typing in a prompt field. Instrument-level control during live sets.

Origin Story

Studio started as two separate apps — Soundscape for audio-reactive visuals, Avatar for AI character generation. Both ran on Scope. Both worked. But switching between them during a live set meant disconnecting from one session and connecting to another. The gap broke the flow.

What if both modes shared one session and I could crossfade between them the way I crossfade between tracks?

The 16 environment themes weren't generated from a template — each one was calibrated by ear through real performances. Energy thresholds, beat response curves, prompt color palettes — all tuned to how the music actually feels in each world, not how it measures on a graph. The themes that survived earned their place through repetition.

I built this because I wanted to see my music. Not a random slideshow. Not disconnected images refreshing on a timer. Visuals that breathe with the beat, settle when the music breathes, explode when the drop hits. Studio is the instrument panel for that kind of conducting.

The whole project was built through rapid AI-assisted iteration: concept, prototype, perform, refine. Tight loops between artistic direction and software systems. The approach is the same one I use for everything — teach AI my taste, then scale from there.

Links and Ecosystem

MetaDJ Studio is built on Daydream Scope — part of Daydream's open-source ecosystem for real-time AI video generation.

Follow the build: @metadjai on X

MetaDJ Studio is built on Daydream Scope — part of Daydream’s open-source ecosystem for real-time AI video generation.

Follow the build: @metadjai on X

Where music, prompts, and stagecraft converge.

MetaDJ Studio is a real-time creative performance workspace built on Daydream Scope. It turns browser audio and text prompts into live AI-generated visuals — one shared session, two creative modes, seamless transitions between them. I built it for my live sets. It’s how I see my music.

by MetaDJ / Z — Zuberant

The Experience

You start a session. You press play. The visuals start breathing with the music.

That’s Environment mode — 16 calibrated visual worlds that react to energy, brightness, texture, and beat. High energy drives intense motion. Low energy lets the imagery settle. Beats pulse through the frame as controlled visual spikes, not jarring cuts. Switch themes mid-track and the world crossfades — SLERP blending the mathematical state of one world into the next so nothing ever flickers or hard-cuts.

Then toggle to Avatar mode. Studio performs a fast reconnect behind the scenes, freezes the last frame during the handshake, and lands on a clean jump cut into the MetaDJ character scene — a cosmic stage, a bioluminescent forest, a neon city. The audio still modulates the generation underneath. The character breathes with the music the same way the environments do. Prompt accents work here too — they apply as subtle overlays on top of the active preset, layering a mood or detail onto the scene without replacing the character or the base environment.

Pop out the Stage view onto a second screen. The Ambilight edge glow pulses at 60fps, evolving from the live video feed. Drop it into OBS, stream it, record it. The compositor runs entirely in the browser — no additional GPU needed.

No music? Switch to Ambient mode and the visuals evolve on their own — the sinusoidal drift keeps the world moving from prompt and latent state alone. Useful for installations, background loops, or just letting a world breathe without audio input.

Play the same track through different themes, and you embark on completely distinct visual journeys.

The Audio-Visual Connection

Audio analysis runs entirely in the browser at ~86Hz using Meyda — no server round-trips. Sending audio to a remote server would add 50-100ms of latency, and that’s enough to sever the rhythm-to-visual connection. By analyzing locally and streaming only lightweight extracted parameters over WebRTC, the sync between what you hear and what you see stays razor-sharp.

Four signals drive the synthesis:

Energy — Overall amplitude mapped directly to visual intensity. The louder the music, the more the world evolves between frames.

Spectral Centroid — A brightness descriptor that shifts the aesthetic — bright, airy tones pull the palette one direction; dark, bass-heavy textures pull it another.

Spectral Flatness — The texture signal. Smooth synth pads produce different visual grain than noisy percussion or distorted guitars.

Beat Detection — Energy-based BPM tracking that triggers rhythmic noise pulses. You physically feel the beat on screen without the visual descending into chaos.

The mapping engine translates these raw signals into Scope’s generation parameters. Each theme defines its own translation — how that specific world breathes with the music.

The Visual Worlds

Environment Themes

Each world is a self-contained ecosystem with its own parameter mapping — how that specific environment breathes with the music:

Astral · Forge · Forest · Synthwave · Sanctuary · Ocean · Cyber · Aurora · Arcade · Volcano · Quantum · Tokyo · Circuit · Amethyst · Matrix · Sakura

Each theme carries multiple prompt variations that rotate across sessions, so the same world never looks exactly the same twice. Beyond the presets, the custom theme builder lets you write your own prompt and tune the reactivity and beat response from scratch. A built-in AI enhancement toggle refines your prompts before they reach Scope — sharpening vague ideas into precise visual descriptions without leaving the workflow.

The Auto Theme Timeline handles live sets where manual switching every section creates too much overhead — it rotates themes on beat boundaries (every 16, 32, or 64 beats), aligning visual scene changes with the music’s structure.

Avatar Scene Presets

6 prompt-driven character scenes for the MetaDJ avatar:

Studio Session · Main Stage · Adventure — Cosmic · Adventure — Forest · Adventure — City · Talkover

Each preset describes the character and environment entirely through text. The avatar’s identity stays coherent across scene changes because the character description is baked into the prompt itself — not derived from a reference image.

Under the Hood

Stack: Next.js 16, TypeScript, Tailwind v4, Meyda for audio analysis, WebRTC streaming to Daydream Scope, Canvas/WebGL client-side compositor.

Two modes, one pipeline: Environment and Avatar share the same Scope pipeline. Mode switching triggers a quick reconnect with the destination mode’s parameters — the compositor holds the last frame while the new session starts, a one-shot cache break clears the previous mode’s latent state, and the visuals jump to the new mode. No loading bar, no morph. SLERP is used for transitions within a mode: theme and scene changes blend over a consistent 4-step cadence — fast enough to feel responsive, slow enough to avoid hard cuts.

Latent cache: Scope maintains the deep mathematical representation of the previous generation. Each new frame uses that cached state as its seed. This is why the world feels continuous and solid rather than a flickering slideshow of disconnected images. A time-based sinusoidal drift prevents the cache from converging to a static frame, so visuals keep evolving even during quiet passages.

Session variety: Each session generates a random seed at pipeline load, so the same theme and track produce different visual journeys across sessions while staying visually stable within one.

Recording: The compositor captures the Scope video stream into a Canvas element, applies any overlay layers, and exposes the result as a capturable stream. Recording happens entirely in the browser — WebM export, no external capture tools needed.

Performance shortcuts: Space for play/pause, 1-9 for instant theme switching, F for fullscreen, left/right arrows to cycle themes — all guarded so they don’t fire when typing in a prompt field. The goal is instrument-level control during live sets.

Origin Story

The question was simple: what if both modes shared one session and I could crossfade between them the way I crossfade between tracks?

That’s Studio. One connection, two creative expressions, seamless transitions.

The 16 environment themes weren’t generated from a template — each one was calibrated by ear through real performances. Energy thresholds, beat response curves, prompt color palettes — all tuned to how the music actually feels in each world, not how it measures on a graph. The themes that survived are the ones that earned their place through repetition.

I built this because I wanted to see my music. Not a random slideshow. Not disconnected images refreshing on a timer. I wanted visuals that breathe with the beat, settle when the music breathes, and explode when the drop hits. Studio is the instrument panel for that kind of conducting — orchestrating music, visuals, and AI generation as one cohesive performance.

The whole project was built through rapid AI-assisted iteration: concept, prototype, perform, refine. Tight feedback loops between artistic direction and software systems. The approach is the same one I use for everything I build — teach AI my taste, then scale from there.

Links and Ecosystem

MetaDJ Studio is built on Daydream Scope — part of Daydream’s open-source ecosystem for real-time AI video generation.

Follow the build: @metadjai on X

MetaDJ Studio

Explore new worlds with Daydream Scope

Tags

More like this

Perro Andaluz 𖠂 > by Scope

MetaDJ Soundscape

Realtime Video AI in your MacOS creative tools