Check out the latest model drops and powerful integrations.

I picked up Hydra during the pandemic a browser-based video synth where you write code and it becomes visuals in real time.

https://hydra.ojack.xyz/
What started as a lockdown hobby turned into a practice. Now I perform live visuals regularly, and I wanted to build one instrument that ties together everything I'm into.

Ruby City in San Antonio, Texas - Photo by Jo E. Norris
So I forked Hydra and rebuilt a VJ-App.
https://github.com/diegochavez-io/hydra-synth_vj.git
The visual pipeline chains several systems: Hydra live coding, DayDream Scope real-time gen-AI, GLSL shaders, Cellular Automata , all feeding into TouchDesigner as part of a larger live visual network. It works standalone as a performance app, but also slots into this bigger routing setup.


Daydream Scopeintegration.
Daydream Scope's generative AI runs through a custom LoRA I trained on my own output. I added a record_batch function to my forked Hydra build that auto-captures 5-second clips formatted for Wan 2.1 14B fine-tuning. So the LoRA driving Scope's real-time generation was literally trained on Hydra's visual output, one tool feeding the next.

These are my Hydra presets audio-reactive, tuned for ambient warmth. This is the raw material the AI learned from.


Wan is an open-source AI video model. You give it a text prompt, and it generates short video clips. The problem is that, out of the box, AI models can produce generic output if your prompt isn't clever enough or if you trigger the wrong word token. It doesn't know what my visuals look like. That's where a LoRA comes in. A LoRA (Low-Rank Adaptation) is a small, focused training layer you add on top of a base model.
To build the dataset, I added a record_batch function to my forked Hydra build. One click, and it walks through every preset, auto-capturing a 5-second clip of each, already formatted for Wan 2.1 fine-tuning. No screen recording, no manual trimming. The browser renders the visuals and writes the training data in one pass.
23 clips, fed into the model. It learned my visual language: the color palettes and the textures. The result is a 300 MB .safetensors file that can be loaded into Daydream Scope or other WAN 2.1 workflows, such as ComfyUI.
Technical details: Wan 2.1 14B base model. Trained on RunPod A100 80GB via ai-toolkit by Ostris.
Here's the model learning.

Step 250, picking up the color palette.

Step 500, flowing textures, learning the aesthetic.
One LoRA wasn't enough. While the first training run was finishing on a rented A100 (after burning an hour on a 5090 that didn't have enough VRAM), I vibe coded a set of datamosh scripts with Claude Code on my Mac. Stripping I-frames from actual bitstreams, cross-moshing motion vectors between clips, pixel-level glitch. The corrupted clips teach the model my aesthetichere:
Datamosh tools: https://github.com/diegochavez-io/datamosh-tools
Lora Training Output: https://huggingface.co/diegochavez/pixmo_h_v2

Each training clip gets a .txt file that tells the model what it's looking at. The trigger word goes at the front of every caption so the model learns to associate that token with your visual concept. For this LoRA the trigger is pixmo_h.
Captions don't need to be poetic. They need to be specific. Describe what's actually happening in the clip: the motion, the colors, the artifacts. Generic captions like "abstract colorful video" give the model nothing to latch onto. Here's one of mine:
pixmo_h P-frame drift on neon green base, hot pink and yellow macroblock shrapnel smearing along original motion paths, single-reference melt
A few things I learned:

Step 0 — base model. Generic glitch attempt, no real codec knowledge

Step 500 — starting to learn. Darker, more structured

Step 1000 — real datamosh macroblock patterns emerging, saturated color tearing

Step 1500 — horizontal banding, compression artifacts, codec-native motion

Step 2000 — fully trained. This is the datamosh aesthetic I fed it
In live coding, you write the visuals while the audience watches. There's no timeline, no pre-rendered content. It's closer to playing a synthesizer than editing a video.
I forked Hydra and built the instrument I always wanted around it: presets, audio reactivity (audio drives color and hue), Ableton Link sync, and built-in projection mapping, all running in a browser.

I built a prompt sequencer directly into the launcher. You write a list of prompts and the sequencer cycles through them on bar boundaries via Ableton Link. Every N bars, the next prompt fires to Scope over OSC. The AI's visual theme shifts with the music, hands-free.
The sequencer pre-loads the next prompt into a live edit field before it fires. I can rewrite it mid-performance, hit Send, and that's what Scope gets instead.

My last cohort project was a standalone cellular automata plugin for Scope (Lenia, SmoothLife, MNCA, and more). For this project, I built those engines directly into the Hydra launcher as another content source feeding into the pipeline.



The engine is Hydra, Olivia Jack's open-source video synth that got me into live coding. That world came with an ethos: share your screens, share your code, learn in public. Everything here is open: the app, the LoRAs, the workflow.