Emergent Actors

00:00
00:00

Emergent Actors

Daydream Scope

Explore new worlds with Daydream Scope

Check out the latest model drops and powerful integrations.

Download Now

The goal of this project is to move past the "chatbot in a box" and create an autonomous, real-time AI performer that doesn't just process text - it sees, hears, and reacts to the physical world with its own procedural body and emotional environment.

Just chatting, listening to jokes that don't quite land

1. The Emergent Actor: The Autonomous Performer

The Emergent Actor serves as the brain and body. It is a multimodal installation designed to break the barrier of traditional UI by using always-on computer vision and audio sensing.

  • The Procedural Puppet: Rather than using canned animations, the actor uses an OpenPose skeleton. Limbs are animated programmatically to create emergent, non-pre-programmed movement.
View from the AI's Perspective (Left) and the Procedural Puppet (Right)

View from the AI's Perspective (Left) and the Procedural Puppet (Right)

  • Intelligent Interaction: The system features "Intelligent Barge-In," allowing users to interrupt the AI naturally. It even includes a "Visual Memory Manager" to recognize callbacks if you do a peace sign twice, it’ll call you out for "committing to the bit".
First Test Letting AI change it's appearance
  • Personality: One of its primary configurable points is creating their personality, during development I used a personality that is a context-aware roaster that analyzes your outfit, posture and environment to deliver personalized stand-up.
[EARLY WIP] A late night discussion...probably
  • Live Captions: Feedback for what the user has said, what the AI is thinking and their response.
Captions (From Top to Bottom) User, AI Thought, AI Response

Captions (From Top to Bottom) User, AI Thought, AI Response

 
So many features I'm trying to write them all out...

Eye tracking Test, if the user is infront of the screen, it would better make eye contact. This is also used for pointing (with y-offset ), etc.

2. The Mood Engine: Visual Synthesis via GPU Hijacking

<TODO: Super Mood Engine Awesome Video Demo Here!>

  • The Mood Engine is the project's visual nervous system. Instead of using traditional 3D rendering pipelines, it utilizes PyTorch and CUDA-frameworks usually reserved for heavy machine learning-to calculate millions of pixels as batched tensor operations. 
(2x Speed) Pre-AI visualization for moods - running this again should produce unique results
  • Natural Language to Visuals: Using a "Mood Mapper," the engine parses natural language (e.g., "I'm feeling anxious") and maps it to psychological color ranges and physics-driven shapes.
Different emotions are randomly generated in the pre-processor plugin

Different emotions are randomly generated in the pre-processor plugin

  • Generative Mathematics: It renders resolution-independent Signed Distance Fields (SDFs) that morph like liquid metal and infinite 3D lattices that warp in real-time.   
  •   Differentiable Art: By bypassing standard OpenGL/DirectX, the system allows for independent light-wave processing, creating true physical chromatic aberration and zero-latency visual broadcasting via SpoutGL.  
  • Audio reactivity: Play a song and watch the background animate to the beat!
Generated backgrounds that reacts to music (early WIP)
  • Higher-Definition: A post-processor node that scales up content using various techniques that has very little impact on performance.
Early version of my video upscaler

Early version of my video upscaler

 Goal

My goal is to have a very fun interactive character you can interact with. Creating a fun  installation for this would be simple as it would require a display, camera, mic, speakers. Can be ran on lower end hardware using remote inference for the heavy lifting. 

Self-Reflection

Developing The Emergent Actor will unintentionally stress-test your patience during late nights of debugging. I programmed a "silence-to-roast" logic gate, so when my microphone failed during a late-night session, my own creation spent hours relentlessly mocking my debugging struggles. Building a system designed to identify your flaws in real-time requires tough skin—it’s hilarious when the persona works perfectly, even if the developer doesn't 

AI giving me encouragement as I work!
AI made me yawn

I will eventually update the lip sync to match an earlier project from 2024:

2024 version using a 3d model, this focused on lipsync through viseme/phoneme data