29 Commits

Author SHA1 Message Date
gilesb
fc9597456f Add JAX typography, xector primitives, deferred effect chains, and GPU streaming
All checks were successful
Build and Deploy / build-and-deploy (push) Successful in 1m28s
- Add JAX text rendering with font atlas, styled text placement, and typography primitives
- Add xector (element-wise/reduction) operations library and sexp effects
- Add deferred effect chain fusion for JIT-compiled effect pipelines
- Expand drawing primitives with font management, alignment, shadow, and outline
- Add interpreter support for function-style define and require
- Add GPU persistence mode and hardware decode support to streaming
- Add new sexp effects: cell_pattern, halftone, mosaic, and derived definitions
- Add path registry for asset resolution
- Add integration, primitives, and xector tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 17:41:19 +00:00
gilesb
7411aa74c4 Add JAX backend with frame-varying random keys
Some checks failed
GPU Worker CI/CD / test (push) Has been cancelled
GPU Worker CI/CD / deploy (push) Has been cancelled
- Add sexp_to_jax.py: JAX compiler for S-expression effects
- Use jax.random.fold_in for deterministic but varying random per frame
- Pass seed from recipe config through to JAX effects
- Fix NVENC detection to do actual encode test
- Add set_random_seed for deterministic Python random

The fold_in approach allows frame_num to be traced (not static)
while still producing different random patterns per frame,
fixing the interference pattern issue.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 11:07:02 +00:00
giles
9a8a701492 Fix GPU encoding black frames and improve debug logging
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
- Add CUDA sync before encoding to ensure RGB->NV12 kernel completes
- Add debug logging for frame data validation (sum check)
- Handle GPUFrame objects in GPUHLSOutput.write()
- Fix cv2.resize for CuPy arrays (use cupyx.scipy.ndimage.zoom)
- Fix fused pipeline parameter ordering (geometric first, color second)
- Add raindrop-style ripple with random position/freq/decay/amp
- Generate final VOD playlist with #EXT-X-ENDLIST

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 16:33:12 +00:00
giles
514ee89cca Add deterministic debug logging to fused pipeline
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
2026-02-04 12:17:17 +00:00
giles
0a6dc0099b Add debug logging to fused pipeline
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
2026-02-04 12:13:39 +00:00
giles
180d6a874e Fix fallback path to read ripple_amplitude from dynamic_params
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
The Python fallback path was reading amplitude directly from effect dict
instead of checking dynamic_params first like the CUDA kernel path does.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 12:11:27 +00:00
giles
4b0f1b0bcd Return raw array from fused-pipeline fallback
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
Downstream code expects arrays with .flags attribute, not GPUFrame.
Extract the underlying gpu/cpu array before returning.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 11:42:20 +00:00
giles
9583ecb81a Fix ripple parameter names in fused-pipeline fallback
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
Use cx/cy instead of center_x/center_y to match gpu_ripple signature.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 11:40:57 +00:00
giles
6ee8d72d24 Fix GPUFrame wrapping in fused-pipeline fallback
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
The fallback path was passing raw numpy/cupy arrays to GPU functions
that expect GPUFrame objects with .cpu property.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 11:39:34 +00:00
giles
ed617fcdd6 Fix lazy audio path resolution for GPU streaming
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
Audio playback path was being resolved during parsing when database
may not be ready, causing fallback to non-existent path. Now resolves
lazily when stream starts, matching how audio analyzer works.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 11:32:04 +00:00
giles
70530e5c92 Add GPU image primitives (gpu-make-image, gpu-gradient)
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
2026-02-04 10:05:09 +00:00
giles
e4349ba501 Add autonomous-pipeline primitive for zero-Python hot path
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
2026-02-04 10:02:40 +00:00
giles
1442216a15 Handle Keyword dict keys in fused-pipeline primitive
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
2026-02-04 09:53:28 +00:00
giles
2d20a6f452 Add fused-pipeline primitive and test for compiled CUDA kernels
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
2026-02-04 09:51:56 +00:00
giles
ad1d7893f8 Integrate fast CUDA kernels for GPU effects pipeline
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
Replace slow scipy.ndimage operations with custom CUDA kernels:
- gpu_rotate: AFFINE_WARP_KERNEL (< 1ms vs 20ms for scipy)
- gpu_blend: BLEND_KERNEL for fast alpha blending
- gpu_brightness/contrast: BRIGHTNESS_CONTRAST_KERNEL
- Add gpu_zoom, gpu_hue_shift, gpu_invert, gpu_ripple

Preserve GPU arrays through pipeline:
- Updated _maybe_to_numpy() to keep CuPy arrays for GPU primitives
- Primitives detect CuPy arrays via __cuda_array_interface__
- No unnecessary CPU round-trips between operations

New jit_compiler.py contains all CUDA kernels with FastGPUOps
class using ping-pong buffer strategy for efficient in-place ops.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 02:53:46 +00:00
giles
9bdad268a5 Fix DLPack: use frame.to_dlpack() for decord→CuPy zero-copy
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 02:10:18 +00:00
giles
1cb9c3ac8a Add DLPack debug logging to diagnose zero-copy
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 02:06:19 +00:00
giles
41adf058bd Build decord from source with CUDA for GPU video decode
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
- Build decord with -DUSE_CUDA=ON for true NVDEC hardware decode
- Use DLPack for zero-copy transfer from decord to CuPy
- Frames stay on GPU throughout: decode -> process -> encode

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 01:50:14 +00:00
giles
b7e3827fa2 Use PyNvCodec for true zero-copy GPU video decode
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
Replace decord (CPU-only pip package) with PyNvCodec which provides
direct NVDEC access. Frames decode straight to GPU memory without
any CPU transfer, eliminating the memory bandwidth bottleneck.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 01:47:03 +00:00
giles
771fb8cebc Add decord for GPU-native video decode
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
- Install decord in GPU Dockerfile for hardware video decode
- Update GPUVideoSource to use decord with GPU context
- Decord decodes on GPU via NVDEC, avoiding CPU memory copies
- Falls back to FFmpeg pipe if decord unavailable
- Enable STREAMING_GPU_PERSIST=1 for full GPU pipeline

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 01:17:22 +00:00
giles
fe6730ce72 Add dev infrastructure improvements
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
- Central config with logging on startup
- Hot reload support for GPU worker (docker-compose.gpu-dev.yml)
- Quick deploy script (scripts/gpu-dev-deploy.sh)
- GPU/CPU frame compatibility tests
- CI/CD pipeline for GPU worker (.gitea/workflows/gpu-worker.yml)
- Standardize GPU_PERSIST default to 0 across all modules

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 21:56:40 +00:00
giles
92eeb58c71 Add GPU frame conversion in color_ops
All color_ops primitives now auto-convert GPU frames to numpy,
fixing compatibility with geometry_gpu primitives.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 21:38:10 +00:00
giles
2c1728c6ce Disable GPU persistence by default
GPU persistence returns CuPy arrays but most primitives expect numpy.
Disable until all primitives support GPU frames.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 21:24:45 +00:00
giles
6e0ee65e40 Fix streaming_gpu.py to include CPU primitives
streaming_gpu.py was being loaded on GPU nodes but had no PRIMITIVES dict,
causing audio-beat, audio-energy etc. to be missing. Now imports and
includes all primitives from the CPU streaming.py module.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 21:20:23 +00:00
giles
3116a70c3e Fix IPFS upload: sync instead of background task
The background IPFS upload task was running on workers that don't have
the file locally, causing uploads to fail silently. Now uploads go to
IPFS synchronously so the IPFS CID is available immediately.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 21:17:22 +00:00
giles
86830019ad Add IPFS HLS streaming and GPU optimizations
- Add IPFSHLSOutput class that uploads segments to IPFS as they're created
- Update streaming task to use IPFS HLS output for distributed streaming
- Add /ipfs-stream endpoint to get IPFS playlist URL
- Update /stream endpoint to redirect to IPFS when available
- Add GPU persistence mode (STREAMING_GPU_PERSIST=1) to keep frames on GPU
- Add hardware video decoding (NVDEC) support for faster video processing
- Add GPU-accelerated primitive libraries: blending_gpu, color_ops_gpu, geometry_gpu
- Add streaming_gpu module with GPUFrame class for tracking CPU/GPU data location
- Add Dockerfile.gpu for building GPU-enabled worker image

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 20:23:16 +00:00
giles
82599eff1e Add fallback to format duration and debug logging for VideoSource 2026-02-03 00:36:36 +00:00
giles
d20eef76ad Fix completed runs not appearing in list + add purge-failed endpoint
- Update save_run_cache to also update actor_id, recipe, inputs on conflict
- Add logging for actor_id when saving runs to run_cache
- Add admin endpoint DELETE /runs/admin/purge-failed to delete all failed runs

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 23:24:39 +00:00
giles
bb458aa924 Replace batch DAG system with streaming architecture
- Remove legacy_tasks.py, hybrid_state.py, render.py
- Remove old task modules (analyze, execute, execute_sexp, orchestrate)
- Add streaming interpreter from test repo
- Add sexp_effects with primitives and video effects
- Add streaming Celery task with CID-based asset resolution
- Support both CID and friendly name references for assets
- Add .dockerignore to prevent local clones from conflicting

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 19:10:11 +00:00