Commit Graph

22 Commits

Author SHA1 Message Date
giles
ed617fcdd6 Fix lazy audio path resolution for GPU streaming
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
Audio playback path was being resolved during parsing when database
may not be ready, causing fallback to non-existent path. Now resolves
lazily when stream starts, matching how audio analyzer works.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 11:32:04 +00:00
giles
ef3638d3cf Make HLS segment uploads async to prevent stuttering
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
2026-02-04 10:25:56 +00:00
giles
6e20d19a23 Fix float literal syntax in autonomous kernel
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
2026-02-04 10:01:43 +00:00
giles
e64ca9fe3a Add autonomous CUDA kernel that computes all params on GPU
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
2026-02-04 10:01:08 +00:00
giles
234fbdbee2 Fix primitive_lib_dir path resolution for sexp files in app root
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
2026-02-04 09:54:07 +00:00
giles
8b9309a90b Fix f-string brace escaping in ripple effect CUDA code
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
2026-02-04 09:49:52 +00:00
giles
3b964ba18d Add sexp to CUDA kernel compiler for fused pipeline
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 09:46:40 +00:00
giles
4d95ec5a32 Add profiling to stream interpreter to find bottleneck
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 09:32:53 +00:00
giles
ad1d7893f8 Integrate fast CUDA kernels for GPU effects pipeline
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
Replace slow scipy.ndimage operations with custom CUDA kernels:
- gpu_rotate: AFFINE_WARP_KERNEL (< 1ms vs 20ms for scipy)
- gpu_blend: BLEND_KERNEL for fast alpha blending
- gpu_brightness/contrast: BRIGHTNESS_CONTRAST_KERNEL
- Add gpu_zoom, gpu_hue_shift, gpu_invert, gpu_ripple

Preserve GPU arrays through pipeline:
- Updated _maybe_to_numpy() to keep CuPy arrays for GPU primitives
- Primitives detect CuPy arrays via __cuda_array_interface__
- No unnecessary CPU round-trips between operations

New jit_compiler.py contains all CUDA kernels with FastGPUOps
class using ping-pong buffer strategy for efficient in-place ops.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 02:53:46 +00:00
giles
75f9d8fb11 Configure GPU encoder for low-latency (no B-frames)
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 02:40:07 +00:00
giles
b96e8ca4d2 Add playlist_url property to GPUHLSOutput
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 02:37:21 +00:00
giles
8051ef9ba9 Fix GPUHLSOutput method name (write not write_frame)
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 02:34:51 +00:00
giles
3adf927ca1 Add zero-copy GPU encoding pipeline
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
- New GPUHLSOutput class for direct GPU-to-NVENC encoding
- RGB→NV12 conversion via CUDA kernel (no CPU transfer)
- Uses PyNvVideoCodec for zero-copy GPU encoding
- ~220fps vs ~4fps with CPU pipe approach
- Automatically used when PyNvVideoCodec is available

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 02:32:43 +00:00
giles
9151d2c2a8 Update IPFS playlist CID in database during streaming for live HLS
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
- Add on_playlist_update callback to IPFSHLSOutput
- Pass callback through StreamInterpreter to output
- Update database with playlist CID as segments are created
- Enables live HLS redirect to IPFS before rendering completes

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 00:23:49 +00:00
giles
b49d109a51 Fix HLS audio duration to match video with -shortest flag
Some checks are pending
GPU Worker CI/CD / test (push) Waiting to run
GPU Worker CI/CD / deploy (push) Blocked by required conditions
HLS outputs were including full audio track instead of trimming
to match video duration, causing video to freeze while audio
continued playing.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 22:13:08 +00:00
giles
7009840712 Add auto GPU->CPU conversion at interpreter level
Convert GPU frames/CuPy arrays to numpy before calling primitives.
This fixes all CPU primitives without modifying each one individually.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 21:40:37 +00:00
giles
3116a70c3e Fix IPFS upload: sync instead of background task
The background IPFS upload task was running on workers that don't have
the file locally, causing uploads to fail silently. Now uploads go to
IPFS synchronously so the IPFS CID is available immediately.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 21:17:22 +00:00
giles
86830019ad Add IPFS HLS streaming and GPU optimizations
- Add IPFSHLSOutput class that uploads segments to IPFS as they're created
- Update streaming task to use IPFS HLS output for distributed streaming
- Add /ipfs-stream endpoint to get IPFS playlist URL
- Update /stream endpoint to redirect to IPFS when available
- Add GPU persistence mode (STREAMING_GPU_PERSIST=1) to keep frames on GPU
- Add hardware video decoding (NVDEC) support for faster video processing
- Add GPU-accelerated primitive libraries: blending_gpu, color_ops_gpu, geometry_gpu
- Add streaming_gpu module with GPUFrame class for tracking CPU/GPU data location
- Add Dockerfile.gpu for building GPU-enabled worker image

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 20:23:16 +00:00
giles
8f4d6d12bc Use fragmented mp4 for streamable output while rendering 2026-02-03 00:41:07 +00:00
giles
3d5a08a7dc Don't overwrite pre-registered primitives in _load_primitives 2026-02-02 23:59:38 +00:00
giles
d20eef76ad Fix completed runs not appearing in list + add purge-failed endpoint
- Update save_run_cache to also update actor_id, recipe, inputs on conflict
- Add logging for actor_id when saving runs to run_cache
- Add admin endpoint DELETE /runs/admin/purge-failed to delete all failed runs

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 23:24:39 +00:00
giles
bb458aa924 Replace batch DAG system with streaming architecture
- Remove legacy_tasks.py, hybrid_state.py, render.py
- Remove old task modules (analyze, execute, execute_sexp, orchestrate)
- Add streaming interpreter from test repo
- Add sexp_effects with primitives and video effects
- Add streaming Celery task with CID-based asset resolution
- Support both CID and friendly name references for assets
- Add .dockerignore to prevent local clones from conflicting

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 19:10:11 +00:00