celery

Author	SHA1	Message	Date
giles	70530e5c92	Add GPU image primitives (gpu-make-image, gpu-gradient) Some checks are pending GPU Worker CI/CD / test (push) Waiting to run Details GPU Worker CI/CD / deploy (push) Blocked by required conditions Details	2026-02-04 10:05:09 +00:00
giles	e4349ba501	Add autonomous-pipeline primitive for zero-Python hot path Some checks are pending GPU Worker CI/CD / test (push) Waiting to run Details GPU Worker CI/CD / deploy (push) Blocked by required conditions Details	2026-02-04 10:02:40 +00:00
giles	1442216a15	Handle Keyword dict keys in fused-pipeline primitive Some checks are pending GPU Worker CI/CD / test (push) Waiting to run Details GPU Worker CI/CD / deploy (push) Blocked by required conditions Details	2026-02-04 09:53:28 +00:00
giles	2d20a6f452	Add fused-pipeline primitive and test for compiled CUDA kernels Some checks are pending GPU Worker CI/CD / test (push) Waiting to run Details GPU Worker CI/CD / deploy (push) Blocked by required conditions Details	2026-02-04 09:51:56 +00:00
giles	ad1d7893f8	Integrate fast CUDA kernels for GPU effects pipeline Some checks are pending GPU Worker CI/CD / test (push) Waiting to run Details GPU Worker CI/CD / deploy (push) Blocked by required conditions Details Replace slow scipy.ndimage operations with custom CUDA kernels: - gpu_rotate: AFFINE_WARP_KERNEL (< 1ms vs 20ms for scipy) - gpu_blend: BLEND_KERNEL for fast alpha blending - gpu_brightness/contrast: BRIGHTNESS_CONTRAST_KERNEL - Add gpu_zoom, gpu_hue_shift, gpu_invert, gpu_ripple Preserve GPU arrays through pipeline: - Updated _maybe_to_numpy() to keep CuPy arrays for GPU primitives - Primitives detect CuPy arrays via __cuda_array_interface__ - No unnecessary CPU round-trips between operations New jit_compiler.py contains all CUDA kernels with FastGPUOps class using ping-pong buffer strategy for efficient in-place ops. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 02:53:46 +00:00
giles	9bdad268a5	Fix DLPack: use frame.to_dlpack() for decord→CuPy zero-copy Some checks are pending GPU Worker CI/CD / test (push) Waiting to run Details GPU Worker CI/CD / deploy (push) Blocked by required conditions Details Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 02:10:18 +00:00
giles	1cb9c3ac8a	Add DLPack debug logging to diagnose zero-copy Some checks are pending GPU Worker CI/CD / test (push) Waiting to run Details GPU Worker CI/CD / deploy (push) Blocked by required conditions Details Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 02:06:19 +00:00
giles	41adf058bd	Build decord from source with CUDA for GPU video decode Some checks are pending GPU Worker CI/CD / test (push) Waiting to run Details GPU Worker CI/CD / deploy (push) Blocked by required conditions Details - Build decord with -DUSE_CUDA=ON for true NVDEC hardware decode - Use DLPack for zero-copy transfer from decord to CuPy - Frames stay on GPU throughout: decode -> process -> encode Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 01:50:14 +00:00
giles	b7e3827fa2	Use PyNvCodec for true zero-copy GPU video decode Some checks are pending GPU Worker CI/CD / test (push) Waiting to run Details GPU Worker CI/CD / deploy (push) Blocked by required conditions Details Replace decord (CPU-only pip package) with PyNvCodec which provides direct NVDEC access. Frames decode straight to GPU memory without any CPU transfer, eliminating the memory bandwidth bottleneck. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 01:47:03 +00:00
giles	771fb8cebc	Add decord for GPU-native video decode Some checks are pending GPU Worker CI/CD / test (push) Waiting to run Details GPU Worker CI/CD / deploy (push) Blocked by required conditions Details - Install decord in GPU Dockerfile for hardware video decode - Update GPUVideoSource to use decord with GPU context - Decord decodes on GPU via NVDEC, avoiding CPU memory copies - Falls back to FFmpeg pipe if decord unavailable - Enable STREAMING_GPU_PERSIST=1 for full GPU pipeline Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 01:17:22 +00:00
giles	6e0ee65e40	Fix streaming_gpu.py to include CPU primitives streaming_gpu.py was being loaded on GPU nodes but had no PRIMITIVES dict, causing audio-beat, audio-energy etc. to be missing. Now imports and includes all primitives from the CPU streaming.py module. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-03 21:20:23 +00:00
giles	86830019ad	Add IPFS HLS streaming and GPU optimizations - Add IPFSHLSOutput class that uploads segments to IPFS as they're created - Update streaming task to use IPFS HLS output for distributed streaming - Add /ipfs-stream endpoint to get IPFS playlist URL - Update /stream endpoint to redirect to IPFS when available - Add GPU persistence mode (STREAMING_GPU_PERSIST=1) to keep frames on GPU - Add hardware video decoding (NVDEC) support for faster video processing - Add GPU-accelerated primitive libraries: blending_gpu, color_ops_gpu, geometry_gpu - Add streaming_gpu module with GPUFrame class for tracking CPU/GPU data location - Add Dockerfile.gpu for building GPU-enabled worker image Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-03 20:23:16 +00:00

12 Commits