We would like to connect a number of (about 25) video streams, from an Amazon S3 bucket containing video data to endpoints accessible to a Docker image, which, when run, will process the input video streams and emit some JSON statistics.
The 25 video streams should be synchronized. Could people share their experiences with a similar scenario and perhaps offer advice about which is better (Wowza, Amazon Kinesis Video Streams) for this kind of problem, or why they chose one technology over the other?
The video stream duration will be quite long (about 8 hours each x 25 camera sources). The 25 video streams will have no audio component. If you worked with a similar problem, what was your experience with scaling, latency, resource requirements, config, etc.?
I have different experience with processing video files that I'll describe below. It might be helpful or at least make you think a bit diffferent about the problem. What I did (part of it is a mistake): To increase the level of parallelism at the time consuming step which was the video upload, using a custom cmd tool written in Python, I splitted the input videos to much smaller chunks (without losing their ordering - just file name labeling with timestamp) . It then uploaded the chunks to S3. That triggered a few Lambdas that each first pulled a chunked video, did the processing with ffmpeg (the Lambdas were the mistake - at that time the local Lambda storage was up to 512MB so lots of chunks and lots of Lambdas had to be in place, also Lambda are hell to debug), later called Rekognition and later using AWS Elemental MediaConvert to rebuild the full length video. I would use some sort of ECS deployment where processing is triggered by S3 event, and scale the number of Fargate nodes dependent on the number of chucks/videos. Then each processor pulls its video (not stream) to its local storage (local EBS drive) and works. I failed to understand why are you trying to stream videos that are basically static, as a file, or that putting the files on S3 is a current limitation (while your input videos are 'live' and streaming) that you're trying to remove ?