Avatar of Amit Mor

Amit Mor

Software Architect at Payoneer
Software Architect at Payoneer·
AWS LambdaAWS Lambda

I have different experience with processing video files that I'll describe below. It might be helpful or at least make you think a bit diffferent about the problem. What I did (part of it is a mistake): To increase the level of parallelism at the time consuming step which was the video upload, using a custom cmd tool written in Python, I splitted the input videos to much smaller chunks (without losing their ordering - just file name labeling with timestamp) . It then uploaded the chunks to S3. That triggered a few Lambdas that each first pulled a chunked video, did the processing with ffmpeg (the Lambdas were the mistake - at that time the local Lambda storage was up to 512MB so lots of chunks and lots of Lambdas had to be in place, also Lambda are hell to debug), later called Rekognition and later using AWS Elemental MediaConvert to rebuild the full length video. I would use some sort of ECS deployment where processing is triggered by S3 event, and scale the number of Fargate nodes dependent on the number of chucks/videos. Then each processor pulls its video (not stream) to its local storage (local EBS drive) and works. I failed to understand why are you trying to stream videos that are basically static, as a file, or that putting the files on S3 is a current limitation (while your input videos are 'live' and streaming) that you're trying to remove ?

6 upvotes·10.8K views
Software Architect at Payoneer·

Definitely Python. Lots of libraries, dead simple syntax. Lots of code examples and reference projects. Elixir is pure functional and takes time to grasp the concepts. Go is great, with simple syntax and performant runtime, but more strict as it is statically typed. For quick coding, nothing beats Python. As you come from .net I’d consider similar approach and be considering Java with SpringBoot as it makes Java faster and much more fun to code web servers

5 upvotes·1 comment·195.7K views
Vitor Bacelar
Vitor Bacelar
September 1st 2021 at 1:50PM

Thanks! I'll try python a little and I think the libraries and code example will definitely help

Software Architect at Payoneer·

I think something is missing here and you should consider answering it to yourself. You are building a couple of services. Why are you considering event-sourcing architecture using Message Brokers such as the above? Won't a simple REST service based arch suffice? Read about CQRS and the problems it entails (state vs command impedance for example). Do you need Pub/Sub or Push/Pull? Is queuing of messages enough or would you need querying or filtering of messages before consumption? Also, someone would have to manage these brokers (unless using managed, cloud provider based solution), automate their deployment, someone would need to take care of backups, clustering if needed, disaster recovery, etc. I have a good past experience in terms of manageability/devops of the above options with Kafka and Redis, not so much with RabbitMQ. Both are very performant. But also note that Redis is not a pure message broker (at time of writing) but more of a general purpose in-memory key-value store. Kafka nowadays is much more than a distributed message broker. Long story short. In my taste, you should go with a minialistic approach and try to avoid either of them if you can, especially if your architecture does not fall nicely into event sourcing. If not I'd examine Kafka. If you need more capabilities than I'd consider Redis and use it for all sorts of other things such as a cache.

3 upvotes·773.7K views
Software Architect at Payoneer·

As others mentioned, the problem domain is around data. From my experience, data means strongly typed entities. It might be good however to start off with a dynamic language such as Python (with Django) just to build a prototype, but once the models have been proved to be valid I'd go with statically typed language such as java/Go (I prefer Go but that's a whole different conversation) as you get compile time guarantees for type safety.

An alternative (or addition) to all of the above is the use of 'strong protocols', such as Protocol Buffers, Avro, Thrift and the likes. In this case you get type safety and stability between communicating backend services, while deciding and changing on whatever backend service language you want. That goes to say that your problem is not related to programming language decision but to a much profound understanding of what's important for the business to be created and be valuable.

As a general note, I don't think you should go, if you've got commercial aspirations, with any language that you'd have hard recruiting people who actually know what their doing. In Israel it would mean take Kotlin out of the equation

3 upvotes·131.1K views