Relay42

Relay42

3 Followers
Deliver personalized cross-channel marketing with Tag & Data Management.
@Relay42
relay42.com
Meeuwenlaan 98, 1021 JL Amsterdam, Netherlands

Decisions 2

Nikola Yovchev

Head of Engineering at Relay42

Since IntelliJ is the de-facto standard for writing Java/Kotlin/Scala application, and in Relay42 we are heavy Java users, every new engineer gets an Ultimate subscription from day1. The gains in productivity, pair programming speed (esp with the Code With Me feature) by using the same and familiar editor are totally worth the cost.

4 161.6K

Nikola Yovchev

Head of Engineering at Relay42

We've been experiencing performance problems in our Hadoop cluster (8 nodes c5.2xlarge) trying to process huge volumes of Cassandra data.

The perfomance problems were both in speed but also in reliablity (sometimes m/r split calculation could hang due to driver inefficiencies on huge datasets).

That's why in the past 6 months we've been doing a tremendous migration effort to moving our analytics/big data loads on Databricks with DeltaLake /S3.

So far the results are encouraging: our worst case scenarios improved tremendously in term of speed: from 8 hours for 250GB payload to just below 30 min. The stability is also good, but the real benefit is the fact that Cassandra/Hadoop as a pair straight up are not made for pain-free big data analytics (out of the box). The amount of projections of the data one'd have to make in order to have Cassandra + Hadoop friendly analytics was simply not worth the pain. At Relay42 our Cassandra nodes are backed up by EBS volumes, which are also damn expensive by themselves + expensive to backup at the scale we run with so many nodes and so many terrabytes of data on each node. This just doesn't scale well financially.

Thus, we are very happy with the cost savings we are making by moving the data to S3/Delta from EBS volumes. Currently we are still running those in parallel but soon those costly EBS volumes are about to be shrunk :)

3 7.6K