What is HBase?
Who uses HBase?
Here are some stack decisions, common use cases and reviews by companies and developers who chose HBase in their tech stack.
Hi, I'm building a machine learning pipelines to store image bytes and image vectors in the backend.
So, when users query for the random access image data (key), we return the image bytes and perform machine learning model operations on it.
I'm currently considering going with Amazon S3 (in the future, maybe add Redis caching layer) as the backend system to store the information (s3 buckets with sharded prefixes).
As the latency of S3 is 100-200ms (get/put) and it has a high throughput of 3500 puts/sec and 5500 gets/sec for a given bucker/prefix. In the future I need to reduce the latency, I can add Redis cache.
Also, s3 costs are way fewer than HBase (on Amazon EC2 instances with 3x replication factor)
I have not personally used HBase before, so can someone help me if I'm making the right choice here? I'm not aware of Hbase latencies and I have learned that the MOB feature on Hbase has to be turned on if we have store image bytes on of the column families as the avg image bytes are 240Kb.
I am researching different querying solutions to handle ~1 trillion records of data (in the realm of a petabyte). The data is mostly textual. I have identified a few options: Milvus, HBase, RocksDB, and Elasticsearch. I was wondering if there is a good way to compare the performance of these options (or if anyone has already done something like this). I want to be able to compare the speed of ingesting and querying textual data from these tools. Does anyone have information on this or know where I can find some? Thanks in advance!