We are building cloud based analytical app and most of the data for UI is supplied from SQL server to Delta lake and then from Delta Lake to Azure Cosmos DB as JSON using Databricks. So that API can send it to front-end. Sometimes we get larger documents while transforming table rows into JSONs and it exceeds 2mb limit of cosmos size. What is the best solution for replacing Cosmos DB?
Thanks for the input Ivan Reche. If we store big documents to blob container then how will python API's can query those and send it to UI? and if any updates happen on UI, then API has to write those changes back to big documents as copy.
Do you know what the max size of one of your documents might be? Mongo (which you can also use on Azure) allows for larger sized documents (I think maybe 20MB). With that said, I ran into this issue when I was first using Cosmos, and I wound up rethinking the way I was storing documents. I don't know if this is an option for your scenario, but I ended up doing was breaking my documents up into smaller subdocuments. A thought process that I have come to follow is that if any property is an array (or at least can be an array with a length of N), make that array simple a list of IDs that point to other documents.
Aerospike might be one to check out. Can store 8Mb objects and provides much better performance and cost effectiveness compared with Cosmos and Mongo.