Need advice about which tool to choose?Ask the StackShare community!
MongoDB vs Redis: What are the differences?
Introduction
In this article, we will examine the key differences between MongoDB and Redis. Both MongoDB and Redis are popular NoSQL databases, but they have distinct differences in terms of data storage, data model, and use cases. Let's explore these differences in detail.
Data Storage: MongoDB is a document-oriented database that stores data in a flexible JSON-like format called BSON (Binary JSON). It allows the storage of complex hierarchical data structures with nested arrays and documents. Redis, on the other hand, is an in-memory data structure store that primarily stores data in key-value pairs. It is optimized for extremely fast data access and can persist data to disk if required.
Data Model: MongoDB follows a flexible schema-less data model, which means that fields in a collection can vary from document to document. This allows for easy modifications and updates to the data structure without downtime. Redis, on the other hand, follows a schema-less but simpler data model where data is primarily stored as strings, lists, sets, hashes, and sorted sets. This simplicity makes Redis more suitable for use cases that require caching, session management, and real-time analytics.
Scalability and Performance: MongoDB is known for its horizontal scalability, allowing it to handle large amounts of data and high traffic loads across multiple servers or sharded clusters. It also provides built-in replication for data redundancy and fault tolerance. Redis, on the other hand, excels in performance and is designed to handle millions of small, simple data operations per second due to its in-memory storage mechanism. It can be used as a high-speed cache or as a message broker, making it ideal for use cases that require rapid data retrieval or real-time messaging.
Querying Capabilities: MongoDB supports a rich query language that includes filtering, sorting, and aggregation capabilities. It also allows for full-text search and geospatial queries, making it suitable for complex query requirements. Redis, being primarily a key-value store, has a more limited querying capability, mainly supporting simple operations such as retrieving values by key, sets operations, and basic string manipulation. It does not provide built-in support for complex queries or indexing.
Persistence: MongoDB provides a flexible persistence model where data can be written to disk for durability. It supports various write durability modes, allowing developers to prioritize performance or data safety depending on their needs. Redis, by default, stores data in memory for maximum performance, but it also provides options to persist data to disk using snapshots or append-only logs. However, persistence in Redis is not as durable as in MongoDB, making it more suitable for use cases where data loss is tolerable or can be easily recreated.
Use Cases: MongoDB is commonly used for applications that require complex data modeling, high scalability, and rich querying capabilities. It is a popular choice for content management systems, e-commerce platforms, and real-time analytics. Redis, on the other hand, is often used as a caching layer, session store, or message broker in systems that require high-speed data access and real-time data processing. It is suitable for use cases such as real-time leaderboards, real-time analytics pipelines, and pub/sub messaging systems.
In summary, MongoDB and Redis differ in terms of data storage, data model, scalability, querying capabilities, persistence, and use cases. MongoDB offers a flexible document-oriented data model suitable for complex data structures and advanced querying, while Redis excels in speed and simplicity, making it ideal for caching, session management, and real-time messaging.
I need urgent advice from you all! I am making a web-based food ordering platform which includes 3 different ordering methods (Dine-in using QR code scanning + Take away + Home Delivery) and a table reservation system. We are using React for the front-end, and I need your advice if I should use NestJS or ExpressJS for the backend. And regarding the database, which database should I use, MongoDB or PostgreSQL? Which combination will be better? PS. We want to follow the microservice architecture as scalability, reliability, and usability are the most important Non Functional requirements. Expert advice is needed, please. A load of thanks in advance. Kind Regards, Miqdad
I can't speak for the NestJS vs ExpressJS discussion, but I can given a viewpoint on databases.
The main thing to consider around database choice, is what "shape" the data will be in, and the kind of read/write patterns you expect of that data. The blog example shows up so much for DBMS like MongoDB, because it showcases what NoSQL / document storage is very scalable and performant in: mostly isolated documents with a few views / ways to order them and filter them. In your case, I can imagine a number of "relations" already, which suggest a more traditional SQL solution would work well: You have restaurants, they have maybe a few menus (regular, gluten-free etc), with menu items in, which have different prices over time (25% discount on christmas food just after christmas, 50% off pizzas on wednesdays). Then there's a whole different set of "relations" for people ordering, like showing them past orders, which need to refer to the restaurant etc, and credit card transaction information for refunds etc. That to me suggests PostgreSQL, which will scale quite well if you database design is okay.
PostgreSQL also offers you some extensions, which are just amazing for your use-case. https://postgis.net/ for example will let you query for restaurants based on location, without the big cost that comes from constantly using something like Google Maps API to work out which restaurants are near to someone ordering. Partitioning and window functions will be great for your own use internally too, like answering questions of "What types of takeways perform the best for us, Italian, Mexican?" or in combination with PostGIS, answering questions like "What kind of takeways do we need to market to, to improve our selection?".
While these things can all be implemented in MongoDB, you tend to lose some of the convenience of ACID or have to deal with things like eventual consistency, which requires more thinking on the part of your engineers. PostgreSQL offers decent (if more complex) scalablity and redundancy solutions, and is honestly very well proven and plenty of documentation exists on optimising queries.
Hello, i build microservice systems using Angular And Spring (Java) so i can't help with with ur back end choice, BUT, i definitely advice you to use a Nosql database, thus MongoDB of course or even Cassandra if your looking for extreme scalability with zero point of failure. Anyway, Nosql if much more faster then Sql (in your case Postresql DB). All you wanna do with sql can also be done by nosql (not the opposite of course).I also advice you to use docker containers + kubernetes to orchestrate them, if you need scalability and replication, that way your app can support auto scalability (in case ur users number goes high). Best of luck
About PostgreSQL vs MongoDB: short answer. Both are great. Choose what you like the most. Only if you expect millions of users, I‘ll incline with MongoDB.
Hello, I am developing a new project with an internal chat between users. Also, there are complex relationships between the other project entities but I wolud like to build something scalable and fast and right now I am designing the data model. What kind of database would you recommend me to manage all application data? relational like MySQL, no relational like MongoDB or a mixed one? Thank you
In MongoDB, a write operation is atomic on the level of a single document, so it's harder to deal with consistency without transactions.
MongoDB supports horizontal scaling through Sharding , distributing data across several machines and facilitating high throughput operations with large sets of data. ... Sharding allows you to add additional instances to increase capacity when required
If you are trying with "complex relationships", give a chance to learn ArangoDB and Graph databases. Its database structures allow doing this with faster and simpler queries. The database is not as strict as others and allows arbitrary data. The data model is really like a neural network and you will never need foreign keys tables anymore. In Udemy there is a free course about it to get started.
The most important question is where are you planning to host? On-premise, or in the cloud.
Particularly if you are planning to host in either AWS or Azure, then your first point of call should be the PaaS (Platform as a Service) databases supplied by these vendors, as you will find yourself requiring a lot less effort to support them, much easier Disaster Recovery options, and also, depending on how PAYG the database is that you use, potentially also much cheaper costs than having a dedicated database server.
Your question regards 'Relational or not' is obviously key, and you need to consider both your required data structure, as well as the ACID requirements of your application model, as well as the non-functional requirements in terms of scalability, resilience, whether you want security authorisation at the highest application tier, or right down to 'row' level in the database, etc. - however please don't fall into the trap of considering 'NoSQL' as being single category. MongoDB, with its document-store type solution is a very different model to key-value-pair stores (like AWS DynamoDB), or column stores (like AWS RedShift) or for more complex data relationships, Entity Graph Stores (like AWS Neptune), to stores designed for tokenisation and text search (ElasticSearch) etc.
Also critical in all this is how many items you believe you need to index by. RDBMS/SQL stores are great for having as many indexes as you want, other than the slow-down in write speed, whereas databases like Amazon DynamoDB provide blisteringly fast read/write performance, but are very limited on key indexing capabilities.
It feels like you have most experience with SQL/RDBMS technologies, so for the simplest learning curve, and if your application fits it, then I'd personally start by looking at AWS Aurora https://aws.amazon.com/rds/aurora/ .
FIrstly, it may help if you explain what you mean by "complex relationships between project entities". Secondly, you can build a fast and scalable solution using either. With that said however, the data sounds relational so I would recommend MySQL.
I think, Its depend of your project type and your skills. MySQL is good and simple for maintenance but MongoDB need more skills and knowledge. If you work on little project, use MySQL. For your project type, MySQL is enough after you can migrate with PostgreSQL
I am going to work on a real estate project and have to decide on a database. Now, SQL databases can be very efficient if appropriately designed. More relations between the data and less redundancy. But with a #NoSQL database, the development time is reduced, and it is easy to query. Since this is my first time working on the real estate domain, I would like to pick a database that would be efficient in the long run.
I recommend PostgreSQL as it’s the most powerful out of the 3 databases you mentioned. It supports JSON objects so you can mimic the MongoDB functionality, but I would also argue that SQL is actually quite powerful and in many cases significantly easier to work with than with NoSQL databases.
Stay away from foreign keys, keep it fast and simple. Define your data structures well in advance. Try to model your data structures based on your system’s vision; based on where it’s going and not based solely on what you currently need it to do. This will help you avoid drastic changes to your database after your system is launched. Populate the database with fake data and run tests. PostgreSQL allows you to create Views from multiple tables. Try to create those views and make sure you can easily create useful views from multiple tables. Run an Explain on those view queries to make sure you created your indexes correctly. Make sure it’s fast!
Any of those three databases are going to be efficient, scalable, and reliable in the long term if you configure and use them correctly. They all also have solid hosting solutions.
All things being equal, I would agree with other posters that Postgres is my preference among the three, but there are caveats.
MongoDB and MySQL have better support for mutli-region replication in your big three cloud environments. Azure recently bought Citus Data, which was a best-in-class Postgres replication solution, so they might be the only one I trust to provide cross-region replication at the moment.
If you have a single region deployment and are on AWS, I can't recommend Aurora Postgres highly enough. It's a very good implementation and extremely performant.
I'll second another piece of advice. Postgresql's JSON columns are a dream when it comes to productivity and I use them frequently with our Rails application. In these cases, no migration is required to change schema. We store payloads with dozens or hundreds of keys and performance has not been an issue. We also have a lot of relational tables, so the joins we get with SQL are very important to us and hard to replicate with a NoQL solution.
That really depends of where do you see you application in the long run. On any application, any of those choices are excellent. You could argue about good support on JSON binaries, but even MySQL has an excellent support for that on the latest versions.
On the long run, when your application gets hundreds of thousands of requests per second, you might start thinking about how many inputs you will have in the database compared to the outputs. PostgresSQL it’s excellent at giving you outputs, but table corruption can happen when you start receiving this massive number of inputs (Which was the reason Uber switched from Postgres to MySQL)
On our OPS Platform at CTO.ai , we decided to use Postgres, because we need a reliable and agile way to send the output to our users, so that was out best choice in the long run for our product.
Hi everybody,
We are building a prototype of our first Hardware system. It's a sensor that we expect to send data approx. every 5 secs. If we scale it to about 1k sensors, it would mean 173M records per month.
We will offer data-retention for 7 days or 1 month based on the plan, but if by any chance the project goes well I see that we can have a lot of pain if not choosing the persistence layer correctly from day 1. Which data storage would you use for that?
Cheers, Alberto.
We have seen really good performance with frequent writes in Redis (~10k writes per second). The retrievals are also fast. I see no reason why it should not be your first choice. I know Redis doesn't have the image of a persistence database but it is capable of persistence and high availability. Moreover, MongoDB or some other database can be used as a cold storage. Even switching to MongoDB or any other NoSQL database entirely wouldn't be a big problem since Redis is a simple key-value store.
Hasura is a full platform working with Postgres and GraphQL. With Hasura you get everything and it is fast and cheap to scale up. With a good infrastructure like Kubernetes or AWS ECS you can run whatever you want.
MongoDB is nice but with millions of rows tends to lose its charm and with MongoDB or Redis you must to develop some backend with Hasura you get a backend ready to use.
I used Cassandra in a similar situation while collecting data from a sensor, and it has worked very well so far. There has been literally no downtime and scaling is a breeze.
You might want to look into Vitess/MySQL as well. If you are interested in the inherent technology stack then https://planetscale.com might be worth exploring as an option.
Why do not you try a time series or IOT database? They usually offer very hight toughtput for ingesting and faster quering performance for this kind of data.
I am one of those who believes that MongoDB can be used for everything, this thanks to the advertising of MongoDB.
We are creating an e-commerce platform, we know that it has many relationships, but with MongoDB we can avoid some, but in the end, some relationships have to exist.
A single developer to create two native applications in Flutter, a web application with React, create the backend with multiple microservices hosted with Google Cloud Run. PostgreSQL can be heavy because it should be used with an ORM, on the contrary, with MongoDB you can avoid some relationships and avoid ORM / ODM.
We need advice from someone who has the experience and has had to choose between these two databases for an e-commerce site.
The real question here is not about the technology but rather your real needs and your data. Do you need to manage data that has core concepts and relations ? (such as a family, with parents and children) or do you need to manage a basic collection of similar data (such as blog entries)? PostgreSQL is definitely a relational database for managing entities and their relationships whereas MongoDB (I may be strongly opinionated here ;-) ) is more targeted at managing collection of entities (such as the blog entries). For an e-commerce site (with some products, products categories, user ratings and comments, prices, bundles...) I would go for PostgreSQL as it will support/guide you in creating a structured data set with all your products, organized in categories and with user ratings/comments attached to them. HTH
Had exactly the same question when selecting data storage for our new product. Not e-commerce though, rather interactive and content-focused HR SaaS for SME.
The key arguments for PostgreSQL
It gives you the opportunity to use relationships where you really need it and just go with key-value tables where you don't.
With Jsonb datatype you can store documents/objects/arrays as JSON then use JSON elements in queries and even indexes.
There are more tools/integrations working with PostgreSQL which you can use out of the box, e.g. Hasura
I am in your spot, exactly. A few months ago, I had decided to use Postgres because since its version 9 it showed a lot of progress for being a high-availability database. However, frankly, I didn't want to model statically all data, since I have several distinct schemas (like for different product types) and I wanted some flexibility to add or remove as I saw fit. One of the main challenges with analyzing a NoSQL database being familiar in the SQL ways, is that it's easy to look for "analogies" for what makes SQL useful, like relationship enforcing, transactions and the cascading effect on deletes, updates and inserts, and that limit your vision a lot when analyzing a tool like Mongo, especially in a micro-services pattern. Now-a-days, I really found my solution in Mongo. Not just because of it being NoSQL, but because all of the support I find in the NodeJS community through packages and utilities that make it dead easy to use it for several use-cases. Whatever Postgres offers, Mongo does it a little easier and better, like text search and geo-queries. What you need to see is to model your data in a way that makes sense with Mongo. For instance, I've got a User service that has all auth related information of a user. But then, I have the same user in the Profile service, with the same id, but totally different fields. You have two de facto ways to connect data, by reference and embedding, which in Ecommerce, both have big uses. Like using references to relate a User to a Profile, and an embed to relate a Product to an Order. There's even a third, albeit a little more "manual" implementation here, the graph relationship in which you can model data, in which you can easily model event-driven documents, like a Purchase that goes from "a customer" to "a store", which you can later use for much easier and deep analytics than with the classical SQL stance. MariaDB has it readily available, and also has many improvements over MySQL and Postgres, especially for NoSQL features and scalability. Sadly it is just seen as a MySQL clone, but it offers more than that (although its documentation could be improved). Using Mongo in a micro-service environment is even better because your models can be smaller, meaning less burden on relationships, although you do compensate with a bit of duplication, but a well-designed schema will have minimal impact on that. Whatever tool might do the job, but I want to cheer on the newer generation. Hope it helps.
Hello,
I am trying to design an online ordering app similar to Doordash or Uber Eats. I'm having a hard time trying to finalise on what database (or mixture of databases) to use. I'm leaning towards using a relational database like MySQL or PostgreSQL. But, when the application grows, I don't want to join on 20 tables to get a data. Any help would be greatly appreciated. Thank you for your time.
Hello Suhas , We build our product www.voilacabs.com which is in the same lines as yours but we have used a combination of Mysql and MongoDB. When using MySQL, i would recommend doing the following: 1. Use Mysql only for storage only and for realtime updates we recommend MongoDB. 2. Don't try to Join more than 3 tables. ( the moment you reach 3 join stop there and try to un-normalized database. 3. Never or very rarely use Auto-increments. ( we recommend using UUIDS ) . Use UUIDS always for Auto increments for MYSQL. If you using Postgre SQL then i would suggest you to please check this https://instagram-engineering.com/sharding-ids-at-instagram-1cf5a71e5a5c There is a stored procedure that generated unique keys instead of auto-increment keys and that will help you sharding or clustering database without sync errors. 4. Also For MongoDB if you can put a layer of REDIS Cache then that will boost your api performance under large loads. 5. Use Node.js programing language as that function asynchronously .
Let me know if you still need any suggestion's . Thanks & Regards Rupen Makhecha CTO @ Voila Cab's www.voilacabs.com
I would recommend a mixture of MySQL and MongoDB. Using MongoDB for the Content Distribution Network (CDN) will make it easy to store high volume incoming data. MySQL is recommended to be used for business logic. PostgreSQL is not recommended since you will be faced with inefficient database replication features and constant migration from one PostgreSQL version to another.
I'm currently developing an app that ranks trending stuff ( such as games, memes or movies, etc. ) or events in a particular country or region. Here are the specs: My app does not require registration and requires cookies and localStorage to track users. Users can add new entries to each trending category provided that their country of origin is recorded in cookies. If each category contains more than 100 items then the oldest items get deleted. The question is: what kind of database should I use for managing this app? Thanks in advance
I think your best and cheapest choice is going to be MongoDB, Although Postgres is probably going to be the more scaleable approach, you likely have a good idea of how you want to present your data, and the app seems small enough that you shouldn't need to worry about scaling issues. It also sounds like your app can grow in a linear capacity based on the number of users, and the amount of data, which is the perfect use-case for noSQL databases (linear, predictable scaling).
Correct me if I have any of these assumptions wrong. 1. You're looking to have a relatively high-read with a lower write volume 2. Your app is essentially a list of objects that can belong to a category 3. users can create objects in this list.
I think Mongo is going to be what you're looking for on the following basis: 1. you absolutely need a database that is shared by all users of your app, therefor IndexedDB is out of the question. 2. You have semi-structured data 3. you probably want the cheapest solution.
I think Postgres is wrong for the following reasons: 1. your app is pretty simple in concept, SQL databases will add unnecessary complexity to your system, either through ORMs or SQL queries. (use an ORM if you go with SQL) 2. Hosting SQL databases for production is not cheap! the cheapest solution I know of for Postgres is ElephantSQL. It provides 20MB for free with 5 concurrent connections, you should be okay to manage these limitations if you decide to go Postgres in the end. Whereas mongoDB Atlas has some great free-tier options.
Although your data might be easier to model in Postgres, you can certainly model your data as a single list of items that have a category attached.
I don't want to officially recommend another tool, but you should really checkout prisma, firebase, amplify, or Azure App Services for this app! Just go completely backend-less [Firebase] https://firebase.google.com/ [Amplify] https://aws.amazon.com/amplify/ [Prisma] https://www.prisma.io/ [Azure App Services] https://azure.microsoft.com/en-us/services/app-service/?v=18.51
Hi everybody, I'm developing an application to be used in a gym setting where athletes fill out a health survey, and coaches can analyze the results. However, due to the dynamic nature of some aspects of the app and more static aspects of the other, I am wondering if/how I would integrate MongoDB with my existing PostgreSQL database. I would like to store things like registrations, license information, and club information in Postgres
, while I am thinking about moving things like user surveys, logging, and user settings information over to MongoDB
. Some fields on the survey are integers, some large blocks of text, and some are arrays. My thought is, if I moved that data to MongoDB
, it would give us greater flexibility in terms of adding and removing fields and data to them, and it would scale a lot easier than Postgres
. Not to mention it will be easier to organize that kind of data. Is that overkill or am I approaching this issue the right way? Thank you!
You can have your cake and eat it too. If you really need the flexibility of a document store, Postgresql's JSONB support allows you to mix and match relational data and document data within the same database/table. You can just as easily run analytical queries against JSONB data in Postgresql as you can against "normal" relational data. MongoDB comes with a significant operational overhead and cost (hello replica sets), so unless you really need MongoDB's sharding capabilities (which you shouldn't until you get to extreme scaling numbers), then just stick with Postgresql and use JSONB where you need it.
With PostgreSQL you could easily integrate JSON or array type columns and develope a simple interface to add columns on your application. Anyway handling all the data this way will require some intermediate skill with PostgreSQL dialect and a mix and match of syntaxes for your analitical queryes. Also you will need to have a good design for you backend to handle all this. MongoDB will handle all this in a more natural way and I believe will be more easily integrated with a Node.js backend.
How are you managing your PostgreSQL schema? It doesn't have to be hard to add or remove fields. We're working on a SQL database client at BaseDash that lets you add/remove columns in a couple clicks.
If you decide to migrate some of your data to MongoDB, you can definitely manage the two databases in parallel. For any records that need to be linked, you can treat it just like a foreign key by creating a column that points to an ID in the other database. For example, you might store user settings in MongoDB, and include a UserId
field that points to your User record in your Postgres database.
Those types of things should fit fine in a postgres json column. You'll actually have more flexibility with postgres because you can have a field as a normal column or in a json column, and you can have constraints and indexes on fields within a json column (or not).
We are building an IOT service with heavy write throughput and fewer reads (we need downsampling records). We prefer to have good reliability when comes to data and prefer to have data retention based on policies.
So, we are looking for what is the best underlying DB for ingesting a lot of data and do queries easily
We had a similar challenge. We started with DynamoDB, Timescale, and even InfluxDB and Mongo - to eventually settle with PostgreSQL. Assuming the inbound data pipeline in queued (for example, Kinesis/Kafka -> S3 -> and some Lambda functions), PostgreSQL gave us a We had a similar challenge. We started with DynamoDB, Timescale and even InfluxDB and Mongo - to eventually settle with PostgreSQL. Assuming the inbound data pipeline in queued (for example, Kinesis/Kafka -> S3 -> and some Lambda functions), PostgreSQL gave us better performance by far.
Druid is amazing for this use case and is a cloud-native solution that can be deployed on any cloud infrastructure or on Kubernetes. - Easy to scale horizontally - Column Oriented Database - SQL to query data - Streaming and Batch Ingestion - Native search indexes It has feature to work as TimeSeriesDB, Datawarehouse, and has Time-optimized partitioning.
if you want to find a serverless solution with capability of a lot of storage and SQL kind of capability then google bigquery is the best solution for that.
We Have thousands of .pdf docs generated from the same form but with lots of variability. We need to extract data from open text and more important - from tables inside the docs. The output of Couchbase/Mongo will be one row per document for backend processing. ADOBE renders the tables in an unusable form.
I prefer MongoDB due to own experience with migration of old archive of pdf and meta-data to a new “archive”. The biggest advantage is speed of filters output - a new archive is way faster and reliable then the old one - but also the the easy programming of MongoDB with many code snippets and examples available. I have no personal experience so far with Couchbase. From the architecture point of view both options are OK - go for the one you like.
I would like to suggest MongoDB or ArangoDB (can't choose both, so ArangoDB). MongoDB is more mature, but ArangoDB is more interesting if you will need to bring graph database ideas to solution. For example if some data or some documents are interlinked, then probably ArangoDB is a best solution.
To process tables we used Abbyy software stack. It's great on table extraction.
If you can select text with mouse drag in PDF. Use pdftotext it is fast! You can install it on server with command "apt-get install poppler-utils". Use it like "pdftotext -layout /path-to-your-file". In same folder it will make text file with line by line content. There is few classes on git stacks that you can use, also.
At Pushnami we were looking at several alternative databases that would support following architectural requirements: - very quick prototyping for an unknown domain - ability to support large amounts of data - native ability to replicate and fail over - full stack approach for Node.js development After careful consideration MongoDB came on top, and 3 years later we are still very happy with that decision. Currently we keep almost 2TB of data in our cluster, and start thinking about sharding.
After using couchbase for over 4 years, we migrated to MongoDB and that was the best decision ever! I'm very disappointed with Couchbase's technical performance. Even though we received enterprise support and were a listed Couchbase Partner, the experience was horrible. With every contact, the sales team was trying to get me on a $7k+ license for access to features all other open source NoSQL databases get for free.
Here's why you should not use Couchbase
Full-text search Queries The full-text search often returns a different number of results if you run the same query multiple types
N1QL queries Configuring the indexes correctly is next to impossible. It's poorly documented and nobody seems to know what to do, even the Couchbase support engineers have no clue what they are doing.
Community support I posted several problems on the forum and I never once received a useful answer
Enterprise support It's very expensive. $7k+. The team constantly tried to get me to buy even though the community edition wasn't working great
Autonomous Operator It's actually just a poorly configured Kubernetes role that no matter what I did, I couldn't get it to work. The support team was useless. Same lack of documentation. If you do get it to work, you need 6 servers at least to meet their minimum requirements.
Couchbase cloud Typical for Couchbase, the user experience is awful and I could never get it to work.
Minimum requirements
The minimum requirements in production are 6 servers. On AWS the calculated monthly cost would be ~$600
. We achieved better performance using a $16
MongoDB instance on the Mongo Atlas Cloud
writing queries is a nightmare While N1QL is similar to SQL and it's easier to write because of the familiarity, that isn't entirely true. The "smart index" that Couchbase advertises is not smart at all. Creating an index with 5 fields, and only using 4 of them won't result in Couchbase using the same index, so you have to create a new one.
Couchbase UI
The UI that comes with every database deployment is full of bugs, barely functional and the developer experience is poor. When I asked Couchbase about it, they basically said they don't care because real developers use SQL directly from code
Consumes too much RAM
Couchbase is shipped with a smaller Memcached instance to handle the in-memory cache. Memcached ends up using 8 GB of RAM for 5000 documents
! I'm not kidding! We had less than 5000 docs on a Couchbase instance and less than 20 indexes and RAM consumption was always over 8 GB
Memory allocations are useless I asked the Couchbase team a question: If a bucket has 1 GB allocated, what happens when I have more than 1GB stored? Does it overflow? Does it cache somewhere? Do I get an error? I always received the same answer: If you buy the Couchbase enterprise then we can guide you.
server side
Our ML model will be trained locally first. After the local stage, the model will be deployed in the cloud using Flask
and Amazon EC2
. The data will be stored using MongoDB
, and we will use MongoDB Atlas
to host it remotely. Below are the reasons why we choose above stacks:
MongoDB
and MongoDB Atlas
- Schema-less (Easy to add/remove attributes)
- It is easy to set up and run in Amazon EC2
Flask
:
- python library support
- easy and simple to use
Amazon EC2
:
- Easy to use/configure
- Secure
Client side
JavaScript will be our main language for front-end. We will use React
to build our web interfaces. Material-UI
will be used for designing the theme, because it has lots of pre-defined themes. Redis
is a great tool for caching due to its in-memory data structure, it can be used to improve user experience. In terms of visualizing data, we use Chart.js
. Because it is light weight and easy to learn.
Project management
We use Slack
for communication, because it is efficient, light weight and easy to use. GitHub
will be used as our primary version control, issue tracking, and roadmapping tool. Because it provides a all-in-one place for code writing and progress tracking.
Server side
We decided to use Python for our backend because it is one of the industry standard languages for data analysis and machine learning. It also has a lot of support due to its large user base.
Web Server: We chose Flask because we want to keep our machine learning / data analysis and the web server in the same language. Flask is easy to use and we all have experience with it. Postman will be used for creating and testing APIs due to its convenience.
Machine Learning: We decided to go with PyTorch for machine learning since it is one of the most popular libraries. It is also known to have an easier learning curve than other popular libraries such as Tensorflow. This is important because our team lacks ML experience and learning the tool as fast as possible would increase productivity.
Data Analysis: Some common Python libraries will be used to analyze our data. These include NumPy, Pandas , and matplotlib. These tools combined will help us learn the properties and characteristics of our data. Jupyter notebook will be used to help organize the data analysis process, and improve the code readability.
Client side
UI: We decided to use React for the UI because it helps organize the data and variables of the application into components, making it very convenient to maintain our dashboard. Since React is one of the most popular front end frameworks right now, there will be a lot of support for it as well as a lot of potential new hires that are familiar with the framework. CSS 3 and HTML5 will be used for the basic styling and structure of the web app, as they are the most widely used front end languages.
State Management: We decided to use Redux to manage the state of the application since it works naturally to React. Our team also already has experience working with Redux which gave it a slight edge over the other state management libraries.
Data Visualization: We decided to use the React-based library Victory to visualize the data. They have very user friendly documentation on their official website which we find easy to learn from.
Cache
- Caching: We decided between Redis and memcached because they are two of the most popular open-source cache engines. We ultimately decided to use Redis to improve our web app performance mainly due to the extra functionalities it provides such as fine-tuning cache contents and durability.
Database
- Database: We decided to use a NoSQL database over a relational database because of its flexibility from not having a predefined schema. The user behavior analytics has to be flexible since the data we plan to store may change frequently. We decided on MongoDB because it is lightweight and we can easily host the database with MongoDB Atlas . Everyone on our team also has experience working with MongoDB.
Infrastructure
- Deployment: We decided to use Heroku over AWS, Azure, Google Cloud because it is free. Although there are advantages to the other cloud services, Heroku makes the most sense to our team because our primary goal is to build an MVP.
Other Tools
Communication Slack will be used as the primary source of communication. It provides all the features needed for basic discussions. In terms of more interactive meetings, Zoom will be used for its video calls and screen sharing capabilities.
Source Control The project will be stored on GitHub and all code changes will be done though pull requests. This will help us keep the codebase clean and make it easy to revert changes when we need to.
We actually use both Mongo and SQL databases in production. Mongo excels in both speed and developer friendliness when it comes to geospatial data and queries on the geospatial data, but we also like ACID compliance hence most of our other data (except on-site logs) are stored in a SQL Database (MariaDB for now)
MySQL has a lot of strengths working for it. It's simple and easy to set up and use. It's JSON engine is also really good these days. Mongo is also simple to setup and use, and it's speed as a document-object storage engine is first class.
Where Postgres has both beat is in it's combining of all of the features that make both MySQL and Mongo great, while adding on enterprise grade level scalability and replication. It's Postgres' stability and robustness, while still fulfilling the roles of it's contemporaries extremely well that edge Postgre for me.
When I was new with web development, I was using PHP for backend and MySQL for database. But after improving my JS skills, I chosen Node.js. Because of too many reasons including npm, express, community, fast coding and etc. MongoDB is so good for using with Node.js. If your JS skills are enough good, I recommend to migrate to Node.js and MongoDB.
My data was inherently hierarchical, but there was not enough content in each level of the hierarchy to justify a relational DB (SQL) with a one-to-many approach. It was also far easier to share data between the frontend (Angular), backend (Node.js) and DB (MongoDB) as they all pass around JSON natively. This allowed me to skip the translation layer from relational to hierarchical. You do need to think about correct indexes in MongoDB, and make sure the objects have finite size. For instance, an object in your DB shouldn't have a property which is an array that grows over time, without limit. In addition, I did use MySQL for other types of data, such as a catalog of products which (a) has a lot of data, (b) flat and not hierarchical, (c) needed very fast queries.
We used Mongo for the first iterations of our app, but the relational nature of our data was an awkward fit for a database that is not relational. We sorely lacked relational database integrity features that needed to be done on the application side (poorly) and it was a huge relief when we managed to port our application over to Postgres, which performs great and never gives us trouble, while having very user friendly extensions like JSON and PubSub that made the transition easy.
We wanted a JSON datastore that could save the state of our bioinformatics visualizations without destructive normalization. As a leading NoSQL data storage technology, MongoDB has been a perfect fit for our needs. Plus it's open source, and has an enterprise SLA scale-out path, with support of hosted solutions like Atlas. Mongo has been an absolute champ. So much so that SQL and Oracle have begun shipping JSON column types as a new feature for their databases. And when Fast Healthcare Interoperability Resources (FHIR) announced support for JSON, we basically had our FHIR datalake technology.
Pros of MongoDB
- Document-oriented storage828
- No sql593
- Ease of use553
- Fast464
- High performance410
- Free255
- Open source218
- Flexible180
- Replication & high availability145
- Easy to maintain112
- Querying42
- Easy scalability39
- Auto-sharding38
- High availability37
- Map/reduce31
- Document database27
- Easy setup25
- Full index support25
- Reliable16
- Fast in-place updates15
- Agile programming, flexible, fast14
- No database migrations12
- Easy integration with Node.Js8
- Enterprise8
- Enterprise Support6
- Great NoSQL DB5
- Support for many languages through different drivers4
- Schemaless3
- Aggregation Framework3
- Drivers support is good3
- Fast2
- Managed service2
- Easy to Scale2
- Awesome2
- Consistent2
- Good GUI1
- Acid Compliant1
Pros of Redis
- Performance886
- Super fast542
- Ease of use513
- In-memory cache444
- Advanced key-value cache324
- Open source194
- Easy to deploy182
- Stable164
- Free155
- Fast121
- High-Performance42
- High Availability40
- Data Structures35
- Very Scalable32
- Replication24
- Great community22
- Pub/Sub22
- "NoSQL" key-value data store19
- Hashes16
- Sets13
- Sorted Sets11
- NoSQL10
- Lists10
- Async replication9
- BSD licensed9
- Bitmaps8
- Integrates super easy with Sidekiq for Rails background8
- Keys with a limited time-to-live7
- Open Source7
- Lua scripting6
- Strings6
- Awesomeness for Free5
- Hyperloglogs5
- Transactions4
- Outstanding performance4
- Runs server side LUA4
- LRU eviction of keys4
- Feature Rich4
- Written in ANSI C4
- Networked4
- Data structure server3
- Performance & ease of use3
- Dont save data if no subscribers are found2
- Automatic failover2
- Easy to use2
- Temporarily kept on disk2
- Scalable2
- Existing Laravel Integration2
- Channels concept2
- Object [key/value] size each 500 MB2
- Simple2
Sign up to add or upvote prosMake informed product decisions
Cons of MongoDB
- Very slowly for connected models that require joins6
- Not acid compliant3
- Proprietary query language2
Cons of Redis
- Cannot query objects directly15
- No secondary indexes for non-numeric data types3
- No WAL1