Hello, I am developing a new project with an internal chat between users. Also, there are complex relationships between the other project entities but I wolud like to build something scalable and fast and right now I am designing the data model. What kind of database would you recommend me to manage all application data? relational like MySQL, no relational like MongoDB or a mixed one? Thank you
I'm starting to work on a Jira-like bug tracker web app. This is a hobby project that is mostly a way for me to learn about different technologies and development processes(CI/CD, etc..) so I could be more ready when I start applying for programming jobs.
I'm debating between MySQL, which I'm less familiar with, and MongoDB which I have used in the past.
My two points of consideration are the following:
1) Which one is more likely to be relevant for web dev jobs? While I want to learn new technologies, I prefer learning ones that will make me more hireable in the future.
2) Which one is more flexible when it comes to changing the shape of the stored data? I expect to need to make some changes as the project goes on.
MySQL is still more popular than MongoDB if you look at Google Trends. I've also added MariaDB, which is pretty much a copy from MySQL and its features, and PostgreSQL, which is also a popular relational database.
This is a very good article for comparing MySQL to MongoDB and which one you should use: MongoDB vs MySQL: A Comparative Study on Databases.
If you just want to learn and you have the time, I would opt for using both MySQL and MongoDB. For example using MySQL for most of the site content and MongoDB for saving log messages. As you get more and more logs you start to see the benefits from MongoDB's faster document fetching.
Hi! Thanks for the answer.
I really like the idea of opting to use both for different scenarios, and I think that this is the one I will go for.
Thanks again :)
There's really not an awful lot of difference between the two, they have wildly different storage mechanisms but they each have their fairly similar benefits. If you want to learn something that might be a requisite skill for a job, I would also look at alternatives such as time based and column based systems like InfluxDB and the unbelievably fast and flexible ClickHouse. While they may seem like an unlikely fit for a personal bug tracker app, there's no reason not to use them. Since I got into InfluxDB people have been requesting it a lot and I'll be using ClickHouse for all large databases, probably forever. Expand your horizons beyond your competition's.
We are planning to migrate one of my applications from MSSQL to MySQL. Can someone help me with the version to select?. I have a strong inclination towards MySql 5.7. But, I see there are some standout features added in Mysql 8.0 like JSON_TABLE. Just wanted to know if the newer version has not compromised on its speed while giving out some add on features.
MySQL 8.0 is significantly better than MySQL 5.7. For all InnoDB row operations, you'll see a great performance improvement. Also, the time taken to process transactions is lower in MySQL 8.0. Moreover, there has been an improvement in managing read and read/write workloads.
MySQL AB doesn't implement anything in MySQL until they can find a way to do it efficiently and, often, more efficiently than other systems. So although I don't have experience with benchmarking JSON_TABLEs or similar new features, their development philosophy alone suggests that version 8 for the latest features would be a safe jump without sacrificing system performance.
I am trying to design an online ordering app similar to Doordash or Uber Eats. I'm having a hard time trying to finalise on what database (or mixture of databases) to use. I'm leaning towards using a relational database like MySQL or PostgreSQL. But, when the application grows, I don't want to join on 20 tables to get a data. Any help would be greatly appreciated. Thank you for your time.
Hello Suhas , We build our product www.voilacabs.com which is in the same lines as yours but we have used a combination of Mysql and MongoDB. When using MySQL, i would recommend doing the following: 1. Use Mysql only for storage only and for realtime updates we recommend MongoDB. 2. Don't try to Join more than 3 tables. ( the moment you reach 3 join stop there and try to un-normalized database. 3. Never or very rarely use Auto-increments. ( we recommend using UUIDS ) . Use UUIDS always for Auto increments for MYSQL. If you using Postgre SQL then i would suggest you to please check this https://instagram-engineering.com/sharding-ids-at-instagram-1cf5a71e5a5c There is a stored procedure that generated unique keys instead of auto-increment keys and that will help you sharding or clustering database without sync errors. 4. Also For MongoDB if you can put a layer of REDIS Cache then that will boost your api performance under large loads. 5. Use Node.js programing language as that function asynchronously .
Let me know if you still need any suggestion's . Thanks & Regards Rupen Makhecha CTO @ Voila Cab's www.voilacabs.com
If you use MySQL and want a better sharding solution on top of it, then I suggest looking into @Vitess. You can also explore https://planetscale.com as a dbaas alternative to managing those services yourself.
Great, i think Vitess sounds promising. Thank you for this nice recommendation
Disclaimer: I work for PlanetScale where we maintain Vitess as well. We also have a beta feature currently in the works where someone can try out Vitess on top of their existing MySQL infra without making any changes. Please feel free to contact me if you'd be interested to test it out.
I would recommend a mixture of MySQL and MongoDB. Using MongoDB for the Content Distribution Network (CDN) will make it easy to store high volume incoming data. MySQL is recommended to be used for business logic. PostgreSQL is not recommended since you will be faced with inefficient database replication features and constant migration from one PostgreSQL version to another.
Hi all. I am an informatics student, and I need to realise a simple website for my friend. I am planning to realise the website using Node.js and Mongoose, since I have already done a project using these technologies. I also know SQL, and I have used PostgreSQL and MySQL previously.
The website will show a possible travel destination and local transportation. The database is used to store information about traveling, so only admin will manage the content (especially photos). While clients will see the content uploaded by the admin. I am planning to use Mongoose because it is very simple and efficient for this project. Please give me your opinion about this choice.
The use case you are describing would benefit from a self-hosted headless CMS like contentful. You can also go for Strapi with a database of your choice but here you would have to host Strapi and the underlying database (if not using SQLite) yourself. If you want to use Strapi, you can ease your work by using something like PlanetSCaleDB as the backing database for Strapi.
Your requirements seem nothing special. on the other hand, MongoDB is commonly used with Node. you could use Mongo without defining a Schema, does it give you any benefits? Also, note that development speed matters. In most cases RDBMS are the best choice, Learn and use Postgres for life!
Hi. We have an application, which offers clients with mobile Apps. Mobile apps serve using REST APIs provided by a big Monolith web frontend and backend built on PHP/MySQL running on a conventional dedicated machine.
Now we have started rolling out our application across the globe. We want to serve each country at its own TLD like, myapp.us, myapp.pk etc.
Since each country site might have different features and localization challenges, therefore, we'll need to have several different master branches, each for a country. And the backend application will be cloned on a separate machine for each country.
We'd need to geo-restrict mobile apps as well. So a client from the US would be served via our ".us" TLD REST API, and Pakistan client's App should be served with REST APIs from .pk TLD.
Need a piece of advice on, Which AWS service can we use to have a single authentication endpoint which would authenticate Apps from the relevant country server by automatically detecting the location?
We have an in-house build experiment management system. We produce samples as input to the next step, which then could produce 1 sample(1-1) and many samples (1 - many). There are many steps like this. So far, we are tracking genealogy (limited tracking) in the MySQL database, which is becoming hard to trace back to the original material or sample(I can give more details if required). So, we are considering a Graph database. I am requesting advice from the experts.
- Is a graph database the right choice, or can we manage with RDBMS?
- If RDBMS, which RDMS, which feature, or which approach could make this manageable or sustainable
- If Graph database(Neo4j, OrientDB, Azure Cosmos DB, Amazon Neptune, ArangoDB), which one is good, and what are the best practices?
I am sorry that this might be a loaded question.
You have not given much detail about the data generated, the depth of such a graph, and the access patterns (queries). However, it is very easy to track all samples and materials if you traverse this graph using a graph database. Here you can use any of the databases mentioned.
ArangoDB are also multi-model databases where you can still query the data in a relational way using joins - you retain full flexibility.
In SQL, you can use Common Table Expressions (CTEs) and use them to write a recursive query that reads all parent nodes of a tree.
I would recommend
ArangoDB if your samples also have disparate or nested attributes so that the document model (JSON) fits, and you have many complex graph queries that should be performed as efficiently as possible. If not - stay with an RDBMS.
Another reason I recommend ArangoDB is the fact that the storage engine does not limit your data model. You cannot create a geo-index on a 'user.location' field in any of the gremlin-compatible stores for example, as the JSON documents can only have one level of properties.
Thanks for your response, We woud explore the ArangoDB <
Here are some more details if you are wondering
Operation produces many samples(output) from other samples(input). We are traking both Operation and Samples (two graphs i.e one for operation and another for samples), Typical depth is 10 to 20 for both Operation and Samples but some are even deeper(> 20). Operations could be million records(2-3 million) and samples could be (10 to 20 million) records so far over the years. We are using the Closure data model in the dbms to represent the tree/graph data.
API and some power users directly access the data via specific sql(stored procedure and/or special sql sripts). We are open to restrict or enhance the acess pattens further.
We are finding it hard to go upstream/downstream and also merge two tree structures(operations and samples) as depth increaseses
We are finding hard to data mine based on sample or process attributes(some are nesed)
Hard to represent multiple parents to one child.
I am a Microsoft SQL Server programmer who is a bit out of practice. I have been asked to assist on a new project. The overall purpose is to organize a large number of recordings so that they can be searched. I have an enormous music library but my songs are several hours long. I need to include things like time, date and location of the recording. I don't have a problem with the general database design. I have two primary questions:
- I need to use either MySQL or PostgreSQL on a Linux based OS. Which would be better for this application?
- I have not dealt with a sound based data type before. How do I store that and put it in a table? Thank you.
Honestly both databases will do the job just fine. I personally prefer Postgres.
Much more important is how you store the audio. While you could technically use a blob type column, it's really not ideal to be storing audio files which are "several hours long" in a database row. Instead consider storing the audio files in an object store (hosted options include backblaze b2 or aws s3) and persisting the key (which references that object) in your database column.
Hi Erin, Chances are you would want to store the files in a blob type. Both MySQL and Postgres support this. Can you explain a little more about your need to store the files in the database? I may be more effective to store the files on a file system or something like S3. To answer your qustion based on what you are descibing I would slighly lean towards PostgreSQL since it tends to be a little better on the data warehousing side.
I would like to build a medium to large scale app, that has real-time operations and a good authentication system and a secure and fast API. Should I use Django with React only? Or maybe use Django for the API, Node.js for real-time operations and React for the frontend? Any suggestions? Which database should I use with those technologies? Should I use both MySQL / PostgreSQL and MongoDB together? Should I use only MongoDB or MySQL / PostgreSQL? Or is it better to go with both MySQL and PostgreSQL at the same time? Should I use also GraphQL?
TL;DR PostgreSQL + Django + React.js.
A few notes about Django: * Django includes own ORM which is able to work with SQL databases. In this case, you're able to use any SQL storage like a PostgreSQL / MySQL / etc., but you can't use MongoDB. * Django is synchronous web-framework. If you want to use asynchronous operations in the database, you have to choose another tool (aiohttp for Python or fastify.js for Node.js). * Django is stable. You don't need to worry about data consistency, etc. * Django-Rest-Framework is a great library for handling REST API requests. * django-channels is a library for handling WebSocket connections. * GraphQL is a great thing, but it requires additional knowledge for using it. (especially, performance knowledge).
A few notes about Node.js: * You have to choose Node.js web-framework. Node.js includes a lot of web-frameworks like a express.js, hapi.js, fastify.js, etc. * Node.js applications are asynchronous. It can give you additional performance. * You have to know about data consistency inside your own application. * You're able to use MongoDB or any SQL database because npm includes a lot of libraries that can work with databases. * You're able to use GraphQL because Node.js is a better choice for GraphQL. * You don't need to use additional libraries for handling REST and WebSocket connections.
So, my conclusion is using Django + PostgreSQL + React.js. For this stack, you can get more stability. If you need to get more performance, you have to think about some asynchronous languages (like a Node.js).
Take a look at Flask + SQLAlchemy + PostgreSQL + React.js. SQLAlchemy is a better ORM than Django-ORM.
I hope, it's useful for you :)
Best regards, Max
Node.js is a great option for real-time applications, especially in conjunction with Socket.IO.
In terms of databases, I'd go with PostgreSQL. MongoDB has its benefits (schema-less, sharding, map-reduce), but for most CRUD-based apps, it makes sense to store the bulk of your data in a relational database (of which PostgreSQL is the best IMO). You can throw in MongoDB if you have a specific need for it. There's certainly no need to use both MySQL and PostgreSQL.
As for GraphQL, it can be nice to work with since you don't need to predefine specific data endpoints on your backend, instead shifting the power to your frontend in requesting the data it needs. It's also useful for public APIs, when you don't know what data users want (see Github's API). It can be useful at the early stage when you're prototyping and want to be able to fetch data quickly, but certainly isn't necessary.
At BaseDash we use Node.js, ExpressJS, Socket.IO, PostgreSQL, and Sequelize to fit our use case of database management and real-time operations.
I'm building a website where users can participate, like and dislike any given challenge.
Problem : If 10k or 1 million users join the given challenge at a time it can cause a race condition in my database MySQL and in also Redis.
What I want : Aggregating joined participated users, likes and dislikes.
Solution : I'm thinking about using Kafka as a Queue message broker then users event one by one saving into Redis, database and aggregate them.
One problem is also here saving and doing aggregate takes time now; how can I show users they have successfully joined the challenge?
One solution is that when a user joins the challenge I send a request to the Kafka queue then update the current user UI and show a success message (not updating the other users' joined messages to current user because I am not using Websockets)
Other App example Take the same example of https://stackshare.io posts. On posts users can like, dislike and comments.
Estimated users : 1 million Stack : Django, Mysql, Redis and Kafka
- How I can manage these kinds of things?
- How do big tech companies handle this?
- Where am I right or wrong?
- Are there other tools that can help me in this situation?
- I am using locks in Redis when total like, dislike and joined users increment or decrement. Should I be doing this? Is it the same for transactions in MySQL?
I need the best approach to handle this situation that can also be scalable.
Thanks in advance for reading my post and giving me suggestions on this. ☺️
Consider using SQL support on Apache Pinot, which is an online analytic processing datastore which can write complex SQL queries and also join different tables in Pinot with those in other datastores. Pinot enables you to build dashboards for quick analysis and reporting on aggregated data.