What is Amazon Neptune and what are its top alternatives?
Amazon Neptune is a fully managed graph database service that enables customers to build and run applications that work with highly connected datasets. It supports popular graph models Property Graph and W3C's RDF, and provides features like ACID transactions, TinkerPop Gremlin, SPARQL, and more. However, customers have reported limitations such as the high cost associated with large datasets and the complexity of managing the service.
Neo4j: Neo4j is a popular graph database management system that offers high performance, ACID transactions, and a flexible data model with powerful querying capabilities using the Cypher query language. Pros include strong community support, ease of use, and scalability. Cons may include high cost for enterprise features.
OrientDB: OrientDB is a multi-model database management system that combines the power of graphs with document, key/value, object-oriented, and reactive models. Key features include distributed architecture, SQL support, indexing, and multi-master replication. Pros include flexibility in data models, while cons may include a learning curve for newcomers.
ArangoDB: ArangoDB is a multi-model database management system that supports graphs, documents, and key/value models. It offers powerful graph traversal, ACID transactions, and a flexible query language (AQL). Pros include multi-model support and performance, while cons may include a less mature ecosystem compared to other options.
TigerGraph: TigerGraph is a distributed graph database platform that delivers real-time deep link analytics. It offers parallel loading, querying, and computation for fast data processing. Pros include speed and scalability, while cons may include limited community support.
JanusGraph: JanusGraph is an open-source, distributed graph database that supports various storage backend options such as Apache Cassandra, HBase, Google Cloud Bigtable, and more. It offers features like ACID transactions, global secondary indexing, and geo-spatial search. Pros include scalability and flexibility, while cons may include complexity in setup and configuration.
Top Alternatives to Amazon Neptune
- Neo4j
Neo4j stores data in nodes connected by directed, typed relationships with properties on both, also known as a Property Graph. It is a high performance graph store with all the features expected of a mature and robust database, like a friendly query language and ACID transactions. ...
- GraphQL
GraphQL is a data query language and runtime designed and used at Facebook to request and deliver data to mobile and web apps since 2012. ...
- OrientDB
It is an open source NoSQL database management system written in Java. It is a Multi-model database, supporting graph, document, key/value, and object models, but the relationships are managed as in graph databases with direct connections between records. ...
- JanusGraph
It is a scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster. It is a transactional database that can support thousands of concurrent users executing complex graph traversals in real time. ...
- Dgraph
Dgraph's goal is to provide Google production level scale and throughput, with low enough latency to be serving real time user queries, over terabytes of structured data. Dgraph supports GraphQL-like query syntax, and responds in JSON and Protocol Buffers over GRPC and HTTP. ...
- MySQL
The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software. ...
- PostgreSQL
PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions. ...
- MongoDB
MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding. ...
Amazon Neptune alternatives & related posts
- Cypher – graph query language69
- Great graphdb61
- Open source33
- Rest api31
- High-Performance Native API27
- ACID23
- Easy setup21
- Great support17
- Clustering11
- Hot Backups9
- Great Web Admin UI8
- Powerful, flexible data model7
- Mature7
- Embeddable6
- Easy to Use and Model5
- Highly-available4
- Best Graphdb4
- It's awesome, I wanted to try it2
- Great onboarding process2
- Great query language and built in data browser2
- Used by Crunchbase2
- Comparably slow9
- Can't store a vertex as JSON4
- Doesn't have a managed cloud service at low cost1
related Neo4j posts
Hello Stackshare. I'm currently doing some research on real-time reporting and analytics architectures. We have a use case where 1million+ records of users, 4million+ activities, and messages that we want to report against. The start was to present it directly from MySQL, which didn't go well and puts a heavy load on the database. Anybody can suggest something where we feed the data and can report in realtime? Read some articles about ElasticSearch and Kafka https://medium.com/@D11Engg/building-scalable-real-time-analytics-alerting-and-anomaly-detection-architecture-at-dream11-e20edec91d33 EDIT: also considering Neo4j
Google Maps lets "property owners and their authorized representatives" upload indoor maps, but this appears to lack navigation ("wayfinding").
MappedIn is a platform and has SDKs for building indoor mapping experiences (https://www.mappedin.com/) and ESRI ArcGIS also offers some indoor mapping tools (https://www.esri.com/en-us/arcgis/indoor-gis/overview). Finally, there used to be a company called LocusLabs that is now a part of Atrius and they were often integrated into airlines' apps to provide airport maps with wayfinding (https://atrius.com/solutions/personal-experiences/personal-wayfinder/).
I previously worked at Mapbox and while I believe that it's a great platform for building map-based experiences, they don't have any simple solutions for indoor wayfinding. If I were doing this for fun as a side-project and prioritized saving money over saving time, here is what I would do:
Create a graph-based dataset representing the walking paths around your university, where nodes/vertexes represent the intersections of paths, and edges represent paths (literally paths outside, hallways, short path segments that represent entering rooms). You could store this in a hosted graph-based database like Neo4j, Amazon Neptune , or Azure Cosmos DB (with its Gremlin API) and use built-in "shortest path" queries, or deploy a PostgreSQL service with pgRouting.
Add two properties to each edge: one property for the distance between its nodes (libraries like @turf/helpers will have a distance function if you have the latitude & longitude of each node), and another property estimating the walking time (based on the distance). Once you have these values saved in a graph-based format, you should be able to easily query and find the data representation of paths between two points.
At this point, you'd have the routing problem solved and it would come down to building a UI. Mapbox arguably leads the industry in developer tools for custom map experiences. You could convert your nodes/edges to GeoJSON, then either upload to Mapbox and create a Tileset to visualize the paths, or add the GeoJSON to the map on the fly.
*You might be able to use open source routing tools like OSRM (https://github.com/Project-OSRM/osrm-backend/issues/6257) or Graphhopper (instead of a custom graph database implementation), but it would likely be more involved to maintain these services.
- Schemas defined by the requests made by the user75
- Will replace RESTful interfaces63
- The future of API's62
- The future of databases49
- Self-documenting13
- Get many resources in a single request12
- Query Language6
- Ask for what you need, get exactly that6
- Fetch different resources in one request3
- Type system3
- Evolve your API without versions3
- Ease of client creation2
- GraphiQL2
- Easy setup2
- "Open" document1
- Fast prototyping1
- Supports subscription1
- Standard1
- Good for apps that query at build time. (SSR/Gatsby)1
- 1. Describe your data1
- Better versioning1
- Backed by Facebook1
- Easy to learn1
- Hard to migrate from GraphQL to another technology4
- More code to type.4
- Takes longer to build compared to schemaless.2
- No support for caching1
- All the pros sound like NFT pitches1
- No support for streaming1
- Works just like any other API at runtime1
- N+1 fetch problem1
- No built in security1
related GraphQL posts
I just finished the very first version of my new hobby project: #MovieGeeks. It is a minimalist online movie catalog for you to save the movies you want to see and for rating the movies you already saw. This is just the beginning as I am planning to add more features on the lines of sharing and discovery
For the #BackEnd I decided to use Node.js , GraphQL and MongoDB:
Node.js has a huge community so it will always be a safe choice in terms of libraries and finding solutions to problems you may have
GraphQL because I needed to improve my skills with it and because I was never comfortable with the usual REST approach. I believe GraphQL is a better option as it feels more natural to write apis, it improves the development velocity, by definition it fixes the over-fetching and under-fetching problem that is so common on REST apis, and on top of that, the community is getting bigger and bigger.
MongoDB was my choice for the database as I already have a lot of experience working on it and because, despite of some bad reputation it has acquired in the last months, I still believe it is a powerful database for at least a very long list of use cases such as the one I needed for my website
When I joined NYT there was already broad dissatisfaction with the LAMP (Linux Apache HTTP Server MySQL PHP) Stack and the front end framework, in particular. So, I wasn't passing judgment on it. I mean, LAMP's fine, you can do good work in LAMP. It's a little dated at this point, but it's not ... I didn't want to rip it out for its own sake, but everyone else was like, "We don't like this, it's really inflexible." And I remember from being outside the company when that was called MIT FIVE when it had launched. And been observing it from the outside, and I was like, you guys took so long to do that and you did it so carefully, and yet you're not happy with your decisions. Why is that? That was more the impetus. If we're going to do this again, how are we going to do it in a way that we're gonna get a better result?
So we're moving quickly away from LAMP, I would say. So, right now, the new front end is React based and using Apollo. And we've been in a long, protracted, gradual rollout of the core experiences.
React is now talking to GraphQL as a primary API. There's a Node.js back end, to the front end, which is mainly for server-side rendering, as well.
Behind there, the main repository for the GraphQL server is a big table repository, that we call Bodega because it's a convenience store. And that reads off of a Kafka pipeline.
- Great graphdb4
- Great support2
- Open source2
- Multi-Model/Paradigm1
- ACID1
- Highly-available1
- Performance1
- Embeddable1
- Rest api1
- Unstable4
related OrientDB posts
We have an in-house build experiment management system. We produce samples as input to the next step, which then could produce 1 sample(1-1) and many samples (1 - many). There are many steps like this. So far, we are tracking genealogy (limited tracking) in the MySQL database, which is becoming hard to trace back to the original material or sample(I can give more details if required). So, we are considering a Graph database. I am requesting advice from the experts.
- Is a graph database the right choice, or can we manage with RDBMS?
- If RDBMS, which RDMS, which feature, or which approach could make this manageable or sustainable
- If Graph database(Neo4j, OrientDB, Azure Cosmos DB, Amazon Neptune, ArangoDB), which one is good, and what are the best practices?
I am sorry that this might be a loaded question.
related JanusGraph posts
- Graphql as a query language is nice if you like apollo3
- Easy set up2
- Low learning curve2
- Open Source1
- High Performance1
related Dgraph posts
- Sql800
- Free679
- Easy562
- Widely used528
- Open source490
- High availability180
- Cross-platform support160
- Great community104
- Secure79
- Full-text indexing and searching75
- Fast, open, available26
- Reliable16
- SSL support16
- Robust15
- Enterprise Version9
- Easy to set up on all platforms7
- NoSQL access to JSON data type3
- Relational database1
- Easy, light, scalable1
- Sequel Pro (best SQL GUI)1
- Replica Support1
- Owned by a company with their own agenda16
- Can't roll back schema changes3
related MySQL posts
When I joined NYT there was already broad dissatisfaction with the LAMP (Linux Apache HTTP Server MySQL PHP) Stack and the front end framework, in particular. So, I wasn't passing judgment on it. I mean, LAMP's fine, you can do good work in LAMP. It's a little dated at this point, but it's not ... I didn't want to rip it out for its own sake, but everyone else was like, "We don't like this, it's really inflexible." And I remember from being outside the company when that was called MIT FIVE when it had launched. And been observing it from the outside, and I was like, you guys took so long to do that and you did it so carefully, and yet you're not happy with your decisions. Why is that? That was more the impetus. If we're going to do this again, how are we going to do it in a way that we're gonna get a better result?
So we're moving quickly away from LAMP, I would say. So, right now, the new front end is React based and using Apollo. And we've been in a long, protracted, gradual rollout of the core experiences.
React is now talking to GraphQL as a primary API. There's a Node.js back end, to the front end, which is mainly for server-side rendering, as well.
Behind there, the main repository for the GraphQL server is a big table repository, that we call Bodega because it's a convenience store. And that reads off of a Kafka pipeline.
We've been using PostgreSQL since the very early days of Zulip, but we actually didn't use it from the beginning. Zulip started out as a MySQL project back in 2012, because we'd heard it was a good choice for a startup with a wide community. However, we found that even though we were using the Django ORM for most of our database access, we spent a lot of time fighting with MySQL. Issues ranged from bad collation defaults, to bad query plans which required a lot of manual query tweaks.
We ended up getting so frustrated that we tried out PostgresQL, and the results were fantastic. We didn't have to do any real customization (just some tuning settings for how big a server we had), and all of our most important queries were faster out of the box. As a result, we were able to delete a bunch of custom queries escaping the ORM that we'd written to make the MySQL query planner happy (because postgres just did the right thing automatically).
And then after that, we've just gotten a ton of value out of postgres. We use its excellent built-in full-text search, which has helped us avoid needing to bring in a tool like Elasticsearch, and we've really enjoyed features like its partial indexes, which saved us a lot of work adding unnecessary extra tables to get good performance for things like our "unread messages" and "starred messages" indexes.
I can't recommend it highly enough.
- Relational database764
- High availability510
- Enterprise class database439
- Sql383
- Sql + nosql304
- Great community173
- Easy to setup147
- Heroku131
- Secure by default130
- Postgis113
- Supports Key-Value50
- Great JSON support48
- Cross platform34
- Extensible33
- Replication28
- Triggers26
- Multiversion concurrency control23
- Rollback23
- Open source21
- Heroku Add-on18
- Stable, Simple and Good Performance17
- Powerful15
- Lets be serious, what other SQL DB would you go for?13
- Good documentation11
- Scalable9
- Free8
- Reliable8
- Intelligent optimizer8
- Transactional DDL7
- Modern7
- One stop solution for all things sql no matter the os6
- Relational database with MVCC5
- Faster Development5
- Full-Text Search4
- Developer friendly4
- Excellent source code3
- Free version3
- Great DB for Transactional system or Application3
- Relational datanbase3
- search3
- Open-source3
- Text2
- Full-text2
- Can handle up to petabytes worth of size1
- Composability1
- Multiple procedural languages supported1
- Native0
- Table/index bloatings10
related PostgreSQL posts
Our whole DevOps stack consists of the following tools:
- GitHub (incl. GitHub Pages/Markdown for Documentation, GettingStarted and HowTo's) for collaborative review and code management tool
- Respectively Git as revision control system
- SourceTree as Git GUI
- Visual Studio Code as IDE
- CircleCI for continuous integration (automatize development process)
- Prettier / TSLint / ESLint as code linter
- SonarQube as quality gate
- Docker as container management (incl. Docker Compose for multi-container application management)
- VirtualBox for operating system simulation tests
- Kubernetes as cluster management for docker containers
- Heroku for deploying in test environments
- nginx as web server (preferably used as facade server in production environment)
- SSLMate (using OpenSSL) for certificate management
- Amazon EC2 (incl. Amazon S3) for deploying in stage (production-like) and production environments
- PostgreSQL as preferred database system
- Redis as preferred in-memory database/store (great for caching)
The main reason we have chosen Kubernetes over Docker Swarm is related to the following artifacts:
- Key features: Easy and flexible installation, Clear dashboard, Great scaling operations, Monitoring is an integral part, Great load balancing concepts, Monitors the condition and ensures compensation in the event of failure.
- Applications: An application can be deployed using a combination of pods, deployments, and services (or micro-services).
- Functionality: Kubernetes as a complex installation and setup process, but it not as limited as Docker Swarm.
- Monitoring: It supports multiple versions of logging and monitoring when the services are deployed within the cluster (Elasticsearch/Kibana (ELK), Heapster/Grafana, Sysdig cloud integration).
- Scalability: All-in-one framework for distributed systems.
- Other Benefits: Kubernetes is backed by the Cloud Native Computing Foundation (CNCF), huge community among container orchestration tools, it is an open source and modular tool that works with any OS.
Recently we were looking at a few robust and cost-effective ways of replicating the data that resides in our production MongoDB to a PostgreSQL database for data warehousing and business intelligence.
We set ourselves the following criteria for the optimal tool that would do this job: - The data replication must be near real-time, yet it should NOT impact the production database - The data replication must be horizontally scalable (based on the load), asynchronous & crash-resilient
Based on the above criteria, we selected the following tools to perform the end to end data replication:
We chose MongoDB Stitch for picking up the changes in the source database. It is the serverless platform from MongoDB. One of the services offered by MongoDB Stitch is Stitch Triggers. Using stitch triggers, you can execute a serverless function (in Node.js) in real time in response to changes in the database. When there are a lot of database changes, Stitch automatically "feeds forward" these changes through an asynchronous queue.
We chose Amazon SQS as the pipe / message backbone for communicating the changes from MongoDB to our own replication service. Interestingly enough, MongoDB stitch offers integration with AWS services.
In the Node.js function, we wrote minimal functionality to communicate the database changes (insert / update / delete / replace) to Amazon SQS.
Next we wrote a minimal micro-service in Python to listen to the message events on SQS, pickup the data payload & mirror the DB changes on to the target Data warehouse. We implemented source data to target data translation by modelling target table structures through SQLAlchemy . We deployed this micro-service as AWS Lambda with Zappa. With Zappa, deploying your services as event-driven & horizontally scalable Lambda service is dumb-easy.
In the end, we got to implement a highly scalable near realtime Change Data Replication service that "works" and deployed to production in a matter of few days!
- Document-oriented storage828
- No sql593
- Ease of use553
- Fast464
- High performance410
- Free255
- Open source218
- Flexible180
- Replication & high availability145
- Easy to maintain112
- Querying42
- Easy scalability39
- Auto-sharding38
- High availability37
- Map/reduce31
- Document database27
- Easy setup25
- Full index support25
- Reliable16
- Fast in-place updates15
- Agile programming, flexible, fast14
- No database migrations12
- Easy integration with Node.Js8
- Enterprise8
- Enterprise Support6
- Great NoSQL DB5
- Support for many languages through different drivers4
- Schemaless3
- Aggregation Framework3
- Drivers support is good3
- Fast2
- Managed service2
- Easy to Scale2
- Awesome2
- Consistent2
- Good GUI1
- Acid Compliant1
- Very slowly for connected models that require joins6
- Not acid compliant3
- Proprietary query language2
related MongoDB posts
Recently we were looking at a few robust and cost-effective ways of replicating the data that resides in our production MongoDB to a PostgreSQL database for data warehousing and business intelligence.
We set ourselves the following criteria for the optimal tool that would do this job: - The data replication must be near real-time, yet it should NOT impact the production database - The data replication must be horizontally scalable (based on the load), asynchronous & crash-resilient
Based on the above criteria, we selected the following tools to perform the end to end data replication:
We chose MongoDB Stitch for picking up the changes in the source database. It is the serverless platform from MongoDB. One of the services offered by MongoDB Stitch is Stitch Triggers. Using stitch triggers, you can execute a serverless function (in Node.js) in real time in response to changes in the database. When there are a lot of database changes, Stitch automatically "feeds forward" these changes through an asynchronous queue.
We chose Amazon SQS as the pipe / message backbone for communicating the changes from MongoDB to our own replication service. Interestingly enough, MongoDB stitch offers integration with AWS services.
In the Node.js function, we wrote minimal functionality to communicate the database changes (insert / update / delete / replace) to Amazon SQS.
Next we wrote a minimal micro-service in Python to listen to the message events on SQS, pickup the data payload & mirror the DB changes on to the target Data warehouse. We implemented source data to target data translation by modelling target table structures through SQLAlchemy . We deployed this micro-service as AWS Lambda with Zappa. With Zappa, deploying your services as event-driven & horizontally scalable Lambda service is dumb-easy.
In the end, we got to implement a highly scalable near realtime Change Data Replication service that "works" and deployed to production in a matter of few days!
We use MongoDB as our primary #datastore. Mongo's approach to replica sets enables some fantastic patterns for operations like maintenance, backups, and #ETL.
As we pull #microservices from our #monolith, we are taking the opportunity to build them with their own datastores using PostgreSQL. We also use Redis to cache data we’d never store permanently, and to rate-limit our requests to partners’ APIs (like GitHub).
When we’re dealing with large blobs of immutable data (logs, artifacts, and test results), we store them in Amazon S3. We handle any side-effects of S3’s eventual consistency model within our own code. This ensures that we deal with user requests correctly while writes are in process.