What is OpsGenie and what are its top alternatives?
Top Alternatives to OpsGenie
- PagerDuty
PagerDuty is an alarm aggregation and dispatching service for system administrators and support teams. It collects alerts from your monitoring tools, gives you an overall view of all of your monitoring alarms, and alerts an on duty engineer if there's a problem. ...
- VictorOps
VictorOps is a real-time incident management platform that combines the power of people and data to embolden DevOps teams so they can handle incidents as they occur and prepare for the next one. ...
- Jira Service Desk
It lets you receive, track, manage and resolve requests from your team's customers. It is built for IT, support, and internal business teams, it empowers teams to track, prioritize, and resolve service requests, all in one place. ...
- Sentry
Sentry’s Application Monitoring platform helps developers see performance issues, fix errors faster, and optimize their code health. ...
- Healthchecks.io
Healthchecks.io is a monitoring service for your cron jobs, background services and scheduled tasks. It works by listening for HTTP "pings" from your services. You can set up various alert methods: email, Slack, Telegram, PagerDuty, etc. ...
- Bigpanda
Bigpanda helps you manage and respond to ops incidents faster. All your alerts: organized, assignable, trackable, snoozeable, and updated in real-time. ...
- Cronitor
Monitoring systems are often complex and require a strong sysadmin background to properly configure and maintain. Cronitor replaces all this with a simple service that anyone can set up. Receive email/sms notifications if your jobs don't run, run too slow, or finish too quickly. ...
- Squadcast
It is an end-to-end incident response platform that helps tech teams adopt SRE best practices to maximize service reliability, accelerate innovation velocity and deliver outstanding customer experiences. ...
OpsGenie alternatives & related posts
PagerDuty
- Just works54
- Easy configuration23
- Awesome alerting hub14
- Fantastic Alert aggregation and on call management11
- User-customizable alerting modes9
- Awesome tool for alerting and monitoring. Love it4
- Most reliable out of the three and it isn't even close3
- Expensive7
- Ugly UI3
related PagerDuty posts
Our primary source of monitoring and alerting is Datadog. We’ve got prebuilt dashboards for every scenario and integration with PagerDuty to manage routing any alerts. We’ve definitely scaled past the point where managing dashboards is easy, but we haven’t had time to invest in using features like Anomaly Detection. We’ve started using Honeycomb for some targeted debugging of complex production issues and we are liking what we’ve seen. We capture any unhandled exceptions with Rollbar and, if we realize one will keep happening, we quickly convert the metrics to point back to Datadog, to keep Rollbar as clean as possible.
We use Segment to consolidate all of our trackers, the most important of which goes to Amplitude to analyze user patterns. However, if we need a more consolidated view, we push all of our data to our own data warehouse running PostgreSQL; this is available for analytics and dashboard creation through Looker.
Data science and engineering teams at Lyft maintain several big data pipelines that serve as the foundation for various types of analysis throughout the business.
Apache Airflow sits at the center of this big data infrastructure, allowing users to “programmatically author, schedule, and monitor data pipelines.” Airflow is an open source tool, and “Lyft is the very first Airflow adopter in production since the project was open sourced around three years ago.”
There are several key components of the architecture. A web UI allows users to view the status of their queries, along with an audit trail of any modifications the query. A metadata database stores things like job status and task instance status. A multi-process scheduler handles job requests, and triggers the executor to execute those tasks.
Airflow supports several executors, though Lyft uses CeleryExecutor to scale task execution in production. Airflow is deployed to three Amazon Auto Scaling Groups, with each associated with a celery queue.
Audit logs supplied to the web UI are powered by the existing Airflow audit logs as well as Flask signal.
Datadog, Statsd, Grafana, and PagerDuty are all used to monitor the Airflow system.
- The transmogrifier is a game changer7
- Great Team, Great Product6
- Free tier5
- Much better than ANY of the alternatives. Todd is GREAT3
- Great tiered escalation management3
- Android app with Wear integration2
- On-call routing and the timeline is brilliant2
- Awesome Team always updating1
- Nice UI1
related VictorOps posts
- Integration with Jira and Confluence1
related Jira Service Desk posts
Sentry
- Consolidates similar errors and makes resolution easy235
- Email Notifications121
- Open source108
- Slack integration84
- Github integration71
- Easy48
- User-friendly interface44
- The most important tool we use in production28
- Hipchat integration18
- Heroku Integration17
- Good documentation15
- Free tier14
- Easy setup9
- Self-hosted9
- Realiable7
- Provides context, and great stack trace6
- Feedback form on error pages4
- Love it baby4
- Easy Integration3
- Gitlab integration3
- Filter by custom tags3
- Super user friendly3
- Captures local variables at each frame in backtraces3
- Performance measurements1
- Confusing UI12
- Bundle size2
related Sentry posts
For my portfolio websites and my personal OpenSource projects I had started exclusively using React and JavaScript so I needed a way to track any errors that we're happening for my users that I didn't uncover during my personal UAT.
I had narrowed it down to two tools LogRocket and Sentry (I also tried Bugsnag but it did not make the final two). Before I get into this I want to say that both of these tools are amazing and whichever you choose will suit your needs well.
I firstly decided to go with LogRocket the fact that they had a recorded screen capture of what the user was doing when the bug happened was amazing... I could go back and rewatch what the user did to replicate that error, this was fantastic. It was also very easy to setup and get going. They had options for React and Redux.js so you can track all your Redux.js actions. I had a fairly large Redux.js store, this was ended up being a issue, it killed the processing power on my machine, Chrome ended up using 2-4gb of ram, so I quickly disabled the Redux.js option.
After using LogRocket for a month or so I decided to switch to Sentry. I noticed that Sentry was openSorce and everyone was talking about Sentry so I thought I may as well give it a test drive. Setting it up was so easy, I had everything up and running within seconds. It also gives you the option to wrap an errorBoundry in React so get more specific errors. The simplicity of Sentry was a breath of fresh air, it allowed me find the bug that was shown to the user and fix that very simply. The UI for Sentry is beautiful and just really clean to look at, and their emails are also just perfect.
I have decided to stick with Sentry for the long run, I tested pretty much all the JS error loggers and I find Sentry the best.
- Can be self-hosted3
- Great value2
- Free tier2
- Easy to understand2
related Healthchecks.io posts
- User interface, easy setup, analytics, integrations7
- Consolidates many systems into one6
- Correlation engine2
- Quick setup1
related Bigpanda posts
- Quick and helpful support2
- Simple and direct1
- Pricey0
related Cronitor posts
- Easy Configuration2
- Intuitive UI / UX2
- Lots of Integrations2
related Squadcast posts
I'm currently on PagerDuty, but I'm about to add enough users to go out of the starter tier, which will dramatically increase my license cost. PagerDuty is, in my experience, quite clunky, and I'm looking for alternatives. Squadcast is one I've found, and another is xMatters. Between the three, I'm currently leaning towards xMatters, but I'd like to know what people suggest.