Need advice about which tool to choose?Ask the StackShare community!
AWS CloudTrail vs ELK: What are the differences?
AWS CloudTrail is a service that logs and monitors AWS API activity, providing visibility into user actions and changes made in the AWS environment. ELK, on the other hand, is an open-source stack that combines Elasticsearch, Logstash, and Kibana for log management, analysis, and visualization. Here are the key differences between AWS CloudTrail and ELK:
Purpose and Scope: AWS CloudTrail is a fully managed service by AWS that focuses on auditing and tracking activities within an AWS environment. It records API calls made by users, services, or other AWS resources and delivers log files to Amazon S3 or CloudWatch Logs for further analysis. In contrast, ELK is an open-source stack that combines Elasticsearch, Logstash, and Kibana to form a robust log management and analytics platform. ELK is more general-purpose and can be used to collect, process, and analyze logs from various sources, not limited to AWS.
Deployment and Infrastructure: AWS CloudTrail is a managed service, which means users don't need to set up and maintain the infrastructure for log collection and storage. It is seamlessly integrated with other AWS services, simplifying setup and configuration. On the other hand, ELK is a self-hosted solution, requiring users to deploy and manage their Elasticsearch, Logstash, and Kibana instances. While ELK offers more control over the infrastructure and data, it also entails additional maintenance efforts and operational responsibilities.
Data Collection and Integration: CloudTrail is tightly integrated with AWS services and automatically records API activities for supported services within the AWS ecosystem. It provides detailed information about changes to resources, security events, and user actions in the AWS environment. ELK, being a general-purpose log management solution, can collect logs from a wide range of sources, including AWS, on-premises systems, applications, and third-party services.
Log Analysis and Visualization: AWS CloudTrail primarily focuses on auditing and tracking AWS API activity. While it provides basic log search and filtering capabilities, its primary use case is for compliance and governance purposes. On the other hand, ELK is a versatile log analysis and visualization platform. Elasticsearch, as the core component, offers powerful full-text search and indexing capabilities, enabling fast and efficient log querying. Kibana provides a user-friendly interface for visualizing log data through customizable dashboards and charts, making it easier for users to gain insights and detect patterns in the data.
In summary, AWS CloudTrail is a managed service specialized in auditing and monitoring AWS API activity, whereas ELK is an open-source stack used for general log management and analytics across various data sources.
We would like to detect unusual config changes that can potentially cause production outage.
Such as, SecurityGroup new allow/deny rule, AuthZ policy change, Secret key/certificate rotation, IP subnet add/drop. The problem is the source of all of these activities is different, i.e., AWS IAM, Amazon EC2, internal prod services, envoy sidecar, etc.
Which of the technology would be best suitable to detect only IMP events (not all activity) from various sources all workload running on AWS and also Splunk Cloud?
For continuous monitoring and detecting unusual configuration changes, I would suggest you look into AWS Config.
AWS Config enables you to assess, audit, and evaluate the configurations of your AWS resources. Config continuously monitors and records your AWS resource configurations and allows you to automate the evaluation of recorded configurations against desired configurations. Here is a list of supported AWS resources types and resource relationships with AWS Config https://docs.aws.amazon.com/config/latest/developerguide/resource-config-reference.html
Also as of Nov, 2019 - AWS Config launches support for third-party resources. You can now publish the configuration of third-party resources, such as GitHub repositories, Microsoft Active Directory resources, or any on-premises server into AWS Config using the new API. Here is more detail: https://docs.aws.amazon.com/config/latest/developerguide/customresources.html
If you have multiple AWS Account in your organization and want to detect changes there: https://docs.aws.amazon.com/config/latest/developerguide/aggregate-data.html
Lastly, if you already use Splunk Cloud in your enterprise and are looking for a consolidated view then, AWS Config is supported by Splunk Cloud as per their documentation too. https://aws.amazon.com/marketplace/pp/Splunk-Inc-Splunk-Cloud/B06XK299KV https://aws.amazon.com/marketplace/pp/Splunk-Inc-Splunk-Cloud/B06XK299KV
While it won't detect events as they happen a good stop gap would be to define your infrastructure config using terraform. You can then periodically run the terraform config against your environment and alert if there are any changes.
Consider using a combination of Netflix Security Monkey and AWS Guard Duty.
You can achieve automated detection and alerting, as well as automated recovery based on policies with these tools.
For instance, you could detect SecurityGroup rule changes that allow unrestricted egress from EC2 instances and then revert those changes automatically.
It's unclear from your post whether you want to detect events within the Splunk Cloud infrastructure or if you want to detect events indicated in data going to Splunk using the Splunk capabilities. If the latter, then Splunk has extremely rich capabilities in their query language and integrated alerting functions. With Splunk you can also run arbitrary Python scripts in response to certain events, so what you can't analyze and alert on with native functionality or plugins, you could write code to achieve.
Well there are clear advantages of using either tools, it all boils down to what exactly are you trying to achieve with this i.e do you want to proactive monitoring or do you want debug an incident/issue. Splunk definitely is superior in terms of proactively monitoring your logs for unusal events, but getting the cloudtrail logs across to splunk would require some not so straight forward setup (Splunk has a blueprint for this setup which uses AWS kinesis/Firehose). Cloudtrail on the other had is available out of the box from AWS, the setup is quite simple and straight forward. But analysing the log could require you setup Glue crawlers and you might have to use AWS Athena to run SQL Like query.
Refer: https://docs.aws.amazon.com/athena/latest/ug/cloudtrail-logs.html
In my personal experience the cost/effort involved in setting up splunk is not worth it for smaller workloads, whereas the AWS Cloudtrail/Glue/Athena would be less expensive setup(comparatively).
Alternatively you could look at something like sumologic, which has better integration with cloudtrail as opposed to splunk. Hope that helps.
I'd recommend using CloudTrail, it helped me a lot. But depending on your situation I'd recommed building a custom solution(like aws amazon-ssm-agent) which on configuration change makes an API call and logs them in grafana or kibana.
Pros of AWS CloudTrail
- Very easy setup7
- Good integrations with 3rd party tools3
- Very powerful2
- Backup to S32
Pros of ELK
- Open source13
- Can run locally3
- Good for startups with monetary limitations3
- External Network Goes Down You Aren't Without Logging1
- Easy to setup1
- Json log supprt0
- Live logging0
Sign up to add or upvote prosMake informed product decisions
Cons of AWS CloudTrail
Cons of ELK
- Elastic Search is a resource hog5
- Logstash configuration is a pain3
- Bad for startups with personal limitations1