This article gives you the quick guideline to use Cloud Alert for the first time you access to the system.
Cloud Alert is the powerful automation platform that consolidate together all of your apps, services and workflows in the single place. It’s also extendable, flexible, and built with love for DevOps and ChatOps.
1. How it works?
The core components of Cloud Alert are the event trigger, event collector, rule engine, ChatOps and Runbook.
Triggers are Cloud Alert constructs that identify the incoming events to Cloud Alert. Rules are written to work with triggers. For example, there is a generic webhook trigger registered with Cloud Alert.
Event collector works as a listener which gets events from the alerting sources (or Event trigger) as Amazon CloudWatch, Azure Monitor, Datadog and so on.
Rule engine provides the flexible way to help users easily define rules by themselves vis criteria, and generate corresponding Runbook actions.
When there is a trigger from rules, Cloud Alert can act as Tier 1 support: It troubleshoots, fixes known problems, and escalates to humans when needed. Be it a silly yet common “when disk is out of space, clean up the logs”, recovering RabbitMQ split-brain, migrating MySQL master, or automating troubleshooting guides for OpenStack or Cassandra… the learning from Facebook, LinkedIn and others is: if you don’t automate, you die.
Cloud Alert supports 03 action types that we can define the actions for Rule setting:
For the common issue, we can define Runbook actions to automated remediate. Runbook is as workflows with multiple actions.
In Cloud Alert, Apache Airflow is used as Runbook modules. For more information, please refer to here.
(2) ChatOps (Collaboration Tools)
For the critical issues, user can define to send message immediately via ChatOps channel as Slack, Skype.
ChatOps brings automation and collaboration together; transforming devops teams to get things done better, faster, and with style. It bring benefits as:
- Expose actions via human-friendly aliases
- Get notifications from rules and workflows
(3) Other issue
For the other issues, you can log request to Ticket system as Service Now.
2. How to get started?
To get started with Cloud Alert:
- Prepare use cases & scenarios
- Configure the event sources (Amazon Event Bridge, Data Dog)
- Set up the ChatOps channel (MS Team, Slack) in the Integration setting
- Define Rules and set the associated actions
- View Action log and respond with critical issues
3. Key Features
- Display total number of alerts by rule matching classified into 3 categories as Incident, ChatOps, Runbook
- Show top 10 recent actions
- Event Source settings: Configure to display the event source by primary field or matching field.
- Alert Source: Configure how to trigger the alert to Cloud Alert from the specific sources.
- How to setup rule: set up rule criteria and associated trigger action
- How to configure the rule criteria: define criteria to match rules
Action Logs: show action logs by period (1h, 1 day, 1 week, 1 month)
ChatOps: send message to the specific collaboration channels
Runbook: setup & configure runbook, matching runbook with Airflow
- Invite user
- Azure AD integration (SSO SAML)
- Setup role