Getting start with Cloud Alert

This article gives you the quick guideline to use Cloud Alert for the first time you access to the system.

Cloud Alert is the powerful automation platform that consolidate together all of your apps, services and workflows in the single place. It’s also extendable, flexible, and built with love for DevOps and ChatOps.

1. How it works?

The core components of Cloud Alert are the event trigger, event collector, rule engine, ChatOps and Runbook.

Event trigger

Triggers are Cloud Alert constructs that identify the incoming events to Cloud Alert. Rules are written to work with triggers. For example, there is a generic webhook trigger registered with Cloud Alert.

Event Collector

Event collector works as a listener which gets events from the alerting sources (or Event trigger) as Amazon CloudWatch, Azure Monitor, Datadog and so on.

Rule engine

Rule engine provides the flexible way to help users easily define rules by themselves vis criteria, and generate corresponding Runbook actions.

When there is a trigger from rules, Cloud Alert can act as Tier 1 support: It troubleshoots, fixes known problems, and escalates to humans when needed. Be it a silly yet common “when disk is out of space, clean up the logs”, recovering RabbitMQ split-brain, migrating MySQL master, or automating troubleshooting guides for OpenStack or Cassandra… the learning from Facebook, LinkedIn and others is: if you don’t automate, you die.

Cloud Alert supports 03 action types that we can define the actions for Rule setting:

(1) Runbook

For the common issue, we can define Runbook actions to automated remediate. Runbook is as workflows with multiple actions.

In Cloud Alert, Apache Airflow is used as Runbook modules. For more information, please refer to here.

(2) ChatOps (Collaboration Tools)

For the critical issues, user can define to send message immediately via ChatOps channel as Slack, Skype.

ChatOps brings automation and collaboration together; transforming devops teams to get things done better, faster, and with style. It bring benefits as:

  • Expose actions via human-friendly aliases
  • Get notifications from rules and workflows

(3) Other issue

For the other issues, you can log request to Ticket system as Service Now.

2. How to get started?

To get started with Cloud Alert:

3. Key Features

Dashboard

  • Display total number of alerts by rule matching classified into 3 categories as Incident, ChatOps, Runbook
  • Show top 10 recent actions

Event Source

Rule

Action Logs: show action logs by period (1h, 1 day, 1 week, 1 month)

ChatOps: send message to the specific collaboration channels

Runbook: setup & configure runbook, matching runbook with Airflow

User

  • Invite user
  • Azure AD integration (SSO SAML)

Role

  • Setup role

Leave a Reply