Child pages
  • Fault Management
Skip to end of metadata
Go to start of metadata

Overview

Primary task of the Fault Management is to collect and process events. Events are created as result of network activity, user operations, clients activity, equipment faults etc. Working network can generate thousands of events every minute. FM module allows to collect them, classify, assign priorities, correlate events and automatically determine root cause of failure. System supports life cycle of events ensuring no important events left unnoticed or unhandled.

Terminology

  • Event Class - Meaning of event
  • Event Category - Zone of interest.
  • Event Classification - A process of analyzing event and assigning Event Class
  • Event Correlation
  • Root-cause analyses

Stages of event processing

event_processing

Collection

Event collection performed by Activator. Events are generated by Managed Object and transmitted to Activator encapsulated in Message. Supported message protocols are:

  • SNMP Trap
  • Syslog

Activator check did message arrived from valid Managed Object, collects message, converts it to protocol-independed message format and transmits it to the SAE via SAE RPC stream.

Events are represented as two-column tables. Left column represents key or parameter name. Right column represents value. Each parameter represented by one row (both key and value parts).

Examples:

sourcesyslog
message-- MARK --
severity6
facility5
 or
sourceSNMP Trap
1.3.6.1.6.3.1.1.4.1.01.3.6.1.6.3.1.1.5.1
1.3.6.1.2.1.1.3.07
1.3.6.1.6.3.1.1.4.3.01.3.6.1.4.1.4413.2.10
 

Storing

Protocol-independed messages are received by SAE. SAE does not perform additional message processing and stores messages into Database as soon as possible.

Classification

The task of Event Classification is to determine Event Class and retrieve all Class Variables from the given message. Event Classification helps to understand the very meaning of event. All further event processing based of Event Class and fetched Class Variables.

Events classified by passing via set of Event Classification Rules. Event Classification Ruleconsist of a set of pairs of regular expressions. Rules are evaluated in order of preference.

Event classification performed by noc-classifier (Classifier daemon). Classifier periodically queries database for the unclassified event and performs classification process, each event in separate transaction.

Classification results are written back to database.

Overview of event classification process given in flowchart below:

event_classification

Correlation

Correlation performed by noc-correlator (Correlator) daemon. Correlator works with Event Window (last 10 minutes by default) and Event Correlation Rules to perform corellation.

During correlation process following decisions made:

  • Should Event be silently discarded?
  • Should Event be automatically closed or dropped?
  • Does event automatically close known event in the Event Window?
  • Does event repeat already known event in the Event Window?
  • Does event caused by already known event in the Event Window (Has root cause)?
  • Does event cause already known event in the Event Window (Event is root cause)?

Decisions are made considering Event Variables, extracted by Correlator. Root cause decisions are not permanent. Event may become a Root Cause or receive Root Cause with new events.

Message Status

  • Unclassified
  • Active
  • Closed
  • No labels