2 Services
There are two management protocol independent EVA services provided, the basic Event and Alarm service and the Log Control service. The basic EVA service provides clients with an API for registering and sending events and alarms. The Log control service provides a mechanism for control of generic logs. Also included is a specialization of the generic log function for logging of events and alarms.
Each service provides client functions that can be used from applications in the system to, for example, send alarms. There is also an API that management applications can use to monitor and control the system. This API can be extended for specific management protocols, such as SNMP or CORBA.
2.1 Basic Event and Alarm service
This service contains functions for the client API to EVA. EVA is a distributed global application, which means that clients can access the EVA functionality from any node.
Clients can register and send events and alarms. Management applications can subscribe to event and alarms, and control the treatment of them.
An event is a notification sent from the NE to a management application. An event is uniquely identified by its name. A special form of an event is an alarm. An alarm represents a fault in the system that needs to be reported to the manager. An example of an alarm could be
equipment_on_fire
. When an alarm is sent, it becomes active, and is stored in an active alarm list. When the application that sent the alarm notices that the fault that caused the alarm is not valid anymore, it clears the alarm. When an alarm is cleared, the alarm is deleted from the active alarm list, and anclear_alarm
event is generated by EVA. Each fault may give rise to several alarms, maybe with different severities. There can however only be one active alarm for each fault at the same time. For example, associated with disk space usage may be two alarms,disk_80_percent_filled
anddisk_90_percent_filled
. These two alarms represents the same fault, but only one of them can be active at the same time. An active alarm is identified by its fault_id. In contrast to alarms, ordinary events do not represent faults, and are not stored as the alarms in the active alarm list.The basic EVA server is a global server to which all events and alarms are sent. The server updates its tables (e.g. the active alarm list), and sends the event or alarm to the
alarm_handler
process that runs on the same node as the global server.alarm_handler
is agen_event
process defined in the SASL application.Before a client can send an event or alarm, the name of the event must be registered in EVA. To register an event, a client calls
register_event/2
. The parameters of this function are the name of the event and whether the event should be logged by default or not. A manager can decide to change this value later. To register an alarm, a client callsregister_alarm/4
. The parameters of this function are the name and logging parameters as for events, and the class and default severity of the alarm.EVA stores the definitions of events and alarms in the Mnesia tables
eventTable
andalarmTable
respectively. Since an alarm is a special form of an event, each alarm is present in both of these tables. The active alarm list is stored in the Mnesia tablealarm
. The records for all these tables are defined in the header fileeva.hrl
, available in theinclude
directory in the distribution.2.1.1 Event Definition Table
All registered events are stored in the
eventTable
. It has the following attributes:
name
log
generated
The event is uniquely identified by its
name
, which is an atom.The
log
attribute is a boolean flag that tells whether this event should be stored in some log when it is generated or not. This attribute is writable.The
generated
attribute is a counter that counts how many times the event has been generated.2.1.2 Alarm Definition Table
The
alarmTable
extends theeventTable
, and has the following attributes:
name
class
severity
The alarm is uniquely identified by its
name
, which is an atom. Note that each alarm is present in theeventTable
as well.The
class
attribute categorizes the alarm, and is defined when the alarm is registered. It is as defined in X.733, ITU Alarm Reporting Function:
communications
. An alarm of this class is principally associated with the procedures or processes required to convey information from one point to another.
qos
. An alarm of this class is principally associated with a degradation in the quality of service.
processing
. An alarm of this class is principally associated with a software or processing fault.
equipment
. An alarm of this class is principally associated with an equipment fault.
environmental
. An alarm of this class is principally associated with a condition relating to an enclosure in with equipment resides.
The
severity
parameter defines five severity levels, which provide an indication of how it is perceived that the capability of the managed object has been affected. Those severity levels which represent service affecting conditions ordered from most severe to least severe arecritical
,major
,minor
andwarning
. The levels used are as defined in X.733, ITU Alarm Reporting Function:
indeterminate
. The Indeterminate severity level indicates that the severity level cannot be determined.
critical
. The Critical severity level indicates that a service affecting condition has occurred and an immediate corrective action is required. Such a severity can be reported, for example, when a managed object becomes totally out of service and its capability must be restored.
major
. The Major severity level indicates that a service affecting condition has developed and an urgent corrective action is required. Such a severity can be reported, for example, when there is a severe degradation in the capability of the managed object and its full capability must be restored.
minor
. The Minor severity level indicates the existence of a non-service affecting fault condition and that corrective action should be taken in order to prevent a more serious (for example, service affecting) fault. Such a severity can be reported, for example, when the detected alarm condition is not currently degrading the capacity of the managed object.
warning
. The Warning severity level indicates the detection of a potential or impending service affecting fault, before any significant effects have been felt. Action should be taken to further diagnose (if necessary) and correct the problem in order to prevent it from becoming a more serious service affecting fault.
When an alarm is cleared, a
clear_alarm
event is generated. This event clears the alarm with thefault_id
contained in the event. It is not required that the clearing of previously reported alarms are reported. Therefore, a managing system cannot assume that the absence of anclear_alarm
event for a fault means that the condition that caused the generation of previous alarms is still present. Managed object definers shall state if, and under which conditions, theclear_alarm
event is used.2.1.3 Active Alarm List
The active alarm list is stored in the ordered Mnesia table
alarm
. The corresponding record is sent to thealarm_handler
when an alarm is sent. It has the following read-only attributes:
index
fault_id
name
sender
cause
severity
time
extra
A row in the active alarm list is uniquely identified by its
fault_id
. However, to make the table ordered, the alarms uses the integerindex
as a key into the table. For each new alarm, EVA allocates a newindex
that is greater than theindex
of all other active alarms.The
name
is the name of the corresponding alarm type, defined inalarmTable
.
sender
is a term that uniquely identifies the resource that generated the alarm.
cause
describes the probable cause of the alarm.
severity
is the perceived severity of the alarm.
time
is the UTC time the alarm was generated.
extra
is any extra information describing the alarm.2.1.4 Event
When an event is generated, the
event
record is sent toalarm_handler
. It has the following attributes:
name
sender
time
extra
The
name
is the name of the corresponding event type, defined ineventTable
.
sender
is a term that uniquely identifies the resource that generated the event.
time
is the UTC time the event was generated.
extra
is any extra information describing the event.2.1.5 Example
As an example of how to register and send events and alarms, consider the following code:
%%%----------------------------------------------------------------- %%% Resource code %%%----------------------------------------------------------------- reg() -> eva:register_event(boardRemoved, true), eva:register_event(boardInserted, false), eva:register_alarm(boardFailure, true, equipment, minor). remove_board(No) -> eva:send_event(boardRemoved, {board, No}, []). insert_board(No, BoardName, BoardType) -> eva:send_event(boardInserted, {board, No}, {BoardName, BoardType}). board_on_fire(No) -> FaultId = eva:get_fault_id(), %% Cause = fire, ExtraParams = [] eva:send_alarm(boardFailure, FaultId, {board, No}, fire, []), FaultId.Two events and one alarm is defined. Board removal is an event that is logged by default, and board insertion is an event that is not logged by default. The alarm
equipmentFailure
is a minor alarm that is logged by default.When the application detects that board
N
is on fire,board_on_fire(N)
is called. This function is responsible for sending the alarm. It gets a new fault identifier for the fault, and callseva:send_alarm/5
, pointing out the faulty board (N
), and suggests that the probable cause for the equipment trouble isfire
.The
board_on_fire
function returns the fault identifier for the new alarm. This fault identifier can be used at a later time in a call toeva:clear_alarm(FaultId)
to clear the alarm.2.2 Log Control service
The Log Control service contains functions for monitoring logs, and functions for transferring logs to remote hosts, e.g. management stations. The main purpose of the Log Control service is to provide one entity through which all logs in the system can be controlled by a management station. Regardless of the type log, all logs are controlled in a similiar fashion.
Clients can register their logs in the log server. Management applications can control the logs, and transfer the logs to a remote host.
2.2.1 Log monitoring
This service uses a log server that monitors all logs in the system. Each log uses the standard module
disk_log
for the actual logging.Each log has an administrative and an operational status, that both can be either
up
ordown
. If the operational status isup
, the log is working, and if it isdown
, the log does not work. The administrative status is writable, and reflects the desired operational status. Normally they are both the same. If the administrative status is set toup
, the operational status will beup
as well. However, if the log for some reason does not work, e.g. if the disk partition is full, the operational status will bedown
. When the operational status is down, no events are logged in the log.2.2.1.1 Alarms
The
Tlog
service defines two EVA alarms;log_file_error
andlog_wrap_too_often
.
log_file_error
. This alarm is generated if a file error occurs when an item is logged. Default severity iscritical
. The cause for this alarm can be anyReason
as returned fromfile:write
in case of error. The alarm is cleared if the file system starts working again. For example, the alarm can be generated if the partition is full, and cleared when space is available.
log_wrap_too_often
. This alarm is generated when the log wraps more often than the wrap time. Default severity ismajor
. The cause for this alarm is undefined. The alarm is cleared if the log wraps within the wrap time, the next time it wraps.
2.2.1.2 Example
The following is an example of code that creates a log to be controlled by the generic Log Control function:
start() -> disk_log:open([{name, "ex_log"}, {file, "ex_log/ex_log.LOG"}, {type, wrap}, {size, {10000, 4}}]), log:open("ex_log", ex_log_type, 3600). test() -> %% Log an item disk_log:log("ex_log", {1, "log this"}), %% Set the administrative status of the log to 'down' log:set_admin_status("ex_log", down), %% Try to log - this one won't be logged disk_log:log("ex_log", {2, "won't be logged"}), Logs1 = log:get_logs(), %% Set the administrative status of the log to 'up' log:set_admin_status("ex_log", up), %% Log an item disk_log:log("ex_log", {3, "log this"}), Logged = disk_log:chunk("ex_log", start), {Logs1, Logged}.2.2.2 Log transfer
It is possible to transfer a log to a remote host. When the log is transferred, the log may be filtered, and the log records may be formatted.
As the logs are implemented as
disk_log
logs, each log consists of several log files. When the log is transferred, it is written to one single file on the remote host. Whendisk_log
is used, the log records are normally not formatted when they are stored in the log, in order to increase log performance. However, a manager will probably need the log formatted in a human readable format. Thus, when the log is being transferred, each log record may be formatted in a log specifc way. Of course, to further increase performance, the log can be transferred as is, and leave it to the managar to format the log off-line.2.3 EVA log service
The EVA log service uses the generic Log Control service to implement log functionality for events and alarms defined in EVA.
In the rest of this description, the term event refers to both events and alarms as defined in EVA.
This log functionality supports logging of events from EVA. It uses the module
disk_log
for logging of events. There can be several event logs active at the same time. It is possible to create new event logs dynamically, either from within an application, or from a management system. Each log uses a filter function to decide whether an event should be stored in the log or not.There is a concept of a default log. The default log is used to log any event that has the
log
flag ineventTable
set totrue
, but no log is currently able to store the event (or there is no other log defined to log the event). The usage of the default log is optional.For example, suppose that we want to define an alarm log, that logs all alarms in the system. We can do this with the following code:
-module(alarm_log). -export([alarm_filter/1, make_alarm_log/0]). alarm_filter(Item) when record(alarm, Item) -> true; alarm_filter(_) -> false. make_alarm_log() -> disk_log:open([{name, "alarm_log"}, {format, internal}, {type, wrap}, {size, {10000, 10}}]), eva_log:open("alarm_log", {alarm_log, alarm_filter, []}, 36000).If we set the administrative status of this log to
down
, and an alarm that should be logged according to its definition in theeventTable
, the alarm is stored in the default log instead of"alarm log"
(provided there are no other logs that are defined to log the alarm).