White Papers
-
Open NerveCenter: Downstream Alarm Suppression - Understanding the Technical Details - Perl Subroutines -
The new downstream alarm suppression behavior model uses several Perl subroutines to store parent-child relationships and maintain node statuses. This section includes descriptions of the following Perl subroutines:
The ICMP_ERROR transition calls this Perl subroutine to evaluate the error and determine whether or not it indicates that the node is unreachable. If the ICMP error is Port Unreachable, the node is up and reachable. It is assumed that other ICMP errors indicate an unreachable node. This assumption may be incorrect depending on the behavior of your network. To include other ICMP errors that indicate that the node is unreachable, modify this Perl subroutine.
my $Type = VbValue( 0 );
my $Code = VbValue( 1 );
if( $Type == 3 && $Code == 3 )
{
FireTrigger( "SS_PortUnreach" );
}
else
{
# Modify this else to eliminate other types of
# ICMP errors that are not indicative of an
# unreachable node. The assumption is that if
# SS_IcmpError is fired, we are being told, by
# the network, that the node is unreachable.
FireTrigger( "SS_IcmpError" );
}
For the DwnStrmSnmpStatus and DwnStrmIcmpStatus alarms, all state transitions -- except transitions from Error to Testing -- call one of the following Perl subroutines:
These Perl subroutines update the node status so the node's children can accurately update their statuses based on the node's status.
my $Return;
$Return = NC::SetNodeStatus($NodeName,"Testing");
#If $Return = 0, operation failed
my $Return;
$Return = NC::SetNodeStatus($NodeName,"Down");
#If $Return = 0, operation failed
my $Return;
$Return = NC::SetNodeStatus($NodeName,"Unreachable");
#If $Return = 0, operation failed
my $Return;
$Return = NC::SetNodeStatus($NodeName,"Up");
#If $Return = 0, operation failed
For the DwnStrmSnmpStatus and DwnStrmIcmpStatus alarms, if a node is in a Testing state, the ERROR trigger is fired every time the node is polled and doesn't respond. Each resulting ERROR transition calls the TestParentStatus Perl subroutine.
The TestParentStatus Perl subroutine tests the parent node status and determines the status of the node by doing the following:
If TriggerFlag is set to Testing, TestParentStatus does nothing because TestParentStatus must have more information to make an accurate decision. If the alarm should be in another state, TestParentStatus fires the appropriate trigger to transition the node into that state.
The code for this subroutine is:
# The purpose of this subroutine is to test the parent
# node status and fire the appropriate trigger to take the
# alarm to either down or unreachable. You must make sure
# that all parents are being monitored with the status
my $NodeUpdateTime; # Last time node status was updated
my $LastNodeStatus; # Last node status
my @Parents = (); # Array of parents
my $ParentUpdateTime; # Last time parent node status was updated
my $ParentStatus; # Last parent status
my $ParentNotUpdated = 0; # Remember if we have any parents not updated
#Define all triggers that can be fired
# Get the last node status and update time for this node
($LastNodeStatus,$NodeUpdateTime) = NC::GetNodeStatus($NodeName);
# Get the array of parents for this node
@Parents = NC::GetParents($NodeName);
# Test each parent, if ANY are ok, we assume the node
# is reachable. Parents update time must be past the
# last time the node was updated or we can't assume the
($ParentStatus,$ParentUpdateTime) = NC::GetNodeStatus($Parent);
if( $ParentUpdateTime >= $NodeUpdateTime )
# Using TriggerFlag to store name of trigger to be fired. If any
# parent is found to be up, then the flag will be set to down. If
# all parents are down or unreachable, then the flag will be set
# to unreachable. If no parents are down and at least one parent
# is testing, set flag to testing. Otherwise, it will remain not
# set and we will update the node's current status and time. Testing
# handles the case where one parent is testing and another is
# unreachable. We need to make sure we do not mark the node as
# unreachable until the parent node in testing goes to some final
# state because that state could be agent down which is treated
if( ($ParentStatus eq "Down" || $ParentStatus eq "UnReachable") && $TriggerFlag eq "NotSet")
elsif( $ParentStatus eq "Up" )
elsif( $ParentStatus eq "Testing" && $TriggerFlag ne "Down" )
# Remember that we have at least one parent that hasn't been updated.
# If no parents, assume node is down.
# If I have at least one parent not updated and I do not have
# any Up parents, Set TriggerFlag to testing.
if( $ParentNotUpdated && $TriggerFlag ne "Down" )
if( $TriggerFlag ne "Testing" )
# Fire trigger if node's status should change.
if( $TriggerFlag ne $LastNodeStatus )
For the DwnStrmSnmpStatus and DwnStrmIcmpStatus alarms, if an alarm is in a DeviceDown or Unreachable state, the ERROR trigger is fired every time the node is polled and doesn't respond. Each resulting ERROR transition calls the TestParentSetNode Perl subroutine.
The TestParentSetNode Perl subroutine tests the parent node status and determines the status of the node by doing the following:
If TriggerFlag is set to Testing, TestParentSetNode does nothing because TestParentSetNode must have more information to make an accurate decision. If the alarm should be in another state, TestParentSetNode fires the appropriate trigger to transition the alarm into that state. If the alarm is already in the correct state, TestParentSetNode just refreshes the node update time so the node's children can accurately update their statuses based on the node's status.
The code for this subroutine is:
# The purpose of this subroutine is to test the parent
# node status and, if the node is not in a terminal state
# but should be, fire a trigger to make it so. If the node
# is already in the correct state, just refresh the node
# update time. You must make sure that all parents are
# being monitored with the status alarms.
my $NodeUpdateTime; # Last time node status was updated
my $LastNodeStatus; # Last node status
my @Parents = (); # Array of parents
my $ParentUpdateTime; # Last time parent node status was updated
my $ParentStatus; # Last parent status
my $ParentNotUpdated = 0; # Remember if we have any parents not updated
#Define all triggers that can be fired
# Get the last node status and update time for this node
($LastNodeStatus,$NodeUpdateTime) = NC::GetNodeStatus($NodeName);
# Get the array of parents for this node
@Parents = NC::GetParents($NodeName);
# Test each parent, if any are ok, we assume the node
# is reachable. Parents update time must be past the
# last time the node was updated or we can't assume the
($ParentStatus,$ParentUpdateTime) = NC::GetNodeStatus($Parent);
if( $ParentUpdateTime >= $NodeUpdateTime )
# Using TriggerFlag to store name of trigger to be fired. If any
# parent is found to be up, then the flag will be set to down. If
# all parents are down or unreachable, then the flag will be set
# to unreachable. If no parents are down and at least one parent
# is testing, set flag to testing. Otherwise, it will remain not
# set and we will update the node's current status and time. Testing
# handles the case where one parent is testing and another is
# unreachable. We need to make sure we do not mark the node as
# unreachable until the parent node in testing goes to some final
# state because that state could be agent down which is treated
if( ($ParentStatus eq "Down" || $ParentStatus eq "UnReachable") && $TriggerFlag eq "NotSet" )
elsif( $ParentStatus eq "Up" )
elsif( $ParentStatus eq "Testing" && $TriggerFlag ne "Down" )
# Remember that we have at least one parent that hasn't been updated.
# Node does not have parents so assume down
# If I have at least one parent not updated and I do not have
# any up parents, Set TriggerFlag to testing.
if( $ParentNotUpdated && $TriggerFlag ne "Down" )
if( $TriggerFlag ne "Testing" )
# Fire trigger if node's status should change. Otherwise
# refresh the time for the node's current state.
if( $TriggerFlag ne $LastNodeStatus )
NC::SetNodeStatus($NodeName,$LastNodeStatus);
}
Previous: Alarms | Next: Glossary |
Please send comments or corrections to Information Development | This file was last updated on 10 October © 2002 |