Open-Source Network Management Tools

Alexei Roudnev alex at relcom.net
Fri Sep 17 17:53:06 UTC 2004


There is another problem with TRAPS:
- when I code monitoring, I always need 2 messages:
  - CRITICAL
  - REPAIRED

(We have a few scripts making monitoring, and it always started with sending
CRITICAL message only, and ended in sending both messages - it iis
impossible to work without having information _if condition still exists or
not_.)

Unfortunately, no SYSLOG no SNMPTRAP have such positive notifications, which
makes their use very difficult, and limit it to a very small set of really
CRITICAL events.

I have not such problem with POLL:
- poll parameter, draw a chart;
- if parameter override threshold, 'SHORT FAILURE' event raised (no paging,
just show a problem);
- if 'SHORT FAILURE' exists for some time, it is converted into CRITICAL and
send alert;
- when problem fixed, it sends RESTORED message.
(See: ProactiveNetwork system; many opensource systems. Do not see - CA!,
good example of terrible design. BMC is something average.)

As a result, you always can see:
- history of the parameter (so, if it is disk space, easy to understand, how
many time do you have, for example);
- history of events (when it failed and when it restored);
- if someone other work this problem out.

Without it... I receive a message

  ALERT, CRITICAL, server XXX, oid 1.2.3.4.5.6.DELL.RAID.blabla

I do not know (it's impossible) where to look - there is not any parameter
associated with this message.
I do not know, was it short condition (may be, disk was replaced in RAID) or
it still exists (DISK failed now);
In retrospective, manager do not see, how fast it was fixed.

It all makes SNMP TRAPS very unconvenient (not talking about possible lost
of event).




----- Original Message ----- 
From: "Michael Smith" <mksmith at noanet.net>
To: "Alexei Roudnev" <alex at relcom.net>; <nanog at merit.edu>
Sent: Friday, September 17, 2004 10:11 AM
Subject: RE: Open-Source Network Management Tools



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> -----Original Message-----
> From: Alexei Roudnev [mailto:alex at relcom.net]
> Sent: Friday, September 17, 2004 12:53 AM
> To: Michael Smith; nanog at merit.edu
> Subject: Re: Open-Source Network Management Tools
>
> I always tried to avoid any deal with SNMP TRAPS as most unreliable
> and unconvenient way of alerting (unfortunately, it can not be
> avoided totally).
> We use 'syslog' (syslog-ng + home written syslog analyzers +
> copmmercial soft, sometimes) when possible.
>

Unfortunately, SNMP TRAPS are what is available on the SONET
transport side of the network.  There is no useful data to be gotten
from polling.  In addition, the fact that TRAPS are proactive instead
of reactive means I have am immediately aware of network events
whereas I might miss something with a poll.

In addition, we have dry contact closures on these devices that TRAP
only, no polling.  Thankfully, the number of these events is small
enough that syslog functions quite well.

Syslog has not been up to the task of working with the sheer volume
of TRAPS generated when there is a significant event on the optical
network.  Sometimes we see the notification but not the resolution,
sometimes we see all but the last line of a TRAP message, and
sometimes we get nothing.

Thanks,

Mike

-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0.3

iQA/AwUBQUscOZzgx7Y34AxGEQK3oQCgg6bP3O4Pt5GyOPXsi+1tSvLrt2AAnjqs
BeYnYocvvNjP1RqqfH2dq+HT
=JrJP
-----END PGP SIGNATURE-----




More information about the NANOG mailing list