Thursday, August 12, 2010

A Mini-MRTG On Router

Chalk Talk July
Helping to solve the age-old problem "The network is slow"
During my time as a network administrator, I cannot tell you how often someone has told me “the network is slow.” Since the network is the big magic box, traffic is injected at one end and mysteriously appears at the other. If there is ever a performance issue with an application or service, the network is to blame. As a network administrator, you are now obligated to prove or disprove the performance of the infrastructure. Anyone that has been unfortunate enough to find themselves in this situation, understands what a painstaking process it is to capture, correlate, analyze, scrutinize, and justify your findings.

What if you could direct them to a site that had real-time network/application performance statistics? Using Embedded Event Manager (EEM) and Tcl (Tool Command Language) you have the capability to create a Web page that can display the current statistical information gathered from IP Service Level Agreement (IPSLA) on how a specific application is behaving. This may not alleviate all the troubleshooting situations, but it can certainly be a good start.

Let’s consider the following situation, where a user at a remote location is experiencing application slowness. In this example, the client is accessing a TFTP server at the central data center and is noticing slow performance. Although we are using a TFTP application, this could be any application or service running on your network.

The following network diagram will be used:

We begin by configuring the “IP-SLA Responder” router as follows from configuration mode:

Configure the IP-SLA responder:
ip sla responder

In order to provide adequate Quality of Server (QoS) for this application, we’ll create a policy and apply it to the serial interface. This policy is very rudimentary and used as an example only:
class-map match-any APP
match access-group name APP
policy-map POL-MAP
class APP
bandwidth percent 20
class class-default
ip access-list extended APP
permit udp any eq tftp any
interface Serial0/0/0
service-policy output POL-MAP

Next, we’ll configuring the “IP-SLA Sender” router as follows from configuration mode:

Configure the IP-SLA sender to match the traffic type of the application. Using a udp-echo on port 69, directed to the loopback address of the “IP-SLA Responder” router will simulate application traffic:

ip sla 10
udp-echo 69
ip sla schedule 10 life forever start-time now

Create a QoS policy and apply it to the outgoing interface:
class-map match-any APP
match access-group name APP
policy-map POL-MAP
class APP
bandwidth percent 20
class class-default
ip access-list extended APP
permit udp any any eq tftp
interface Serial0/0/0
service-policy output POL-MAP

We’ll use the following script to collect IP-SLA information every 60 seconds and store that information to a file on the router.

Using CRON, the script will run every 1 minute.
::cisco::eem::event_register_timer cron name add-dr cron_entry "* * * * *" maxrun 59
namespace import ::cisco::eem::*
namespace import ::cisco::lib::*

Prepare to collect IP-SLA information from the router.
if [catch {cli_open} RESULT] {
error $RESULT $errorInfo
} else {
array set cli1 $RESULT
if [catch {cli_exec $cli1(fd) "en"} RESULT] {
error $RESULT $errorInfo

Collect the sla statistics and store it in the result in the variable “RTT”.
if [catch {cli_exec $cli1(fd) "show ip sla statistics 10" } RESULT] {
error $RESULT $errorInfo
} else {
if [catch {cli_close $cli1(fd) $cli1(tty_id)} RESULT] {
error $RESULT $errorInfo }

If the string length is less than 10 characters, it's not a valid sample.
if {[string length $RTT] > 10} {
# locate the RTT value
set BEGIN [expr [string first "Latest RTT: " $RTT] + 12]
set END [expr [string first "milliseconds" $RTT $BEGIN] - 2]

If the END value is less than 0 it indicates that the time is unknown, in this case, set the value to 0.
if {$END < 0} {
set RTT_Data 0
} else {
# store the valid data to the variable
set RTT_Data [string range $RTT $BEGIN $END]

Locate the time of the last sample.
set BEGIN [expr [string first "start time: " $RTT] + 12]
set END [expr [string first "." $RTT $BEGIN] - 1]

If the END value is less than 0 it indicates that the time is unknown, in this case, set the value to 0.
if {$END < 0} {
set Clock_Data 0
} else {
Store the valid data to the variable “CLOCK_Data”
set CLOCK_Data [string range $RTT $BEGIN $END]

Format the string with name and value.
set XML_Data " \n"

If the file doesn't exist, create it.
if {[catch {set FILE [open flash:/TCL/Data.xml RDONLY]} err]} {
set FILE [open flash:/TCL/Data.xml WRONLY]
close $FILE
set FILE [open flash:/TCL/Data.xml RDONLY]
} else {

Write the data to a variable called TEMP_Data.
set TEMP_Data [read $FILE]
close $FILE
set TEMP_Data [concat $TEMP_Data $XML_Data]

Delete the file.
file delete -force flash:/TCL/Data.xml

Open the file for writing
set FILE [open flash:/TCL/Data.xml WRONLY]

With 4 elements per line we are only interested in seeing 1 hours worth of information, or 240 elements.
if {[llength $TEMP_Data] > 240} {
set TEMP_Data [lrange $TEMP_Data [expr [llength $TEMP_Data] - 240] [llength $TEMP_Data]]

Store the information to the file and close it.
puts $FILE $TEMP_Data
close $FILE

The information has been collected and stored to a file on the router. We now need to create another script that will allow a user to easily view the information. To leverage some of the previous work, we’ll use the web server script from the book “Tcl Scripting in Cisco IOS” as the foundation, but we’ll have to write some code to display the information that we are interested in seeing:

Open the Data.xml file for reading and store the information in a variable called “TEMP_Data”.
set FILE [open flash:/TCL/Data.xml RDONLY]
set TEMP_Data [read $FILE]
close $FILE

In order to display the information correctly, we must add several parameters to the beginning and end of the file.
set TEMP_Data [concat "decimalPrecision='0' formatNumberScale='0'>" $TEMP_Data]
set TEMP_Data [concat $TEMP_Data ""]

Delete the old file.
file delete -force flash:/TCL/Graph.xml

Open the file and write the information.
set FILE [open flash:/TCL/Graph.xml WRONLY]
puts $FILE $TEMP_Data
close $FILE

The section below uses the file “Graph.xml” to display the line chart.
########## TO put original

set httpheader "HTTP/1.1 200 OK
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: binary

Display the information.
puts $httpsock $httpheader$header$middle

Prepare the router to run the Tcl scripts.

Create a directory on a local storage device, in this case, we are using the local flash. Copy the Tcl files and the FCF_Line.swf file to the directory.
Make the directory:
IPSLA-Sender#mkdir flash:/TCL
Create directory filename [TCL]?
Created dir flash:/TCL

Copy the files using your preferred method. The results should be as follows:

IPSLA-Sender#dir flash:/TCL
Directory of flash:/TCL/

30 -rw- 1522 Jun 22 2010 22:12:08 -07:00 IPSLA_Viewer.tcl
31 -rw- 5860 Jun 17 2010 15:32:52 -07:00 Web_Server.tcl
32 -rw- 19933 Jun 17 2010 16:06:10 -07:00 FCF_Line.swf
33 -rw- 2526 Jun 23 2010 21:38:46 -07:00 IPSLA_Collector.tcl

Using configuration mode, add the following commands to the IPSLA-Sender router:

Specify the location of the scripts:
event manager directory user policy "flash:/TCL/"

The next line is critical to allow multiple scripts to run simultaneously.
event manager scheduler script thread class default number 10
event manager policy Web_Server.tcl type user
event manager policy IPSLA_Collector.tcl type user

The IPSLA_Collector.tcl script will run using CRON, but we will manually start the Web_Server script as follows: (Note: This is not entered in configuration mode.)
IPSLA-Sender#event manager run Web_Server.tcl

The client can now view the IP-SLA information by accessing the following link: and should see a similar display:

• Note: The Web_Server.tcl script was modified to use port 8082.

Below are some helpful troubleshooting commands:

IPSLA-Sender#show event manager policy registered
No. Class Type Event Type Trap Time Registered Name
1 script user none Off Tue Jun 22 21:27:18 2010 Web_Server.tcl
policyname {Web_Server.tcl} sync {yes}
nice 0 queue-priority normal maxrun 31536000.000 scheduler rp_primary

2 script user timer cron Off Tue Jun 22 22:56:18 2010 IPSLA_Collector.tcl
name {add-dr} cron entry {* * * * *}
nice 0 queue-priority normal maxrun 59.000 scheduler rp_primary

IPSLA-Sender#show ip sla statistics 10
IPSLAs Latest Operation Statistics

IPSLA operation id: 10
Type of operation: udp-echo
Latest RTT: 99 milliseconds
Latest operation start time: 22:58:17.031 SUMMER Tue Jun 22 2010
Latest operation return code: OK
Number of successes: 32
Number of failures: 0
Operation time to live: Forever

IPSLA-Sender#show policy-map interface s0/0/0

Service-policy output: POL-MAP

Class-map: APP (match-any)
93 packets, 1488 bytes
30 second offered rate 0 bps, drop rate 0 bps
Match: access-group name APP
93 packets, 1488 bytes
30 second rate 0 bps
queue limit 64 packets
(queue depth/total drops/no-buffer drops) 0/0/0
(pkts output/bytes output) 93/4464
bandwidth 20% (51 kbps)

Class-map: class-default (match-any)
68818 packets, 60267721 bytes
30 second offered rate 4810000 bps, drop rate 4644000 bps
Match: any

queue limit 64 packets
(queue depth/total drops/no-buffer drops) 44/22966/0
(pkts output/bytes output) 45852/30323948

Any user has the capability to access a web page on a router to determine the current status of an application at a particular location. The remote router (IP-SLA Sender) is configured to send UDP probes using IP-SLA. The script IPSLA_Collector.tcl runs every 60 seconds and gathers critical information from the “show ip sla statistics 10” command and stores that information in a local file. The “Web_Server.tcl” script acts as a web server for the router. The IPSLA_Viewer.tcl script reads the collected information from the Data.xml file created from the IPSLA_Collector.tcl script, reformats the data, saves it as Graph.xml, and displays the information using the FusionCharts ( charting software. The responsibility of the IP-SLA Responder router is simply to respond back to requests from the sender.

This example gives you the basics of how to provide information to the general population on what is happening on the network, including specific application behavior. You have the capability to gather statistical information from IP-SLA or any other method of your choosing, store it and save it for future viewing. Although this may not completely eliminate the questions regarding the slowness of the network, it is a great start.


Wednesday, August 11, 2010