Friday, 15 March 2013

Junos/SRX troubleshooting

Some useful commands for determining traffic flow issues on a Juniper SRX (most commands probably apply to other JunOS devices but YMMV):

Firstly a look at some general system information. I've highlighted some useful fields:

SRX> show chassis routing-engine
Routing Engine status:
    Temperature                 36 degrees C / 96 degrees F
    CPU temperature             36 degrees C / 96 degrees F
    Total memory              2048 MB Max  1167 MB used ( 57 percent)
      Control plane memory    1104 MB Max   475 MB used ( 43 percent)
      Data plane memory        944 MB Max   689 MB used ( 73 percent)
    CPU utilization:
      User                       5 percent
      Background                 0 percent
      Kernel                     2 percent
      Interrupt                  0 percent
      Idle                      93 percent
    Model                          RE-SRXSME-SRE6
    Serial ID                      NOTFORYOU
    Start time                     2013-02-21 14:27:54 UTC
    Uptime                         18 days, 4 minutes, 21 seconds
    Last reboot reason             0x200:normal shutdown
    Load averages:                 1 minute   5 minute  15 minute
                                       0.22       0.10       0.08

One thing I usually look at is the interface statistics screen. Your PPS (Packets Per Second) can give you an indication of how busy an interface is:

SRX> monitor interface traffic





















Here are some useful global traffic stats. This firewall is often under maximum throughput load hence a large number of dropped packets!

SRX> show security flow statistics
    Current sessions: 937
    Packets forwarded: 1988275
    Packets dropped: 19483010
    Fragment packets: 29


This command allows you to see a breakdown of your session usage (and the maximum number of supported sessions)

SRX> show security flow session summary
Unicast-sessions: 950
Multicast-sessions: 0
Failed-sessions: 0
Sessions-in-use: 1017
  Valid sessions: 948
  Pending sessions: 0
  Invalidated sessions: 69
  Sessions in other states: 0
Maximum-sessions: 524288

The next command allows you to see your active session numbers over the last 60 seconds, e.g. 0 seconds ago there were 580 active sessions. You can use this to look for any spikes in usage.

SRX> show security monitoring performance session
fpc  0  pic  0
Last 60 seconds:
 0:     580   1:     617   2:     580   3:     555   4:     529   5:     606
 6:     602   7:     661   8:     626   9:     610  10:     570  11:     625
12:     594  13:     613  14:     576  15:     601  16:     581  17:     626
18:     588  19:     626  20:     582  21:     625  22:     596  23:     603
24:     568  25:     633  26:     608  27:     703  28:     642  29:     633
30:     592  31:     617  32:     597  33:     623  34:     593  35:     645
36:     608  37:     677  38:     640  39:     630  40:     587  41:     608
42:     597  43:     612  44:     598  45:     621  46:     586  47:     634
48:     616  49:     633  50:     602  51:     653  52:     622  53:     638
54:     607  55:     619  56:     601  57:     653  58:     615  59:     623

You can also do the same for firewall SPU utilisation (Service Processing Units - These processors perform the majority of the packet processing on the firewall). e.g. 43 seconds ago the SPU was at 86% utilisation

SRX> show security monitoring performance spu
fpc  0  pic  0
Last 60 seconds:
 0:  68   1:  69   2:  70   3:  67   4:  66   5:  67
 6:  69   7:  68   8:  64   9:  70  10:  66  11:  68
12:  70  13:  69  14:  71  15:  65  16:  76  17:  79
18:  70  19:  73  20:  70  21:  75  22:  74  23:  75
24:  72  25:  68  26:  77  27:  70  28:  80  29:  80
30:  87  31:  85  32:  86  33:  85  34:  83  35:  85
36:  87  37:  85  38:  81  39:  76  40:  79  41:  82
42:  82  43:  86  44:  85  45:  84  46:  82  47:  79
48:  70  49:  70  50:  69  51:  71  52:  68  53:  68
54:  69  55:  68  56:  71  57:  68  58:  68  59:  70

This command shows some utilisation stats for the Forwarding process on the firewall (FWDD)

SRX> show chassis forwarding 
FWDD status:
  State                                 Online    
  Microkernel CPU utilization         7 percent
  Real-time threads CPU utilization  34 percent
  Heap utilization                   73 percent
  Buffer utilization                 88 percent
  Uptime:                               14 days, 20 hours, 28 minutes, 50 seconds

ALGs (Application Layer Gateways) can have a significant performance impact as well as possibly introducing unexpected behaviour with the particular protocol they are examining. Check whether they are enabled using the command below. More information on ALGs here:

http://www.juniper.net/techpubs/en_US/junos12.1/topics/concept/alg-security-overview.html

SRX> show security alg status 
ALG Status :
  DNS      : Enabled
  FTP      : Disabled
  H323     : Enabled
  MGCP     : Enabled
  MSRPC    : Enabled
  PPTP     : Enabled
  RSH      : Enabled
  RTSP     : Enabled
  SCCP     : Enabled
  SIP      : Enabled
  SQL      : Enabled
  SUNRPC   : Enabled
  TALK     : Enabled
  TFTP     : Disabled
  IKE-ESP  : Enabled

Sometimes you need a packet capture to really be able to see what's going on.
Most of the time you'll want to capture the traffic in a file:

SRX# set security flow traceoptions file file-name

Set a maximum file size according to your needs with this command:

SRX# set security flow traceoptions file size ?
Possible completions:
  <size>               Maximum trace file size (10240..1073741824)

You can chose what types of packet you want to capture with this command, "basic-datapath" is recommended for most flow captures.

SRX# set security flow traceoptions flag [all/basic-datapath/packet-drops]

Use a filter to reduce the amount of traffic captured:

SRX# set security flow traceoptions packet-filter f1 source-prefix 172.16.0.0/12


Once you have issued a 'commit' the traffic capture will begin and your output file can be found in /var/log ( you can use the command "show log filename" to view it) with output similar to that below:

Feb 26 11:47:02 11:46:58.483266:CID-0:RT:<10.140.150.160/24387->10.2.2.221/49535;6> matched filter matchfilter:

Feb 26 11:47:02 11:46:58.483290:CID-0:RT:packet [40] ipid = 42687, @41038f9c

Feb 26 11:47:02 11:46:58.483307:CID-0:RT:---- flow_process_pkt: (thd 9): flow_ctxt type 15, common flag 0x0, mbuf 0x41038d80, rtbl_idx = 0


If you would like to view traffic in real-time on the console you can use the monitor command as below, however this will only show you traffic destined for or originating from the firewall itself, i.e. you cannot use that command to see traffic passing though the firewall.

SRX> monitor traffic interface ge-/2/0/0 
verbose output suppressed, use <detail> or <extensive> for full protocol decode
Address resolution is ON. Use <no-resolve> to avoid any reverse lookup delay.
Address resolution timeout is 4s.
Listening on ge-2/0/0, capture size 96 bytes

10:36:05.383718 Out LACPv1, length 60
10:36:05.583595  In LACPv1, length 60
10:36:06.386069 Out LACPv1, length 60





No comments:

Post a Comment