Tuesday 8 October 2013

IPSec VPNs and IKE

One to watch out for when you're setting up a site-site VPN - IKE versions. You may get very little in the way of debug output from your VPN gateways/endpoints if your IKE versions don't match, it may just be a NO-PROPOSAL-CHOSEN notify message from one of the gateways. But the sure way to check for a mis-match is (as always) with a packet capture:

Check the IKE version in the INIT packet:

















And then the IKE version in any corresponding reply:














This particular VPN was a site-site VPN for an Azure Virtual Network to a Palo Alto firewall and the mistake that we'd made was to use the "Dynamic Routing" option when creating the VPN Gateway within Azure.
With dynamic routing set, Azure defaults to using IKEv2 but with static routing it will switch to IKEv1 and your Phase 1 and Phase 2 SAs will come alive as if by magic!
Further info on Azure VPN parameters here :
http://msdn.microsoft.com/en-us/library/windowsazure/jj156075.aspx

Update:

Somewhere between the previous version of PANOS and what we are currently running (v 5.0.8) the log messages on our Palo Alto have been updated to be clearer when it comes to IKE version mismatches. What we see now is the following output in the system logs :

- 0:10.11.12.1[500] - 13.14.15.16[500]:0xf34075b8:unknown ikev2 peer

Further Update:

After recently re-visiting Azure I hit the same problem as previously but now in a different manner. Azure also supports the use of Point to Site VPNs which you can setup at the same time as a Site to Site VPN when creating a new Virtual Network.



However, if you chose to create both at the same time, at the point you come to create a Gateway you will not have the option of choosing between Static and Dynamic routing. This is because Point to Site VPNs on Azure only support Dynamic routing so the Gateway will default to Dynamic routing.

Friday 15 March 2013

Junos/SRX troubleshooting

Some useful commands for determining traffic flow issues on a Juniper SRX (most commands probably apply to other JunOS devices but YMMV):

Firstly a look at some general system information. I've highlighted some useful fields:

SRX> show chassis routing-engine
Routing Engine status:
    Temperature                 36 degrees C / 96 degrees F
    CPU temperature             36 degrees C / 96 degrees F
    Total memory              2048 MB Max  1167 MB used ( 57 percent)
      Control plane memory    1104 MB Max   475 MB used ( 43 percent)
      Data plane memory        944 MB Max   689 MB used ( 73 percent)
    CPU utilization:
      User                       5 percent
      Background                 0 percent
      Kernel                     2 percent
      Interrupt                  0 percent
      Idle                      93 percent
    Model                          RE-SRXSME-SRE6
    Serial ID                      NOTFORYOU
    Start time                     2013-02-21 14:27:54 UTC
    Uptime                         18 days, 4 minutes, 21 seconds
    Last reboot reason             0x200:normal shutdown
    Load averages:                 1 minute   5 minute  15 minute
                                       0.22       0.10       0.08

One thing I usually look at is the interface statistics screen. Your PPS (Packets Per Second) can give you an indication of how busy an interface is:

SRX> monitor interface traffic





















Here are some useful global traffic stats. This firewall is often under maximum throughput load hence a large number of dropped packets!

SRX> show security flow statistics
    Current sessions: 937
    Packets forwarded: 1988275
    Packets dropped: 19483010
    Fragment packets: 29


This command allows you to see a breakdown of your session usage (and the maximum number of supported sessions)

SRX> show security flow session summary
Unicast-sessions: 950
Multicast-sessions: 0
Failed-sessions: 0
Sessions-in-use: 1017
  Valid sessions: 948
  Pending sessions: 0
  Invalidated sessions: 69
  Sessions in other states: 0
Maximum-sessions: 524288

The next command allows you to see your active session numbers over the last 60 seconds, e.g. 0 seconds ago there were 580 active sessions. You can use this to look for any spikes in usage.

SRX> show security monitoring performance session
fpc  0  pic  0
Last 60 seconds:
 0:     580   1:     617   2:     580   3:     555   4:     529   5:     606
 6:     602   7:     661   8:     626   9:     610  10:     570  11:     625
12:     594  13:     613  14:     576  15:     601  16:     581  17:     626
18:     588  19:     626  20:     582  21:     625  22:     596  23:     603
24:     568  25:     633  26:     608  27:     703  28:     642  29:     633
30:     592  31:     617  32:     597  33:     623  34:     593  35:     645
36:     608  37:     677  38:     640  39:     630  40:     587  41:     608
42:     597  43:     612  44:     598  45:     621  46:     586  47:     634
48:     616  49:     633  50:     602  51:     653  52:     622  53:     638
54:     607  55:     619  56:     601  57:     653  58:     615  59:     623

You can also do the same for firewall SPU utilisation (Service Processing Units - These processors perform the majority of the packet processing on the firewall). e.g. 43 seconds ago the SPU was at 86% utilisation

SRX> show security monitoring performance spu
fpc  0  pic  0
Last 60 seconds:
 0:  68   1:  69   2:  70   3:  67   4:  66   5:  67
 6:  69   7:  68   8:  64   9:  70  10:  66  11:  68
12:  70  13:  69  14:  71  15:  65  16:  76  17:  79
18:  70  19:  73  20:  70  21:  75  22:  74  23:  75
24:  72  25:  68  26:  77  27:  70  28:  80  29:  80
30:  87  31:  85  32:  86  33:  85  34:  83  35:  85
36:  87  37:  85  38:  81  39:  76  40:  79  41:  82
42:  82  43:  86  44:  85  45:  84  46:  82  47:  79
48:  70  49:  70  50:  69  51:  71  52:  68  53:  68
54:  69  55:  68  56:  71  57:  68  58:  68  59:  70

This command shows some utilisation stats for the Forwarding process on the firewall (FWDD)

SRX> show chassis forwarding 
FWDD status:
  State                                 Online    
  Microkernel CPU utilization         7 percent
  Real-time threads CPU utilization  34 percent
  Heap utilization                   73 percent
  Buffer utilization                 88 percent
  Uptime:                               14 days, 20 hours, 28 minutes, 50 seconds

ALGs (Application Layer Gateways) can have a significant performance impact as well as possibly introducing unexpected behaviour with the particular protocol they are examining. Check whether they are enabled using the command below. More information on ALGs here:

http://www.juniper.net/techpubs/en_US/junos12.1/topics/concept/alg-security-overview.html

SRX> show security alg status 
ALG Status :
  DNS      : Enabled
  FTP      : Disabled
  H323     : Enabled
  MGCP     : Enabled
  MSRPC    : Enabled
  PPTP     : Enabled
  RSH      : Enabled
  RTSP     : Enabled
  SCCP     : Enabled
  SIP      : Enabled
  SQL      : Enabled
  SUNRPC   : Enabled
  TALK     : Enabled
  TFTP     : Disabled
  IKE-ESP  : Enabled

Sometimes you need a packet capture to really be able to see what's going on.
Most of the time you'll want to capture the traffic in a file:

SRX# set security flow traceoptions file file-name

Set a maximum file size according to your needs with this command:

SRX# set security flow traceoptions file size ?
Possible completions:
  <size>               Maximum trace file size (10240..1073741824)

You can chose what types of packet you want to capture with this command, "basic-datapath" is recommended for most flow captures.

SRX# set security flow traceoptions flag [all/basic-datapath/packet-drops]

Use a filter to reduce the amount of traffic captured:

SRX# set security flow traceoptions packet-filter f1 source-prefix 172.16.0.0/12


Once you have issued a 'commit' the traffic capture will begin and your output file can be found in /var/log ( you can use the command "show log filename" to view it) with output similar to that below:

Feb 26 11:47:02 11:46:58.483266:CID-0:RT:<10.140.150.160/24387->10.2.2.221/49535;6> matched filter matchfilter:

Feb 26 11:47:02 11:46:58.483290:CID-0:RT:packet [40] ipid = 42687, @41038f9c

Feb 26 11:47:02 11:46:58.483307:CID-0:RT:---- flow_process_pkt: (thd 9): flow_ctxt type 15, common flag 0x0, mbuf 0x41038d80, rtbl_idx = 0


If you would like to view traffic in real-time on the console you can use the monitor command as below, however this will only show you traffic destined for or originating from the firewall itself, i.e. you cannot use that command to see traffic passing though the firewall.

SRX> monitor traffic interface ge-/2/0/0 
verbose output suppressed, use <detail> or <extensive> for full protocol decode
Address resolution is ON. Use <no-resolve> to avoid any reverse lookup delay.
Address resolution timeout is 4s.
Listening on ge-2/0/0, capture size 96 bytes

10:36:05.383718 Out LACPv1, length 60
10:36:05.583595  In LACPv1, length 60
10:36:06.386069 Out LACPv1, length 60





Juniper dual partitions

To show the current active and booted partitions:


my_ex4200> show system storage partitions
fpc0:
----------------------------------------------------------------------
Boot Media: internal (da0)
Active Partition: da0s2a                <--------- Active
Backup Partition: da0s1a
Currently booted from: backup (da0s1a)  <--------- Booted

Partitions information:
  Partition  Size   Mountpoint
  s1a        184M   /
  s2a        183M   altroot
  s3d        369M   /var/tmp
  s3e        123M   /var
  s4d        62M    /config
  s4e               unused (backup config)

fpc1:
----------------------------------------------------------------------
Boot Media: internal (da0)
Active Partition: da0s2a
Backup Partition: da0s1a
Currently booted from: backup (da0s1a)

Partitions information:
  Partition  Size   Mountpoint
  s1a        184M   /
  s2a        183M   altroot
  s3d        369M   /var/tmp
  s3e        123M   /var
  s4d        62M    /config
  s4e               unused (backup config)

To show the software versions installed on the respective partitions:


my_ex4200> show system snapshot media internal
Information for snapshot on internal (/dev/da0s1a) (backup)
Creation date: Feb 2 08:43:02 2012
JUNOS version on snapshot:
  jbase  : 10.4R9.2
  jcrypto-ex: 10.4R9.2
  jdocs-ex: 10.4R9.2
  jkernel-ex: 10.4R9.2
  jroute-ex: 10.4R9.2
  jswitch-ex: 10.4R9.2
  jweb-ex: 10.4R9.2
  jpfe-ex42x: 10.4R9.2
Information for snapshot on internal (/dev/da0s2a) (primary)
Creation date: Jan 5 16:25:17 2013
JUNOS version on snapshot:
  jbase  : ex-11.4R5.7
  jcrypto-ex: 11.4R5.7
  jdocs-ex: 11.4R5.7
  jkernel-ex: 11.4R5.7
  jroute-ex: 11.4R5.7
  jswitch-ex: 11.4R5.7
  jweb-ex: 11.4R5.7
  jpfe-ex42x: 11.4R5.7

To boot from the alternate partition (for a quick fail-back of software version):

my_ex4200# request system reboot slice alternate

Friday 22 February 2013

Dig on Ubuntu 12.04

Is Dig not installed on your Ubuntu box? Ask the internets and it'll say: just install dnsutils, easy!

# apt-get install dnsutils
# Package dnsutils is not available, but is referred to by another package....

Not so easy...
Anyway the magic fix is to update your package index:

# apt-get update


You should then be able to run the below as normal:

# apt-get install dnsutils



Now your package index is up to date you can also upgrade all your packages with:

# apt-get upgrade

Also useful to remember, if you end with a broken package, just run:

# apt-get -f install