0
0
0
s2sdefault

I have been working on a script in my spare time. It is meant to help troubleshoot unspecific connectivity issues where some kind of Checkpoint device is involved. The idea is that you when you experience a connectivity problem, you run this script, do whatever you need to do to get things back up and running, and provide Checkpoint TAC the output file, and they can begin work to determine what went wrong.

It's the middle of production, and you get notified that services are being impacted; there's no connectivity. You quickly check things out and find nothing amiss. You reboot the internet modem, switch, and both firewalls. Everything comes back up without issue, and services resume like nothing happened.

Then, some upper management demands root cause, "We NEED to know what happened", but there is no real indication to point to anything specific. You start opening cases with the equipment vendors, ISP, and Checkpoint. Let's be honest here, no one is likely to find anything at this point, and perhaps no one does. Maybe there's a few vague suggestions, and likely much more finger pointing, but no one knows for sure. Since it was a one-off situation, the heat dies down, root cause isn't so necessary, and eventually the whole thing fades from memory.

Until it happens again.

Services are impacted, and you don't have time to call support, open a case, and start troubleshooting. What now?

We've all been here, either in this exact kind of scenario or something somewhat similar. Perhaps you do a bit more troubleshooting. Maybe you gather more information and get better details on the case. At this point, any work with any vendor will require data from the problem state, but you still don't have time to wait for support, but you still need to figure out what's really going on and why.

Enter my connCheck script. The idea here is that pre-load this script onto all your Checkpoint Gateways, and should such a similar connecitivity issue arise, you run the script, then go do what you need to do to restore services. Once everything's back up and running, you provide that file to Checkpoint TAC for further analysis.

Let's take a quick overview of what the script does:

  1. Before we really get things started, we gather a few general statistics and details about the device, things like the routing table, top output, arp stats, cluster stats, to name a few.
  2. One of the key things this script does is a couple different packet captures. As SecureXL can (and does) alter the packet captures, we need to turn it off. The script checks if it is on, and turns it off. Once completed, the script will re-activate SecureXL.
  3. Once the packet captures have been started, we start gathering a CPInfo. This also serves as a bit of a timing mechanism.
  4. While we are waiting on the CPInfo, we gather some general *.elg debug files, and continue the packet captures.
  5. Once the CPInfo finishes, stop the packet captures
  6. Turn SecureXL back on, if it was turned off
  7. Gather a second, identical set of details as in Step 1
  8. Compress everything into 1 nice package, ready for upload to Checkpoint TAC
WARNINGS
  • This script is provided for debugging purposes only with no warranty, implied or otherwise.
  • This script is for a Check Point Gaia Gateway ONLY. It has not been tested on anything else.
  • SecureXL will need to be turned off. This script will turn it off and back on again.
  • This script will gather a CPInfo at a low priority. This will use all availble CPU, but at a low priority. This may cause 100% CPU Usage warnings, but should not affect traffic.
Installation

Note: Please be sure to install this script prior to any issues manifesting themselves

  1. Copy the connCheck.tgz compressed file to the gateway
  2. Decompress with command:
    • [Expert@Host:0]# tar -zxvf connCheck.tgz
  3. Ensure it is executable:
    • [Expert@Host:0]# chmod +x connCheck.sh
  4. You are ready to go!
Usage

Please run this while the issue is occurring, otherwise, if the problem is not occurring, we will not see the problem.

  1. While the problem is occurring, you will need CLI access to the device. SSH or Direct Console are both fine.
  2. CD to the folder with the script.
  3. Run the script:
    • [Expert@Host:0]# ./connCheck.sh
    • Note: The output file will be placed into the present working directory. Ensure you run it from a folder with ample free space.
  4. Provide the compressed output package to Checkpoint.
Download from GitHub

Have any feedback you wish to offer? Comments/queries/concerns? Let me know in the comments section below!

Add comment


Security code
Refresh

0
0
0
s2sdefault