(I wrote this almost a year ago and it’s been sitting in my drafts folder since then. It’s still an outstanding issue and I haven’t figured it out yet)
I know a fair bit about networks, networking and how things work. My knowledge runs a few inches deep and a half-mile long. I don’t pretend to know inner workings of networking protocols but I know what they’re supposed to do and how to use them.
What started off as innocuous as a bounced email spiraled into a level of hell that I haven’t been to in a long time. I received an email from someone internal telling me that one specific firm we do work with now and again always bounces back to them. They get delay messages and then finally an undeliverable email error message. Because we had a big project starting with them, it would be great if we could actually communicate with them via email and could I look into it?
I started off by contacting the person who was getting the bounced emails using Gmail, as they wouldn’t be able to respond if I used my work email. I asked him if he could forward me the error messages so I could determine where the error was coming from. He forwarded them to me and I saw immediately that the error messages were being generated by his mail server, not mine.
I looked up their MX record and determined that their mail server was using a Telus Business DSL address. Knowing that all the Telus & Shaw fixed IP addresses were all in the same network as their DHCP addresses, I knew that 99% of them were marked as spam bots on the various realtime blacklists. I added their domain name to the whitelist on our spam server to make sure that they weren’t getting dropped because of suspected spam.
I had them try again, and it still didn’t work. Next I went through the raw SMTP logs on my email server (MDaemon, not Exchange so I’m feeling my way through a new interface already) and could not find any record of their mail server even attempting to contact mine, so I had nothing to go on. If they had tried to send an email and it was dropped for whatever reason, there should have been a record of the attempt made! With nothing to go on, I put the ball back in their court to have their tech people scan through their mail server logs and see what was going on.
The next day I got a phone call from their tech team. It was actually a local consultant who they had outsourced their IT to and we tried a few things. From his office, he was able to send me email no problem, and after allowing it on my firewall he was able to ping me successfully and then did a traceroute to see which way the packets were travelling. After that, he tried the same thing from his client’s server and it failed, failed, failed. He emailed me a screenshot of the traceroute and I forwarded it to my ISP’s support department.
Metrobridge support got back to me the next day and confirmed that the packets were entering their network and they were able to send and receive email with the client site. They also confirmed that the last hop on the traceroute was the router that I was connected to, so that put the ball back in my court again as the stopping point for traffic.
I forgot about it for a few days and over the weekend until this morning. One of my remote sites VPN tunnel was down (thanks to Shaw this time, changing my supposed static IP address at the remote site, which broke the tunnel) so I went to the log screen on the firewall and saw a whole page full of yellow highlighted ALERT priority messages in the category Intrusion Prevention: IP Spoof Dropped. As I was scanning it, my eyes paused on the IP address and I thought “Telus IP address” so I stopped, and compared it with my notes. Son of a gun, it was a match.
I launched myself back into SonicWALL mode and started reading the admin guide and hitting up Experts Exchange and some other go-to sites but could not really find anything that related to the errors I was getting.
My buddy Todd was online and I bounced a few things off him. Eventually what conclusion we came to was that there was a configuration error on my firewall.
My firewall has two WAN devices plugged into it: Metrobridge wireless and Telus DSL. The way it’s configured is that our mail and VPN tunnels terminate at the Metrobridge IP Address and use the Telus DSL for failover/backup. All internal internet traffic goes out over the Telus DSL and uses the Metrobridge for failover/backup. Metrobridge’s wireless connection is metered and Telus’ DSL isn’t really (it is, but the cap is very high) so we want to make sure that we minimize costs and send traffic out over the DSL.
The problem arises because of the way the SonicWALL is configured by default, and the way Telus hands out their addresses: my Telus fixed IP address has a /22 netmask, or 255.255.252.0 which means that everything from 220.127.116.11 through 18.104.22.168 are all on the same network. That’s 1022 addresses. If any packets come in to the Metrobridge WAN port on the SonicWALL from any of those addresses, the SonicWALL considers it traffic coming from a protected network on a different interface.
That means it must be someone trying to impersonate a computer on the protected network. That means someone is trying to hack in to the network. That means it’s a spoofed IP, therefore drop the packet and do not let it through the firewall. This is a good thing, but coupled with the enormous range of addresses in the network specified by Telus, it means that the other 1022 IP addresses will never be able to send any packets, email or otherwise, to any of my servers.