If you’ve started taking a closer eye on security for your web site then you’ve probably at some point started looking at the actual logs of users visiting your web site. These can go by a lot of different names: we’ll stick with the blunt “web logs” for this article. These are files that show you exactly who is doing, or trying to do, what with your web site, where they’re doing it from, and how they’re doing it (what browser they are using, often what operating system, and more information depending on your host’s web server software).
If you have a busy website then it quickly becomes prohibitive to go through every line every day, and you want to be able to scan it for problems. This is a good idea, but to be able to do so, you will need to have some idea of what kinds of problems you’re looking for. Here is a summary of some of what you’re going to try to keep an eye out for:
Hits to non-existent pages
To scan your logs effectively you need to be able to know the names of the actual pages on your web site without looking, meaning that you also need to know immediately if you are looking at an attempted hit to a non-existent page. There are a few common pages that you’ll see from someone trying to infiltrate. “index.php” is one, and no, this won’t accomplish anything on their end if all you have is “index.html”: the “.html” extension does make it a completely separate page.
OK, we’ll then what’s the danger? Nothing, immediately. The reason that you’ll see these attempts is that some web design software packages have built-in bugs. These create pages with vulnerabilities that have predictable names. These hits are attempts to access those.
One important note about this, though, is that this isn’t always bad news. Search engine spiders often do the same auto-browsing, but in this case they are looking for pages that contain instructions for the search engine, like “robots.txt.”
Not “ha ha” funny, either. There are two things you’re looking for here:
Lots of non-ASCII characters
These can either be control characters or other characters down the character set. You’ll recognize them by a syntax like “%056”. Again, these need some script on your end to do something with them (they send unauthorized instructions to said script), but it’s a sign that someone’s trying.
Attempted login information
Password protection is common. So are people who don’t realize that you need a password other than “password”. If you see a URL that is long, sent to a .cgi, .php or other executable page, and the URL contains in it what looks like a username/password combo, then that is what it probably is.
What to do if you have a busy website?
We’ve discussed this many times what to do when you see these things. The quick solution:
- Block the IP addresses you need to, and don’t block any more than that, lest you risk filtering out legitimate traffic.
- Also don’t be afraid to ask your web host for an extra set of eyes if there’s something you’re suspicious of. Not only do they have more experience, but if there’s an attack affecting multiple users, then they might recognize something about its footprint that you wouldn’t be able to.
This is your website, your livelihood. There’s nothing wrong with being as secure about it as you want to be. Keep an eye on your log files and stay safe!