The importance of monitoring your server access logs
We recently had a client come to us with random performance problems on their site. The site is a low traffic brochure site on a shared hosting account. Most of the time the site performs reasonably well. However, there are mysterious times where the site slows down to a crawl. This reached a critical point Friday afternoon and their hosting provider shut down the site for using too many server resources.
The apparent randomness of the issue suggested that it may be related to spikes in traffic to the site. However Google Analytics didn't show any corresponding traffic spikes.
Our next step was to look at the raw Apache access log files. When we did this, we found that there were some very significant spikes of requests to the user login page of the site, apparently caused by bots. We identified this Friday evening and worked on a solution over the weekend to get the site back in order.
The solution in this case was to implement some basic IP address restrictions on the site login and administration pages since there was no public facing authentication requirements on the site (only site admins/content editors logged in from a few known locations). This simple change reduced the site page request traffic from over 5,000 page requests per day to about 1,000 page requests per day (which corresponds with their Google Analytics traffic reports).
The take away from this is a reminder that while Google Analytics is a fantastic tool for seeing what the legitimate traffic to your website is doing, you need to dig down into your server access logs to really see what's going on to ensure the health of your web infrastructure.