Web Hosting Performance Anxiety
We have pretty high standards for web hosting around here, and I gotta say, performance on our shared hosting system has really stunk over the last few weeks, particularly on weekday mornings. (Our managed servers and VPS accounts are running fine.) This is especially bad because this is where most of our customers host their sites and when their visitor traffic is most important; we've been getting a lot of feedback about the problem.
One of our first clues that something was going wrong was the (recently described) misbehavior of MIVA stores.
Additionally, backups started taking longer to complete; backing up 1,079,139,162,282 bytes in bazillions of files takes a long time, and since the backup process itself is relatively resource intensive, it cannot be allowed to run once traffic starts picking up around 7-8AM. But even with the manual termination of backups, we continue to experience intermittent performance slowdowns at other times during the business day.
Our server monitoring software (Nagios), which checks on the health of everything in the datacenter every couple minutes, has even detected connection timeouts from the main web hosting cluster. That means that at least one of the servers in the cluster was unable to accept a connection, let alone deliver a simple PHP page after 10 seconds -- from inside our network. Not good.
Additionally, most of us use the lori extension for Firefox, which tells you how long it takes for your browser's request to get the very first byte of information back, and then how long it takes to render all the HTML and load all the images. If the network and the server are working well, you'd expect that first byte to get to the browser certainly in less than one second. When we're experiencing performance troubles (like we have several times last week and this week), we've seen the first byte delayed over 10 seconds, again, from inside our network. Not good!
All of these phenomena stem from an increasing load on our central file server, as illustrated in the trend apparent in the graph below:
So file server load has increased rather dramatically in the last month or so, coinciding with MIVA problems, late-finishing backups, and generalized shared hosting performance issues. But is that the cause or the effect?
A theory we're pursuing is that one or a small handful of sites have begun exhibiting outside-the-norm behavior in terms of reads and writes to the file system. Unfortunately, great tools don't exist (at least as far as we can find) for this sort of NFS traffic analysis, so it's been slow going.
There's a long term solution that we've begun working on, but we need to identify the source of our trouble and restore our standard speedy service as soon as we can.
So, if you're hosting on the Modwest shared system and you think you might have recently started doing a whole lot of filesystem I/O, let's talk. Examples would be CMS/blogging software (such as Movable Type) which can periodically (or far too often) generate thousands of cache files, or an MLS system which frequently downloads and processes thousands of images, or any software which requires (or leaves behind) thousands of files.
In summary -- we know about the problem, we want to fix it, and we're doing everything we can. If you have advice or questions, please add a comment below or shoot us an email.
Who is your shared server host?
You should try out hostgator
Posted by: Web Hosting Provider | December 21, 2007 at 01:28 AM