halo.bungie.org

They're Random, Baby!

HBO's Web Issues: Where We Stand

If you're not into reading complicated explanations, skip to the bottom of this page for the relevant info, in nice, easy-to-digest bullet format. Otherwise, read on.


Recently, there have been a number of glitches in the day-to-day running of HBO; some are obvious (such as when the front page shows no news at all), others are a little more subtle but still noticed by many (such as issues with the spoiler system). It's time, I think, to explain a little bit about the actual running of this site.

When HBO started, it was run by a bunch of Mac fanatics, and hosted on a series of Macintosh computers; we upgraded periodically, but we stayed with Macs for a few reasons:

  • We knew 'em pretty well
  • They were pretty solid at the traffic levels we got in the early days
  • We built a lot of dynamic content using a Mac-only backend system

Time went on, and we started to see problems. Macs are great - but the networking code built into the version of the operating system we needed to use (because of this backend we used) is really not robust enough for serious webserving. In addition, some of the databases we were using had gotten too complicated for the simple database system built into this backend program. In the fall of 2002, we were seeing enough traffic that HBO was being brought down regularly simply from too many hits. Our solution was to spread the load; we moved subsections of the site to new machines (halosm and halosn). This helped - a LOT - and for almost a year we were able to keep up as we'd been before, growing in popularity without crashing too often.

Eventually, however, we even outgrew these THREE machines. At this point, we brought a unix box online (nikon) and moved all static content over to it - everything that didn't need our dynamic backend code. Again, that helped - a lot. We realized, however, that again, this was just a temporary solution; the only real long-term answer was to move away from this backend system that was locked into a platform that simply can't handle the sort of traffic HBO sees. [That traffic is considerable, by almost any standard; these days, not including the movie traffic we do (which, with the help of Brian Towne and Mythica.org, tops 200 GB/day), we're seeing more than 600,000 pageviews per day, with raw hits in excess of 4 times that value. Even split between multiple machines, that's an industrial-strength number... and requires industrial-strength webserving, something Macs are simply not up to.] Update, 16 November 2004: A bunch of Mac lovers have written to us, wondering why I say we have to leave the Mac as a hosting platform. Doesn't Apple make some of the best servers out there (Xserves)? Yes, they do. They also cost 3-4 times as much as a comparably equipped P4-based server; I simply don't have the cash for a rack full of them. (We can't use them right now for the Mac-based content anyway - our backend stuff runs only on OS9.)

We've therefore been porting all this custom code over to the more standard combination of PHP and MySQL, and as pieces move, sections become faster, more robust, more available. There is still an awful lot of code still running on the Macs - and as long as incoming content continues to pour in the way it's doing, that's not going to change immediately. However, in the meantime, it means some areas of the website are hybrids; running on more than one site at a time. And THIS fact explains some of the glitches people have been seeing recently.

The news section, for example, is running entirely on nikon - but since we don't want to confuse people, you still see the day's news on halo.bungie.org. This is done by HBO making a request to nikon each time you hit the page, and including the results of the query in the page returned to you. (This may SOUND less efficient than storing the news on HBO directly... but trust me, it's not.) The News Archives and the News By Day sections are completely housed on nikon, so the only issue there is cookies... but for the main page, your cookie setting (read on HBO) has to be transferred to nikon in order that you're only shown the news you've asked for. Sometimes, HBO glitches, and gets no response back from nikon after this request. If this happens, it caches the empty response... and from then until the server cache is flushed out, every single visitor to the website is served an empty page. (It's supposed to flush itself every 30 minutes, but clearly, it doesn't always work that way. We've had at least two extended runs - more than 4 hours each - of blank front page display.)

This also explains the fact that there are two separate cookies to set - one on HBO, the other on nikon. this has been a cause of confusion for people; they think they've set their cookie, they go to 'news by day', and the setting is different. (Or it's set right for 'news by day', but the front page is wrong.) Yes, you have to set the cookie twice; once on nikon, once on the main machine. This isn't a bug; it's a limitation of running the site split between multiple machines.

The 'blank page' syndrome has shown up one other place, as well - the Links page. If you visit the page, and it seems really short... it probably is, and that's because it's showing you an empty middle section (where the query from nikon's links database is supposed to go). Again - this isn't supposed to last very long, but it can.

All of these issues are transient, in the sense that as soon as we get fully moved over to nikon, the problems will vanish. (Well, these problems, anyway... I'm sure there'll be other ones.) In the meantime, please be patient; if you run across an empty page, wait a few minutes and try again. If it's been empty for a long time (more than 30 minutes, you can send us an email, if you choose... but realize that all we need to know is that it's empty; you don't need to demand we fix it, or give us your browsing history, or yell at us for running a crappy half-professional site. Just point out that it's empty, we'll fix it as soon as we hear about it. Thanks!

So... a summary of this whole long-winded mess, for those who found this too technical for their liking:

The Important Stuff

  • There are two spoiler cookies to set (halo.bungie.org and nikon.bungie.org)
  • An empty news page means a corrupted server cache - we'll clear it soon
  • All of these problems will disappear when this transition period is over

Sometimes the quantity of traffic we see floors me - it seems huge compared to any other website I run. Then I remember how many people play this game, and how many complete sources of Halo information there are on the net right now... and I wonder why it's not bigger.

We're doing the best we can, and the kind words you send our way warm our hearts... but for those who see only the annoyances, please, please, PLEASE try to understand that we're doing the best we can with what we have. Thanks!

bungie.org
brr!