Hacking Days at Wikimedia

It’s 100+ degrees here in Boston, and I spent the morning walking around MIT trying to find “Wikimania: Hacking Days,” a small pre-conference of the Wikimania conference which starts on Friday.

The event is for hardcore tech folks, and hardware tech folks only. This is where folks debate database structure, server configurations, caching, and text editors–the details as we say in the business. This of course, will lead some folks to ask me why I’m here and not working on some Excel model of what our business will look like in 2018, going on a sales call, or eating sushi lunches with other CEOs.

The answer to why I attend events like this is that many of the business issues and opportunities in our industry come out of the tech discussions–maybe even most. The tech folks typically don’t get listened too enough in corporate environments from what I’ve experienced. People would rather hang out with the Harvard MBAs for some reason. In my experience the MBAs are great at regurgitating the past (a function of the case-study methodology) but bad at finding the opportunities in the future. Tech folks do the future better, but sometimes get trapped in details that really don’t matter. Everyone’s different, and you have to understand that thinking of your team if you’re gonna get anywhere in life.

While on the subject of MBAs, I have to say I find many of them very in-authentic. It’s maddening to hear them say stuff like “this worked for EBAY so we should do that,” as opposed to “why did this work for EBAY at that moment in time, what can we learn from that given how much things have changed since then.” That is what the tech folks are capable of: those deep, deep logical debates backed up by technical facts. Of course, those debates can results in products and plans that are so wildly complex that you can’t bring them to a mass market. So, the truth is that the you have to triangulate between the MBAs, the tech folks, and the sales people if you want to find a hit–that’s my strategy (of course all of these things answer to the user at the end of the day as well, so don’t forget the user overlay).

First up is Brion Vibber, the CTO of Wikipedia, to speak about the state of the Wikitech. He reports that the Wikipedia has been stable for the last year, after years of performance issues. He says they add servers proactively now. Wikipedia is on 200 servers, and there are many issues when you have so many servers–like updating them with changes (I’ve heard that one before).

He said the process of managing the server clusters is basically, hanging out in IRC and waiting for someone saying something is broken.

Building software is easy today, but scaling software/servers is hard. We’ve seen this over and over again. Anyone can make blog software, but can you build software that scales to 20 servers with routers that don’t melt (we had one melt–literally) when 50,000 people a minute try to load a post.

Found a podcast with Brion, and here is a photo of him from the MySQL conference.

Mark Bergsma is up next, talking about squid caching architecture, geographical DNS, and purging (the cache).

Wikimedia Clusters started in Florida (pmtpa) and Amsterdam (knams), with the ones in Amsterdam not housing any data but simply caching http. The third center came from a donation in Seoul, South Korean from Yahoo (yaseo).

The Squid setup is two tier, with a line of cache servers in front of a content servers on Apache. If the Squid doesn’t find a doc in its cache it asks its local neighbors, then the parents of another cache, and then the apache. The Squids are diskless.

The Squids have been split into two groups: one for content and one for images–that’s smart. Load balancing is done by Linux Virtual Server (LVS), and Marks says that is working OK.

Mark went into a long discussion of DNS routing, something Brian has been educating me on it. It’s complex and important, and it is basically the process by which users are routed to servers. So, if you have 200 servers and someone connects to you from France which server do you send them to. Google and Yahoo do this really well.

Another important discussion Mark had with the group was Object Purging. This is the process by which objects are removed from the servers Cache so that another object can be put in there (for example, a page that is changed).

At the end Mark showed off his Foundry BigIron RX-8, a $60,000 router.

Leave a Reply