Economics of Wikipedia Policy Debates (did you know the wikipedia doesn’t have logs files? how cool is that!)

An economist named Jeremy is taking on the policy debates at wikipedia (which are many) right now at the Wikipedia Hacking Days conference I’m at in Boston.

An example, new pages are not allowed to be created on the wikipedia by users who are not logged in. Some folks think non-logged in people (the most anonymous people) should be allowed.

Jeremy says we need to think about objects first. He says the key metrics are:

  1. Number of edits
  2. Quality of edits
  3. The readership’s comprehension and learning of the data in wikipedia.

You can think about how effective edits are by non-logged in people by seeing if they are reverted back (i.e. canceled). Very interesting idea. If non-logged in users have their changes reverted most of the time there really is no reason to let them create new pages because their effectiveness is so low. If there is no difference between logged in and non-logged in users then you might as well leave them be.

Jeremy is now talking about readership data. Turns out the Wikipedia folks are very privacy centric and don’t like to keep their logs that much (of course, they record IP address for changes, so that’s the end of privacy right there if you want to edit stuff!).

Wikipedia turned their log files off a while ago because they can’t store them due to the 5-7B pages they serve up every month. The result is that it’s really hard track how users are flowing through the site. The tech folks love the fact that the site doesn’t have log files since they are very concerned about privacy. The German version of the site is doing javascript tracking of the site, so they have an idea of what is happening on the site.

So, the discussion has gotten into the privacy issues of knowing what people are searching for on wikipedia.

… more details on the talk coming.

Leave a Reply