Wikipedia: An "Open Source" Encyclopedia Gains Respect
Wikipedia (www.wikipedia.org) is a multilingual "copyleft" encyclopedia designed to be read and edited by anyone. It is collaboratively edited and maintained by thousands of users via wiki software, and is hosted and supported by the non-profit Wikimedia Foundation. In addition to typical encyclopedia entries, Wikipedia includes information more often associated with almanacs, gazetteers, and specialist magazines; and coverage of current events. -Wikipedia
That's the self-definition of Wikipedia, an ambitious project to chronicle the world's knowledge by creating an online encyclopedia written entirely volunteers. In a world where Utopian dreams often go sour, Wikipedia-like Linux-is a success story, a testament to the organizing power of the Web. Wikipedia applies the idea of open source to a reference work. Articles are written, edited, and reviewed entirely by volunteers. The website carries no advertising (the domain name changed from .com to .org in 2002). And while article quality varies, the results are often surprisingly good-sometimes better than "closed source" reference works like the Encyclopedia Britannica.
Wiki, Hawaiian for "fast," refers to a hypertext document that anyone can edit. The idea began with the Portland Pattern Repository, created by Ward Cunningham, which covers some of the underlying principles of extreme programming. Wikipedia claims to be the largest Wiki project ever created.
Wikipedia could be described as a benevolent dictatorship, with co-founder Jimmy Wales serving as the site's Linus Torvalds. But as reference works go, Wikipedia is remarkably democratic. Anyone can initiate a new article, and if you search for a topic not already covered, you are encouraged to write it yourself. Anyone can also edit, correct or augment most current articles. Readers are invited to add more details about their favorite city, summarize a recently read book, profile an actor, or just hit the random link and see if they have anything to add. Version control between edits is straightforward: each article carries with it a list of every revision ever made, with the ability to compare changes between any two of them. Every article also carries with it a separate discussion section. The discussion surrounding how to handle some topics-like abortion or Microsoft-is extensive.
The guidelines for article-writing are simple. Wikipedia has a few suggestions for layout: a clear introduction, short paragraphs, appropriate images, interesting quotations on the topic, cross-links to related subjects within Wikipedia and external links beyond. Wikipedia has an online style manual to resolve questions of punctuation, capitalization, and naming conventions. In keeping with the wiki format, hyperlinks are plentiful. Articles in their early stages may feature little of this, but are slowly "wikified" toward the preferred format over time.
Many Wikipedia articles began as simple entries, then evolved into more complete treatments, sometimes with many associated internal links. Back in October 2001, for example, the Wikipedia article on Sony read in its entirety:
Tokyo, Japan-based consumer electronics giant. Its very first consumer product, in the late 1940s, was a rice boiler.
2000 Sales: 63 billion USD
2000 Employees: 189,700
Name coined by founder Akio Morita to convey "sound" and "sonny" for youth and energy. See http://www.snopes2.com/business/names/sony.htm
Also see PlayStation and Walkman.
Thirty revisions later, the current entry gives a brief history of the company, including the origin of the name and its September 2004 purchase of MGM, together with a long list of links to other Wikipedia articles covering Sony's film and production sites, its related music publishers, and its many products, including rumors on the forthcoming Playstation 3.
In this age of functional illiteracy (especially in America), this conception of reader as contributor and editor sounds too good to be true and one can't help marveling at the quality of the work. I sent a friend who is an authority on stereoscopy Wikipedia's article on the same. He thought it thorough and accurate. Another friend who has taught college-level educational psychology gave Wikipedia's article on educational theorist Jean Piaget a B+. Not every article comes up to this standard, but enough do that Wikipedia has become, over time, as useful in its own realm as Google is in Web searches. Indeed, Google has drawn people to Wikipedia through its search results.
From Nupedia to Wikipedia
Before co-founding Wikipedia, Wales tried his hand at an earlier online encyclopedia called Nupedia. That project didn't go very far-just 24 completed articles-but it did establish some of Wikipedia's operating philosophy. Like Wikipedia, it used the GNU Free Documentation License, looked for contributors who were experts in their field, and established a review and editing process that tried to produce a professional result.
Wikipedia began life as an internal collaborative medium for Nupedia articles in development, then became an online encyclopedia in its own right-going live on January 15, 2001 (referred to by Wikipedians as "Wikipedia Day"). Growth was dramatic. The encyclopedia featured some 20,000 articles during its first year of operation. Last September, Wikipedia reached 1 million articles in 100 languages-growing at the rate of 2,500 articles a day. "We're doing over 300 million page views a month now, and we have doubled in the last eight weeks," says Wales. He thinks the number of hits Wikipedia gets is accelerating, possibly because searches from Google and other engines are linking to it. "I expect that by next March, we should be doing about a billion pages a month-at least that's the target."
The first article in a nascent Japanese version appeared in April 2001: It now has 75,000 articles, making it the third-largest Wikipedia version (after English and German), mostly contributed by native Japanese speakers. Other multi-language projects sponsored by the Wikipedia Foundation include a dictionary and thesaurus (Wiktionary), a compendium of quotes (Wikiquote), a set of books and manuals (Wikibooks), and public domain documents (Wikisource).
Wikipedia and its siblings run on 25 servers using the LAMP (Linux, Apache, MySQL, Perl/PHP) platform. When a query comes in, a Wiki server fetches needed information out of the MySQL database. Another server running Apache assembles the data into the completed page using the default format or customized to user preference. At the front end are five servers running the Squid Web Proxy Cache, which feed the page to the end-user. Wales estimates that for any given page, about 75 percent comes out of the cache, the remainder from the database. Associated servers hold image files, handle email and other auxiliary tasks.
The project developed its own open source Wiki server, called MediaWiki, which is built atop MySQL. "It's very easy to install on a Linux server, and yet it scales to run this huge website," says Wales.
Pinning down the truth
Encyclopedia: An encyclopedia (alternatively encyclopadia) is a written compendium of human knowledge. -Wikipedia
Wikipedia's growing popularity reflects its increased stature as a serious resource. But compared to a commercial encyclopedia, Wikipedia is not yet an authoritative source-a guaranteed accurate, balanced compendium of human knowledge. The mix of articles is heavy on technology, lighter on the arts, and is tilted more toward Western thought. But as more people come to Wikipedia, they will fill out the missing pieces and making the whole enterprise better. As with open source, more people involved result in ever more stable, more feature-rich versions. At least that's what has happened so far. The criteria for improving Linux is pretty clear-add features the community desires, and if there's a bug, fix it. Building an online encyclopedia is similar: if an article is missing, add one; if an article needs improving, improve it. The stickier problem can be figuring out what is a feature, and what is a bug. In a contentious world, what is truth?
One Wikipedia operating principle is that articles must have a neutral point of view-that is, they state the facts in a way that people on both sides of an issue can still agree. Wales says that one of the best techniques for achieving neutrality is to report on the controversy. "You can imagine on an issue like abortion: you could never get the two sides together. But reasonable people can at least agree on what the other side's position is-if they can come to an agreement and say we'll present the issue and let people decide for themselves. It works reasonably well." In the Wikipedia guidelines, Wales writes with some irony that "the key is to write about what people believe, rather than what is so. If this strikes you as somehow subjectivist or collectivist or imperialist, then ask me about it, because I think that you are just mistaken."
Most articles seem to have come to some agreement, but not all. Does 128 kbit/s compression produce unacceptably low quality or is that just an opinion? Does MP3 rival Apple's AAC, or not? Should piracy be discussed? The article's authors are still debating that point, and in the meantime, the prose is somewhat choppy. Some subjects are even more controversial. The article on Yasser Arafat is write-protected until disputes are resolved. Doing so may take as least as long as resolving the Middle East crisis, itself.
Getting both sides of a controversy is one challenge. Accuracy is another. One professional researcher called Wikipedia a great third source of information. It is not a primary source because, unlike a commercial encyclopedia, stated assertions, whether through error or mischief, may be false. Wales says that long term, the answer will be to "select stable versions of some articles-those that have gone through a review process so that we feel confident in them. These articles could be flagged as achieving an authoritative status. Nothing is ever perfect, but we do want to be as good as Britannica."
In some articles, Wikipedia is already better than Britannica-with clear writing, great illustrations, and timely discussion that reflects late-breaking events. "When people first encounter Wikipedia and see that anyone can come in and edit pages, they imagine that a million different people each added a sentence, and it somehow turned into this cohesive work," Wales says. "But the project isn't really like that. Really there are 200-300 people that I would consider the core of the community, who are organizing, reviewing, supervising, and checking things. It ends up being those people who are writing the bulk of Wikipedia."
But what protects Wikipedia from the evil in the world-virus writers, website vandal, and malicious pranksters? What about misguided amateurs, true incompetence, and bias. Wikipedians have heard all of the objections and maintain a list of replies.
- What about cranks who argue the world is flat or describe perpetual motion machines?
- All new entries appear on a recent changes page-and the silly ones get quickly deleted.
- What about trolls and flamers who have already poisoned the waters of Usenet?
- Wikipedia differs from Usenet in two significant ways: Users can edit other users, and while Usenet encourages debates, encyclopedias encourage collective agreement-even if that agreement is about what people disagree on.
- What about amateurish writing on topics they know little about?
- Amateurs are encouraged to at least start an article, with room for experts to edit later. And while more articles are written by undergraduates than their professors, the emphasis remains on clear writing and good research. Professors are not always known for the former.
- What about the lack of a formal review process to weed out mistakes?
- Wales says that Wikipedia does have a review process. It is more freeform than formal, but it still works.
- What about vandals and spammers?
- Wikipedia has suffered from both and has an ad hoc method for fixing them. Any user can revert a page to an earlier version and look for further damage by tracking other edits made from the same IP address. Wikipedia system administrators can also block IP addresses for a time.
- What about the bias that comes from a self-selected group of contributors?
- As more readers participate, more topics in more languages will be covered. Wikipedia is still slanted toward the English-speaking world, and has more coverage depth on technical subjects than the arts.
Wikipedia amounts to a bet: that there is more good will in the world than bad actors, and that gradual but persistent improvement will ultimately triumph. The idea worked for Darwin. So far, it seems to be working for Wikipedia, as well.
Sidebar: Instant Authority: Contributing to Wikipedia
How easy is it to contribute to Wikipedia? To find out, I tried my hand at augmenting the entry on one of my favorite non-fiction writers, John McPhee. The article as I found it had a two-paragraph biography on the author, together with a list of his books. So far so good, but McPhee is also noted for the broad range of subjects he's covered. So I added a third paragraph, composing it on a word processor where I could check it for spelling errors, then pasting it into Wikipedia's editing page. A few edit tools allowed me to italicize the book names and to add hyperlinks where I thought appropriate. I previewed the page a few times, caught a few typos, clicked on the "save page" button-and instantly, my paragraph was added to the sum of world knowledge.
If anyone wants to see what I added, they need only click on the history tab and my revision comes up, along with the date I made it. Subsequent readers who think I did a bad job can modify my paragraph, or even remove it. There's also room for discussion, if anyone wants to argue a point. But the revision as I wrote it remains accessible via the page's "history" tab.
Knowing that made me more a more careful writer than I might be in an informal email exchange. There's something about contributing to an encyclopedia, even a democratic one, that makes you want to be authoritative. At the same time, the ability to instantly revise also gives you license to make an inadvertent mistake-knowing it can be easily corrected. But as careful as a contributor might be, Wikipedia is not a place to go if you have pride of authorship. As the website puts it: "If you do not want your writing to be edited mercilessly and redistributed at will, do not submit it."
Sidebar: A conversation with Jimmy Wales, co-founder of Wikipedia
Jimmy Donal "Jimbo" Wales (born August 7, 1966) is an Internet entrepreneur and a wiki enthusiast.
Wales was born in Huntsville, Alabama, and is a graduate of Auburn University and the University of Alabama. He worked as a professional futures and options trader in Chicago. He also admires the Objectivist philosophy of Ayn Rand, and while in graduate school owned and moderated an Internet mailing list known as the "Moderated Discussion of Objectivist Philosophy." In the mid-1990s Wales started Bomis, a search portal focusing on aspects of pop culture, which also sells original content. More recently, he founded Wikia, a wiki-style search engine.
With Larry Sanger, Wales founded Wikipedia, a wiki-based online encyclopedia derived from the free software model. He and Sanger had previously worked on the now-defunct Nupedia encyclopedia project. Wales is currently the director of the Wikimedia Foundation, a Tampa-based non-profit organization that encompasses Wikipedia and its younger sister projects. Time magazine reported that Wales had spent around US$500,000 on the establishment and operations of his Wiki projects.
Wales lives in St. Petersburg, Florida, with his wife Christine and his daughter Kira.
- How is Wikipedia different from your first encyclopedia project, Nupedia?
- The key is that the barrier to entry is very low. People who are interested in getting involved can jump right in, start learning what to do, meet people who are working in the project, and become part of the community. Whereas the old system with Nupedia made it difficult to get involved.
- When we first put Wikipedia up, we were concerned that the academics we had gathered were not going like it. We presented it as here's a tool for us to play with and think about. But the reason we threw it up was because we knew that Nupedia was not working-because with volunteers you have to make it fun, and Nupedia wasn't very much fun to work on. It was an onerous process that we had set up. A lot of volunteer time was spent fighting the system, rather than the system enabling them to do things.
- Where did you figure out that the Wiki format would work?
- We didn't know it would work before we tried it. One of my employees showed me Ward Cunningham's wiki. We had played around with that. Then Larry Sanger, who was the main person I had hired to organize the Nupedia project, showed me the wiki, so we decided to put it up and play with it. If you're familiar with wikis and the culture of wikis, our way of doing things is very different from a traditional wiki. Most wikis emphasize community over content, and tend to be like a free-form type of chat. We split the article space and the discussion space so that you're working on an article and want to discuss it, you do so on a separate discussion page.
- How does Wikipedia gets its singular voice?
- There are people who make a hobby of going around and making all the articles on a particular type of topic consistent. All of the articles about birds will have a similar format because somebody's gone through and made sure that that is the case.
- So it's not as self-organizing as it appears because there's a structure for organization on the front page. It's self-organizing in the sense that you need a community of people that is self-organizing-that forms committees, creates plans and follows through. The word "community" is abused on the Internet. But this really is a community of people who know each other. We've got about 40 IRC channels in different languages and topics. There's usually over 100 people on the main Wikipedia channel 24 hours a day, discussing things and getting to know each other. And people meet in person.
- What lessons are to be learned in how you structured it?
- A big part of it is people. It's a question of leadership and inspiring and encouraging people to be thoughtful and loving and caring towards each other. The Internet is full of dysfunctional communities where all people do is argue. There's been this value from the beginning that we're doing something we think is important, and therefore, we should try not to have flame wars but should try to work together for constructive solutions to problems that come up. Of course people are people and sometimes we have big arguments. But the overall structure makes it very possible for new people to come in. It's a very welcoming community.
- You've pointed out that there is an advantage in trying to be an authoritative source, as opposed to being a debating society.
- Exactly. And that makes a big difference, too, in terms of people getting along. We have a pretty diverse group of contributors, politically, and there are many divisive issues in our times that need to have encyclopedia articles. But we do a pretty good job of attracting and retaining people who are willing to discuss things in a sensible way, without getting personal, while striving to present the whole story.
- It seems to me that you are particularly good at covering very recent events.
- Definitely. Sometimes, it's because a place may have never been important before. For example, the article on the Abu Ghraib prison describes its history, right up to and including the scandal, but also including the time when Saddam ran it as a torture center. You won't necessarily find that in Britannica because it wasn't that important before - it was just a prison in Iraq. Then it suddenly became important.
- One of the neat things about what we do is complement the news-giving it an historical context and background information. When 9/11 happened, Wikipedia featured an enormous number of articles on things that people were wondering about, but weren't reported in the news. Like the World Trade Center's architect-his career and history-and articles on the history of the airlines whose planes went down. That's something we do very well.
- Are you trying to attract scholars-authorities in their field?
- We try to make sure that we are welcoming to scholars. At the same time, though, there is a feeling that credentials aren't what matters, but rather, the quality of your work. A big name intellectual might feel more comfortable staying in the ivory tower. If they come to Wikipedia, they are liable to get their work edited by somebody, and they may not be accustomed to that.
- There are some parallels with open source development.
- Absolutely. First of all, we use a free license and we run all free software on our servers. The software that we create in house is all released under the GNU GPL. We're definitely from the free software movement.
- Does an organization like this need a benevolent dictator?
- Something like that. I always joke that I don't want to be a benevolent dictator, but I'll be the constitutional monarch. But there are times in a project like this when half the people think one thing, and half the people think the other thing, but both sides agree we have to pick one. In cases like, that some leadership is necessary.
- Your online persona is a mix of humor.
- Yes, though I'm a complete and total geek.
- Japanese is the third largest Wikipedia language version. How is it going?
- Japanese was one of the first languages that we added. From a software point of view it was a little tricky at first because of kanji and romanji. But that's gotten a lot easier because of Unicode.
- My wife is half Japanese and grew up in Tokyo, so I have a cultural interest in the Japanese version. I am told that as compared to the English Wikipedia, Japan has more of a tradition of discussing everything on the discussion page until a consensus is reached-then changing the article. In the English Wikipedia, you change the article-then argue about it.
- Is there anything else you want to say Software Design readers?
- We need developers. From the technical point of view, Wikipedia is a really cool challenge. How do you, on a very minimal budget, serve this huge growing website that's doubling every three months? There's a lot of work that needs to go on in terms of optimizing the code. [Most online discussions are in English or German.]
- If anybody wants to participate, how do they go about it?
- The best thing is the Wikitech-L website. Just type it into Google and it will be the first link. There's also the IRC channel, which is where a lot of the action takes place. A lot of newcomers come in there and start asking questions about the software, learning, downloading, testing, and then they get involved in the development.