Most important person on English-language wikipedia ? Frank Sinatra

Frank Sinatra by Gottlieb

The most important person in the English-language world is Frank Sinatra, according to an analysis of Wikipedia articles led by the University of Toulouse's Young-Ho Eom.

Number two is Michael Jackson, and number three is Pope Pius XII.

Eom arrived at this somewhat surprising conclusion by using methods borrowed from Google to analyse Wikipedia pages and determine which individuals have the most important articles linking to them.

Taking all 24 major language editions of Wikipedia into account, Eom's team carried out the same study and came to a more plausible conclusion: Adolf Hitler. (Even then, Michael Jackson takes the second slot, and Madonna is in third place.)

The point of the study wasn't strictly to determine who the most important people are on Wikipedia, but instead to discover if the online encyclopaedia was skewed in the level of attention it gives to various figures, either by gender, time, or location.

"Our analysis shows that most important historical ļ¬gures across Wikipedia language editions are born in western countries after the 17th century, and are male," the authors write. But, they add, "each Wikipedia edition highlights local figures, so that most of its own historical figures are born in the countries which use the language of the edition. The emergence of such pronounced accent to local figures seems to be natural since there are more links and interactions within one culture."

The study takes as its starting point Google's PageRank algorithm. This is still the basic method the search engine uses to decide the basic importance of webpages in its index, although the system has received an almost total overhaul since it was first introduced in 1998.

PageRank says that a page is important if a lot of important pages link to it. While the definition sounds self-referential, in practice, it recreates a lot of features which we intuitively understand: the more sites link to a newspaper's website, for instance, the more important that website is; and being linked to from a national newspaper probably says more about a site's importance than being linked to from a nondescript blog.

But when the study applied PageRank to figures on Wikipedia, the results were odd. The most important person in the world comes out as Carl Linnaeus, the 18th-century Swedish naturalist – certainly an important feature in natural history, but possibly not the most important in the world.

It seems the algorithm was thrown by a quirk of Wikipedia: the site contains an almost complete collection of named species – as well as who named them. Since Linnaeus' life work was coming up with a system to classify organisms, and applying that system as widely as possible, he is linked to from a lot of pages, and important ones at that. From the Domestic Cat to the Red Fox, through the mighty Asian elephant and the lowly moss gall, Linnaeus named, and is linked from, them all.

To get around that issue, the researchers applied a second way of measuring importance: CheiRank. Simply put, "the PageRank… of an article is proportional to the number of incoming links, while the CheiRank… of an article is proportional to the number of outgoing links. Thus a top PageRank article is important since other articles refer to it, while a top CheiRank article is highly connected because it refers to other articles." In other words, an important person is likely to have a lot of other important people and things involved in their lives.

Mix Pagerank and CheiRank together, and you get the final measure: 2DRank. That's how the authors arrived at the top 100 for all 24 editions of Wikipedia – and how Frank Sinatra took pole position.

Wikipedia's Jimmy Wales: 'It's true, I'm not a billionaire. So?'

Powered by Guardian.co.ukThis article was written by Alex Hern, for theguardian.com on Wednesday 11th June 2014 15.25 Europe/London

guardian.co.uk © Guardian News and Media Limited 2010