Google Trends, the problem of closed data
One of the first thing that comes to mind when thinking about measuring eminence is to ask Google. They have a tool that could be used for that, Google Trends.Giving one search term, Trends returns a curve showing the evolution of the number of queries done for the term. It does not give the absolute values (the number of queries), but the evolution.
Giving two search terms, Trends returns a drawing with two curves, one for each term, with the evolution of the number of queries. They don't give the number of queries, but the the two curves are rendered at the same scale, so it's possible to compare the importance of the two terms.
This gives a way to the sportsmen for example, maybe possible with Google Trends API.
Or more more simply ask Google if they can provide ranked lists of famous people.
But this presents a major inconvenient : the impossibility to verify the data for an independant researcher. The only ones that could do that are Google engineers. There is no guarantee that data are not manipulated.
Using reference sites
An other way to build eminence criteria could be to use reference sites on a given subject. There are at least two indications that could be used :- The presence of a person on a site indicates a certain level of eminence.
- In some sites, each person has a dedicated page, and these pages sometimes contain links to other persons (these links are sometimes called "inter-links"). Counting these inter-links could be used to build a hierarchy between the different persons listed on a site.
See tig12/mactutor-by-links, a program generating a list of mathematicians sorted using citation count.
Wikipedia
Wikipedia (or wikidata) is a potential source to collect information about persons ; it also contains "inter-links". But there is one special danger with wikipedia : as we know that some parts of wikipedia are flawed, strongly biased (like everything related to "scientific astrology"), one must be suspicious of this data source.A priori, one would think that wikipedia is neutral when it comes to compute eminence of sportsmen or mathematicians. But maybe everything is political. If groups lead by ideology, like skeptics, is able to infiltrate wikipedia editors for some topics, nothing guarantees that a given topic is not biased.
This kind of bias may be introduced by editors that do not necessarily manipulate the content for ideological reason. For example, if a group of editors of a given country is motivated to provide a good documentation for the athlets of their country, these athlets will be sur-represented.
Let's notice that this bias is not specific to wikipedia.
Problems
Using the web does not solve all the problems of eminence criterium construction.For some groups like mathematicians, it looks quite easy. But for groups like sportsmen, the sources (the reference sites) should be carefully choosen to avoid to favour a given sport or a given country.
One way to lower this problem is to build several indicators coming from different sources and compare the results to identify problematic sources.