Category Archives: wikipedia

The digital methods initiative at the University of Amsterdam – incidentally my new employer – has an ever growing list of very useful tools that help with studying online phenomena. The Wikipedia Network Analysis tool (like most DMI software written by Eric Borra) is particularly interesting if we simply take into account the place of Wikipedia in our contemporary knowledge configurations. The tool crawls Wikipedia from a starting URL (by default at a +2 radius) and – amongst other things – spits out a source node / target node list of links between the  different pages.

To visualize the data, you can use Many Eyes but there are significant limits to woking with online tools. This little script will take the source/target data and create a gdf file you can explore with gephi or guess. This is a Wikipedia network surrounding the page on data visualization:

What is rather incredible is that I actually filtered the nodes with only one connection from the graph, going from 4995 to 690. Wikipedia is has become big. Very, very big.

An interesting insight to take from this graph is that many of the data visualization pioneers are placed at the center of the network, indicating that the field has grown and diversified from a limited set of initial concepts and experiments – something that can be easily confirmed by looking at the literature of the field where the same examples pop up regularly.

A visualization approach may be interesting for studying Wikipedia as a knowledge platform instead of a social experiment. While the attention given to forms of governance, contribution, etc. is certainly justified, we may want to take a closer look of the actual organization of knowledge on Wikipedia and how this compares to other forms of collecting knowledge.