Category Archives: abstract

Edit: slides

On Thursday, I will be giving a talk at the “The Lived Logics of Database Machinery” workshop, organized by computational culture, which will take place at the Wellcome Collection Conference Centre in London, from 10h to 17h30. I am very much looking forward to this, although I’ll be missing a couple of days from the currently ongoing DMI summer school. This is what I will be talking about:

ORDER BY column_name. The Relational Database as Pervasive Cultural Form

This contribution starts from the observation that, in a way similar to the computational equivalence of programming languages, the major types of database models (network, relational, object-oriented, etc.) and implementations are all able to store and manage a very large variety of data structures. This means that most data structures could be modeled, in one way or another, in almost any existing database system. So why have there been so many intense debates about how to conceive and build database systems? Just like with programming languages, the specific way a database system embeds an abstract concept in a set of concrete methods and mechanisms for specifying, accessing, and manipulating datasets is significant. Different database models and implementations imply different ways of “thinking” data organization, they vary in performance, robustness, and “logistics” (one of the reasons why Oracle’s product succeeded well in the enterprise sector in the 1980s, despite its lack of certain features, was the ability to make backups of a running database), and they provide different modes of interaction with both the data and the system.

The central vector of differentiation, however, is the question how users “see” the data: during the “database debates” of the 1970s and 1980s the idea of the database as a set of tables (relational model) was put in opposition to the vision of the database as a network of records (network model). The difference between the two concerned not only performance, flexibility, and complexity, but also the crucial question who the users of these systems would be in the first place. The supporters of the network model clearly saw the programmer as the target audience for database systems but the promoters of the much simpler relational model and its variants imagined “accountants, engineers, architects, and urban planners” (Chamberlin and Boyce 1974) to directly interact with data by means of a simple query language. While this vision has not played out, according to Michael Stonebreaker’s famous observation, SQL (the most popular, albeit impure implementation of Codd’s relational ideas) has indeed become “intergalactic data-speak” (most packages on the market provide SQL interfaces) and this standardization has strongly facilitated the penetration of database systems into all corners of society and contributed to a widespread “relational view” of data organization and manipulation, even if data modeling is still mostly in expert hands.

The goal of this contribution is to examine this “relational view” in terms of what Jack Goody called the “modes of thought” associated with writing, and in particular with the list form, which “encourages the ordering of the items, by number, by initial sound, by category, etc.” (Goody 1977). As with most modern technologies, the relational model implies a complex set of constraining and enabling elements. The basic structural unit, the “relation” (what most people would simply call a table) disciplines data modeling practices into logical consistency (tables only accept tuples/rows with the same attributes) while remaining “semantically impoverished” (Stonebreaker 1993). Heterogeneity is purged from the relational model on the level of modeling, especially if compared to navigational approaches (e.g. XPath or DOM), but the “set-at-a-time” retrieval concept, combined with a declarative query language, affords remarkable flexibility and expressiveness on the level of data selection. The relational view thus implies an “ontology” consisting of regular, uniform, and only loosely connected objects that can be ordered in a potentially unlimited number of ways at the time of retrieval (by means of the query language, i.e. without having to program explicit retrieval routines). In this sense, the relational model perfectly fits the qualities that Callon and Muniesa (2005) attribute to “powerful” calculative agency: handle a long list of diverse entities, keep the space of possible classifications and reclassifications largely open, multiply possible hierarchies and classifications. What database systems then do, is bridging the gap between these calculative capacities and other forms of agency by relating them to different forms of performativity (e.g., in SQL speak, to SELECT, TRIGGER, and VIEW).
While the relational model’s simplicity has led to many efforts to extend or replace it in certain application areas, its near universal uptake in business and government means that the logistics of knowledge and ordering implied by the relational ontology resonate through the technological layers and database schemas into the domains of management, governance, and everyday practices.

I will argue that the vision of the “programmer as navigator” trough a database (Bachman 1973) has, in fact, given way to a setting where database consultants, analysts, and modelers sit between software engineering on the one side and management on the other, (re)defining procedures and practices in terms of the relational model. Especially in business and government sectors, central forms of management and evaluation (reporting, different forms of data analysis, but also reasoning in terms of key performance indicators and, more generally, “evidence based” management) are directly related to the technological and cognitive standardization effects derived from the pervasiveness of relational databases. At the risk of overstretching my argument, I would like to propose that Thrift’s (2005) “knowing capitalism” indeed knows (largely) in terms of the relational model.

My colleague Theo Röhle and  I went to the Computational Turn conference this week. While I would have preferred to hear a bit more on truly digital research methodology (in the fully scientific sense of the word “method”), the day was really quite interesting and the weather unexpectedly gorgeous. Most of the papers are available on the conference site, make sure to have a look. The text I wrote with Theo tried to structure some of the epistemological challenges and problems to take into account when working with digital methods. Here’s a tidbit:

…digital technology is set to change the way scholars work with their material, how they “see” it and interact with it. The question is, now, how well the humanities are prepared for these transformations. If there truly is a paradigm shift on the horizon, we will have to dig deeper into the methodological assumptions that are folded into the new tools. We will need to uncover the concepts and models that have carried over from different disciplines into the programs we employ today…

I have not idea whether it’s going to be accepted but here is my proposal for the Internet Research 9.0: Rethinking Community, Rethinking Place conference. The title is: Algorithmic Proximity – Association and the “Social Web”

How to observe, describe and conceptualize social structure has been a central question in the social sciences since their beginning in the 19th century. From Durkheim’s opposition between organic and mechanic solidarity and Tönnies’ distinction of Gemeinschaft and Gesellschaft to modern Social Network Analysis (Burt, Granovetter, Wellman, etc.), the problem of how individuals and groups relate to each other has been at the core of most attempts to conceive the “social”. The state of “community” – even in the loose understanding that has become prevalent when talking about sociability online – already is an end result of a permanent process of proto-social interaction, the plasma (Latour) from which association and aggregation may arise. In order to understand how the sites and services (Blogs, Social Networking Services, Online Dating, etc.) that make up what has become known as the “Social Web” allow for the emergence of higher-order social forms (communities, networks, crowds, etc.) we need to look at the lower levels of social interaction where sociability is still a relatively open field.
One way of approaching this very basic level of analysis is through the notion of “probability of communication”. In his famous work on the diffusion of innovation, Everett Rogers notes that the absence of social structure would mean that all communication between members of a population would have the same probability of occurring. In any real setting of course this is never the case: people talk (interact, exchange, associate, etc.) with certain individuals more than others. Beyond the limiting aspects of physical space the social sciences have identified numerous parameters – such as age, class, ethnicity, gender, dress, modes of expression, etc. – that make communication and interaction between some people a lot more probable than between others. Higher order social aggregates emerge from this background of attraction and repulsion; sociology has largely concluded that for all practical purposes opposites do not attract.
Digital technology largely obliterates the barriers of physical space: instead of being confined to his or her immediate surroundings an individual can now potentially communicate and interact with all the millions of people registered on the different services of the Social Web. In order to reduce “social overload”, many services allow their users to aggregate around physical or institutional landmarks (cities, universities, etc.) and encourage association through network proximity (the friend of a friend might become my friend too). Many of the social parameters mentioned above are also translated onto the Web in the sense that a person’s informational representations (profile, blog, avatar, etc.) become markers of distinction (Bourdieu) that strongly influence on the probability of communication with other members of the service. Especially in youth culture, opposite cultural interests effectively function as social barriers. These are, in principle, not new; their (partial) digitization however is.
Most of the social services online see themselves as facilitators for association and constantly produce “contact trails” that lead to other people, through category browsing, search technology, or automated path-building via backlinking. Informational representations like member profiles are not only read and interpreted by people but also by algorithms that will make use of this data whenever contact trails are being laid. The most obvious example can be found on dating sites: when searching for a potential partner, most services will rank the list of results based on compatibility calculations that take into account all of the pieces of information members provide. The goal is to compensate for the very large population of potential candidates and to reduce the failure rate of social interaction. Without the randomness that, despite spatial segregation, still marks life offline, the principle of homophily is pushed to the extreme: confrontation with the other as other, i.e. as having different opinions, values, tastes, etc. is reduced to a minimum and the technical nature of this process ensures that it passes without being noticed.
In this paper we will attempt to conceptualize the notion of “algorithmic proximity”, which we understand as the shaping of the probability of association by technological means. We do, however, not intend to argue that algorithms are direct producers of social structure. Rather, they intervene on the level of proto-social interaction and introduce biases whose subtlety makes them difficult to study and theorize conceptually. Their political and cultural significance must therefore be approached with the necessary caution.