Category Archives: mathematics

In August 2010, Edinburgh Sociologist Donald MacKenzie (whose book An Engine, not a Camera is an outstanding piece of scholarship) wrote an article in the Financial Times titled Unlocking the Language of Structured Securities where he discusses a software suite for financial analysis called Intex and compares it to a language that allows to see and interact with the world in certain ways rather than others. MacKenzie describes his first encounter with Intex as a moment of revelation that quickly turned into doubt:

The psychological effect was striking: for the first time, I felt I could understand mortgage-backed securities. Of course, my new-found confidence was spurious. The reliability of Intex’s output depends entirely on the validity of the user’s assumptions about prepayment, default and severity. Nevertheless, it is interesting to speculate whether some of the pre-crisis vogue for mortgage-backed securities resulted from having a system that enabled neophytes such as myself to feel they understood them.

While MacKenzie does not go as far as imputing the recent financial crisis to a piece of software, he points out that Intex is not recursive in its mode of analysis: when evaluating a complex financial asset, for example one of the now (in)famous CDOs that are made up of other assets, themselves combining further values, and so on, Intex does not follow the trail down to the basic entities (the individual mortgage) but calculates risk only from the rating of the asset in question. MacKenzie argues that Goldman-Sachs’ 2006 decision to basically get out of mortgage-based securities may well be a result of their commitment to go beyond available tools by implementing a (very costly) “bottom-up” approach that builds its evaluation of an asset by calculating up from the basic units of value. The card-house character of these financial instruments could become visible by changing tools and thereby changing perspective or language. Software makes it possible to implement very different practices or languages and to make them pervasive; but how does a company chose one strategy over another? What are the organizational and “cultural” factors that lead Goldman-Sachs to change its approach? These may be the truly challenging questions here, although they may never get answered. But they lead to a methodological lesson.

The particular strength of systems like Intex lies in their capacity to black-box evaluation strategies behind a neat interface that allows users to immediately operate on the underlying models, weaving these models into their decisions and practices. Conceptually, we understand the ways in which software shapes action better and better but the empirical complexity of concrete settings is positively daunting even outside of the realm of financial markets. What I take from MacKenzie’s work is that in order to understand the role of software, we have to be very familiar with the specific terrain a system is embedded in, instead of bringing overarching assumptions to the table. Software is a means for building structure and this building is always happening in particular organizational settings that are certainly caught up in larger trends but also full of local challenges, politics, and knowledge. Programs are at the same time structuring backdrop practice and part of a strategic repertoire that actors dispose of.

The case of financial software indicates that market behavior standardizes around available tools which leads to the systemic delegation of certain decision processes to software makers. This may result in a particular type of herd behavior and potentially in imbalance and crisis. Somewhat ironically, it is Goldman-Sachs that showed the potential of going against the grain by questioning programmed wisdom. That the company recently paid $550M in fines for abusing their analytical advantage by betting against a CDO they were selling to customers as an investment indicates that ethics and cunning are unfortunately two pair of shoes…

What is a link? From a methodology standpoint, there is no answer to that question but only the recognition that when using graph theory and associated software tools, we project certain aspects of a dataset as nodes and others as links. In my last post, I “projected” authors from the air-l list as nodes and mail-reply relationships as links. In the example below, I still use authors as nodes but links are derived from a similarity measure of a statistical analysis of each poster’s mails. Here are two gephi graphs:

If you are interested in the technique, it’s a simple similarity measure based on the vector-space model and my amateur computer scientist’s PHP implementation can be found here. The fact that the two posters who changed their “from:” text have both of their accounts close together (can you find them?) is a good indication that the algorithm is not completely botched. The words floating on the links on the right graph are the words that confer the highest value to the similarity calculation, which means that it is a word that is relatively often used by both of the linked authors while being generally rare in the whole corpus. Elis Godard and Dana Boyd for example have both written on air-l about Ron Vietti, a pastor who (rightfully?) thinks the Internet is the devil and because very few other people mentioned the holy warrior, the word “vietti” is the highest value “binder” between the two.

What is important in networks that are the result of heavily iterative processing is that the algorithms used to create them are full of parameters and changing one of these parameters just little bit may (!) have larger repercussions. In the example above I actually calculate a similarity measure between each two nodes (60^2 / 2 results) but in order to make the graph somewhat readable I inserted a threshold that boils it down to 637 links. The missing measures are not taken into account in the physics simulation that produces the layout – although they may (!) be significant. I changed the parameter a couple of times to get the graph “right”, i.e. to find a good compromise between link density for simulation and readability. But look at what happens when I grow the threshold so than only the 100 strongest similarity measures survive:

First, a couple of nodes disconnect, two binary stars form around the “from:” changers and the large component becomes a lot looser. Second, Jeremy Hunsinger looses the highest PageRank to Chris Heidelberg. Hunsinger had more links when lower similarity scores were taken into account, but when things get rough in the network world, bonding is better than bridging. What is result and what is artifact?

Most advanced algorithmic techniques are riddled with such parameters and getting a “good” result not only implies fiddling around a lot (how do I clean the text corpus, what algorithms to look for what kind of structures or dynamics, what parameters, what type of representation, here again, what parameters, and so on…) but also having implicit ideas about what kind of result would be “plausible”. The back and forth with the “algorithmic microscope” is always floating against a backdrop of “domain knowledge” and this is one of the reasons why the idea of a science based purely on data analysis is positively absurd. I believe that the key challenge is to stay clear of methodological monoculture and to articulate different approaches together whenever possible.

If we want to understand the plethora of very specific roles computers play in today’s world, the question “What is software?” is inevitable. Many different answers have been articulated from different viewpoints and different positions – creator, user, enterprise, etc. – in the networks of practices that surround digital objects. From a scholarly perspective, the question is often tied to another one, “Where does software come from?”, and is connected to a history of mathematical thought and the will/pressure/need to mechanize calculation. There we learn for example that the term “algorithm” is derived from the name of the Persian mathematician al-Khwārizmī and that in mathematical textbooks from the middle ages, the term algorism is used to denote the basic arithmetic techniques – that we now learn in grammar school – which break down e.g. the calculation of a multiplication with large numbers into a series of smaller operations. We learn first about Pascal, Babbage, and Lady Lovelace and then about Hilbert, Gödel, and Turing, about the calculation of projectile trajectories, about cryptography, the halt-problem, and the lambda calculus. The heroic history of bold pioneers driven by an uncompromising vision continues into the PC (Engelbart, Kay, the Steves, etc.) and Network (Engelbart again, Cerf, Berners-Lee, etc.) eras. These trajectories of successive invention (mixed with a sometimes exaggerated emphasis on elements from the arsenal of “identity politics”, counter-culture, hacker ethos, etc.) are an integral part for answering our twin question, but they are not enough.

A second strand of inquiry has developed in the slipstream of the monumental work by economic historian Alfred Chandler Jr. (The Visible Hand) who placed the birth of computers and software in the flux of larger developments like industrialization (and particularly the emergence of the large scale enterprise in the late 19th century), bureaucratization, (systems) management, and the general history of modern capitalism. The books by James Beniger (The Control Revolution), JoAnne Yates (Control through Communication and more recently Structuring the Information Age), James W. Cortada (most notably The Digital Hand in three Volumes), and others deepened the economic perspective while Paul N. Edwards’ Closed World or Jon Agar’s The Government Machine look more closely at the entanglements between computers and government (bureaucracy). While these works supply a much needed corrective to the heroic accounts mentioned above, they rarely go beyond the 1960s and do not aim at understanding the specifics of computer technology and software beyond their capacity to increase efficiency and control in information-rich settings (I have not yet read Martin Campell-Kelly’s From Airline Reservations to Sonic the Hedgehog, the title is a downer but I’m really curious about the book).

Lev Manovich’s Language of New Media is perhaps the most visible work of a third “school”, where computers (equipped with GUIs) are seen as media born from cinema and other analogue technologies of representation (remember Computers as Theatre?). Clustering around an illustrious theoretical neighborhood populated by McLuhan, Metz, Barthes, and many others, these works used to dominate the “XY studies” landscape of the 90s and early 00s before all the excitement went to Web 2.0, participation, amateur culture, and so on. This last group could be seen as a fourth strand but people like Clay Shirky and Yochai Benkler focus so strongly on discontinuity that the question of historical filiation is simply not relevant to their intellectual project. History is there to be baffled by both present and future.

This list could go on, but I do not want to simply inventory work on computers and software but to make the following point: there is a pronounced difference between the questions “What is software?” and “What is today’s software?”. While the first one is relevant to computational theory, software engineering, analytical philosophy, and (curiously) cognitive science, there is no direct line from universal Turing machines to our particular landscape with the millions of specific programs written every year. Digital technology is so ubiquitous that the history of computing is caught up with nearly every aspect of the development of western societies over the last 150 years. Bureaucratization, mass-communication, globalization, artistic avant-garde movements, transformations in the organization of labor, expert movements in public administrations, big science, library classifications, the emergence of statistics, minority struggles, two world wars and too many smaller conflicts to count, accounting procedures, stock markets and the financial crisis, politics from fascism to participatory democracy,… – all of these elements can be examined in connection with computing, shaping the tools and being shaped by them in return. I am starting to believe that for the humanities scholar or the social scientist the question “What is software?” is only slightly less daunting than “What is culture?” or “What is society?”. One thing seems sure: we can no longer pretend to answer the latter two questions without bumping into the first one. The problem for the author, then, becomes to choose the relevant strands, to untangle the mess.

In my view, there is a case to be made for a closer look at the role the library and information sciences played in the development of contemporary software techniques, most obviously on the Internet, by not exclusively. While Bush’s Memex has perhaps been commented on somewhat beyond its actual relevance, the work done by people such as Eugene Garfield (citation analysis), Calvin M. Mooers (information retrieval), Hans-Peter Luhn (KWIC), Edgar Codd (relational database) or Gerard Salton (the vector space model) from the 1950s on has not been worked on much outside of specialist circles – despite the fact that our current ways of working with information (yes, this includes your Facebook profile, everything Google is doing, cloud computing, mobile applications and all the other cool stuff Wired writes about) have left behind the logic of the library catalog quite some time ago. This is also where today’s software comes from.

Over the last year, I have been reading loads of books in and on Information Science, paying special attention to key texts in the (pre)history of the discipline. Fritz Machlup and Una Mansfield’s monumental anthology The Study of Information (Wiley & Sons, 1983) has been a pleasure to read and there are several passages in the foreword that merit a little commentary. I have always wondered why Shannon’s Mathematical Theory of Communication from 1948 has been such a reference point in the discipline I started out in, communication science. Talking about purely technological problems and pumped with formulas than very, very few social science scholars could make sense of, the whole things seems like a misunderstanding. The simplicity and clearness of the schema on page two – which has been built into the canonical sender-receiver model – cannot be the only reason for the exceptional (mostly second or third hand) reception the text has enjoyed. In Machlup & Mansfield’s foreword one can find some strong words on the question of why a work on engineering problems that excludes even the slightest reference to matters of human understanding came to be cited in probably every single introduction to communication science:

“When scholars were chiefly interested in cognitive information, why did they accept a supposedly scientific definition of ‘information apart from meaning’? One possible explanation is the fact that they were impressed by a definition that provided for measurement. To be sure, measurement was needed for the engineering purposes at hand; but how could anybody believe that Shannon’s formula would also measure information in the sense of what one person tells another by word of mouth, in writing, or in print?
We suspect that the failure to find, and perhaps impossibility of finding, any ways of measuring information in this ordinary sense has induced many to accept measurable signal transmission, channel capacity, or selection rate, misnamed amount of information, as a substitute or proxy for information. The impressive slogan, coined by Lord Kelvin, that ‘science is measurement’ has persuaded many researchers who were anxious to qualify as scientists to start measuring things that cannot be measured. As if under a compulsion, they looked for an operational definition of some aspect of communication or information that stipulated quantifiable operations. Shannon’s formula did exactly that; here was something related to information that was objectively measurable. Many users of the definition were smart enough to realize that the proposed measure – perfectly suited for electrical engineering and telecommunication – did not really fit their purposes; but the compulsion to measure was stronger than their courage to admit that they were not operating sensibly.” (p. 52)

For Machlup & Manfield – who, as trained (neoclassical) economists, should not be deemed closet postmodernists – this compulsion to measure is connected to implicit hierarchies in academia where mathematical rationality reigns supreme.  A couple of pages further, the authors’ judgment becomes particularly harsh:

“This extension of information theory, as developed for communication engineering, to other quite different fields has been a methodological disaster – though the overenthusiastic extenders did not see it, and some of them, who now know that it was an aberration, still believe that they have learned a great deal from it. In actual fact, the theory of signal transmission or activating impulses has little or nothing to teach that could be extended of applied to human communication, social behavior, or psychology, theoretical or experimental.” (p. 56)

Shannon himself avoided the term “information theory” and his conception of communication obviously had nothing to do with what the term has come to mean in the social sciences and general discourse. But the need to show that the social sciences could be “real” sciences in search of laws formulated in mathematical terms proved stronger than the somewhat obvious epistemological mismatch.

Like many classic texts, Machlup & Manfield’s work offers a critique that is not based on dismissal or handbag relativism but on deep engagement with the complexities of the subject matter and long experience  with interdisciplinary work, which, necessarily, makes one bump into unfamiliar concepts, methods, ontological preconceptions, modes of reasoning, vectors of explanation and epistemological urges (what is your knowledge itch? how do you want to scratch it?). The Study of Information is a pleasure to read because it brings together very different fields without proposing some kind of unifying meta-concept or imperialist definition of what science – the quest for knowledge – should look like.

The question of how mathematics could lay the foundation for a machine that sustains such a wide variety of practices is really quite well understood from the point of view of the mathematical theory of computation. From a humanities standpoint however, despite the number of texts commenting on the genius of key figures such as Gödel, Turing, Shannon, and Church, there is still a certain awkwardness when it comes to situating the key steps in mathematical reasoning that lead up to the birth of the computer in the larger context of mathematics itself. One of the questions I find really quite interesting is the role of the formalist stance in mathematics.

In the philosophy of mathematics, there are many different positions. The realist stance for example holds that mathematical objects exist. For the platonist, they exist in some kind of extra spatio-temporal realm of ideas. For the physicalist, they are intrinsically connected to material existence, even if that relationship is not necessarily simple. Then there is formalism and this is where things get interesting. In a tale we can read in many social sciences and humanities books on the computer, there is the young Kurt Gödel that smashes the coherent world of the “establishment” mathematician David Hilbert, inventing the metamathematical tools that will later prove essential for the practical realization of computing machinery in the process. What is most often overlooked in that story is that Hilbert’s formalist position is already an extremely important step in the preparation for what is to come. For Hilbert, the question of the ontological status of mathematical objects is already a no-go – truth is no longer defined via any kind of correspondence to an external system but as a function of the internal coherence of the symbolic system. As Bettina Heintz says, Hilbert’s work rendered mathematical concepts “self-sufficient” (autark) by liberating them from any kind of external benchmark and opening a purely mechanical world where symbolic machinery can be built at will, like in a game.

If we want to think about computing today, I think we should remember this break from an ontological concept of truth to a purely formalistic one (even if that mean Gödel put a pretty big crack in it lateron). Because in a way, programming is like a “game” with formulas and if the algorithm works, that means it is “true”. In this sense, Google’s PageRank algorithm is true. But without the reference to an external system, this “truth” is purely mechanical, internal. In a similar way, an algorithm’s claim to objectivity, impartiality, or neutrality should be seen as internal only. The moment we apply mathematics to the description of some external mechanism (gravity, for example), there is a second truth criterion that intervenes, which refers to the establishment of correspondence between the formal system and the external reality. In the same way, if an algorithm is applied to, let’s say the filtering of information, the formal world of the game is mapped onto another world. There is an important difference however. When mathematics are applied to physical phenomena, the gesture is descriptive and epistemological (verb: is). When an algorithms is applied to tasks such as information filtering, the gesture is prescriptive and political (verb: ought).

The fact than an automatic procedure works makes it true in a formal sense. The moment we apply it to a certain task, other criteria intervene. Hilbert’s formalism pulled mathematics from the empirical world and if we bring the two together again by writing software, the criteria by which we judge the quality of that action should be seen as political because there are no mathematical criteria to judge the mapping of on world onto the other. No Hilbert to hold our hand…