Monthly Archives: February 2012
The emerging field of software studies (and micro-annexes like “code studies”) shows a remarkable interest in code obfuscation (e.g. here, here, and here), a fun practice for creative programmers that plays on the fact that source code is text and can therefore be endlessly transformed (there are also more serious uses for obfuscation, generally in situations where source code is visible by design, e.g. JavaScript on the Web). While the practice of making a program’s source code unreadable without breaking functionality is indeed a way of approaching software from a potentially revelatory angle, I am somewhat astounded by how much attention humanities scholars pay to an exercise that is diametrically opposed to what 99% of all programmers spend considerable blood, sweat, and tears on every day, namely to make their code readable.
Code obfuscation as creative and playful practice for expert programmers speaks to the humanities’ interest in the original, the artistic, the deviant, and the critical but there is a real danger of losing connection with the mundane practice of writing software, where considerable energy is spent on writing code in a way that other people can easily understand it and, perhaps even more importantly, that a programmer can understand it quickly herself when coming back to a script or module weeks or months after it was written.
As most programmers will attest, the considerable difficulty of programming lies not so much in the “programming” part but in the managing of large amounts of stuff: complex architectures that span over many modules, huge APIs and libraries that provide highly specialized functionality, programming languages with always growing numbers of comfort functions (just look at how many array functions there are now in PHP), pages and pages of (sometimes badly written) documentation, different versions of basically everything, and – of course – the large amounts of code we ourselves and the people we work with have written, not so rarely under considerable time constraints, which leads of course to less than stellar code. The logistical dimension of programming is considerable.
SVN systems, powerful IDE’s (for somebody like me who only programs a couple of hours per week, autocomplete and integrated documentation are simply a godsend), and better development methodology obviously make the task of negotiating this massive environment a lot more bearable, but these tools are not eliminating the need to read code all the time to understand what’s going on. That’s why we try to make it readable as we write it and good refactoring (going over one’s code after the functionality is implemented) treats readability as a priority. But still, every programmer I know has, at one point in time, decided to write a library or a program herself simply because she didn’t want to experience the excruciating pain of reading somebody else’s poorly written code. This is how bad things can get.
Computer Science literature (like Steve McConnell’s classic Code Complete) and the Web are full of guidelines on how to write readable code and recommendations are intensely discussed and can be extremely detailed. I would like to argue here that one can learn as much – or more – about software by looking at strategies for readability than by looking at obfuscation. Some things are rather obvious, like choosing good names for modules, classes, functions, and variables; or like code indentation, which some programming languages have even made a requirement. Good commenting seems to be rather evident as well but there are many different schools of thought on that and automated comment generation in certain programming editors has not lead to real standardization. In general, while there is certainly wide agreement on the need for readability, the persistence of differences in style makes it clear that this is largely a question of convention and therefore depends on normative agreement rather than on simply finding the “best” technique.
But what I find most interesting about the question of readability is that beyond the cited elements lurk even more difficult questions that concern the borders between readability and architecture and between readability and complexity. Ed Lippert for example writes: “Don’t write ‘clever’ code; the maintenance programmers don’t have time to figure out your cleverness when it turns out to be broken.” This points to some of the basic tensions in modern software design and engineering: while programmers learn to value elegance, efficiency, and compact code, the requirements of large teams with a high degree of division of labor and the general speed-up of hardware can make readability a higher priority than execution speed or compactness. This can also mean to not use certain obscure functions or syntactical conventions. Consider these two examples in JavaScript:
variable1 = (variable2 == 10) ? 20 : false; and if(variable2 == 10) { variable1 = 20; } else { variable1 = false; }
These two elements are functionally equivalent; the first one however is much shorter and, especially for less experienced programmers, more difficult to read and understand.
Another question concerns when and how to divide code into functions, objects, modules, etc. Dustin Boswell and and Trevor Foucher’s Art of Readable Code for example recommends to “extract unrelated subproblems” by moving the code into a subroutine. While this may be straightforward in many cases, what the “reader” needs to know to understand the code can vary a lot from one case to another. Creating subroutines can certainly help with readability (and make code more easily reusable), but it a) means that the reader has to track down the subroutines and b) may make the code more complex simply because the subroutine may take into account different use cases that have to be distinguished. While redundancy is often considered a crime, it can have benefits when it comes to readability.
The subject of readability can be (and is) discussed infinitely but what is significant from a software studies’ perspective is that the problem points to the incursion of a social and economic context into the practice of programming. Not only do we ask “what is my code supposed to do?”, but also “who is going to read my code?”, “will other people work with my code?”, “is this something I will reuse?”, “how important is execution speed?”, and so on. While studying obfuscation points to the duality of computer code as text and machine, the readability question reveals it as caught up in various contexts that have to be negotiated in the practice of programming itself. That code is executable is the technical condition for software. That code is readable is not a requirement on the same level; but it has become a major aspect to a program’s capacity to become part of an increasingly structured professional practice.