Cohesion and redundancy


    We’re doing some static analysis with NDepend as part of our daily build and the cohesion measurement caught my interest.

    When googling for cohesion I stumbled upon some articles discussing reading and writing normal text (normal in this case meaning not source code).

    The concepts that I fell for are cohesion and redundancy.

    Here is what Wikipedia has to say about these words regarding programming:

    • Cohesion: a measure of how well the lines of source code within a module work together to provide a specific piece of functionality.

    • Redundancy: The notion of unnecessary, dead or duplicate, code.

    … and here are the same words regarding linguistics:

    • Cohesion: the linguistic elements that make a discourse semantically coherent.

    • Redundancy: In language, redundancy is the use of duplicative, unnecessary or useless wording.

    Linguistics also has the notion of coherence; what makes a text semantically meaningful.

    In this article Alice Horning discusses research about cohesion and redundancy in writing. It made me draw some parallels between writing good literature and writing good source code.

    Alice states that novice writers lack the ability to distance themselves from the writing to see their text from a readers point of view. I see the same pattern on and on when reading, and trying to understand, source code. It is, mostly, not the programmers’ lack of technical competence that makes it hard to understand their intent. It is their style of naming and grouping things together that creates a total lack of linguistic cohesion, and coherence.

    I guess Domain Driven Development would tend to some of this but all I know about that is what Jimmy Nilsson said on DotNetRocks. At least a common way of naming, abstracting and grouping things within an application would use the linguistic sense of redundancy to make code more understandable. Design patterns will also use redundancy to help the reader understand code since he probably will have a good idea of what’s coming if the code corresponds to a known pattern.

    Code reviews also help a lot, but that is done far less than one would like.

    I think the cure for this is to take time to read more code. Just as you need to read lots of literature to be a good author you need to read other peoples source code to enhance your own skills. You need to read code that is more than just samples and proof of concepts, and I think it should be code that you are not emotionally attached to. So get your favorite open source application and read its code. While you’re at it you can contribute to it and make the world an even better place.