Do Not Write Vacuum-Sealed Code

There seems to be a tendency for code writers to vacuum-seal their code. By “vacuum-sealed”, I mean that the code is inconvenient to add, edit, and modify. When you read a section of code it should be as convenient as possible to add a comment, add and remove lines, rearrange the lines, and generally modify the code in any way.

Which of these two bags of coffee is easier to add more coffee to?

Vacuum-Sealed Coffee bag

Tin Tied Coffee Bag

It is easier to add coffee to the tin-tied than the vacuum-sealed bag. Don’t vacuum seal the code. Write code that is ‘airy’, uncompressed, and convenient to edit.

Vacuum seals in code are in many forms. Here are a few examples:

Placing all the code in a return statement

...
public int someFunction(int input) {
    return input + someOtherFunction(input) % 2.718;  
}

This is a fairly short and simple example. This gets really ugly as the function gets into the 10 lines or longer. If you wanted to write a comment on the reason to call ‘someOtherFunction()’ this would be a bit of a hassle; it would be ‘inconvenient’. The refactored code now conveniently allows a comment.

...
public int someFunction(int input) {
    // call someOtherFunction() to spacle over the lower north regiem error. 
    int result = input + someOtherFunction(input); 
    result = result % 2.718;
    return result; 
}

This is much more convenient to add the code comment and edit the code. As a by-product, it is easier to debug and step through.

Anonymous Functions

Anonymous Functions are one of the biggest offenders. I continue to be amazed by functional programmers who seem to go to great extremes to avoid writing a function, but this is another post.

...
...
iCollection myCollection = theData.Filter(x ->  if (n <= 1) return false;
            for (int i = 2; i * i <= n; i++)            
                if (n % i == 0) return false;            
            return true);

Source of the anonymous function part: https://rosettacode.org/wiki/Primality_by_trial_division#C#

This is also a simple example. In the wild, the anonymous function code gets very long and programmers seem to enter anonymous functions on one line. The anonymous function vacuum seals this code. It is not convenient to fix a bug, add a comment, or write a test.

Also, A first-time reader of this code can not quickly determine what the anonymous function does. Plus, debugging and stepping through the anonymous function is difficult. The code is also not reusable, how would a programmer use the anonymous function in another filter?

The solution to removing the vacuum seal is to simply remove the anonymous and write a function.

...
...
iCollection myCollection = theData.Filter(x -> isPrime(x)); 

... 

/// returns true if the input is prime.  
public Boolean isPrime(Integer input) {
    if (n <= 1) return false;
    for (int i = 2; i * i <= n; i++) {            
        if (n % i == 0) return false; 
    }           
    return true;
}

In this version the code now has room to breathe, it is more convenient to edit, modify, reuse, debug, and test.

Conclusion

When writing code, try to get a feeling of when the code is vacuum-sealed and rewrite the code to avoid it. This will make your code easier to edit, modify, debug, reuse, and unit test. It will save you time and make future edits in the code convenient.

Do we avoid breaking changes too much?

In the software industry do we avoid breaking changes too much? I am talking about the big stuff. I am talking about browsers and programming languages. Breaking changes are difficult in these systems as there are multiple groups that need to coordinate to implement the change.

What is a breaking change? A breaking change is an upgrade that sacrifices compatibility with previous versions. For example, Python 3 was a breaking change, Python 2 code will not run on the Python 3 interpreter. This may be thought of as a branch. Breaking changes are noted by a change in the ‘major’ version number. For example 2.0.0 to 3.0.0 is a breaking change. See: Semantic versioning

Breaking changes are a big deal. I don’t want to under-emphasize this fact. There are people still upset about Python 3. Breaking changes incur costs to the users of the code. And if there are a lot of users then there are a ton of costs. But there are benefits, python 3 has risen to the top of the charts. I think the change to 3.0 allowed Python to expand to the king of the hill status. Without that breaking change, Python fades away.

A technology we have held on too long is a ‘Turtle’ technology (I just made that up, so don’t go Googling it). Developers that use a technology too long are called ‘Turtles’. They lock onto it like a snapping turtle and won’t let go. They growl at you and doubt your sanity if you suggest an alternative or doubt the rightness of it.

For some perspective let’s look at something that the a group of developers held on to for too long. Let’s talk about IBM’s green screen. Known as IBM 3270 and IBM 5250. Subsequently called 3270 in this essay. 3270 was developed in 1971 and was a very good solution for communication to a mainframe on a remote terminal. You could do anything on a 3270 including programming and debugging. But the 3270 was character-based and could not display graphics or bar codes. I worked in the AS/400 and IBM 5250 in 1995 and the developers were in the full turtle mode. Their forceful obstinacy seemed, to me, a bit irrational.

In the end, IBM wrote Java drivers for the database and all the systems for the AS/400, and we wrote a web page interface in short order. IBM even added Java to the AS/400. After which green screen development decline to a natural level for applications in which it makes sense to use, data entry applications, and such.

Another turtle technology is the ‘C’ programming language. C turtles are not the irrational kind. They acknowledge the problems of the language, such as buffer overruns and race conditions. C was created in the 1972-1973 time frame. It remains a very popular language and is the ‘go-to’ language for systems programming, low level, and embedded. C has several descendants such as C++ and C#. C++ adds object-oriented programming but still suffers C’s problems. C# is just Java misspoken and is more of a Java descendant than a C decendant. Both are very good, but neither one was designed to be a ‘systems’ language. I don’t know why it has 48 years to find something better. It may be the C is just that great. The good news it is looks like Rust may be the successor to C. Rust fixes most problems with C and still remains a systems-level language.

Try not to be a turtle. Remember all technologies will lose popularity. If your favorite tech is currently falling and in ridicule then recognize, accept it. Don’t take it personally. Don’t disparage your fellows on the other side. You can take steps to reverse the demise or move to something new or just live with it and ride the bubble down the drain.

The many levels of software anaomlies

This paper has been brewing in my head for a long time. It started when I heard someone say “All software has bugs.” I was logically forced to say that the statement was true, but it didn’t feel completely true to me. This statement needed some clarification and adjustment to make the truth ring in a harmonious fashion.

My first step was to think that not all bugs are the same. They occur at different times. After you write a line of code the bug could appear the 1st time you run it, or the 10th or the 1,000,000th. This leads me to create a scale similar to Big O notation for algorithms. I call this “Big B” notation. Big B notation is the Log10(N). Where N is the number of times the line of code is run to make the bug occur.

For Example:

// Divide by zero
result =  input / 0;

This will throw a divide by zero error the first time it is executed. It has a Big B of zero – written as B(0).

// Integer overflow 
int i = 1;
while (true) {
   i = i + 1;
}

In Java this will overflow the value of i when it is incremented the 2,147,483,647th time. Which is a B(9.3)

Big B is a handy notation. If you run a program and it has errors in B(0) to B(2) range this is a beta program. A released program should not contain B(7) or lower bugs and should not have known B(9.3) bugs like integer overflow. So now we can say “All programs have software bugs, but they should be less the B(7) bugs.

Big B is pretty good but it only covers one category of bugs. The bugs that occur on repeated execution. A lot of bugs don’t fall into this category. Bugs such as resource exhaustion, malicious user input and the long list of other categories. Don’t use Big B on these categories, it does work very well.

There needs to be some other classification of bugs. I thought maybe something like IP Code would be a good framework to work from. But this line of thought has not offered many rewards. Your comments and suggestions are very welcome.

There is one other level of anomalies. This is a category of bugs that has been running for decades. It has full unit testing and the code has been reviewed several times. No reasonable level of testing would discover this level of bug. This level of bug should be called a “Knuth”. The Knuth is named after Donald Knuth who writes checks to anyone who finds a bug in his books. See: Knuth reward check

There may be bugs in all software. Hopefully, they are only “Knuth Level” bugs. If your boss or users find one of these you should be proud knowing you did your best, no shame shall fall upon your family. Now you should fix the bug, and then write them a check for $2.56 or perhaps offer to buy them a beer.