Tuesday, May 9, 2017

Ski rental, and when should you refactor your code?

A codebase often accumulates technical debt:  bad code that accumulates over time due to poor design or due to taking shortcuts to meet a deadline.

Sometimes, you need to take a break from implementing features or fixing bugs, and temporarily focus on repaying technical debt via activities such as refactoring.  If you don't do so, then the technical debt will accumulate until you find it impossible to make any changes to your software, and even refactoring to reduce the technical debt may become impractical.  But, you don't want to spend all your time refactoring nor to do so prematurely:  you want to spend most of your effort on delivering improvements that are visible to users, and you cannot be sure of your future development plans and needs.

Typically, the decision about when to refactor or clean up code is based primarily on gut feel, emotions, or external deadlines.  The ski rental algorithm offers a better solution.

Ski rental is a canonical rent/buy problem.  When you are learning to ski and unsure of whether you will stick with the sport, each day you have the choice of renting or buying skis.  Buying skis prematurely would be a waste of money if you stop skiing.  Renting skis for too long would be a waste of money if you ski for many days.

Here is an algorithm that guarantees you never spend more than twice as much as you would have, if you had known in advance how many days you would ski.  Rent until you have spent as much money as it would cost to buy the skis, and then buy the skis.  If you quit skiing during the rental period, you have spent the minimum possible amount.  If you continue skiing until you purchase skis, then you have spent twice as much as you would have, were you omniscient.  (Randomized algorithms exist that give a better expected expected amount of money lost, but do not guarantee a limit on the amount of wasted money.)

You can apply the same approach to refactoring.  For any proposed refactoring, estimate how much time it would take to perform, and estimate the ongoing costs of not refactoring (such as extra time required to add features, fix bugs, or test).  Use the ski rental algorithm to decide when to perform the refactoring.

The problem with the ski-rental approach is that programmers are notoriously bad at cost estimation, and ski rental requires you to compare two different estimates.  However, the alternative is to continue to make your decisions based on how much the code smells bother you -- an approach that is likely to waste your time, even if it satisfies your emotions.

Monday, May 1, 2017

Plotters, pantsers, and software development

Last week, I gave two talks at TU Delft and also had the privilege to hear a talk by Dr. Felienne Hermans that analogized programming to story-writing.  One source of inspiration for her was the observation that when kids program, their programs might not contain any user interaction, but only show a story, somewhat like a movie.

There are two general types of fiction writers:  plotters and pantsers.  A plotter outlines the story and plans its structure and characters before beginning to write and filling in the details.  By contrast, a pantser prefers to write by the seat of the pants, discovering the story as they write and later revising to achieve consistency.  There are great writers who are plotters and great writers who are pantsers (and most writers are probably some combination of the two personalities).  Each approach requires heavy work.  Plotters do their heavy work during the planning stage.  Pantsers do their heavy work during rewriting stages.

Most recommendations about software development come from a plotter mentality.  The developer should determine user requirements and decide on an architecture and specifications of components before writing the code.  Extreme Programming can be viewed as a reaction to this "Big Design Up Front" attitude.  Extreme Programming forbids pre-planning:  it encourages taking on one small task at a time and doing only enough work to complete that task.  It advocates refactoring during development -- similar to rewriting a text -- as the developers discover new requirements or learn the limitations of their design.

Perhaps Extreme Programming is a reaction from pantsers who feel alienated by the dominant software development approach.  Perhaps Extreme Programming is their attempt to express and legitimize their own style of thinking.  Perhaps by respecting those mental differences, we can improve education to attract more students and make them all feel welcome.  And perhaps both plotters and pantsers can understand the other in order to avoid needless religious wars over the right approach to software development.

Felienne wasn't able to answer my questions about plotters and pantsers, such as the following.  Can we look at a finished piece of writing and tell whether it was created by a plotter or a pantser?  How should we teach writing differently depending on the learner's preferred style?  Is one's personal style innate or learned?  Can people be trained to work in the other style, and what is the effect on their output?  For novices, which approach produces more successfully-completed manuscripts and fewer abandoned efforts?  Are different styles more appropriate for different genres, or for series rather than individual books?

Analogies can be useful, especially in sparking ideas, but they should not be taken too far.  For example, the frequent analogies between civil engineering and software engineering have led to unproductive "bridge envy" and incorrect comparisons.  Although many bridges are built each year, most of them do not require imaginative design because they are similar or identical to previously-built bridges.  By contrast, every new program is fundamentally different from what exists -- otherwise, we would just reuse or modify existing code.  Therefore, the design challenges are much greater for software.

I also have questions about the analogy between fiction writing and programming.  (I want to admit to you and to myself that I am a plotter, so these questions may reflect my personal bias.)  Although plotting and pantsing may both produce great novels when practiced by great writers, would they both produce great nonfiction -- or, for that matter, great bridges?  Extreme Programming has been shown to work in certain circumstances, but few people practice it in its pure form, and it does not scale up to large development efforts.  It is commonly said that you can't refactor a program to make it secure or to give it certain other desirable properties; is this actually true, and if so what does it say about the utility of pantsing in software engineering?  Can pantsing work well within the confines of a well-understood domain -- such as writing a period romance novel or building a website based on a framework you have used before?

Whatever the benefits of the writing analogy for software engineering, it is a thought-provoking alternative to the civil engineering analogy.  It reminds us to be aware of the many ways of thinking, not just our own.