Agile Chronicles #12: Technical Debt

Agile Chronicles #12: Technical Debt

I know it may sound like I’m painting a rosy, infallible picture of Scrum.  It’s the truth, though, and I feel like it solved most of my project problems.  There is, however, one main problem I saw that made Scrum, and Iterative Development in general, fall flat on its face.  It’s called Technical Debt, and it’s a problem with programming in general, not Scrum/Agile.  When it rears its head in Scrum, the effects are devastating, and I believe one of the main reasons Scrum fails for a lot of people.

This article will discuss what Technical Debt is from a Flash/Flex developer perspective, how it negatively affects my Scrum projects, and what are some of the prescribed ways to prevent it.  Nothing ground breaking here folks, just corroboration that TD IS a major problem, and not even Scrum is immune.

What is Technical Debt?

Martin Fowler has a good summary about what Technical Debt is.  Here are 2 quotes from his wiki:

You have a piece of functionality that you need to add to your system. You see two ways to do it, one is quick to do but is messy – you are sure that it will make further changes harder in the future. The other results in a cleaner design, but will take longer to put in place.

Git-r-done vs. doing it right.

Another quote which describes the metaphor:

…doing things the quick and dirty way sets us up with a technical debt, which is similar to a financial debt. Like a financial debt, the technical debt incurs interest payments, which come in the form of the extra effort that we have to do in future development because of the quick and dirty design choice. We can choose to continue paying the interest, or we can pay down the principal by refactoring the quick and dirty design into the better design. Although it costs to pay down the principal, we gain by reduced interest payments in the future.

The metaphor also explains why it may be sensible to do the quick and dirty approach. Just as a business incurs some debt to take advantage of a market opportunity developers may incur technical debt to hit an important deadline. The all too common problem is that development organizations let their debt get out of control and spend most of their future development effort paying crippling interest payments.

He goes on to mention that unlike money, you cannot effectively measure technical debt. Additionally, you only get a return on your loan once you deliver.

Signs of Technical Debt

There are various signs of Technical Debt.  System fragility, tons of reoccurring bugs, or code that “sometimes” works.

System fragility examples include when you move a login button, the entire login form doesn’t work.  Another sign is when you change a ValueObject’s property name slightly, yet get no new compiler errors.  Yet another, usually a violation of DRY, is when you change a parsing error of a server’s response, yet still get the bug after compilation.

Another example is the statement “the application doesn’t seem to be working”.  If there had been diagnostic tools in place such as unit tests, a debug/log window, or other testing tools, that question could be quickly answered.  Even if it can’t, if you can’t quickly isolate problems after asking some follow up questions, that’s another indicator.

Another is when you fix a bug. 3 times. Over 3 Sprints.  Then it comes back in some other unrelated area.  Another is before the user even does anything you’re presented with one or more error dialogues.

The best are when you see the right data.  Sometimes.  You have no idea if it’s your fault, the server’s fault, or… what.

Bottom line the code has growing pains, and is challenging to change.  When you change one thing it breaks something else unrelated, your change is a pain to implement, or your new code duplicates a lot of functionality because you can’t borrow other functionality because it isn’t encapsulated enough to share.

The one that took me a LONG time to get over when I first started dealing with extremely large code bases was “fatique”. I’d get tired of dealing with all the problems, and feeling like I wasn’t making any progress. That, and I had no confidence in my changes because I wasn’t sure what the ramifications were, what would break, etc.  Effectively, I didn’t want to work on the project anymore. This has happened twice to me.  It’s not from my attention span, but rather just pure loathing at not being able to move forward.  Part of this is remedied in experience, part in “learning the story” of code base by spending a lot of time with it, and just general growth as a programmer. If you’re grown, and still have this fatigue, that’s another clue.

All these and more are signs of Technical Debt.  If not dealt with, they only get worse and more numerous.

Technical Debt’s Effects on Scrum

In Scrum, your user stories need to pass User Acceptance Testing (UAT).  At the end of every Sprint, your client validates each user story your team completed is in fact done.  If you have technical debt, this’ll eat away at the quality and/or dependability of your user stories actually working.  This means one or more stories won’t pass UAT.  This then slows down your, or your team’s, velocity.

This often will negatively affect your client’s view of the team as your earlier velocity will have been higher.  Usually your team will start off slow or fast, but eventually 4 Sprints in, a common velocity will emerge.  With a lot of Technical Debt, the reverse will happen; your initial speed will slowly decrease as your team struggles to cope with the increasing Technical Debt.  If not tackled, more debt is created, and your velocity stays slow, or decreases.

Technical Debt will also often exhibit itself as “bugs”.  In Scrum “there are no bugs”.  Benevolent Scrum Masters will assign 1 point to bugs.  This is often done to put a sense of accomplishment that the bug was completed, but more importantly that it’s trackable.  Its low point value can unfortunately skew to its business value.  Perhaps the bug is what prevented your original user story worth the maximum of 5 points from passing UAT.  Worse, if your team is working on fixing bugs, they aren’t working on user stories.

Again, negatively affecting your velocity.

Eventually your team will either have to pull back and pay off their technical debt, continually pay off a little to show some payoff, or just pray the client focuses on more pressing user stories in other areas and “forgets” the glaring issues.  I’ve seen all 3 happen.

Pullback & Reassess

Scrum doesn’t really have methodology for pulling back and fixing core architectural problems.  It’s generally assumed that 80% of your 2 week Sprint is spent refactoring.  I’ll let that sink in.

Done?  Worse, Technical Debt, being unmeasurable, wreaks havoc on a Product Manager’s ability to give transparency into the project for stakeholders.  You don’t often have a stable velocity; some Sprints your team could simply deliver 0 story points.  This makes it challenging for the PM to project when the Backlog will be completed and when certain milestones will be reached.  It’s even worse when managing budgets and getting contracts signed.  Scrum is hard enough to sell to clients who are used to fixed bid; suddenly Scrum sounds like a bait and switch, or just a money pit.  Assuming your team is actually working on the user stories they were assigned and aren’t trying to sabotage the project, then that situation is usually irrefutable proof that your team has some serious Technical Debt.

Software done with Scrum is like a plane flying from A to B.  While the flight path is continually corrected, eventually it’ll get there.  Planes usually can’t fly with ice on the wings, hence why you setup good architectural bases in the first 2 Sprints.  Ice on the wings during flight can be disastrous.

For some projects, sometimes you have no choice.  Your team has to take the hit, pay off the debt, and rebuild trust with your client.  Yes, these can be bloody, but hey… anything is better than Waterfall which in my experience ends like Lock, Stock, and Two Smoking Barrells: Tons of death, theft, and people get bling who don’t deserve it.  While this is effective with Capitalism, life is too short to product shitty software.

American Credit

While not an American only phenomenon, it’s certainly indicative to our culture, hence why Fight Club had such an interesting ending.  If you’re quick, good, and lucky you can get away with paying as much Technical Debt as you can per Sprint while knocking out a few new user stories as well.  Sometimes this is just how things are; other times, you’re merely postponing the inevitable day of reckoning.  Consultants are famous for this.  They get away with masking the massive Technical Debt, and moving on letting some other poor W2 (employee) sap, or consultants like me, to actually fix the Technical Debt.

Eventually, you have to fix the broken windows… or maybe “you” don’t, but someone does.  It’s easier to do this sooner than later.  I’ve even seen a team where one guy would fix windows 90% of the Sprint, and another would unintentionally break them again.

Pray for Priority Shift

The third one I’ve seen happen in larger companies who don’t adhere 100% to Scrum.  The priorities change for the user stories, and other parts of the application which are basically broken are ignored, buried, and eventually assumed to work.  Flowers don’t grow as well over a dead robot’s grave.

Incidentally, when higher ups find out, this is where I get some of my work.

Preventing & Fixing Technical Debt

As you can see, Technical Debt has devastating effects on Scrum, and Iterative Software in general.  The best way to ensure Technical Debt doesn’t negatively affect your project is to prevent it.  It’s like insect infestations in your house; if you don’t attack the problem early, it can be very challenging to remove.  The best way is to prevent the infestation in the first place by not leaving out food, water, and generally ensuring you have sealed entryways and windows.  Same with software.  Start with a tight, simple architecture, and continue to maintain it.

That sounds great from a high-level perspective, but what does that really mean?

Here’s a few I know & have experience with:

  • good design
  • good developers
  • fixing broken windows ASAP
  • OOP, specifically encapsulation
  • unit tests with a decent coverage
  • an agreed upon framework
  • throwing optimization out the window
  • using old technology
  • Continuous Integration: frequent checkins & religious tagging

Here’s a few I know of, but don’t have a lot, or any, experience with:

  • Test Driven Development
  • Automated Build Systems
  • Automated Testing (Hudson, Cruise Control, etc)
  • Functional/Automation Testing (QuickTest and others)

Let’s go over these in short detail.

Good Design

The #1 thing that saves you the most time, reduces the most risk, and directly contributes to the efficiency of your team is a good deign.  Wireframes you hate since you’ve re-worked them so many times with gallons of dead tree’s at your feet, and design comps you love.  5 minutes of a designer’s time can save a week of a developer’s.  When you hit a brick wall during Sprint 5, a designer can quickly put your team back on track quicker and more accurately than a developer can.

Hire, and keep around, a good User Experience/Interaction Designer/Information Architect/Designer today!  If you’re made of epic win, hire a Usability Engineer as well.  They are the ONLY ones who can provide real facts to your team.

Good Developers

A common complaint of Scrum is that it only works with senior to mid-level developers.  I agree.  While it takes a village to build great software, you need to ensure your development team is top notch, or mostly top notch with a rockstar to guide the rest.  I realize this isn’t a reality for a lot of people.  Even if you have a senior dev, the bad developers can actually detract from the senior’s ability to deliver as he/she struggles to re-factor their code while ignoring their own user stories.  That or the bad/mediocre developer continually creates Technical Debt.  While developers not focusing on their user stories is a management problem, not a Scrum specific one, if they can’t even be professional and do what they are told, they are effectively sabotaging the project, and thus need to be removed.

Remember, small agile teams are more effective than larger ones.  If it ends up only being 2 heavy hitters, so be it.  Or you can just hire my team.

Fixing Broken Windows

Not much to add here other than they are usually one of the few things that can clearly indicate Technical Debt.  If you have to prioritize, choose to ignore the ones you can easily predict and reproduce.  They are less dangerous and take a lot less time to annoy your clients.  It’s when you go, “Whoa… never seen that happen before” is when you worry your client and PM.

OOP Encapsulation

One of the good concepts of Object Oriented Programming is Encapsulation.  It may seem silly to bring up something so basic, but it’s a basic, core concept for a reason.  Without good encapsulation, your code cannot grow, nor be flexible; something that’s VERY integral in Iterative development.

It’s ok if your API is good and the code underneath it sucks.  This is where Test Driven Development can shine.  Instead of “creating a component” from the inside out you actually write like you’re going to use the component yourself.  A better API usually results.  Then, you can focus on writing the guts to make that API happen… or not (see throwing optimization out the window).  It’s easier to fix bad code inside encapsulated, and thus easily isolatable components than it is sphaghetti code.  Sphaghetti monsters are waaaayyyyy more scary to fix than WW2 tanks that just need a new engine.

Unit Tests with Decent Coverage

Even if you don’t practice TDD, it’s still a good practice to implement unit tests on problem areas of the code.  TDD purists will state you should write tests first.  If you don’t, there is still a ton of value in writing unit tests, or getting good “coverage” of problematic areas.  Key areas include your Service layer or any code that accesses non AS3 data and injects it into the system.  This includes XML, text, JSON, AMF, binary data, and anything that “parses” something.

Strong-typing is awesome, but it assumes you started with strong-typing.  Parsing XML to ValueObjects is great, but if “That PersonVO is a Spy!”, good unit test coverage can usually point that stuff out early and save the team days of debugging and lost time.

Plus, it’s really awesome to blame with 100% confidence the back-end Java team for your woes in 7 seconds vs. 2 hours of frustrating searching for a needle in a haystack whilst being verbally bashed by the back-end devs wondering why you didn’t use HTML5 instead of Flex.  Suddenly Outlook does a cliche freeze before you can write a scathing, emotionally charged, career destroying email back at those over-architecting ($*&Us.

While they have less value on GUI components, some of the ways in which you architect Flex 4 components can actually facilitate easier testing.  Examples include really complicated components that NEED to be in a specific state when data is injected.  While a really complicated chart that has 100 unit tests around it still has to be visually verified, those 100 unit tests can help dramatically.

Finally, in those areas where you have good coverage, you can do minor refactoring with a ton more confidence.

Agreed Upon Framework

Doesn’t matter what framework your team chooses, as long as you all are in agreement.  This could even include using no framework.  I prefer Robotlegs, my partners prefer Swiz, and my clients often still use Cairngorm 2.x.  Regardless of your thoughts on frameworks, in my experience they help you manage, scale, and work with a team on larger code bases.

Premature Optimization

Knuth once said:

…premature optimization is the root of all evil.

After awhile, common optimizations you do by default, and to me, aren’t even considered intentional optimizations.  Some are small, low hanging fruit, that you can do without even thinking and make into habits.  Examples include setting an Array’s length to a local variable instead of inside of a loop, using Object Pooling, and using visible vs. removeChild.

Good programmers constantly want to make their code better.  Sometimes this means faster or more responsive.  The really good ones know when to leave a TODO, and move on.  Most users are happy with a reasonably responsive GUI.  A programmer’s expectations of “responsive” is often way higher than what a user will accept.

Code it and refactor later.  You can’t refactor vaporware.  Focus on pimp API’s.  Besides, this is ActionScript, not C.  We have certain limitations (no threads, a VM vs. all code being machine code, and a lot of missing optimizations).  Brandon Hall once said if you have a performance problem in Flash, you’re probably doing it wrong.  Instead of re-factoring code, step back and re-assess your approach.  Recognize those limitations early, and yell like crazy at the designer, “WE CAN’T GO TO THE MOON IN A TRICYCLE YOU FRAKIN CRAYON PUSHER!!!?”

If you have something to prove, do it at night and post to your blog.  If the GUI isn’t responsive, well duh, that’s not a bad optimization, that’s a required one.  The point here is if you need to prove through tests cases that by not doing your optimization(s) that the app will be unresponsive, you’re either in dangerous time sink territory, or doing something amazing.

Use Old Technology

While I love using Flex 4, a lot of the apps I still work on, and clients have in development and/or production are Flex 3.x.  Flex 3.5 & Flex Builder are solid, 4.0 SDK and IDE, not so much.  You want your problems to be programming challenges, not unexpected scrollRect or IDE magically disappearing code hinting bugs.  These things can be major time sinks and devastating to your productivity, sometimes killing your entire day as you attempt to reinstall & reconfigure various things.  Mitigate those risks by using old software that works and has a proven track record.  Most won’t listen to this paragraph and I don’t blame them.  Flex 4 skinning does help a lot in the “make this design work” department, although, no one talks about the overhead costs of putting an icon in a Spark button, and all the other simple things that are now more complicated.

Software traditionally outlives it’s original expected lifespan by 3 times or more.  One of the reasons is that it works.

Continuous Integration

Continuous Integration is a gigantic topic, and actually encapsulates a lot of other sub-topics I’ve already mentioned.  I just want to focus on the parts of it that I found extremely helpful and harmful if not followed.

The first is checking in your code often.  If you don’t know how to merge code, learn.  If you don’t know what tool to use, use BeyondCompare.  If you’re on a Mac like me, use CrossOver to make BC work (I’ve been using that setup for 4 years).

Even on a 2 man team, merging and testing your changes can take a lot of time, even if you and the other developer are on completely different sections of the code base.  On one project, we actually set a specific merge day on the 2nd week of our 2 week Sprint to ensure we planned for this disruptive day.  We’d both merge frequently, but we’d still ensure we didn’t merge much more after that since it can wreak havoc on UAT day.

Regardless of what source control system you use, you NEED to implement tagging.  This doesn’t need to be a team thing at all; just give 1 person responsibility with this important task.  Every UAT day, they need to tag the build.  Developers in general should make their own tags and branches, but sometimes the code isn’t in a risky spot and you don’t end up doing it.  That’s ok.  You should still have Sprint tagged builds so you have SOME saved point to confirm/deny major bugs/changes against.

Test Driven Development

I won’t go into detail here, but I’ve seen the positive benefits on my Service layer, or when I’m writing a library for others to use.  For both, you actually create the API first and it feels immediately relevant.  Additionally since the Service layer is often the weakest link in your application it’s great to proactively find bugs that would of usually cost a lot of time early on.  That, and you have a dependable suite to test for changes that may negatively affect things later on.  Finally, you can test if the entire applications communication system is working within seconds, quickly killing assumptions that add to debugging time.

What I haven’t resolved yet, and this is probably due to my inexperience with it, is how to manage the amount of code I have to refactor.  No longer just the components, but the tests often as well.  This is especially true with some of my more complicated GUI composition components where I test the controller layers.  Maybe it’s my inexperience with TDD & unit tests, maybe my tests just aren’t very good… I don’t know.  Like any problems with the GUI, I just blame the Designer.  That doesn’t make the re-factoring amount & time go away though.

Automated Build Systems

Automated build systems include things like ANT and Maven or even Rake.  This is the #1 question that sabotages my sales call when talking to Enterprise clients.  They want to here me say, “Why yes, of course I’m a major proponent of automated build systems.”

I don’t, though.  ActionScript & Flex Dev isn’t Java, and is a lot simpler.  I’ve seen more team productivity lost to setting up & maintaining build systems.

… except for deployment.  When you’re deploying to various servers, and doing integration testing with different systems, I’ve seen TONS of value here.  I’ve only seen this work if you have a dedicated build person who doesn’t mind having a lower story point contribution.

If I can’t hit the run and debug buttons in Flex/Flash Builder, I perceive that a serious problem.  I’ve seen many teams who can’t do this, and fire drills rage.  Once you solve that problem, you can start tackling other problems.  There are many who disagree with me, but they are nowhere to be found on my consulting gigs where I’m saving a project.

Automated Testing

The King James Bible says in the 10 Commandments that you shouldn’t covet your neighbor.  I’m an uber-sinner then because I’ve never worked on a team that had the chops, client permission, and/or resources to setting up and maintaining an automated testing server, yet I continually wish I had been, or soon will.  I’ve been on teams where we talked about it, but it was never executed because we too busy TRYING NOT TO DIE AND ACTUALLY SHIP SOFTWARE.

Hudson and Cruise Control look awesome.  That said, a lot of the people I’ve worked with don’t even write unit tests, so……..

Functional Testing Tools

As I stated, a lot of things in Flex are insanely visual; you can’t write unit tests to test visual accuracy.  That, and some of the state’s they get in are unwieldy to write tests around.  You can utilize functional testing tools to automate a lot of that… and man, seeing them in action is HOT!  They are wicked expensive as a result, and only the larger Enterprise companies employ them.  Here, too, is another thing that sabotages my sales calls with clients:

“You ever use Functional Testing tools?”

“No, but I’d love too, they look HOT!”

Conclusions

In my experience with Scrum, and Iterative Development in general, Technical Debt is the only thing I’ve seen that can really destroy the promises that Scrum is supposed to deliver on.  Thus, it’s been the one thing I’ve been focused on better understanding on how to identify, prevent, and fix.  If you’re using Scrum, beware.  Twice I’ve seen Technical Debt make a huge negative impact on my projects.

There are a bunch of things you can do.

Think heavily upon the decision of coding something quick vs. planned out well architecturally.  There are pro’s and con’s to each (OOP Purists will tell you nothing should be quick, they are wrong).  Sometimes you have zero clue if feature is even relevant to the your users, and it makes huge business sense to only make a quick foray vs. a huge, heavy architecture investment in case the user quickly scoffs at the functionality.  Patton called this a “calculated risk”.  Software IS War.  Other times, that quick implementation without any good architecture can have major negative consequences later down the line.  That’s ok, though, just get really good at refactoring.  Yes, it’s a skill you can learn.  Yes, refactor is a good word; it’s not the same thing as rewrite, and is integral to developing software well.

The other ones are ensuring you have a competent team, ensuring they follow commonly known practices, and tackle problems early.  Check your code in often and early.  If you don’t employ TDD, at least think about getting some unit test coverage on areas that continually are causing problems.

Remember, part of this recession was caused by people having massive debt they didn’t deal with, and ignored.  Debt is a valid way of doing business, and so is software, but you need to see some eventual value on that investment, and not let your debt get out of control.

If you ignore everything I said above, just remember 3 things: encapsulation, refactor, and use a damn good logger.

3 Responses

  1. Thank you for this post. I recently went through a project the aftermath of which essentially shook me to my core, and this article establishes a language I can use to analyze that situation and better understand why it happened.

    One of the best dev-related articles I’ve read all year. Brilliant.

  2. Great post. This is the main reason why I don’t do website development anymore and more Software driven. I don’t know if its just me, but I have been called in to work on , what I would call, dead end projects. Meaning that the previous developer didn’t follow some of the basic practices, error checking,encapsulation, or basic structure in the code, basically following the mythology of “Throw it and see if it sticks”. I understand that some cases OOP is out of scope, or you may not have the time, but I have yet to see any cases like that with any project I’ve done and if you are packaging this off as a production ready application some thought needs to be made.

    In regards to the bulk of this post, Technical Debt, through the years I have learned when its good to do a quick and dirty solution ( just to see if it works), then to spend a great deal of time to just have the client remove the functionality/use case. Its frustrating, but I have to agree with you :
    “it’s ok if your API is good and the code underneath it sucks”

    If your API is great or half decent, when the need to refactor comes into play, its not as painful. This is one thing I’ve learn over the years, and boy it has saved me alot of headaches.

    Your post has me looking more into Scum development.

  3. Ted

    Been dipping into these articles over time and it is interesting to see some common scenarios and a vocabulary for them too that is not often covered in the places I read so thanks for that. Scrum clearly has many appealing concepts and I can see why it is valued it. One observation I would stress though is the importance of leadership in teams. My experience is that quality, strong, PMs and senior devs willing to show leadership are the most important key to a sane work environment and successful project. In the end there are great people I would follow whatever their plan and doofuses I would run away from even if their plan was perfect.