Sunday, 29 September 2013

Eliminate before automating

I've encouraged in my earlier texts to push for more automation, because it typically helps to create products faster and with better quality. Therefore, it's beneficial for the business to encourage automation. However, there's one more thing that has even more value for the business - eliminating tasks that don't provide enough value for the business.

Traditionally, in SW development we focus on improving the way we develop, test and distribute SW. And with automation we help the process to be faster. With automation we are working harder, but not smarter.

When we develop SW too hastily, we create technical debt into our system. To keep our system in good shape, we should be cleaning it up every now and then, refactoring the code and dropping out features that are not used anymore. We need to do the same also for our product development process, refactoring the existing steps and above all, eliminating steps, tasks and items that don't provide additional value to the business but are slowing down the product development.

As an example, if we have been developing our products for years, our bug handling process has collected a lot of things into itself. Typically, some stakeholder makes noise that a certain piece of information would be vital to have in the bug report form, and that piece of information is added to the process without evaluating the real cost of adding it. This scheme is repeated over the years and we end up with a bug reporting process nobody is happy about. Still, none of the parties is ready to give up any of the bits and pieces each has been able to include in process.

Calculating costs for each item in the process might be difficult. Therefore, better approach to keep the process lean is to start from clean table and include only those steps and items that can be shown to have significant value. The team that succeeds in doing that well will have foundation for making innovative products, saving time by eliminating meaningless tasks.

Saturday, 21 September 2013

Creating a decent submission gate

In continuous integration (CI), a well-functioning submission gate is crucial (and it's important also in non-CI process) . Submission gate is the criteria which has to be passed in order for a change to be accepted to the common codebase. Note the difference: this is about single changes to be accepted to a baseline, while definition of done is about acceptance of a feature. Basically, all features are composed of several changes.

Key characteristics for a submission gate criteria:

  1.  It should prevent the breakage of the common codebase
  2.  It should be fluent and swift to use
  3.  It should be reliable


For preventing the breakage, we need to have good enough coverage. However, coverage is limited by the need to make the entries through the gate quick. Therefore, we need to pick an optimal set of activities. These should include:

  • Static code analysis for detecting e.g. possible memory leaks
  • Tests for frequently used features, breakage of which would prevent the use of the baseline for many
  • Tests for areas where the breakage could prevent a major part of further and more expensive testing
  • Tests for easily broken features
  • Unit testing done beforehand
  • Code review
  • Creation of change metadata


Quite a set and all these need to be fast! For static code analysis there are good tools available, which usually provide reports that enable quick fixing of upcoming problems - so those are very useful! Code review is very important and, if organized properly, an inexpensive way to discover problems that testing can't typically find.

Change metadata refers to all formalities that are related to change management, e.g. creating a ticket to the change management tool. This is often a heavy part of the process and should be optimized to make the creation of changes fluent while serving the management with enough information.

Tests need to be selected based on the above criteria, need to be automated (for fluent use) and quick. But we need to remember that the tests need to be also reliable! That's a major challenge as there are many things that could fail, e.g. bugs in test scripts, or failures in SW/HW environment. We will never have ~100% reliable tests (unless we test only very simple things), and therefore we need to be prepared for random failures. What should we do when a test fails?

  • Discard the change which seemed to break the test, if we rely on the test and our analysis on the results support the view.
  • Run the test again a few times to check if it is a random failure. How many times is enough? Do we have time and resources for retesting? Random failure may be caused also by the change at hand, so we need to run further tests also for older SW stacks in our codebase. We may also classify the failure as random if it has already appeared in earlier test runs.
  • Same failure has appeared occasionally already before - let's report an error and get a high priority for fixing it. Perhaps we should even drop the test until we receive a fix? Running a failing test is not sensible, it just grows irritation in everybody. On the other hand, problem should be fixed quickly as while the test is out of use, or random failures present, new errors causing additional failures in the test may enter our codebase without us noticing it.


A lot of things that we should be doing in designing and operating a submission gate. It will never be perfect, we will suffer in either speed, coverage or reliability. So we need to aim to make it decent. However, the most important aspect for a submission gate is always fast feedback, because good coverage is more a requirement for further testing.

Friday, 13 September 2013

Why continuous integration won't succeed?

Continuous integration (CI) as a principle is key part of agile SW development. It is in practice mandatory in order to keep the asset in shape for potential shippable product at the end of the sprint. But there are many hurdles for CI to succeed. Here are some I have faced.

First, there might cultural reasons why developers may not be used to make small changes. This will happen if integration is provided for developers as a service. Thereby, it's no problem for the developer to have a long-living branch with no updates from other development during coding the change, and then hand it over to integrator that will then feel the pain of merging it to SW stack. In addition, developer will in this scheme avoid taking the "bad quality" changes of others into his/her development environment.

Second, there might be also practical reasons, not just culture. How easy it is to merge latest changes to own branch? We need to have good tooling for merges, and new baselines provided at least daily. How much effort there is to deliver a change? Unit testing, reviews, builds, integration testing and bureaucracy may all require such an effort that delivering small changes is not efficient SW development. What is the quality of baselines? Those need to be trustworthy, testing and analysis should be quick but have enough coverage to enable the capturing of most common failures, and baselines not fulfilling the tight release criteria shouldn't be published.

So we need to carefully look into our processes and think how well those support CI culture. Building and testing of changes need to be automated and swift, reviews need to have enough priority within the team and bureaucracy minimized. Ideally, developer's effort before submitting the change shouldn't take more than half an hour, and the complete SW build portfolio and acceptance testing in integration another half an hour. This is not the optimum, but bare minimum that would keep the CI process fluent. Tooling for the process needs to be intuitive to use. If our builds are done nightly or getting results from tests take hours, we have still a lot to do for getting into continuous integration.

Friday, 6 September 2013

Version numbering means trouble for continuous integration

In continuous integration (CI), the aim is to push new changes quickly to the SW stack, pre- and postevaluate the changes through analysis and testing, making each of the changes a release candidate. For a release, we would need some simple identifiers. The most typical identifier is a version number. However, in CI incrementing version number can be painful, especially if it is defined compile-time.

First, let's look at the reasons why we need releases. Ultimately, a release is needed to deliver the SW to customer. However, releasing is beneficial also for internal purposes:

  • Release points out a baseline on which next changes are built on. This is important if testing in the CI process is not happening promptly and we want to avoid the situation that developers are using a bad candidate as a base for their changes.
  • Release simplifies the build configuration, in the form of baseline.
  • Release points out our latest and greatest SW to stakeholders outside SW development, e.g. verification.


Typical identifiers used for a release are:

  • baseline (label)
  • some identifier for the candidate
  • version number

Baseline identification is defined when the baseline is created, and can be thus freely selected based on scheme we have defined for the purpose. The scheme should be simple using a short body and a running number. Candidate will get its identifier when it's submitted, also based on a predefined scheme which should be simple.

But version number is difficult in case it is defined compile-time, i.e. when the candidate is submitted, and we would need to know already then what version number is allocated. Knowing it at that time would be difficult if release frequently (daily) and can't be sure which content ends up in the previous release.

If we redefine it when baseline is created, we need to recompile the SW and test it again, taking a lot of time if we have an extensive set of compilations and tests for baseline selection. Without testing we'll have a risk that something gets broken. Sometimes there's an alternative to hack the version number in the binaries and thus avoid recompilation, but the risk that the SW gets broken is still there.

Ok, so could we then make a meaningful selection for the version number already when the candidate is built? Yes we can, at least sometimes or even most of the time, but not always. And every time we have wrong version number, our CI process gets into trouble and SW distribution delayed. Version numbering should enter the game only when we are getting near the time that a release will be provided also to customer.

Best alternatives would be to have a version numbering which is not done compile-time but later, or meaningful identifiers for the candidates. The latter would mean a completely new SW philosophy for any mature organization that has been using version numbering as the primary means of identification for years, and the resistance will be furious by some people. However, open-minded developers familiar with agile ways of working will understand that by avoiding compile-time version numbering we'll keep our CI process fluent.