Manuel Aldana » Continous Integration http://www.aldana-online.de Software Engineering: blog & .lessons_learned Wed, 30 May 2018 18:51:07 +0000 http://wordpress.org/?v=2.9.2 en hourly 1 Unit-Testing: Situations when NOT to do it http://www.aldana-online.de/2011/02/06/major-unit-testing-pitfalls-and-anti-patterns/ http://www.aldana-online.de/2011/02/06/major-unit-testing-pitfalls-and-anti-patterns/#comments Sun, 06 Feb 2011 11:45:03 +0000 manuel aldana http://www.aldana-online.de/?p=262 I am a big fan and practioner of automated unit-testing, but throughout the years I took my lessons. Starting with “everything has to be automated tested” throughout years I experienced situations where doing unit-testing is not optimum approach.

The presented sections go along with my favorite test-smells:

  1. Brittle tests: Though functionality hasn’t been changed the test fails. Test should show green but in fact shows red (false positive).
  2. Inefficient tests: The effort of writing automated tests doesn’t pay out at all. The benefit/cost ratio (short + long term) is extremely low.

Unit-Test little scripts/tools

There is often no sense to write unit-tests for little scripts or tools, which are one or two-liners. The script content is already so “declaritive”, short and compact that the code is too simple to break. Further more often stubbing or mocking the dependencies is tough (e.g. writing to stdout/file, shutdown machine, doing an HTTP call). You can end up writing a external system emulator which is overkill in this situation. Surely testing is important but for that I go the manual way (executing script, and smoke-test sanity check the outcome).

Unit-Test high level orchestration services

Orchestration services have many dependencies and chain-call lower services. The effort of writing such unit-tests is very high: Stubbing/Mocking all these outgoing dependencies is tough, test setup logic can get very complex and make your test-code hard to read and understand. Further more these tests tend to be very brittle, e.g. minor refactoring changes to production code will break them. Main reason is that inside test-code you have to put a lot of implementation detail knowledge to make stubbing/mocking work. You can argue having many fan-out/outgoing dependencies is a bad smell and you should refactor from start on. This is true in some cases but higher order service often have the nature to orchestrate lower ones, so refactoring won’t change/simplify much and make design even more complicated. In the end for such high level services I much prefer to let test-cover them by automated or non-automated acceptance tests.

Test-first during unclear Macro-Design

When implementing a feature or something from scratch often the macro-design is blurry, I like to call this “diving-in”. For diving-in development or quick prototyping you get a feeling which design fits or not. During this phase class structures/interactions change a lot, sometimes even big chunks of code are thrown away and you restart again. Such wide code changes/deletions often will break your tests and you have to adapt or even delete them. In these situations test-first approach doesn’t work for me, writing test-code even distracts me and slows me down. Yes, unit-tests and test-first approach can and should guide your design but I experienced this counts more when the bigger design decisions have been settled.

100% Code-Coverage

I can’t overstate this: Code-Coverage != Test-Coverage. The Code-Coverage of unit-tests is a nice metric to see untested spots, but it is by far not enough. It just tells you that the code has been executed and simply misses the assert part of your test. Without proper asserts, which check the side-effect of your production code and expected behaviour, the test gives zero value. You can reach 100% code-coverage without having tested anything at all. In the end this wrong feeling of security is much worse as having no test at all! Further more 100% code-coverage is inefficient because you will test a lot of code, which is “too simple to break” (e.g. getters/setters, simple constructors + factory-methods).

Summary

Above points shouldn’t give you the impression that I do speak against automated unit-tests, I think they are great: They can guide you to write incremental changes and help you to focus, you get affinity for green colors ;), they are cheap to execute and regression testing gives you security of not breaking things and more courage to refactor. Still going with the attitude that you have to go for 100% code-coverage and to test every code snippet will kill testing culture and end up in Red-Green color blindness.

]]>
http://www.aldana-online.de/2011/02/06/major-unit-testing-pitfalls-and-anti-patterns/feed/ 0
From Java Easymock to Mockito http://www.aldana-online.de/2010/06/27/from-java-easymock-to-mockito/ http://www.aldana-online.de/2010/06/27/from-java-easymock-to-mockito/#comments Sun, 27 Jun 2010 13:29:10 +0000 manuel aldana http://www.aldana-online.de/?p=174 While browsing through open-source project sonar’s test-code I noticed that they had package imports with Mockito namespace. What I noticed was that the mocking test-code looked similar to easymock but less cluttered and better readable. So I gave Mockito (version was 1.8.3 back then) a try when implementing new test-cases and did not regret it :).

Easymock before

Around 2005 there were several mocking frameworks available. The main reason I chose to work with easymock was that it was both powerful and refactoring friendly. It supports automatic safe refactorings well because the expectations on method calls aren’t setup as a loose string-snippet but on statically typed info (method call expectations are directly bound to the object-type).

Though I found easymock great and made stubbing and mocking much easier as before, I found it had some drawbacks (speaking of version 2.5):

  • The mocking/stubbing of interfaces vs. classes is not transparent. It is done through different Main-classes (EasyMock + classextension.EasyMock). This implied that mixing mocking both interfaces and classes inside one test-class followed in cluttered code and importing hell.
  • The error messages of easymock are confusing sometimes. Often it is not clear whether the test-case has failed or easymock was used wrong (e.g. forgetting to call replay()).
  • The mandatory call of replay() after having setup the mocked object always felt redundant and made test-case longer.
  • The non-clear separation between setting up a mock and verifying it. Setting up a mock added also a verification on all expectations as soon as you called verify(). When writing + reading test-code this always confused me, because you already had to cope with verify logic in the setup part of the test-case.

Mockito after

The guys of Mockito say that they were inspired by Easymock and indeed you see its heritage. After having used it for about 3 months now so far the hands-on impressions are great and I now exclusively use Mockito for writing unit-tests.

My positive experiences were:

  • Test-code still is safe in regard of using static-typed based automatic refactorings.
  • Transparency of classes vs. interfaces. In both cases you call Mockito.mock(MyInterface.class) or Mockito.mock(MyClass.class).
  • Clear seperation between setting up a mock and verifiying it. This feels more intuitive and the clear setup/excercise/verify test-code order is preserved.
  • Helpful error message, when an assertion wasn’t met or the tool guessed a framework usage error.
  • The naming of methods is intuitive (like when(), thenReturn()).
  • When earlier I used the real domain-objects as test-data (i.e. by filling data through setters/constructors), now I use mockito to stub them (i.e. stubbing the getters). Domain code logic has now much less impact on test-runs.
  • Nice short, straightforward documentation.
  • A great name + logo ;)

In summary: The mockito folks did a great job (they took the nice ideas from Easymock creator and improved its drawbacks). Now looking at old test-code using Easymock I subjectively need much more time to grasp what the intent of the test is. With Mockito the test-cases read more like a clear sequential “requirements story” like test-cases always should.

Migration of test-code

If you are already using easymock the tool switch is amazingly quick. Following migration path helped me:

  1. Give yourself and your colleagues around two weeks investing time to work with the tool and get comfortable with it. Write all your new test-classes with Mockito.
  2. If you like it make the switch: Explicitly communicate that using the old mocking framework is deprecated (if possible use static code analysis tools where you can mark severaly packages deprecated (org.easymock.*)). Now usage of Mockito for new test-classes should be mandatory.
  3. If you have already big test-codebase I do NOT recommend a big-bang test-code migration. Such migration work is time consuming and boring. Therefore taking the incremental approach is better: Only migrate Easymock code to Mockito in case you anyway touch the class, i.e. are modifying or adding test-cases.

Looking at the test-migrations I did so far, migrating Easymock code to Mockito is quite straightforward. Get rid of all replay(), verify() calls and adjust to the slight API changes. The only thing you have to watch out for more are the explicit verification on mocked-calls. Easymock did implicitly verify all expectations when calling verify() on the mock-object, on Mockito side you explicitly have to call verifications on each method. The same counts for strict mocks. You have to add respective verifications.

]]>
http://www.aldana-online.de/2010/06/27/from-java-easymock-to-mockito/feed/ 0
Parameterized test-methods with TestNG http://www.aldana-online.de/2009/04/19/parameterized-test-methods-with-testng/ http://www.aldana-online.de/2009/04/19/parameterized-test-methods-with-testng/#comments Sun, 19 Apr 2009 10:08:10 +0000 manuel aldana http://www.aldana-online.de/?p=131 TestNG offers many great features and is definitely more capable as JUnit to build a strong automated Test-Suite. When writing test-cases one important factor is the handling of test-data. With JUnit it is cumbersome to feed different test-data to the same test-code. TestNG solves this much better.

Let’s look at a very simple example. When trying to test an exception with JUnit 4 with different test-data I would need to write something like:

class MyTest{

  @Test
  public void throw_exception_if_wrong_input()
  {
    try
    {
      new Foo(“test-data1″);
      fail();
    }catch(IllegalArgumentException iae){}
   
    try
    {
      new Foo(“test-data2″);
      fail();
    }catch(IllegalArgumentException iae){}
   
    try
    {
      new Foo(“test-data3″);
      fail();
    }catch(IllegalArgumentException iae){}   
  }

}

This code has essential problems:

  1. Because I want to avoid to write two more test-methods (essentialy the same production code is triggered), I am putting three test-cases into one test-method, which is bad practice. This is because test cases aren’t using the tearDown and setUp facilities and cannot guarantee isolation. That is also the reason that I could not use the excpectedException parameter inside the JUnit 4 @Test annotation. Alternative is to really use three test-methods, but code readability then suffers.
  2. Even though above example is very simplified (only new Foo() is called) the test-code is not expressive. Surely you could improve this by extracting method and giving a good name. But still it is a bit blurry, why we do so and that it is just for using different test-data.

TestNG parameterized test-methods

TestNG does it better and builds the “test-case differs only in test-data” situation into its framework. This is done by building a DataProvider and passing parameters to the test-method. This way code gets more expressive and each different test-data set is executed as an isolated test-case also. Here an example of the TestNG version of above test-cases (I followed the code centric way, you can also configure your DataProvider through an XML file):

class MyTest{

  @DataProvider(name = “wrong_input”)
  public Object[][] createData()
  {
    //2 dimensions
    // x: data-set for one test-case
    // y: set of parameters (test-method can contain multiple parameters)
    return new Object[][]{
        {“test-data-1″},
        {“test-data-2″},
        {“test-data-3″}
    };
 }

 @Test(expectedExceptions = IllegalArgumentException.class,
            dataProvider = “wrong_input”)
 public void throw_exception_if_bad_input(String input)
 {
   new Foo(input)
 }
 
}

]]>
http://www.aldana-online.de/2009/04/19/parameterized-test-methods-with-testng/feed/ 0
Avoiding xUnit test-errors (false positives, false negatives) http://www.aldana-online.de/2008/06/16/xunit-test-errors-false-positives-false-negative/ http://www.aldana-online.de/2008/06/16/xunit-test-errors-false-positives-false-negative/#comments Mon, 16 Jun 2008 18:20:30 +0000 manuel aldana http://www.aldana-online.de/?p=118 You are using unit-tests to ensure that production code works as defined or specified from the class-level view. This way you either get feedback that your implementation works as wanted (green-bar=success) or not (red-bar=fail). Unfortunately tests are also man-crafted work and can contain bugs. Following article shows what kind of test-errors exist and what preventive actions can be done.

Annoyance of test-errors

Test-errors are very annoying because your tests should be the impartial authority saying your production works or not. And if you cannot rely on them and they are constantly lying to you, you quickly get the opinion that tests don’t help you but merely slow down your work tasks. In fact I noticed that many developers new to unit-tests are gettings frustrated by test-errors and thus are stopping to write them. This way they lose the advantages of a good test-suite and Test Driven Development which is a shame. Test-errors can roughly occur in two forms: false negatives and false positives.

False positive test

A false positive test gives you a failure, though the feature is behaving alright. You change or create some code, make everything compile but the corresponding test gives you a failure. After debugging for a while you see that the test made wrong expectations about the feature so that your test is inconsistent with the specified behaviour.

Very simplified example:

public static int sum(int x, int y){
  return x+y;
}

public void testSum(){
  assertTrue(3,sum(1,1)); //fails though class under test alright
}

I experience false positive tests sometimes when using mock-frameworks (like EasyMock ), because when injecting mocks to class under test you expose some bits of implementation details to the test by recording behaviour to the mock. After changing implementation of class under test slightly, the calling of your mocks could change also. Without changing your mocks correspondly your test most likely will fail.

Another often occurring cause for a false positive is a wrong test setup, e.g. you could pass wrong or no instances at all to class under test, so it behaves different as expected from test case view. Example: You pass an implementation to class under test, which connects to a file-database and wants to read data. Database is not available and test fails. Here you made a test setup mistake: Actually you had to pass a persistence stub, which returns appropriate values for the test.

Further more your design could be too tightly coupled, where your class under test has many dependencies to other classes. Thus many other instances are called indirectly and it is difficult to isolate you “real” test case. A change of dependent classes makes the test fail, though class under test itself did not change and is still behaving fine.

False negative test

A false negative test reports you a success message that everything is alright but in fact it is broken. In this case all your test expectations (=asserts) are fine, but production code in fact has got a bug.

Same simple example:

public static int sum(int x, int y){
  return x+1
}

public void testSum(){
  assertTrue(2,sum(1,1)); //success though class under test has a bug
}

What I see often in false negative tests, is that assert statements are formulated too weak and claim not enough from class under test. In many test cases you find a very non-saying assertNotNull(returnObject) and the properties of returnObject are not checked more detailed.

Preference of test-error

The lesser evil test-error are definitely the false positives. Here I get instantly notified that there IS a test-error. Looking at a false negative test I don’t get notified at all and the overall ‘good’ feeling that I got tests in place is very deceptive. You interpret more quality to your production code than there is.

What to do about test-errors

Awareness of both test-error categories is a good first step, but how can you avoid test-errors generally?

Suggestions to avoid false positives

An obvious cause for a false positive can be your test assert statements, maybe they are just plain wrong, requirements have been changed and class under test has been adjusted respectively. To avoid such assert statement mistakes you should develop test driven: When a requirement has been changed first look if feature has a corresponding test case and adjust it. After that change your production code. This way you avoid specification inconsistencies inside your test cases.

When looking at your production code, maybe your design is too tightly coupled and your tests are covering just to many classes? With this you should consider to introduce dependency-injection to make passings of alternative test-implementations possible. Or maybe you got a monster method and thus “feature-isolation” is not possible either? Here you should consider to use extract refactorings (method, interface, class), to achieve isolation and loose coupling and make stub/mock injections possible.

Suggestions to avoid false negatives

Most likely your assert statements are too weak and you don’t use enough test-data. Consider to add asserts an invoke class under test with alternative input values. The assertNotNull(returnObject) assert is often a good start to check whether the returnObject has been initialized at all, but if so, more “stronger” asserts should follow. Always remember: Apart from your application throwing an unexpected Exception inside test-run your test case will only fail if one of your asserts will evaluate to false!

]]>
http://www.aldana-online.de/2008/06/16/xunit-test-errors-false-positives-false-negative/feed/ 0
Improving weak automatic test-suites incrementally http://www.aldana-online.de/2008/05/13/improving-weak-automatic-test-suites-incrementally/ http://www.aldana-online.de/2008/05/13/improving-weak-automatic-test-suites-incrementally/#comments Mon, 12 May 2008 23:32:11 +0000 manuel aldana http://www.aldana-online.de/2008/05/13/improving-weak-automatic-run-test-suites-incrementally/ A key element for good quality software is a good automatic run test-suite, which contains both unit and integration tests. As Frederic P. Brooks already mentions in his book ‘The Mythical Man-Month’, developers (as other humans, too) are far away from being perfect. As software from the outside view often needs to be written 100% correctly (only one missing or wrong statement in the whole execution path can make a feature to be buggy), we need to have a reliable way to get quick feedback whether something works or has been broken. A powerful way to accomplish this is by checking the behaviour with automatic running test-cases. Though this practice is already state of the art, you often find systems with a weak test-suite. This article shows why the feedback-loop is so important to how improve your test suite by getting a testing culture and to introduce test-cases step by step.

Tests give you invaluable feedback

In following situations the feedback of tests is very helpful or even essential:

  • Existing tests tell you that you didn’t break existing behaviour after doing adaptions. These regression tests show you a green bar if a code change did not affect working features.
  • Tests help you to introduce new features. Writings new tests is done in parallel when touching production code. This way you can concentrate on this single feature and you further more get a quick response, whether the feature is implemented correctly (green bar) or not (red bar).
  • Tests help you to reproduce a bug. You write a new test case which shows you a the bug as a red bar. With this you have a very good starting point to debug and fix it. When fixed you get a positive response (green bar shows up).

The dilemma of weak test-suites

Of course one single test-case won’t do the catch. Your aim should be have many test-cases which summed up represent your whole strong test suite. Unfortunately test suites are often weak, especially when changing code they don’t give you a red bar if something has been broken. The reason why a test suite is weak can be manifold:

No practice of testing, too few tests

The view of tests as central “development-drivers” is not practiced. Things are implemented and after that roughly tested in a manual way. There is no coding of automatic tests which prove that bug-fixes are done correctly or features work like specified. Testcode is seen as not-so-important stuff, because it does not show up in the production code and this way does not give real additional value to the customer.

Tests aren’t executed regularly

Tests which are part of the test suite exist, but their execution is not part of the build. Thus they are hardly executed and are triggered manually. Often this is the case because the tests need human intervention and cannot be run automatically by a build server or similar. It also can be that in fact you have a completely automatic test suite but is just not common that it is executed often. If tests aren’t executed regularly (e.g. once a day, after each commit), you lose the advantage of quick feedback. With this running the tests after a week/month or so, you don’t know what code change caused the test failures or errors. After a while developers will get reluctant about tests, they will be getting used to red bars and simply put will just ignore them at some point. In the end tests won’t be maintained and your test code gets more a burden than a help. Finally your test-code decays to dead code.

Missing or very expensive tools

For test automation the proper tools are very important. Nowadays you get a lot of good open-source test tools (Fit, xUnit-frameworks). Never the less for some special areas it is not trivial how to automate some testing steps, especially in integration testing (e.g. setting up and validating smart-cards contents, or testing message-oriented/asynchronous systems in general). Some vendors sometimes provide some more advanced testing tools but these are either very expensive or/and cannot be extended for special problems.

The dilemma solution

Above mentioned constraints to get a good test-suite are often hard but can be tackled. First of all everyone in the team needs to understand that tests aren’t given to the customer but are inherent important to the development process. Well, you don’t include you favorite IDE to the customer, never the less you couldn’t create software without it, could you? Testing continously needs to be a new habit, so give the team some more time to dig into this, like planning trainings for JUnit or test automation in general. If the team starts to value tests there still could be the problem that there are hardly any automated tests, which can be very demotivating (‘The where to start…?’ issue). Never the less you can start up by writing tests for all new features you want to implement or for areas you want to refactor. When knowing a bug always reproduce a bug with a test case initially. After doing this for a while you will see how your backing safety-net will grow bigger and bigger. Of course the test cases of the test division won’t be replaced by newly introduced test cases of the developers. There still are many tests which are difficult or cannot be automated at all (like user acceptance or GUI tests). So see developer-testing as supplements and not replacements for the “traditional” tests. Of course the testers could also help to improve the automated integration tests, because they are more black box focused. In a very mature test focused process a tester for instance would attach a reproducing integration test to a bug report.

As mentioned not executed tests some day will end up as dead code. To avoid this introduce certain triggers (e.g. nightly, after each commit) to run your test suite. See the test runnings as part of the software build itself. This view is very powerful and is also recommended by the Continous Integration concept. If run very regularly developers will soon start to appreciate the tests and will have “fun” with extending test-suite. Looking at the visible green bar or getting rid of a red one can be very motivating.

In former times test tools often had been something more for the people in the test division. The picture has changed a lot. Automated tests created directly from the developers is state of the art and is supported by many useful and open-source tools (Fitness, xUnit, db-unit, Mock-Frameworks, soap-ui etc.). They are extensible and can be combined to get an even more powerful testing tool. Further more Continous Integration servers for running and triggering tests are mature, free and can be extended (CruiseControl, Continuum, Hudson etc.).

Summary

Generally the view of tests from the developers side has changed a lot. Especially since the rise of Unit-testing, some new integration test-tools and the concept of Continous Integration, test-creation by developers is much more common. Even if your organization has not introduced this practice you can do so by building up your test-suite incrementally and executing it regularly. The harder part will be to get the developers to a testing routine/culture but this effort will far be outweighed by the benefits.

]]>
http://www.aldana-online.de/2008/05/13/improving-weak-automatic-test-suites-incrementally/feed/ 0
Tests: Why code coverage is not enough http://www.aldana-online.de/2008/03/18/ensure-test-suite-quality-and-beware-of-the-green-bar/ http://www.aldana-online.de/2008/03/18/ensure-test-suite-quality-and-beware-of-the-green-bar/#comments Tue, 18 Mar 2008 20:01:59 +0000 manuel aldana http://www.aldana-online.de/2008/03/18/ensure-test-suite-quality-and-beware-of-the-green-bar/ To benefit from the principle of Continous Integration you need a good testing suite which includes both regression and new-feature tests. Regressions tests focus on testing features which are already implemented and shouldn’t be changed or influenced by latest commited changes. When software evolves such regression tests play a major role for the set of “old” features is far bigger compared to the new ones. One of Continous Integration’s key features is to run test automatically and reporting the results to us, so we are notified when something has been broken and we can react on this quickly. This article discusses what dangers can occur when relying too much on the green bar of your test-suite results.

Common Continous Integration tools (e.g. Cruise-Control, Continumm) are running unit and integration tests and reporting the outcome (error, failure, success) with a push mechanism like E-Mail, Messenger client. As a developer usually you only prick up your ears when you see a red bar. When a green bar shows up or the reporting tool even filters success outcomes we could get into a very severe problem: We think everything is alright for all tests passed but in fact we stepped into a false negative trap and part of the system is broken. As such the test-suite quality isn’t as good as we thought and software quality decreases invisibly. Usually organisation tries to proove test-suite’s quality by using a coverage tool like Cobertura or Clover. They are quite a help for seeking new test cases, never the less it exposes a big problem.

Regarding the Four-Phase test pattern (explained in superb book xUnit Test Patterns by Gerard Meszaros), code coverage shows us that certain parts of production code are executed in Setup, Exercise and Tear-Down phase of test. Considering this code coverage clearly misses to adress the “quality” of the Verify part of the test. Thus when using code coverage as test-suite quality metric only, we still can have a bad test quality: Though our tests are walking through wide areas of production code in fact they don’t tell us what we really expect from it, which is usually done with assertions.

fourPhaseTest

Bad Test Smells

To get a better feeling what can go wrong within the Verify part of a test case have a look at following three (common) test smell examples:

Exception ignorance:
Especially in Java you often are forced to catch checked exceptions. This can be quite annoying especially in tests. Now, instead of attaching the exception one isn’t interested in to the method signature (like void testSomething()throws Exception{…} ) many people are inserting try/catch blocks inside the test method and leave the catch-block empty. Not only this makes reading tests more difficult but can cause severe false negatives too because assertions you defined can be skipped. There we go and we could see green bar though the test failed.

void testSomething(){
  try{
    // 1) do loads of stuff
    // 2)some assertions at the end
  }catch(Exception e){
    // 3) do Nothing ->  unexpected exception does not get reported to test runner
    //defined assertions from above could be skipped
  }
}

Uninterested debugging test:
Production code is called, but no assertions are used at all. Often you find logging-outputs spread around (logger, stdout, stderr). Such test cases often aren’t meaningful because they don’t notify the test runner in case something goes wrong, in fact they are nearly always showing a green bar. Thus the test running is done automatically but the reporting is not, which taken together forms a non-automatic test. Instead human interference comes into play and analyzation of test results is done by staring at the logging-output. In many cases such test cases are used to debug production code in the first place instead of writing a “real” automatic test case in a way of test driven development.

void testSomething() {
  // 1) long List of statements
  // 2) logging output spread around (stdout/stderr)
  // 3) no assertions at all
}

Imprecise assertion:
Assertions are used but they are very general and do not ask more from production code as values only not being null. Such unspecific assertions do not claim much and verification often is weak. This way real bugs which should have been investigated by test could still result to a green bar.

void testSomething(){
  // 1) long List of statements
  // 2) only and single assertion
  assertNotNull(returnValue); //very imprecise assertion
}

Approaches for test quality assurance

It is quite a tough job to find out whether your test-suite is strong or not, especially when it comes to writing the correct verifications/asserts. The right approach always has to be determined between efficiency and effectiveness forces. Manual i.e. human interaction like review of test cases is very powerful for you are reasoning directly as a developer. With that you can point out semantics in the way what assertions make sense, which ones are missing or whether custom assertions could be used. Of course this takes some time (review-meeting, manual code browsing etc.) and thus cannot be done for the whole testing code-base. Still it is not only a good way to get to know the test code base and investigate problematic test cases but more to spread knowledge about good test verification habits.

The less expensive and less time consuming (i.e. efficient) variant could be a code structure analyzer, which scans test code. As an example you can extend http://checkstyle.sourceforge.net/ (Checkstyle) with custom checks. For instance working with regular expressions could make a check possible that at least one single assert statement must be used inside a test or an assertNotNull() is followed by another more specific assertion. Further more you could set up a rule that try-catch blocks need to contain a fail() assertion to verify proper exception-handling in production code.

A third option which I can think of is to test your test-suite’s strength by using Mutation Testing. It works in that way that it injects errors (mutants) into the production code base and checks whether tests would catch them by reporting a failure or error. I remember having used this approach with Jester in a seminar in my university times. We perceived it quite powerful when injecting mutation operators for conditional logic, arithmetic expressions or numeric values for little programs. But when it came down to many other error-prone bits it often fell short (examples abbreviated): Calling different instance methods of input parameters, changing Exception handling, String operations etc.. Further more we did not succeed setting up Jester for integration tests, especially setting-up and tearing down things to a defined state was tricky. Looking out these days for an alternative other mutation testing framework for Java ended up to find a single tool MuJava, where at its core (apart from eclipse plugin) the last update is a bit old from 2005. Never the less it looks promising for it offers more mutation operators especially for object oriented programs. Unfortunately it does not offer a maven report plugin for decent reports. Never the less I guess I’ll give it a try soon.

Summary

Testing is inevitably important when developing software. Either for prooving that new features work or that old features did not get broken by code changes. Looking at the green bar of testing outcome and relying on code coverage is a very good first step but often is not enough and other actions have to be done (test code review, static code analysis, mutation testing). Mutation testing as a concept looks very promising but does not seem to have been common practice.

]]>
http://www.aldana-online.de/2008/03/18/ensure-test-suite-quality-and-beware-of-the-green-bar/feed/ 0
Extending the idea of Continous Integration http://www.aldana-online.de/2008/02/07/extending-idea-of-continous-integration/ http://www.aldana-online.de/2008/02/07/extending-idea-of-continous-integration/#comments Thu, 07 Feb 2008 21:33:44 +0000 manuel aldana http://www.aldana-online.de/2008/02/07/extending-idea-of-continous-integration/ With Continous Integration you send your work to a central place and a checking is being done which tells you if your stuff still works for itself and together with the other parts of the system. This kind of check should be done quick and often (to keep the feedback loop short) so it is mostly done automatically by tools. Martin Fowler already gives a good introduction to this practice, and this book gives even more detailed information, so I won’t cover already discussed topics. For myself I found the idea of Continous Integration so universal, that it can even be applied to other work bits which aren’t directly connected to source code. This way quality of other work artifacts can be enhanced.

As a example just think of the important documentations snippets you are working on (I hope you do…), wouldn’t it be nice if the was a kind of checker which would help you to keep spellings or consistencies right without you manually scrolling the whole document back and forth? This way you would save an often unexpected (difficult to estimate) pig pile of work which usually appears at the release deadline when document needs to be finished. You could argue that if the tool wouldn’t offer such a check you could extend it by a self written plugin yourself to get a very close integration. On the other hand this is often not feasible: Either the tool is not meant to be extended, it is closed source or the effort to be spent for writing a plugin is just too high to equal the cost. You again could argue that faced with such problems a switch to a better tool should be done, but this is even more doubtful: First of all in very many cases a switch of a tool is connected with high effort (total cost of ownership: training, migration of data, evaluation etc.), and further more you generally will hardly (if even never) find a tool which fulfills all your needs to 100% anways. So a more lightweight approach could be a better alternative: Integration of little self written tools which connect to interfaces of your main one. In the case of Continous Integration this would happen by creating a little checker/reporting-tool which in turn scans users input for lurking errors and runs on a regular basis. Thus the feedback loop is short (iteration is one of the keys of software engineering) and thus many errors/misbehaviour can be prevented from keeping too long inside the system and quality is ensured from the beginning on.

workflow continous integration

Let’s head to another example: I recently used the concept of Continous Integration to improve the quality of my homepage: As you write content for the web you often write some html bits and use hyperlinks quite often. In wordpress’ case you usually do that with not very edit-friendly built-in wordpress editor, which does not check the html bits you filled in. So just automatically quite a few broken links creeped in. With me this happened quite often (forgetting opening/closing tags, accidently omitting http://, completely mistyped links, etc.). The discovery + correction of these links had been quite annoying, boring and time consuming. Further more the discovery was more by coincidence as by a structured search. Whilst being bored searching for these sneaky links I came to the idea that this kind of scan could be easily, much better and quickly performed by a script. So I changed the direction and headed my way to an automatic scan: First of all a kind of enabling point was neccessary to easily get access to all pages of my homepage. For this the sitemap feature of wordpress got very handy, where you can get all pages of your site in a structured well parseable XML-document. Armed with this all pages and its hyperlinks could by retrieved easily. Having all existing hyperlinks of my homepage the assertion of the HTTP GET response code of respective link was even more obvious. For this tooling task I used Groovy for its short easy to read syntax and its missing compiling and low deploying neccessities (just guess what long and verbose source-code would have been need when it had been done in Java…). Of course I do favor Java (static type check) when it comes to bigger systems for maintenance reasons but for everyday scripting tasks Groovy or other alternatives (Python, Ruby etc.) a really the best way to go.

Just have a look at following code to see how straight forward it is to build a Continous Integration helper tool. You will notice that only the check part is done of following script (see the middle item of above picture). So what is still missing is a proper trigger (like cron-job for each day) and a better report functionality (like mail-notification instead of stdout print).

//MAIN WORKFLOW
def static HOMEPAGE=“YOUR_WORDPRESS_HOMEPAGE”
//for performance, only visit link once
def linksVisited=[]
def index=0

println “Starting link check for $HOMEPAGE”
def urlset=new XmlSlurper().parseText(new URL(“http://$HOMEPAGE/sitemap.xml”).text)
urlset.url.each{ sitemapEntry ->
  def linkFrom=sitemapEntry.loc.text()
  retrieveLinks(new URL(linkFrom).text).each{
    print “${index++} “
    if(!linksVisited.contains(it)){
      checkLink(it,linkFrom)
      linksVisited << it
    }
  }
}

def retrieveLinks(html){
  def links=[]
  (html =~ /href=“(.*?)”/).each {  whole, link -&gt; links &lt;&lt; link }
  return links
}

def isBrokenLink(url){
  def responseCode=new URL(url).openConnection().getResponseCode()
  responseCode.toString() =~ “^(4|5)”
}

def urlAlright(url,closure){
 try{
   new URL(url)
 }catch(Exception){
   closure(url)
   return false
 }
 return true
}

def checkLink(linkTo,linkFrom){
 if(urlAlright(linkTo,{println “Wrong URL Pattern: $linkTo inside $linkFrom”}))
   if(isBrokenLink(linkTo))
     println “ERROR: BROKEN LINK $linkTo. Referenced from $linkFrom”
}

Of course this kind of broken link check is not restricted to a XML-sitemap. Another enabling point for quering your html content could be a direct database access, whereas you would need to know the database schema of wordpress in a more detailed way. I am running this kind of check each week and already found out quite a few broken links.

Surely these automatic checks cannot reveal all errors inside a system (who the hell checks if my posts make sense at all…) but they definitely help to let people to concentrate on other things which cannot be achieved by computers so far (like sematics, comprehensibility, etc.).

What about you, what kind of Continous Integration use cases are you employing?

]]>
http://www.aldana-online.de/2008/02/07/extending-idea-of-continous-integration/feed/ 0