2012-02-05

Just because you can rewrite your codebase doesn't mean you should have to



3DOM on Richmond Road: Remember the future

The strength of automated driven test suites is that you can verify that all the testable/covered parts of your system still work, even after major changes.

The strength of modern IDEs is at a click of mouse you can find wherever a class or method is used, so go to those places and edit them.

Even so: availability should not imply necessity: life is best when you don't have to use these features unless you really, really want to. Everything should work.

And usually it does. But not last week. A small bug surfaced on tuesday: you got an error marshalling strings with square brackets around, "[]". At first I did the obvious tactic: deny that this could be happening, but after repeated evidence to the contrary I sat down and had a look.

It turns out that the library, json-lib, with the same interface as the java Map interfaces, has an extra "feature" in that whenever you add a string attribute, that string value gets parsed as JSON if it starts with "{" or "[". Which means that parentnode.put("request","4,5") would add the string attribute "request":"4,5" to the parent node, the slight variation parentnode.put("request","[4,5]") would generate the attribute "request":[ 4 , 5]. This is fundamentally different, and challenges any assumption the recipient had about types in marshalled data, as the types varies depending on the contents of the strings being marshalled.

Needless to say, I was unhappy. At least by the point that unhappiness was reached, there were now some tests to work out what was going wrong. Which made fixing it possible, the fix being to always single quote strings being added, such as parentnode.put("request","'[4,5]'"). When the put() operation is invoked, the single quotes are stripped, and the unparsed inner values become the attribute. With careful wrapping of the put operations -and ensuring that only one set of double quotes is placed around every string attribute, the tests passed, everything worked, and a new nightly release went out with the issue marked as closed.

Except the next day, the issues came back. Because it wasn't fixed. Not once you took that node, parentnode, and added it under another node: messagenode.put("payload", parentnode).

Doing that appears to trigger a reparse of every string value. Which means all safely planted array strings end up being reparsed, and converted from strings to JSON arrays.

At this point, the unhappiness level changed from "medium" to excessive. With the new tests replicating this behaviour, and no obvious in-source switch to say "be less helpful", the only solution that seemed 100% likely to work with all possible payloads and orderings of payload construction was to rip out the entire library and replace it with one that did not exhibit the same behaviour.

Which is what I spent Thursday and Friday doing: a complete removal of all uses of json-lib and replacement with Jackson. Which, I not only regret in time wasted, but in the way that the json-lib Java-friendly object model is better than the Jackson "look like the DOM" world view, because the DOM, ubiquitous as it is, is pretty painful. Yes, with experience of the XML parsing world, I can certainly use it -but that did not mean I enjoy it. I've had to rip out and place server side and client side code, other things that are visible to other bits of the system that used the same JSON library. I haven't fixed all that code -but add enough downconvert/upconvert that the code compiles and appears to work.

Four hours over two days writing tests to show this problem existed -then
two full days of repair: switching libraries, backtracking on method calls, changing types and seeing won't compile, running tests. Some extra tests this weekend and then only the merge with two days worth of other peoples changes and some reruns and it's ready to commit. That will leave Jenkins to do the final retest and email, then everyone downstream will have to fix their side of the down- and up- converted code (the conversion methods marked as @Deprecated to make them easy to spot), which will take 1+ hour on monday for a couple of people.

When we look back at this, what will we be able to say?

We can now reliably send strings with square brackets around.

This is one of those weeks that I would never use in a motivational talk for anyone interested in taking software engineering.

[Artwork: 3Dom, "Remember the Future", Richmond Road, Montpelier]

2 comments:

  1. Did you try the ObjectMapper interface of Jackson that's described at http://wiki.fasterxml.com/JacksonInFiveMinutes#Simple_Data_Binding_Example ?

    ReplyDelete
  2. no, I'd had enough of an experience with things trying to be overhelpful I went low level. What I did do was write a set of helper methods to make the low level DOM easier to work with

    ReplyDelete

Comments are usually moderated -sorry.