Monday, December 6, 2010

Personal recollections of the YOW Melbourne Conference

I attended the YOW!2010 Australia Software Developer Conference in Melbourne last week, and here is some recollection of my favourite talks. Please note this is all from memory, so I might some parts wrong.

Main themes

From the sessions I attended, some of the main recurring technical themes in this conference were:
  • Consistency should be traded for Availability in typical distributed systems
  • Eventual consistency over ACID
  • Always expect network failures, i.e. systems must be able to tolerate partitioning
  • NoSQL

Most informative talks

The Rich Get Richer: Rails 3 :- Obie Fernadez

Obie gave a enlightening summary of the events leading to and the people involved in the merging of the Merb and Rails codebases, resulting in the newly released Rails 3.0. The main drivers behind the new release were decoupling, modularity and prevention of code bloat. Many components of Rails 2 had been abstracted out and managed as third-party plug-in projects. He described Rack, Bundler, Arel and the upgrade process from Rails 2.x to 3.0. I look forward to the day when I get the chance to put all these into practice.

Integrated Tests Are A Scam :- J. B. Rainsberger

Having written myriads of tests over the years, it was easy to lose sight of the different roles of tests in different parts of the system. JB gave an excellent refresher on two fundamentals of unit tests : interaction testing and contract testing.

Interaction tests were the "normal" unit tests that I've been writing, where the external dependencies were replaced with stubs/mocks . In the example, he described interaction tests for a client assessing an external supplier. These tests were concerned with the pattern in which the system-under-test (SUT) called out to the external system (mocked out), and how the SUT responded to various return values from the external system (stubbed).

Contract tests referred to tests that verified that an external system satisfied a certain API and expected behaviour, i.e. the contract. In a strongly-typed language like Java, this contract could be abstracted out from the actual implementation of the external system into a public interface, that our code could depend on. In fact, our code would only depend on this public interface, with no knowledge of the implementation. In a strongly-typed language, any implementation of this interface would then satisfy the minimum requirements of the contract, therefore, the real implementation could be replaced with a stub. As long as the stub passed the contract tests, it would in theory make no difference to our code. Hence, we could then write integrated tests for our SUT that used the stub instead of the real implementation, which would make such tests extremely fast.

Personally, I was not too convinced that interaction and contract tests could fully replace the need for integrated tests against real external systems. There would always be unexpected surprises lurking in system configuration, and some behaviour could not be adequately captured by the abstracted interface. Furthermore, in dynamically typed languages, it would be much harder to enforce that the stub fully implemented the public interface.

Designing and Implementing RESTful Application Protocols :- Ian Robinson

Ian walked us through an example from his new book, REST in Practice, to illustrate the interactions of a client program ordering coffee beans from a coffee supplier via a RESTful interface.

The client first asked the server for a list of allowed operations. After that, it requested a quote of available product prices. It then placed an order by resubmitting all the pricing data it had previously received back to the server. In this way, the client could maintain conversation state over a stateless protocol. A checksum of the data was used to prevent price tampering by the client. Finally, the server immediately returned a HTTP 202 response, before kicking off the long-running order fulfillment process asynchronously.

Ian explained several basic concepts such as the use of media types, link relations and XForms in driving the process flow.

I found Ian's talk to be clear and concise, therefore I'm putting his book on my wish-list.

Most interesting talks

Exploring NoSQL :- Erik Meijer

This talk started a bit a slowly with an (mostly unecessary) explanation of how an object graph was represented in memory using pointers. This type of representation was embodied by noSQL, whereas SQL embodied a fully normalized relational view of data.

Things became interesting when Erik started describing Category Theory. Erik reasoned that because the direction of parent-child relationships was reversed between SQL and noSQL, they were duals of each other according to Category Theory. In other words, noSQL was actually "coSQL", i.e. for any statement that was true of SQL, the dual-statement (opposite meaning) would hold true for noSQL. Some examples of this duality were:

The identity is embodied in the row, i.e. primary key (extensional)Identity of an object depended on other objects in the environment (intensional)
Closed-world view of dataOpen-world view of data
Synchronous operationsAsynchronous operations

A very interesting concept indeed. Furthermore, this talk had the most obscure slide ever...

Computational Information Design :- Ben Fry

This talk was worth attending just for the visual candy itself. Ben talked about data visualization using his Processing programming language (IMO really a DSL over Java 2D/3D). Some of the examples shown were really amazing, such as:
  • Mapping the program flow in old cartridge-based video games
  • Tracking changes to six editions of Charles Darwin's work
  • Comparing the human genome with other mammals

Ben then introduced Processing.js that allowed code written in Processing to be run on any HTML5 compatible browser. People could simply go to to write and run Processing code all within a Web browser. What was even more exciting was a port of the Processing runtime to Android, which meant I could run Processing programs on my smartphone. Super cool!


For me, the best part of the conference was the iPhone workshop. Having done some Android development, I was eager to see how things worked on the iPhone side. In this half-day workshop, I learned some basic syntax of Objective-C, basic usage of xcode IDE, some iPhone frameworks, and finally put together a simple iPhone application. My verdict of Objective-C :- dynamic typing and late binding were nice, but the extremely verbose syntax really sucked. I guess I'll stick with Android for now.

Wednesday, October 27, 2010

Tethering HTC Desire to Mandriva Linux 2010.1

The other day, I found myself in a situation where I needed to go online but there was no Wifi around. So I plugged my HTC Desire (Android 2.1) smartphone into my laptop running Mandriva Linux 2010.1 Spring and got tethering working. Here are the steps.


While connected to a wired or Wifi Internet, install the dhcpcd package. As root user (or using sudo):
$ urpmi dhcpcd

It was also useful to install usbview.
$ urpmi usbview

Connecting the phone

First, connect to Mobile Internet from the phone (this must be done before plugging into laptop). Then plug the phone into the laptop using the USB cable and select the Internet sharing mode.

On the laptop, start usbview and confirm that Android Phone was shown.

As root user, start the usb0 interface:
$ ifconfig usb0 up
Check that it had really started the usb0 interface:
$ ifconfig usb0

Getting an address

Finally, on the laptop, obtain an address from the phone using:
$ dhcpcd usb0

When this command had completed, confirm that an IP address had been assigned using:
$ ifconfig usb0

After this, the laptop can get online using the HTC Desire phone as a gateway.

Wednesday, September 15, 2010

Agile Australia 2010: Personal Recollections

I attended the Agile Australia 2010 Conference in Crown Conference Centre, Melbourne, on September 15 and 16, 2010. Here are some of my personal recollections, remarks and photos.


First impression when I got to the venue :- lots of people. The expected Agile consulting/training firms were there with their promotional booths (some free iPads to give away), with Thoughtworks being prominently at the centre. Personally, I was more interested in training courses, and so grabbed some brochures from Renewtek and Agile Academy.

Promotional booths and the crowd

The conference was kicked off with a keynote talk by Jim Highsmith, who presented various pieces of data, statistics and graphs to tell everyone how to measure Agile successes.

Jim Highsmith's keynote talk

This was immediately followed by an excellent talk by Jeff Smith of Suncorp. His point that trust and respect were the crucial values necessary to foster effective teams resonated with my own recent experience with several teams in my own company. He also introduced the Agile Academy, which I didn't know of before.

Next, the conference was split up into three parallel streams. I attended the talks by Ben Hogan (Using Agile techniques to adopt Agile) and Neal Ford (Implementing Emergent Design).

Ben introduced us to John Kotter's 8 steps to successfully implement change, that he contended were not adaptive due to lack of any feedback mechanism. Therefore, he proposed using Scrum's feedback cycle to introduce adaptability to some of these steps. In order to introduced changes such as Agile adoption to people and organizations, he suggested using a longer iteration/cycle time (1 month) than in software development.

I was initially more interested in Neal's topic due to its developer-centric focus. He spent quite a lot of time explaining software design concepts such as:
  • inherent and accidental complexity
  • technical debt and negotiating payment
  • over-engineering for genericness

He presented enablers of emergent designs such as:
  • test-driven design
  • refactoring to harvest idiomatic patterns
He even went through some calculations of code complexity and class coupling metrics. However, given that Neal's talk was marked as for those thoroughly experienced with Agile, I was actually a bit disappointed to find it rather lacking in novel in-depth content. Surely, experienced agile practitioners would already know about complexity and technical debt. I doubt any experienced developer would have learned much from the refactoring examples. I did learn something useful though:- In Pacman, a ghost could not detect Pacman running right up next to it (his example of anti-object).

Neal Ford on idiomatic patterns and software design

By lunch time, I was eager to find out what delicacies might be served in the buffet. Alas, it was just beef rolls and sandwiches. Nothing to brag about. How I missed the comparatively luxurious offerings from the bygone Sun Developer Days conferences.

Sandwiches, rolls and some salad for lunch

After lunch, I watched the panel session, where five top agile practitioners sat together to discuss their preferred flavours of Agile. It was nice to be able to ask questions using Twitter. I was particularly interested in Kanban, and managed to tweet a Kanban-related question for Bruce Taylor from the comfort of my phone. By the way, my boss was also one of the panel members.

This was followed by the lightning talks, my favourite of which was Ben Arnott's Agile@Home. A light-hearted, entertaining contrast to the other serious topics of the day.

One of the lightning talks

The last full talk I attended on Day 1 was Craig Smith's Building An A-Team. The content itself was nothing new, mainly around how to identify and lead good teams. However it was concisely and brilliantly delivered with good humor.

Finally I decided to pop into the Agile Games session next door midway. This was what I saw :

Not too sure what was going on here

Groups of people throwing balls around. After 30 minutes and still failing to make sense of this, I switched to another room and caught the end of Jay Jenkins' talk on how ANZ did Agile.

Thus ended Day 1.


The second day started with an awesome talk by Martin Fowler. He reminded us of the reasons why we did Agile, the advantages of adapting to late changes, the importance of continuous integration and delivery, and finally the different types of technical debt. Of course, the best part was his impersonation of Uncle Bob Martin.

Martin Fowler giving the first talk of Day 2

Next, I attended a couple of more practical talks. First was Jason Yip and Marina Chiovetti on how to properly measure and assess Agile adoption and success in different organizations. They pointed out that external assessments were misleading because results could be faked. Only when an organization defined a vision ("true north"), be transparent and internally reflected on how it was progressing towards this vision could it have a true understanding of its level of Agile adoption.

The second talk was very relevant to me as a developer. Chris Mountford described how he solved monster builds problems in Atlassian. His approach involved the following:
  • Always measure first
  • Be selective about what to test
  • Parallelize tests
  • Canary tests with a good proven version to detect environment/external problems
  • Choosing the right tools

At this point, the #agileaus Twitter feed was flooded with reports of how fantastic talks in the other rooms were. All of the talks during this time slot looked interesting, but alas I could only be in one place at a time.

I felt lunch on Day 2 was a slight improvement over the day before. The salad, rolls and sandwiches were still there, but they tasted a bit better, or maybe I'd lowered my expectations.

More salad, rolls and sandwiches for lunch

In my opinion, the talk with the best analogy (Death Star) and slides would be Nigel Dalton's talk after lunch. Nigel described how various non-IT teams (lawyers, finance, publishers etc) in Lonely Planet adopted Agile. Photos of their various Kanban boards managed to convey how these teams track their work in a highly effective and visible way. It was reassuring to know that Agile was not just restricted to us IT folks. Having personally spent a few months working at the Lonely Planet office in Footscray a few years ago, I was happy to see such a drastic transformation throughout the company.

Nigel Dalton described the Death Star as the perfect Agile project : "Build the weapon first!"

Nigel's entertaining presentation was followed by a more serious, enterprise-level talk by John Sullivan from Sensis, on how best to do "Enterprise Architecture" in an Agile company. This was a topic that interested me as a developer. Some of John's main points were:
  • Traditional enterprise architectural groups were unnecessary barriers between developers and business
  • Architects produced elegant but unworkable solutions
  • Better to form a temporary group of senior developers to sit with business in a requirements gathering session to identify key business and technical requirements
  • Implement an initial version of the architecture in code, or sometimes spikes, to gain technical insights to be able to provide estimates
  • "Architect" as a temporary role on an as-needed basis, not a job title
  • Get rid of people who get in the way
  • Split the backlog into
    • in-build : Only build this, but make the system extensible
    • in-plan : Something to keep in mind, but expect this to change
    • future-release : (I would guess don't worry about this)

John Sullivan declaring the end of the Enterprise Architect job title

Up to this point, DevOps people might have felt a bit left-out. Luckily, Jez Humble changed that with his informative presentation on DevOps and Agile Release Management. This was another area I had a special interest in. Jez stressed the importance of the definition of DONE, i.e. DONE when a feature got to production. The problems he presented were that DevOps people:
  • used to see changes as too risky
  • resisted new changes because they favoured stability
  • spent too much time firefighting rather than doing strategic work
. The solutions he presented were:
  • Culture : Have cross-functional delivery teams. Adopt Kanban to schedule work.
  • Automation : Environments could be procured using cloud computing and configuration managed by Puppet or Chef. He even mentioned cucumber-nagios which interested me a lot.
  • Measurement : Collate business and technical metrics. Perform root cause analysis.
  • Sharing : With big visible displays
  • Overcoming objections : By showing visibility, control, automated scripts and auditing
  • Deployment pipeline : Automatic deployment to various environments including production.
That was a lot of new tools for me to read up on.

Jez Humble talking about DevOps

Finally, I attended the Open Space session, first with the lightning talks group, then finally with a group discussing automated performance tests. The idea sounded good, but I felt there wasn't enough time to flesh out too much technical details.

One of the Open Space lightning talks

At 4:30 pm, snacks and drinks came out for the final networking session that concluded the conference.


Overall a worthwhile conference to gain an overview of the current state of Agile. I was glad to see the previously neglected area of DevOps getting some attention. As for the technically focussed talks, I would have preferred:
  • more technical depth
  • updates on the latest development, testing or monitoring tools
  • some discussion on the latest Agile practices when developing on mobile devices

Maybe next year...

Friday, August 6, 2010

Detect broken links on a Web site using wget

The other day, I started thinking about writing a simple Web site validator that detects broken links on a Web site, similar to W3C Link Checker. After looking at various code samples in Java, Ruby, etc, I figured out that GNU wget 1.12 on my Linux machine could do the job just fine, with no programming required. It even detected broken resource links in CSS, not just broken <a> links.

Here is how to write a simple script to check the site. First, pretend to be a Mozilla-based browser and spider the site to the depth of one level:

wget --spider -r -l 1 --header='User-Agent: Mozilla/5.0' \
-o wget_errors.txt http://the_site_i_want_to_validate

Then, simply look at the return code to determine if there is any error. If the code is larger than zero, there is an error.

if [ $EXIT_CODE -gt 0 ]; then
echo "ERROR: Found broken link(s)"
exit 1

To find out the actual links in question, just grep for 404 in the wget error log.

BROKEN_LINKS=`grep -B 2 '404' wget_errors.txt`

The -B 2 outputs the 2 lines above any matching line, which in this case contains the broken link in question.

Tuesday, August 3, 2010

Loading large reference database in Android

Reference data normally refers to (mostly) read-only data that is used to validate or resolve other pieces of data. For example, a list of postcodes and suburbs, that can be used to provide auto-suggestions in a UI or validate a user's address.

I recently developed an Android (version 2.1 / Eclair) application that used a large set of reference data, stored in a SQLite database. The fact that this was targeted for Android smartphones introduced some constraints in the loading of this data that would not be present for a desktop or Web application, primarily processing time and storage space.

The initial design was based on my prior experience working on Web applications. It involved packaging a largish (~10MB) text file into the apk archive. The application performed a one-time initialization step that loaded this file, parsed it and executed SQL statements to insert rows into the database. The application then accessed the reference data via a subclass of SQLiteOpenHelper. It soon became apparent that this would not work in the Android world, due to the following reasons.
  • Loading the database, even if its a once-off, took too long. People are not likely to want to wait for many more minutes, especially after waiting for the 10MB download.

  • The database storage took up 10+MB, and when combined with the input text file, totalled 20+MB. Android 2.1 required the application to be installed onto the phone's limited internal memory, where 20MB was a relatively massive chunk.

  • The input text file in the apk archive could not be removed after initialization, so it just uselessly consumed precious phone memory.
After a few tries, I arrived at the following final design that seemed to solve these problems.

Preload database

The database loading code had to be removed from the Android code and rewritten as a separate program. This program read the text input file and inserted the rows into a SQLite database, with the database file stored on my desktop. Note that the primary key column of the reference table should be named _id and the following must be added to the database for it to be usable by Android:

CREATE TABLE android_metadata (locale TEXT);
INSERT INTO android_metadata VALUES('en_US');

This preloaded database file was then uploaded to a Web site so that it could be downloaded by the Android application.

Download database

The Android application didn't need to be packaged with the text input file any more, hence trimming the apk archive from 10+MB to a few hundred KB. The application's starting Activity would check and download the database file if it didn't already exist on the SD-Card. According to the Android Dev Guide on Data Storage, the proper place to save the database file would be:

  • for Android 2.2 and above :- the directory returned by getExternalFilesDir()

  • for Android 2.1 :- the directory starting with getExternalStorageDirectory(), then appended with /Android/data/<package_name>/<file_type>, which resolved to something like /sdcard/Android/data/com.mycompany.myapp/db

I had to use the latter option as I was targeting 2.1.

It was important to ensure that the UI was not frozen during the few minutes it took to download the 10+MB database file. That meant that the downloading code must be executed in a separate thread to the main UI thread. The best way to achieve this was to create a subclass of AsyncTask and put the downloading code in its doInBackground() method. This class also displayed a ProgressDialog to keep the user informed, and acquired a WakeLock (SCREEN_DIM_WAKE_LOCK) to prevent the phone from going to sleep. Another important thing to remember was to prevent the screen orientation from changing when the phone was flipped over while the ProgressDialog was showing, otherwise the application would crash when the dialog was dismissed. This was achieved by temporarily setting the requestedOrientation property of the activity to "no-sensor". Here is the skeleton code of the AsyncTask:

// FileDownloader is my own delegate class that performs the
// actual downloading and is initialized with the source URL.
public class InitializeDatabaseTask extends
AsyncTask<FileDownloader, Integer, Object> {
private ProgressDialog progressDialog;
private File dbFile;
private PowerManager.WakeLock wakeLock;
private Activity activity;
private transient int originalRequestedOrientation;

public InitializeDatabaseTask(Activity activity, File dbFile) {
this.dbFile = dbFile;
this.activity = activity;

wakeLock = ...; // Obtain a wakelock for SCREEN_DIM_WAKE_LOCK
progressDialog = ...; // Create a ProgressDialog instance with title, message ,etc

protected void onPreExecute() {
originalRequestedOrientation = activity.getRequestedOrientation();

protected Object doInBackground(FileDownloader... params) {
FileDownloader downloader = params[0];
try {
} catch (IOException e) {
throw new AndroidRuntimeException(e);

return null;

protected void onPostExecute(Object result) {

The main Activity used InitializeDatabaseTask like so:

File dbFile = ...; // File pointing to /sdcard/Android/data/com.mycompany.myapp/db
new InitializeDatabaseTask(this, dbFile).execute(new FileDownloader(DOWNLOAD_DB_URL));

Mount SD-Card in emulator

I developed the application mainly using the Android SDK emulator in Eclipse. The emulator did not mount any SD-Card by default. In order to test the downloading code, the emulator must be set up with a SD-Card image file, like so (on my Linux system):

$ android-sdk/tools/mksdcard 64M ~/.android/avd/Android.2.1.avd/sdcard.img

This created a 64MB image file (more than enough for the data) in a special Android avd directory, so that it would be automatically mounted by the emulator.

Implement custom SQLite helper

Finally, I had to write my own custom version of the SQLiteOpenHelper to access the database file from the SD-Card, because the standard one would only read from the default phone internal memory. Given that the reference data would be read-only, this custom class was a lot simpler than SQLiteOpenHelper and only needed to open a read-only database connection. The code:

public abstract class ExternalStorageReadOnlyOpenHelper{
private SQLiteDatabase database;
private File dbFile;
private SQLiteDatabase.CursorFactory factory;

public ExternalStorageReadOnlyOpenHelper(
String dbFileName, SQLiteDatabase.CursorFactory factory) {
this.factory = factory;

if (!Environment.getExternalStorageState().equals(Environment.MEDIA_MOUNTED)) {
throw new AndroidRuntimeException(
"External storage (SD-Card) not mounted");
File appDbDir = new File(
if (!appDbDir.exists()) {
this.dbFile = new File(appDbDir, dbFileName);

public boolean databaseFileExists() {
return dbFile.exists();

private void open() {
if (dbFile.exists()) {
database = SQLiteDatabase.openDatabase(

public synchronized void close() {
if (database != null ) {
database = null;

public synchronized SQLiteDatabase getReadableDatabase() {
return getDatabase();

private SQLiteDatabase getDatabase() {
if (database==null) {
return database;

A concrete subclass of ExternalStorageReadOnlyOpenHelper was created to query the reference data via the SQLiteDatabase object returned by the getReadableDatabase() method.

The databaseFileExists() method allowed the main Activity to check if the database file already existed to decide whether to initiate download.


This final design had greatly improved the application in the following ways :
  • The user only had to download a small apk file initially, greatly reducing the barrier to installation.

  • The separate download step for the database file presented an opportunity to tell the user what was happening and to use Wi-Fi if available.

  • Most of the data resided on the SD-Card, where space was much more abundant.

  • There was ever only one copy of the reference data on the phone, no redundant duplication.

  • The main application code could be updated without having to download the reference data.

Thursday, May 27, 2010

Large User Story vs Human Psychology

Much have been written about the problems of oversized user stories. They are commonly indicative of too much uncertainty, usually over business requirements, but sometimes over certain technical aspects. While large stories (or epics) are acceptable during a high-level planning stage, they should be either simplified or broken down by the time they are scheduled within an iteration. Otherwise, they usually end up greatly exceeding their (very inaccurate) estimates, or even becoming blocked.

As a software developer, I've been involved with a few such stories recently. In addition to the above-mentioned problems, I noticed another adverse side-effect. With each passing day on the same story, I became increasingly less motivated to observe good coding practices such as refactoring, cleaning up messy code and proper OO design. As the days dragged on, I became more tolerant of sloppy practices. I eventually finished the story and handed over a piece of spaghetti code that I wasn't proud of. In hindsight, I could have done a lot more to improve code quality.

Thinking back, I realized that this same behaviour manifested itself many times before in the past. Perhaps, I might just assume that I was a lazy bum, if not for the fact that I was practising pair-programming, and the same tendency was also apparent in my pairing partners (some more resistant than others). Maybe it was just the long hours and looming deadlines. However, my experience tells me that doing multiple smaller consecutive stories, while they may add up to the same volume of work as a giant story, does not lead to this behaviour.

Image from:
This leads me to speculate that there is a psychological explanation. Research has shown that the human brain has two regions that compete to form a decision when presented with a choice between a small short-term reward and a longer-term goal (larger reward). The more emotionally driven region of the brain seeks the short-term reward while the rational region tells us to hold-out for the overall better outcome. Our brain battles itself to a decision. As professional software developers, we constantly have to make a similar decision :- should we take a short-cut to deliver a piece of work now, or refactor/write more tests/clean up so that the overall code quality improves in the long run?

It is likely that by breaking up a large body of work into many small discrete stories, the completion of each story provides an emotionally satisfying reward in a short period of time, even if extra effort/time has been added to ensure good quality. This extra effort/time is usually at least proportional to the original estimated size of the story, but may cost even more if the story gets so large that it touches many interdependent parts of a complex system.

In other words, a small story usually touches a small area of the code base, hence its parts are easier to visualize in one's head, and thus presents only a small mental barrier to keeping it clean. Conversely, a massive story that encroaches on a myriad of different moving parts becomes hard to visualize. Soon, problem areas stack up beyond our ability to keep track of them in our heads. On top of that, the uncertainties inherent in a large story may reveal themselves to be time-sinks. Before long, the extra effort required to deliver quality becomes a mountain of hurdles. Having spent most of our time and energy just getting to code to work, with no psychological reward from completing anything yet, it is no wonder that we become disinclined to continue.

In conclusion, large user stories work against human psychology for the desired long-term goal of good quality code. Small stories are more satisfying due to frequent emotional rewards, at the same time contributing towards the greater goal.

It is not always possible to foresee how big a story can become at the start, therefore, developers should be given the freedom to split up stories while they are being developed. Personally, I think a story should not take more than 2 to 3 days. If it looks bigger than this, split it. This I endeavor to do more of in the future, with the hope of always delivering code I can be proud of.