Nathan Hoad

The value of automated data generation and processing

July 13, 2011

Recently, I was given a job to do by a friend to make a new web form in PHP. The form is pretty big, with around 160 fields. (for those who must know, it’s a very detailed form for receiving a quote from a removalist company). Total Tangent: These kinds of forms are in need of some serious web design love. Forms of this magnitude are, from the perspective of a programmer, impossible to get looking nice. They’re huge, and horrific to work with. Anyway, back on topic.

I went through the task of building the form, which in of itself took me about two hours, and some basic javascript to make it look a bit tidier and parse integers where necessary, etc.

When it came time to write the PHP, it means rewriting those same ~160 fields as variables. I got about 10 into it, and realised this simple task would take me about an hour. Using this PHP code and the output from submitting the form as a GET request, I turned it into a two minute job using Python:

lines = []

with open('fields.txt', 'r') as f:
    lines = f.readlines()

lines = [ l.split('=')[0] for l in lines ]

for l in lines:
    print "${0} = array( $_GET['{0}'], 0. );".format(l)

I could have done it the hard way, typing all that code myself (sadly, I know a LOT of programmers that do this!) and I would have spent at least another hour doing it. I can never stress enough to other programmers, especially those new to the area, that computers were invented to make our lives easier. This doesn’t mean they should make a programmer’s life harder. I’m still relatively new to programming (about three years officially) yet I find myself schooling programmers on how important automation is.

I was working with a guy a year or two ago on a project that required a lot of test data, which we would have to generate ourselves. Luckily, the guy I was working with agreed that we should just write a few scripts to generate our data - we generated a few thousand records. Out of a class of 30 people, we were the only ones who wrote scripts to generate our data. Everyone else sat there, mindlessly typing, for hours.

The lesson here, is that if you value your time, learn a scripting language, or at least learn the value of writing scripts that generate/process data for you. You can work a lot more efficiently, and get back to doing the fun stuff.

Sadly, as I’m still not quite a “veteran” programmer, I still do things the hard way sometimes. Looking back, after spending hours doing this job, the original form that took me two hours would have been another two minute job if I’d just written a script to create it for me based on a few inputs. Hopefully I remember next time!

Also, this is my first time setting up Alex Gorbatchev’s SyntaxHighlighter. It was really simple to set up, even though the documentation Alex provides is a little out of date on some pages. Looks great too!