Python Rocks! and other rants
Weblog of Kent S Johnson

#36 2004-04-30 23:18:24

Simplicity Rules

One of the qualities that distinguishes code-and-fix hacking from software craftsmanship is a different idea of what "done" means. Read more in this essay.


#35 2004-04-30 09:54:08

Preaching to the Choir

My two "Why I love Python" articles have been wildly popular. Much of their popularity is due to being mentioned in the Daily Python-URL. But I have to wonder, why do Pythonistas so enjoy reading about why Python is great? And how can I reach the Java programmers where I work and convince them to try Python?

Categories: Java, Python


#34 2004-04-29 20:10:40

Why I love Python 2

Python makes it very easy to build complex data structures. One place this is handy is with data-driven programming.

For Meccano I wrote a simple walk-by-rule engine. It walks the tree of domain data and applies callbacks at indicated points. The walk is driven from a tree structure that can be quite large and deeply nested. (I have written about the rule engine before.)

Here is a simple example using some of the same techniques. As you read the example, imagine that the list of rules might be hundreds of lines long and deeply nested. Later on I will indicate some other ways the example might be extended.

Assume we are given a dictionary and we are to print its contents in a nested form where the nesting and order of keys in the output is given by the location of dictionary keys in a structure built from nested lists.

The essential idea of the problem is to use a staticly defined nested list to drive the formatting.

Python version

The Python version is short and sweet (14 lines, 349 chars). The data structures are defined easily and the output generation is simple:

data = { 'a':1, 'b':2, 'c':3, 'd':4, 'e':5 }

formatData = [ 'a', [ 'b', [ 'd', 'e' ], 'c' ] ]

def output(format, indent):
  for item in format:
      if type(item) == type([]):
          output(item, indent+2)
          val = data[item]
          print '%*s%s: %s' % (indent, ' ', item, val )

output(formatData, 0)

The output is:

a: 1
 b: 2
   d: 4
   e: 5
 c: 3

Java version

The Java version is long and ugly. It is 44 lines and 1127 chars - over three times the size of the Python version! The Map is defined in code. The nested list needs extra (Object[]) casts that greatly reduce readability. The code is much more verbose; this is always the case with Java collection code vs Python:

import java.util.HashMap;
import java.util.Map;

public class Structure {

   static Map data = new HashMap();

   static {
       data.put("a", new Integer(1));
       data.put("b", new Integer(2));
       data.put("c", new Integer(3));
       data.put("d", new Integer(4));
       data.put("e", new Integer(5));

   static Object[] struct = {
       new Object[]{
           "b", new Object[]{
               "d", "e"

   public static void output(Object[] format, int indent) {
       for (int i = 0; i < format.length; i++) {
           Object item = format[i];
           if (item instanceof Object[]) {
               output((Object[])item, indent+2);
           else {
               Integer val = (Integer)data.get(item);
               for (int j=0; j<indent; j++)
                   System.out.print(' ');
               System.out.println(item + ": " + val);

   public static void main(String[] argv) {
       output(struct, 0);

Reading data from a file

Now suppose you want to put the configuration data in a file that can be changed at runtime and reloaded as needed.

In Python, all you have to do is move the data structure definitions into a separate Python module. Client code imports the data module and reloads it before each use to re-read the source if it has changed.

In Java, typically the configuration data will be put into a text (non-code) format. XML works well for storing hierarchical data so it would be an obvious choice. Now, you have to define an XML format to hold the data and write code to load and parse the data.

So a hidden benefit of Python is that it includes a parser with the runtime. The parser can read text files and create native collections. This is a huge plus for Python!

More Python benefits

Imagine that part of the nested structure is a class or function name. In Python, the configuration module is code so it can define classes and functions that are referenced directly in the data. Or you can import the module that defines the class or function, then put a reference to it in the data.

In Java, you would typically use separate compiled modules to define the classes and reflection to reference them. The use of reflection further complicates the configuration parser or the client code.

What if parts of the data are repeated? In Python, it's no problem! A repeated section of the configuration can be defined separately and included into the main structure by reference. With an XML representation, the data would likely be repeated in multiple locations in the file.

This all works

I'm not just making this up for the sake of argument - these are all techniques I have used in production code. Python data structures are wonderfully flexible and easy to use!

Categories: Java, Python


#33 2004-04-26 08:24:32

Wired News: Diebold May Face Criminal Charges

If you think development of voting machines is somehow different from other software/hardware projects you should read this article.

Diebold, manufacturer of voting machines used in California, has been decertified by the state and referred to the state attorney general for possible civil and criminal charges under state election law.

The details sound very familiar to projects I have worked on that were struggling to meet a deadline. For example, a new peripheral was installed days before the election that was still being debugged. The peripheral failed in two counties causing the polls to open late. In addition, "Diebold...installed uncertified software on its voting machines in 17 counties without notifying state officials or, in some cases, even county officials who were affected by the changes."

This behavior is appropriate for a trade show, not a federal election. It's no way to run a democracy.


#32 2004-04-24 22:27:12

Agile Prophecies of Dr Seuss

Everything I need to know I learned from The Cat in the Hat.

Just read it :-)

Agile Prophecies of Dr Seuss

Categories: Agile


#31 2004-04-24 14:18:40

Don't Repeat Yourself

Don't Repeat Yourself and its special case Once and Only Once are two of the most important principles of good development. Read this essay for more.

Categories: Agile


#30 2004-04-23 08:30:56

Why I love Python

The code I wrote last night to build a Map of Maps shows one reason why I like Python so much - it is so easy to work with collections!

I wrote a sample app that shows the same thing in Java and Python. The requirement is to take a list of triples of strings in the form

[ language code, item id, localized string ]

and build a two level Map from language code => item id => data triple. Both examples include a simple test driver which prints:

This is a test
C'est un essai
no data
Another test
no data

Java version

Here is the Java version abstracted from the code I wrote last night. It is 56 lines and 1797 characters. The functional part of the code (excluding main()) is 38 lines.

import java.util.*;

public class Test {

  private Map _map = new HashMap();

  public Test(String[][] data) {
      // Convert the input data to a two-level Map from language code => course ID => locale data
      for (int i = 0; i < data.length; i++) {
              String[] itemData = data[i];

          String lang = itemData[0];

          Map langMap = (Map)_map.get(lang);
          if (langMap == null) {
              langMap = new HashMap();
              _map.put(lang, langMap);

          String id = itemData[1];
          langMap.put(id, itemData);

  public String lookup(String lang, String id, String defaultData) {
      Map langMap = (Map)_map.get(lang);
      if (langMap == null) return defaultData;

      String[] itemData = (String[])langMap.get(id);
      if (itemData == null) return defaultData;

      String title = itemData[2];
      if (title == null || title.length() == 0)
          return defaultData;

      return title;

  public static void main(String[] args) {
      String[][] data = {
          { "en", "123", "This is a test" },
          { "fr", "123", "C'est un essai" },
          { "es", "123", "" },
          { "en", "345", "Another test" }

      Test test = new Test(data);

      System.out.println(test.lookup("en", "123", "no data"));
      System.out.println(test.lookup("fr", "123", "no data"));
      System.out.println(test.lookup("es", "123", "no data"));
      System.out.println(test.lookup("en", "345", "no data"));
      System.out.println(test.lookup("fr", "345", "no data"));

Python version

And here is the Python version. It is 34 lines and 1036 characters. The functional part of the code (excluding main) is 17 lines. That is roughly 40% shorter than the Java version.

class Test:

  def __init__(self, data):
      # Convert the input data to a two-level Map from language code => course ID => locale data
      self._map = {}

      for itemData in data:
          lang, id = itemData[:2]
          self._map.setdefault(lang, {})[id] = itemData

  def lookup(self, lang, id, defaultData):
      itemData = self._map.get(lang, {}).get(id)
      if  not itemData:
          return defaultData

      return itemData[2] or defaultData

if __name__ == '__main__':
  data = [
          [ "en", "123", "This is a test" ],
          [ "fr", "123", "C'est un essai" ],
          [ "es", "123", "" ],
          [ "en", "345", "Another test" ]

  test = Test(data);

  print test.lookup("en", "123", "no data")
  print test.lookup("fr", "123", "no data")
  print test.lookup("es", "123", "no data")
  print test.lookup("en", "345", "no data")
  print test.lookup("fr", "345", "no data")

I know which version I prefer!

So how come I'm using Java?

Sigh. I chickened out.

I work in a predominantly Java shop. The project I am working on could grow to a GUI app or a web app or both. I'm familiar with Java Swing, Jetty web server and servlets. I know that there is a great variety of mature tools available for writing Java web apps. I have support available at work if I need help.

On the Python side, I would have to learn wxPython and/or one of the Python web app frameworks like WebWork or Quixote. I don't get such warm fuzzy feelings about the completeness of these frameworks, both in features and release quality. I would be working on my own and out on a limb if I had any problems.

In the end, I decided it was too great a risk so I went with the safer solution.


Categories: Java, Python


#29 2004-04-22 23:07:44

What happened to the Python part?

Not much Python here recently, mostly Java. I am learning about J2EE and writing some code that will probably turn into a Java webapp at some point. I'm getting familiar with Eclipse again. This week I made some changes to my Jython app. So I haven't gone completely over to the dark side :-)

Categories: Python


#28 2004-04-22 22:50:40

10x speedup ain't bad!

I am working on some of the worst code I have seen in years. About the best thing I can say for it is, it's mercifully short - 800 lines of Java with eight comments. (The second best thing is that it isn't in VB. The last truly horrible code I worked on was a VB monster with a 4000-line loop and case statement at its heart.)

The main loop is 250 lines. I don't think the author understood recursion; there are four separate stacks that maintain state in the loop! I'm not sure yet but I think they will all be replaced by recursive calls.

The loop builds a 5 MB XML structure in a StringBuffer, then writes it out to a file! Um, why not just write it to a file directly? Well, the StringBuffers are on one of the stacks so I have to sort that out first.

It has wonderfully readable conditional code like this:

if (isNoLocale() == false) ...

And the clincher - the program uses 16,285 localizations. They are keyed by language code and id. So how do you think the program was accessing this data? It put it all in a List and searched sequentially for a match!!! Yikes!

When I first ran the program it took 191 seconds to create the file.

So today's wins were to - Refactor the CSV reader part of the code into a separate class and clean that up, including changing the line items from Maps to Lists. Time to generate the file: 119 sec. - Build a two-level Map so the localizations can be looked up directly instead of by exhaustive search. Time to generate the file: 19 seconds!

And what does this have to do with anything, anyway? Well, I have to rant every once in a while or I will have to rename my blog :-)

And what is the way out of this mess? Refactoring and unit testing, of course! I have three test files containing all the live data used by the program. Every time I make a change I regenerate them and test with XmlUnit. Now I can refactor without fear to get the program to a point where I can understand it.

Poor code structure is a performance issue! Because if you can't understand it, you can't find the bottlenecks. You can't even profile code that is all in one method.

I saw the same thing with the 4000-line VB loop. I started factoring out common code and eventually I could see where the performance problems were and do something about them.

Categories: Java


#27 2004-04-22 09:24:16

J2EE Design and Development, again

I really like this book. It is full of sensible, practical advice based on real experience.

For example Chapter 3 is about testing J2EE applications. The chapter starts with an introduction to test-driven development, JUnit and best practices for testing in general. Then the author reviews several freely available testing tools including Cactus, ServletUnit, HttpUnit and Web Application Stress Tool.

Each of these tools is presented in the context of testing one aspect of a J2EE application. Short code snippets illustrate the use of the tool.

I have a hard time putting this book down. Really! OK, so I'm a geek. I admit it. And he's not Michael Crichton. But I enjoy reading the author's opinions and advice and I am learning a lot.

Categories: Java


#26 2004-04-21 08:05:20

Alternative Enterprise Architectures

Here are some thought-provoking articles about enterprise application architecture:

PetShop.NET: An Anti-Pattern Architecture

This one is interesting because it contrasts an (IMO) overweight J2EE architecture (Sun's Java PetStore) with an under-architected solution (Microsoft's PetShop.NET). The author has an extreme bias towards very heavy architecture - there are as many packages in Java PetStore as there are classes in PetShop.NET! I think the PetStore architecture is an example of architecture for the sake of architecture, and the author approves. In his conclusion he says, "DotNetGuru is completely rewriting the PetShop.NET, with the intent of implementing a true N-tier architecture based on an agile design. This means that we will provide an implementation that would let the user choosing his best architecture by using Abstract Factory pattern between all the layers. It will be possible to use Remoting/WebService/Local calls in the service layer and real O/R Mapping tool or DAO in the Data tier just by changing configuration file." Yikes! Agile development anyone?

James Turner writes, Why Do Java Developers Like to Make Things So Hard?. The above article is a great example of what he is talking about.

I actually agree with many of the criticisms in the PetStore article, but I think his cure is as bad as the disease. There is a middle way, agile methods will take you there!

The Way You Develop Web Applications May Be Wrong

In this article Anthony Eden argues for an extremely lightweight style of web application development. The comments debate various alternatives.

Simplifying Web Development: Jython, Spring and Velocity

Two of my favorites plus a new interest...what's not to like? :-) There is a link to a good presentation about Spring.

Categories: Java


#25 2004-04-20 10:28:16


I've just discovered the GraphViz package from AT&T Research. I have heard of the package before. It was a release of pydot, a Python wrapper, that got me to look at it.

GraphViz makes it astonishingly simple to create nice-looking graphics. For example, to make a package dependency graph of the Java packages in Meccano, I created this graph definition:

digraph G {
  size = "3,3";
  node [shape=box,fontname="Helvetica"];
  blasterui -> blaster;
  blasterui -> swing;
  blaster -> converter;
  blaster -> writer;
  writer -> mockcourse;
  devscript -> coursedata;
  devscript -> word;
  converter -> devscript;
  converter -> mockcourse;
  mockcourse -> coursedata;
  coursedata -> util;
  server -> editor;
  server -> writer;
  editor -> util;

I saved the definition in a file called Next I ran dot from the command line with

dot -Tpng > dep.png

Here is the resulting image:


There is much, much more you can do with this package but this shows how easy it is to get a useful result.

pydot puts a Python wrapper around GraphViz. This could be useful for creating graphs from a program. For hand-created graphs it might be easier just to write the data file by hand as I did for the example.

I had to make a few changes to to get it to work under Windows. There were problems with the PATH separator, the name of the .exe files, and text vs binary files. Here is a diff from version 0.9 that shows the changes I made:

Compare: (<)C:\Downloads\Python\pydot-0.9\ (26989 bytes)
   with: (>)C:\Python23\Lib\site-packages\ (27121 bytes)

<     for path in os.environ['PATH'].split(':'):
>     for path in os.environ['PATH'].split(os.pathsep):
>             elif os.path.exists(path+os.path.sep+prg + '.exe'):
>                 progs[prg]=path+os.path.sep+prg + '.exe'
<         dot_fd=open(path, "w+")
>         dot_fd=open(path, "w+b")
<         out=os.popen(self.progs[prog]+' -T'+format+' '+tmp_name, 'r')
>         out=os.popen(self.progs[prog]+' -T'+format+' '+tmp_name, 'rb')

Categories: Python


#24 2004-04-19 08:41:36

Continuous Design

I have written before about Growing a design. Key to the success of this technique is keeping your code clean using principles such as Don't Repeat Yourself and You Aren't Going to Need It.

In this article, Jim Shore chronicles his experience with this process. I particularly like the sidebar Design Goals in Continuous Design which summarizes much of what makes this technique work.

Categories: Agile


#23 2004-04-19 08:28:48

Inversion of Control Frameworks

Whenever you hide an implementation class behind an interface, you have the problem of instantiating the concrete instances of the interface and giving them to the client code.

There are several ways to do this:

  • The client code can instatiate the instance directly
  • The instance can be stored in a global resource such as a singleton or registry
  • The code that instantiates the client can also create the instance and pass it to the client
  • An instance can be created from a global property using reflection

Each of these techniques has disadvantages:

  • If the client creates the instance then you can't substitute a different implementation without changing the client, and the benefit of using the interface is reduced.
  • Using a global registry, singleton or property makes the client depend on the global facility which makes testing and reuse more difficult.
  • Reflection is complicated when the instance had dependencies of its own, for example it needs configuration data or depends on other interfaces.

A solution to this problem that is gaining popularity is to use a framework with support for Inversion of Control (or, as Martin Fowler calls it, Dependency Injection). With this technique, client code can be written with no dependencies on global resources. The framework takes care of initializing the required instances and providing them on demand.

Martin Fowler has an article that explains the technique. Two frameworks that use this technique are Spring and PicoContainer.

Categories: Agile


#22 2004-04-14 20:42:40

XPath and dom4j

One of my favorite features of dom4j is its integrated XPath support. This essay has details.

Categories: Java


#21 2004-04-12 11:36:32

J2EE Design and Development

I just picked up a copy of J2EE Design and Development by Rod Johnson.

This looks like a good companion to Martin Fowler's book Patterns of Enterprise Application Architecture <> which I am also reading. Johnson's book gives very specific and practical advice about how to use (or avoid) J2EE technologies to build enterprise applications. It is a nice contrast to Fowler which is pretty abstract.

I stumbled over the book while looking at the Spring framework which also looks interesting. It uses Inversion of Control to stitch together pieces of an enterprise app.

J2EE Design and Development


#20 2004-04-07 15:35:28

Build Integrity In

Chapter 6 of Lean Software Development is Build Integrity In. This is a subject near and dear to me because I am passionate about quality.

Quality is free!

If you keep your codebase clean and expressive it will be supple, it will support your need for change as the project moves forward.

If you let the codebase get crufty and brittle, your progress will slow to a crawl as change becomes harder and bugs keep cropping up. It won't happen in the first release, maybe not even the second, but it will happen.

I have seen both kind of projects and the clean ones are a whole lot more fun after they have been through a few release cycles.

Categories: Agile


#19 2004-04-07 08:58:40

Python dom4j?

I make no secret of my affection for "Python" and dom4j. I would love to find a Python XML package that is as powerful and easy to use as dom4j. The main requirements are

  • Simple API (no W3C DOM, please!)
  • Integrated XPath support
  • Good support for serialization to text

I like ElementTree but it has limited support for XPath. Any other suggestions?

Categories: Python


#18 2004-04-06 19:49:20

Python to the rescue!

Today my boss came to me needing an application to ping our company web servers and send email to the admin if they don't respond. And he needed it fast - in a few hours. Python to the rescue! I told him no problem and whipped together a script from httplib and smtplib. Tonight it's in production! :-)


#17 2004-04-06 19:42:56

Jython wins for now

I recently wrote about whether to use Jython or Python for my next project. For pragmatic reasons I have decided on Jython. Mostly this is because I want to leverage the work I did on the last project:

  • There are several pieces I might want to reuse including a styled editor and a nice toolkit for putting together editing panels.
  • Java Web Start is working pretty well as a deployment strategy.
  • I'm pretty good with Swing now and I would have to learn a new GUI toolkit to use Python.
  • I really do like dom4j, I don't think any of the Python XML toolkits compare. ElementTree is the closest I know of, but its XPath support is limited.
  • And Velocity, and Jetty :-)

Next time, who knows?


#16 2004-04-05 19:49:20

Deliver as Fast as Possible

Chapter 4 of Lean Software Development is Deliver as Fast as Possible.

Pull systems

The authors talk about pull systems where production is driven by customer demand trickling up the supply chain. This is fundamental to modern low-cost manufacturing; for example it is part of Michael Dell's special sauce.

In software development you can do this too. Focus on delivering what the customer wants as soon as possible. If you can find out what the customer wants and give it to them quickly, you will have a happy customer!

In agile development, iterations are short and focused on delivering value to the customer. Each iteration should deliver the most important remaining feature of the project.

Queuing Theory

The authors give a quick summary of queuing theory. What I take away from it is that you get best throughput when you process in small batches and don't run your plant at capacity. The apparent optimization of running your equipment hard actually results in lower overall value for the process.

Again this points to short iterations. It also makes a good case for not driving your developers too hard :-) "Just as a highway cannot provide acceptable service without some slack in capacity; so you probably are not providing your customers with the highest level of service if you have no slack in your organization." (p. 81)

This topic comes up again in Chapter 7 in the discussion of optimization.

Categories: Agile


#15 2004-04-05 08:37:20

Mozilla Amazon Browser

Here is an amazing demonstration of what you can do with Mozilla and XUL. It is a fast, full-featured Amazon search app implemented in XUL and JavaScript.

Mozilla Amazon Browser


#14 2004-04-05 08:28:48

The Secret Source of Google's Power

A fascinating glimpse into the infrastructure that makes Google work.

The Secret Source of Google's Power


#13 2004-04-04 14:37:52

Decide as late as possible

Chapter 4 of Lean Software Development is Decide as Late as Possible. This priciple is based on two simple ideas:

  • Change is expensive
  • The later you make a decision, the more information you have, and the more likely it is to be correct

So if you keep your options open as long as possible, you will make better decisions and you will be less likely to have to change your decision later.

The XP maxims You Aren't Going to Need It and Do The Simplest Thing that could Possibly Work can both be seen as techniques for delaying decisions. If you wait until you know you need something, you have waited until the last possible moment to make a decision. Similarly if you leave something out you have avoided making decisions about the right way to do it.

Categories: Agile


#12 2004-04-03 21:29:20

Essays on agile development

I have started writing about my experience with agile software development. Click /stories/ to read them.

Categories: Agile


#11 2004-04-02 16:30:40

The importance of feedback

Feedback is a consistent theme of Chapter 2 of Lean Software Development. Feedback is a key element of any control system. By adding feedback and shortening feedback cycles you improve the responsiveness of the system.

Many agile practices can be seen as increasing feedback. Unit and acceptance tests, frequent builds, frequent release and close contact with the customer all improve feedback and improve your chance of a successful outcome.

Categories: Agile


#10 2004-04-02 08:00:48

Groovy JSR approved

This is old news, but in case you haven't noticed, the Groovy JSR was approved unanimously.

Groovy JSR approved

Categories: Java


#9 2004-04-02 00:45:36

Deploying Python with wxPython

I've been playing around with Python and wxPython. I wrote a little application that shows a tree view of some data. I packaged it up with py2exe. The resulting distro is 8MB!! For a 100-line program! The entire Meccano2 distro is 6 MB including over a megabyte of help files.

It really makes me think twice about wxPython...

OK I'm not really being fair in the comparison as the Meccano distro doesn't include Java while the Python one does include both Python and wxWindows. But where I work if I'm going to distribute a Python app I probably have to include Python with it.

Categories: Python


#8 2004-04-01 23:26:40

Jython + dom4j = High octane development

Jython and dom4j make a great combination. This article tells why.

Jython + dom4j = High octane development

Categories: Python, Java


#7 2004-04-01 19:46:56

Deliver value to the Customer

Ultimately software development is about delivering value to the customer. If you don't give the customer something that solves a real problem then you have failed. And anything you are doing that doesn't help to deliver value is probably a waste of time and effort.

Lean Software Development talks about several kinds of waste including

  • Partially done work. At first this doesn't seem like waste, after all you expect it to be useful. But it is like inventory on the shelf - you have paid for it but you haven't received anything in return. Until you deliver the work to the customer, it has no value. This is one reason to favor short iterations - they keep you from building up inventory.
  • Extra features. You pay for these coming and going. You pay for them up front by developing them. You pay for them downstream by testing, documenting and maintaining them. But if the feature isn't valued by the customer you never get paid for it.

Categories: Agile


#6 2004-04-01 19:44:48

Lean Software Development

I've been reading Lean Software Development by Mary and Tom Poppendieck. I like it a lot. Of course part of why I like it is I agree with much they have to say ":-)"

The book talks about agile development from an interesting perspective. The authors have experience in several areas of product research and development including new car design and product development at 3M. They show how the lessons learned in these areas can be applied to software development.

Categories: Agile


#5 2004-04-01 16:52:00

Code faster with Python

Bruce Eckel weighs in with his estimate that he codes 5-10 times faster in Python than in Java. A few quotes:

"I have regular experience in both languages, and the result is always: if you want to get a lot done, use Python."

While learning Python, "the experience of being dramatically more productive in Python repeats itself over and over."

This matches my experience exactly. You just don't get it until you try it, then you don't want to go back.

Code faster with Python

Categories: Java, Python

© Kent S Johnson Creative Commons License

Comments about life, the universe and Python, from the imagination of Kent S Johnson.

Weblog home

All By Date

All By Category







Powered by Firedrop2