Thursday, January 18, 2007

Manual Reverse Geocoding

So this guy's letter was delivered without an address- just a map. Reverse geocoding at work. Humans rock- it's dead hard to get a computer to do this. Actually, maybe I should just say the the UK Post rocks, I can't see our USA civil servants doing anything with this besides sending it to the DLO or trying to arrest the person that sent it for subversive activities.

One of the worst experiences in my life was spending three days geocoding a dataset of points for Wien (aka Vienna, Austria) with ArcView 2.1. (ArcView would try to suggest a point, and I would try to move it to the right place.) House numbers there don't follow a very organized pattern, and I hope the errors I inevitably must have made didn't get anyone killed. Or didn't get the wrong person killed, or anything like that.

link via: /usr/bin/girl , which incidentally is one of the first blogs I ever read.


Jeremy said...

That is awesome Matt. Good stuff.

What is your take on the JSON data interchange format? Not just for passing data to and from JavaScript, but for passing data from server A to server B in an efficient way.


Matt M said...

I think JSON is most excellent when dealing with JavaScript serialization. I haven't had the urge to use it outside of that context, but it seems like it would be simpler and more transparent than, say XML, CORBA, JMS, or EJB.

In terms of efficiency, it's always one of those "it depends" kind of questions. If you are passing a huge amount of data, XML is going to be less efficient simply because of it's inherent redundancy and verbosity. However, you would probably be using some sort of compression on the data interchange, so would the redundancy of XML lend itself to compressing down to a smaller size?

One thing XML has that is nice is a number of different representational formats (DTD, XML Schema, Relax NG) to describe it's content with varying degrees of precision.

The one thing that always seemed quite limiting to me was the non-relational character of the data. For example, if I wanted to send a database record that was chock full of foreign keys, most formats don't seem to handle that very well, having no concept of a "pointer". I'd like to see a good way of handling that in JSON, some kind of object identity. Something like the Correlation Identifier pattern in Enterprise Integration Patterns, but for the internal contents of the message.

Matt M said...

I also meant to say that YAML provides this pointer concept quite well.