Monday, August 07, 2006

Data Monopoly, but open n-grams

I know people in the GIS community have been talking about the Digital Globe data deal with Google for a while, but it is interesting to see it begin to intersect with Tim O'Reilly's Open Data concept... One thought- at least when Google licenses data they show it to you- when the govt gets its hands on things. Still, an exclusive license on data does sound somewhat monopolistic. Will a future Google anti-trust case be fought over data?

"....portions of DigitalGlobe's imagery data has been licensed by Google for exclusive use in Google Earth/Maps.

...reminding us of Tim's argument that Data is the next Intel Inside -- a source of competitive advantage. The question is of course whether this competitive advantage should be granted for exclusive use or whether the data itself will eventually be regulated by anti-competition laws."

The killer advantage of Google Earth is the data. It's a great application, but the cheap data is key.

Google has lots of data. In one of my previous jobs in computational linguistics- we would have had a lot of fun with their release of n-gram data *really it's 1,146,580,664 5-grams. I have been quite impressed with Google's statistical machine translation... I am looking forward to what they do with information extraction. Hey- at least the data is free. (And it's not the personal data AOL is giving out. Poor old AOL...trying so hard to get "with it". bye!)