Friday, April 28, 2006

World Wind and the changing nature of databases

Tim O'Reilly (the computer book guy) has been doing a series of posts over on Radar about the changing nature of databases. In this one, he looks at NASA World Wind. (Which he calls "and open source program that does many of the same things as Google Earth"...) It points to some of the problems with flat file storage- "using file stores, especially when a large number of files are present (millions) has proven to be fairly inconsistent across multiple OS and hardware platforms."

One of the themes of the series is looking at places where flat file storage is more efficient than databases. This goes against the default decision of "throw everything into the RDBMS." I ran into this a while ago when I was having discussions with another architect (Steve) and mentioned that my application could get to all kinds of employee data by using the LDAP server, whereas the enterprise architects were telling him he had to pull the data from their excuse for a data warehouse. [Yes, pulling the data from the directly from the warehouse for OLAP systems...] Steve got all excited about this, and I didn't know why. It turns out that accessing the data via LDAP took only 15% of the time that it took to pull from the highly normalized "data warehouse".


Other posts are on Flickr, Bloglines and Memeorandum, and Second Life.

Thursday, April 27, 2006

"Rogue" SaaS users



Ed Sims on DIY in the Enterprise got me thinking about this services thing from the other angles again.

Your IT department can probably block the installation of client applications, but they're going to have a harder time taking away people's web browsers. Subscribing to Software as a Service (SaaS) is one way of dealing with slow/unreponsive/restrictive CIO orgs (at least for browser based apps). I have been several places where local installs were prohibited- even for developers. It makes it tough to try out new tools. Which makes it tough to improve things. Which makes it tough to innovate, et cetera.

Obviously, there are good security and stability reasons for not letting everyone install whatever they want. But I have found that organizations who make that restriction universally lack a way to get software X quickly approved on a trial basis. I realize getting a corporate lawyer to read an EULA is probably a good idea, but it shouldn't be a pre-req to updating software to fix bugs.

One thing I don't get is why so many SaaS have to be ONLY services. A lot of enterprises I have worked with would like to take the applications that are being run in SaaS fashion and bring them behind inside the firewall for secure/managed use. A lot of people are developing applications where something would be better off as a component than a service. This is particularly an issue with depending on SaaS startups- you don't want to wake up and find they've run out of cash and you can't run your business. I still support an old application I wrote using a Lotus Notes software component from a company that was acquired by a competitor who discontinued the product in 1998. If this was offered as a service- my app would have been dead for years.

I like the idea of corporate/government users using SaaS in a non-central-committee fashion (rogue user alert), but the SaaS vendors should then make a behind-the-firewall product available so that it can be transitioned to a controlled environment.

I think there is a definite analogy to mapping services. Sure, for 99.999% percent of situations, it's probably cool to depend on Google to provide continuous service for Google Earth Pro. However, if you have your own massive proprietary dataset, need some local control over things, etc- you are going to want to go to Enterprise and bring it behind the firewall. They even offer a hybrid solution- to layer your vectors on top of their imagery/terrain. Just offering any one of these three things probably would be a non-starter, but they've definitely garnered their reputation from the awesome quality of their SaaS.

Sketch Up, free

Beyond the general awesomeness of the announcement of freeness- it's a great place to practice Ruby (if not Ruby on Rails). The features aren't that different between the pro and free version (mostly , but the $495 pro is required for commercial/government use.

Free SketchUp.

If you haven't checked out the 8 hour trial- there's no reason not to now...I can't wait to see what starts showing up on the 3D Warehouse!

[comments disabled- spammers- f u.]

Wednesday, April 26, 2006

Lucene and GData


Doug Cutting- the genius behind Lucene- put a proposal out for Google's Summer of Code for someone to build a GData server on top of Lucene. Interesting to see how this could turn into a pretty cool interface for Lucene. It's also a picture of how GData could and something that is a usable generic standard for search interfaces. I guess it would be comprable to JSR-170, if they added versioning...

Now...we just need GData support on the Google Search Applicance.

SummerOfCode2006 - General Wiki
The Google Data API extends Atom to present a simple, searchable database. A Lucene-based implementation could be provided as a Java .war file containing a few servlets.

VC to portfolio: "Get some REST"

Brad Feld has been encouraging his portfolio companies (Rally, Feedburner, NewsGator) to support REST and SOAP. It's a really good idea- it lowers the bar for integration dramatically.

Some of the portfolio companies I work with in my consulting gig for a VC firm should see this same light.

Monday, April 24, 2006

SOA- meaning less every day


I keep linking to Fowler's "bliki" on Service Oriented Ambiguity because it is one of the best analyses of the subject. And it's kinda funny:

"I've heard people say the nice thing about SOA is that it separates data from process, that it combines data and process, that it uses web standards, that it's independent of web standards, that it's asynchronous, that it's synchronous, that the synchronicity doesn't matter...."

Fowler links to a guy called David Ing. I don't much about Mr. Ing (I don't know much about Fowler either, except that he is a undeniable genius and the guy who got me actually thinking about software.), but he has some sharp thoughts. I clicked on his link from Fowler's blog today and it took me to his latest collection of thoughts type post. It reminds me of Wittgenstein, in the style of sequential ideas.

A sample:
"- WS-* Specs feel 'top-down'. They take abstract concepts that you *may* need at some point and unify them from a few different levels. This unification is beneficial if you use all the concepts, but conversely expensive in terms of complexity if you don't."

I whine about the WS-* specs a lot. I sometimes understand the point of a spec- to define a common operation across systems. They just go too far. And XML Schema doesn't help...where's the RELAX-NG?

Here's my simple "Web Service" spec:
Make it so that every application in your organization returns an html view of an object if you send a GET to:
http://site/view?id=n
And make them return the same thing in XML if you send a GET to:
http://site/view?id=n&format=xml

Every place that I have seen where they do that- people actually use the API. Is it a service?

Who cares?

Saturday, April 22, 2006

IE Tab


Well, the name almost says it all. Open IE in a Firefox tab. For those persistent and ever so incompatible sites. It gives you a little button for your toolbar to reload the current page in IE, then the button changes to a little one that lets you bring it back to firefox. You can middle click to open the site in the other renderer in a new tab.

I realize this has been around for a while, but it's new to me and solved a problem for one of the more important users that I serve. I was using a little bookmarklet to launch IE, but this is far superior. Plugins are a wonderful thing.

Friday, April 21, 2006

Rename published method refactoring...


Jason Yip points to a cool new refactoring feature in Eclipse 3.2 M5. For reference, the refactoring support in Eclipse already handles all of the messiness associated with renaming a method- updating references to the method, updating your comments, etc. However, that was all for the references inside your own codebase. It now can keep the old method name around as a pointer to the new method name. This gives you the ability to broadcast the change to any external consumers of that method via a deprecation warning- letting them know not to depend on the existence of that method for much longer, while still allowing you to keep the code in just one place. Don't repeat yourself!

Now, am I going to mess around with upgrading to a development version of Eclipse to get this? Might be too risky at the current juncture. I am going to have to check out the buglist.

Anticipating Customer Wants



Carl has a good little post about referencing Mark Cuban's concept of building the product your customers are going to want versus what they tell you they want now.
Carl said-
"A friend and I were walking down this path earlier this month discussing how all of the applications that we have written over the years were successful if the user'’s did not have that much input. This totally goes against the grain of Agile methodologies."

There are a couple of things going on here. One is that a big difference between commercial product development and enterprise custom development is that in the enterprise your customers are already identified and assigned to you. Still, I have always found it make enterprise custom development as similar to product development as is reasonable. The reasons for this are numerous, but the unanticipated expansion of your customer base and supporting new functions of the existing customer base are two prominent ones. In this analysis, it does make sense to do some generalized product style development in the enterprise versus just building what the customers you have access to say they want.

In my mind- agile software development methodologies, where you only really do what is most important in the current iteration, are actually better than up front requirements definition methodologies where you plot out precisely what you are going to for the next year or two. In the up front case, you basically guarantee that you aren't going to be able to adjust to what the customer is going to want when you figure it out halfway through the budgeted schedule.

It does point to a fundamental problem with the whole concept of requirements gathering in the enterprise custom development context. Who is representing the customers that aren't at the table? This is why treating the project as a product- and having someone function in a product management role is really important. If you have someone that is an advocate for the product itself- you can allocate some of the budgeted time/money to making it a better product.

So, my solution to the problem of representing the interests of users that aren't at the table (including the "future users") is to have someone appointed to serve that role, and give them a "point budget" in each iteration or release. In my current project, we aren't really doing that, but we have a certain number of points reserved for the developers own choice of what to work on (could be reducing technical debt). In our agile development process we are using points to represent the amount of work we are capable of completing in a certain time period, and allocating some of those points to various stakeholders. It's only logical to think of the future of the product as a stakeholder in its current inception.

This differs a bit from the whole Getting Real thing (where the product manager has total control to select from the list of desired features), but it seems more workable in a typical office politics environment where everyone wants their influence to be recognized.

Thursday, April 20, 2006

Calendar API

Now there's a calendar API [code blog, [general blog]. Should I write my own outlook sync program that does what I want or wait for someone else? Have to balance having it act like I want to vs. the time investment.

One thing that is making it look like something worth bothering with is the emerging concept of GData. When I first heard of it, I thought it was going to be something like RDF (Resource Description Framework). It's more of a combination of REST and ATOM, with some query and update capabilities. [the API] The thing that Ray Ozzie is working on over at Microsoft is going in this same direction, with more focus on sync, which is something he has down cold. I think Google are making an effort to have the growing array of APIs be somewhat consistent. If all of the APIs move to this kind of model, it could have a simplifying effect on development.

In any case, the common kinds schema is pretty basic model of the world from the Google perspective. It has a couple of location specific concepts:

A place (such as an event location) associated with the containing entity. The type of the association is determined by the rel attribute; the details of the location are contained in an embedded or linked-to Contact entry.A element is more general than a element. The former identifies a place using a text description and/or a Contact entry, while the latter identifies a place using a specific geographic location.

gd:geoPt

A geographical location (latitude, longitude, elevation).
Examples



Schema
start = geoPt
geoPt =
element gd:geoPt {
attribute label { xs:string },
attribute lat { xs:float },
attribute lon { xs:float },
attribute elev { xs:float }?,
attribute time { xs:dateTime }?
}