Sun’s Tim Bray has kicked off an interesting cross-blog conversation recently. He calls it the Wide Finder Project, and the basic issue is this: we’re moving toward a future of dealing with many CPUs with many cores but with (relatively) low clock rates. What are the interesting computer science and software development challenges this raises? How can we take advantage of architectures like this when dealing with parallelism is just so … painful using today’s paradigms and tools?
One thing I find fascinating about the discussion is how it’s coming from a strategic and futures-based motivation, but it’s taking place with a real roll-up-your-sleeves hacking ethic. Tim postulated a simple, almost a toy, problem — parsing Apache log files. Tim and others are exploring this simple problem and how currently-available languages and language features affect how easy it is to take advantage of multi-CPU, multi-core architectures to rip through the file like a chainsaw through wet cardboard.
Tim started the ball rolling with Erlang (conclusion: wicked cool, but the I/O and regexp libraries aren’t up to snuff — likely a solvable problem) and others have run with it from there.
So why the Wide Finder problem on a virtualization blog? I ran across Kevin Johnson’s blog entry A Pile of Lamps.
He starts off in an earlier entry by scaring himself:
At the risk of sounding like a pessimist, I think we’ll end up with
thousands of little SOA web services engines. Each one handling a
single piece. Each one with its own HTTP stack. Each one using
PHP/Perl/Ruby/etc to implement the service functions. Each one sitting
on top of a tiny little mysql database. Eeeep! I just scared myself – better drop this line of thought. I’ll have nightmares for weeks.
Kevin points to Andrew Clifford’s The Dismantling of IT, which is not talking about v12n per se, but certainly fits into the picture we’re drawing here:
The most obvious change is that the new
architecture would remove technical layers, such as databases and
middleware. These capabilities would of course still exist, but they
could be standardised and hidden inside the systems. They would not
need so much management, and we would need fewer specialists.
Mark Masterson urges
him to reconcile with our future world of cooperating tiny little machines, all
busy message-passing and presumably acting somewhat autonomously to
avoid the nightmare management burden. Sounds a bit like a job for … virtualization and resource pools? Or as Kevin puts it:
Is the answer a combination of LAMP, embedded computing, cluster management, and virtualization?