May 15, 2008
CouchDB Incremental Reduce
I just checked in the first cut of incremental reduce for CouchDB into the Apache repository. So right now it allows you do a full reduction of all view values between the optional start key or end key. The cost of both view update and query time is logarithmic cost.May 9, 2008
Catapult Operator
If you lived in the Dark Ages, and you were a catapult operator, I bet the most common question people would ask is, 'Can't you make it shoot farther?' No. I'm sorry. That's as far as it shoots. - Jack Handey
May 8, 2008
HDR
This is my first ever try at HDR photography.April 27, 2008
CouchDB Roundup
Jan summarizes recent happenings: Another week (or two) in CouchDB
April 25, 2008
Gwendolyn
Me and my oldest. I'm under blankie.
Roseanna
Me and my littlest, about 2 minutes ago.
April 21, 2008
RubyFringe
So I've been meaning to write about a conference I'll be going to, RubyFringe in Toronto, July 18-20. I'm looking forward to it, partially because it's described as "Deep nerd tech with punk rock spirit" and the speakers sound pretty interesting. Most conferences are about as fun as a trip to the bank.
At ETech '08 I met Pete Forde and some other people involved with RubyFringe. We hung out a bit and had some crappy barbeque with people like Anil Dash and Dan Grigsby. It was a really good time and I'm looking forward to being around the types of geeks who would come to RubyFringe.
RubyFringe will be pretty informal. So I'm planning on wearing pajama pants the whole conference, and I'm even going to give my talks in pajama pants. Unless it's too hot, in which case I'll wear my cut-off pajama pants. I might have to buy some new ones, gotta look fresh yo.
I'll be talking about CouchDB, and maybe about what it's like to move your family somewhere cheap and live off savings to build the next great open source database. If you are going, send me feedback what you want to hear about and I'll make sure to cover it.
April 18, 2008
Forgive me, El Guapo
I know that I, Jefe, do not have your superior intellect and education. But could it be that once again, you are angry at something else, and are looking to take it out on me?
April 14, 2008
Lisp as Blub
There's a problem in the server software. When the load gets high, it fails catastrophically instead of gradually. Robert and Patrick Collison are investigating, but they're still not sure what the problem is. My guess from the external evidence is that it's related to garbage collection.Killing the server process fixes the problem, at least for a day or two.
And there's the problem with Lisp for writing server software. Long lived processes, shared state threading, and garbage collection make it extremely difficult to fail gracefully. Even if your code is completely correct and bug free, it can still crash, hang or just run unacceptably slow and there is nothing you can do to correct it without completely restarting.
There is no macro or meta programming technique to fix this problem. There are things you can do to mitigate it (mostly by generating less garbage), but once you reach a certain level of activity in the system where the garbage collector can no longer keep up (and it will happen), then every line of code in your system is now a potential failure point that can leave the whole program in a bad state. Lisp has this problem. Java has this problem. Erlang does not.
April 7, 2008
Compaction
File compaction is now checked into the Apache CouchDB SVN repos.
CouchDB databases grow with every document update, even if the update is a document deletion. So file compaction needs to be run occasionally to recover wasted disk space.
The compaction process will purge all old revisions and pack together the documents on disk to make sequential document access faster, like during view rebuilds and replication. Compaction is copy style and happens live or hot, while the database server is actively running. Normal database operations, reads, view index refreshes, document updates, replication, etc can all happen while the database is actively being compacted.
Also it is incremental and restartable, so if the server is shut down or there is a power failure in the middle of compaction, the next time you restart the compaction it will start back near the last spot where it left off.
So that's now checked in with a unit test working, though like most of the code it needs more testing, etc.
And here are still some other enhancements I'd like to see to storage compaction. One is compaction queuing, to make sure only one database at a time is compacting since it's a very disk IO heavy operation. That's fairly easy to implement.
Another enhancement is dealing with long transactions better that overlap the compaction file transition. Currently when a compaction completes, any read or write that started before the compaction completed will have at least 5 seconds to finish before it will be forcibly terminated with an error. That can be fixed to allow any unlimited amounts of time for transactions to complete, and to do so actually ties into some larger changes I'd like to see in the code. But until then clients can just retry the operation and things will be fine, so for now these are low priority things and I'm ok with it if they don't get done before 1.0.
Right now view indexes still do not compact, but that will be fixed later. For now as a stopgap, just delete the index files and the views will rebuild from scratch and hence "compacted". But views definitely will have proper compaction before 1.0.
Right now we are working on getting the Mochiweb branch finished and integrated back into trunk (Mochiweb is a replacement library for the current Inets HTTP library). Christopher Lenz has been doing most of the work, and I'm now going to help out finishing it up and hopefully get it checked in this week.
Once that's done plus a few more small tweaks, we might consider CouchDB 0.8 done. Then the release after that I think we'll target the incremental reduce and security, and then CouchDB will be in beta.