software blog

Irregular journal of my tinkering with software, chiefly under Debian GNU/Linux

Florian Cramer,

Desktop search engines

Fri Nov 4 15:22:22 2005

Google-like full text index access to one's home directory remains an issue under free operating systems. Updatedb/[s]locate provides only indexed file name search, "grep -r" through several gigabytes isn't great either. In former times, I struggled with glimpse (slow, non-free) and swish++/swish-e (totally broken usability) and pretty much gave up on the issue.

Today, I gave estraier (from Debian unstable) a shot, but didn't find it usable either because it only offers a web cgi, and no command line tool, for search queries. Then I finally gave up my resistance and installed Beagle, the well-known Mono-based Gnome search backend which, fortunately, can be used independently from Gnome and via a set of commandline tools. The beagle demon - and with it the Mono runtime - running for several hours on my PIII/1GHz, indexing ca. 50 GB of data in my home directory. My impressions so far:

- Indexing speed is subjectively slow, perhaps because the program is written for .NET/Mono.

- It spits out all kinds of error messages in the process. Several dependencies (the evolution libraries libedataserver, libebook, libecal) were missing after the Debian package installation.

- "beagle-query" works, also during indexing, but offer neither regular expressions, nor wildcards . This sucks big time. It also spits out matching files as URIs (i.e. "file:///home/foo/bar"). While this can be fixed with a wrapper script, it sucks, too.

- The software is clearly beta. I keep getting error messages such as "inotify_add_watch: no space left on device" - 25 GB are free according to df - "Maximum watch limit hit. Try adjusting /proc/sys/fs/inotify/max_user_watches."

Despite the flaws, the program is clearly useful until someone will implement something better (hopefully in C, with regex searches...), so I will keep it around.