Taking a peek at Apache Solr
We’ve been talking about upgrading #1 German Drupal community DrupalCenter from Drupal 5 to Drupal 6 for a while. We’re also thinking about new features and layout once we get this done. Our content-rich site would definitely get even more attractive with advanced and fast faceted search capabilities. Right nor Drupal’s integrated search isn’t of much fun and it’s so slow…
Because I also server as admin to our dedicated server and have a copy of our site running for tests I spontaneously gave Apache Solr integration a shot. All I had to do was installing Java 5 SDK, installing Apache Solr integration module, downloading, configuring and starting Apache Lucene Solr. Even whithout a fully-fledged servlet container like Tomcat, just using the integrated Jetty engine, it was a lot of fun.
Most of the time time was spend on indexing the content and configuration of Solr facet blocks. Solr search is amazingly fast compared to core search but also leads t more page impressions as you narrow down your searches one by one. It’s a new way of searching mostly known from big sites like Amazon and eBay.
But we also ran into kinda trouble as Solr indexed everything, even content that is protected from regular users. We have internal forums for our moderatores and admins protected by Forum Access module (using Node Access), but even those nodes get indexed. OK, normal users cannot view those protected nodes, but they appear along with some of the nodes’ content on the SERPs.
A quick look at the issue queue made clear that this is a non-trivial problem that causes some headaches across the community. It will be interesting to see how Acquia adresses these issueswith their Solr service called Acquia Search, which is currently in beta mode.


