Search is a feature request we get frequently at Pivotal Labs. It’s easy to understand why– if your users can search your app, they can navigate by thinking about what they want, instead of trying to remember where they put it.
There are a number of different tools to implement search– there are dedicated fulltext datastores such as ElasticSearch and Apache Solr, plus conventional relational databases like PostgreSQL that have fulltext search support built-in. Of these, Solr is one of the more established options, and it’s the one I’ve seen used most here at Pivotal Labs. For Ruby projects, Solr has gem support in the form of Sunspot, which provides simple declarative DSLs for indexing and searching your data, and gets you out the door with working search in a very modest amount of code. To show just how simple TTD-ing out Sunspot search can be, let’s implement search.
Setting up
Let’s pretend we’re working on a video sharing app. Anybody can comment on any video, and we want comments to be searchable.
Gemfile
gem 'sunspot_rails'
group :development, :test do
gem 'sunspot_matchers'
gem 'sunspot_solr'
gem 'rspec-rails'
end
spec/support/sunspot_matchers.rb
RSpec.configure do |c|
c.include SunspotMatchers
c.before do
Sunspot.session = SunspotMatchers::SunspotSessionSpy.new(Sunspot.session)
end
end
Let’s use scaffolds so we’ll have a functional app running right away:
% bundle install
% rails g rspec:install
% rails g scaffold video url:string --no-view-specs
% rails g scaffold comment video_id:integer text:text --no-view-specs
% rake db:migrate test:prepare
% rake
You should see a few dozen passing tests, and you can try creating a video and a comment or two.
The first commit: Making comments searchable
We’re storing the comment text typed by our users in the ‘text’ column, and that’s what we want to make searchable. The sunspot_matchers gem we added to our Gemfile makes this easy to express in our spec:
spec/models/comment_spec.rb
require 'spec_helper'
describe Comment do
it { should have_searchable_field(:text) }
end
The test fails:
% bundle exec rspec spec/models/comment_spec.rb
F
Failures:
1) Comment should should have searchable field text
Failure/Error: it { should have_searchable_field(:text) }
expected class: Comment to have searchable field: text, but Sunspot was not configured on Comment
# ./spec/models/comment_spec.rb:4:in `block (2 levels) in <top (required)>'
We can make the test green by telling Sunspot which fields to make searchable on our Comment model.
app/models/comment.rb
class Comment < ActiveRecord::Base
belongs_to :video
attr_accessible :text, :video_id
searchable do
text :text
end
end
% bundle exec rspec spec/models/comment_spec.rb
.
Finished in 0.02179 seconds
1 example, 0 failures
Perfect. That’s our first commit.
% git add -A
% git commit -m "make Comment model searchable"
We’ve now got our Rails app sending updates to Solr every time we create, update, or delete a comment. You may already get the sense that quite a bit more must be going on than the 3 lines of implementation code let on, but we’ll come back to that in a moment. In the meantime, let’s do do the other half of the search implementation: querying the index.
The Second Commit: Making a Controller Perform a Search
To add search to our app, we’re going to modify the #index
method on our CommentsController so that when we provide a search query to the controller, we get back a filtered list of comments. This test is almost as easy as the last one we wrote:
spec/controllers/comments_controller_spec.rb
describe "GET index" do
context "with a search term" do
it "performs a search for matching comment text" do
get :index, {search: "sandwiches"}, valid_session
Sunspot.session.should be_a_search_for(Comment)
Sunspot.session.should have_search_params(:fulltext, "sandwiches")
end
end
end
The test should fail because we’re not yet doing a search:
% bundle exec rspec spec/controllers/comments_controller_spec.rb
1) CommentsController GET index with a search term performs a search for matching comment text
Failure/Error: Sunspot.session.should be_a_search_for(Comment)
RuntimeError:
no search found
# ./spec/controllers/comments_controller_spec.rb:65:in `block (5 levels) in '
To pass that spec, we’ll tell the controller to perform a Sunspot search of comments if it receives a search term:app/controllers/comments_controller.rb
class CommentsController < ApplicationController
# GET /comments
# GET /comments.json
def index
if params[:search]
@comments = Sunspot.search(Comment) do
fulltext params[:search]
end.results
else
@comments = Comment.all
end
respond_to do |format|
format.html # index.html.erb
format.json { render json: @comments }
end
end
# ... more controller methods
end
% bundle exec rspec spec/controllers/comments_controller_spec.rb
..................
Finished in 0.83291 seconds
18 examples, 0 failures
やったー! That’s our second commit right there.
% git add -A
% git commit -m "teach the CommentsController how to do searches."
Meanwhile, back in Reality
Ok, so we’ve written some application code, and we’ve written some tests. But does it actually work?
In fact, it’s pretty easy to verify. Open up a terminal window, cd to your app’s directory, and start up the test Solr server that we sneakily bundled while updating the Gemfile.
% rails generate sunspot_rails:install
% bundle exec rake sunspot:solr:start
And start your rails server if it’s not running yet.
% rails s
You can type the url by hand if you want: visiting localhost:3000/comments?search=omg will return all the comments matching that search term. You’ve now got a rudimentary search feature, waiting to fleshed out with proper pagination and sorting[1].
Your API is very nice but what did you just do with my data?
Sunspot’s big contribution is minimizing the amount of code you have to write to integrate search into your app’s business domain. With Sunspot, you only write code for the behavior that’s specific to your application: which of your domain models you want to be able to search and how you want your users to initiate those searches.
To make use of the the minimal code you add to your models, sunspot_rails is also injecting a lot of code of its own that handles communication with the Solr server: it adds callbacks to ActiveRecord models you’ve configured for indexing, and it also sends Solr a crucial go-ahead “commit”message at the end of any controller action that modifies an indexed model, telling Solr to apply all of the changes you’ve made.
Suspect the commit messages
These commit messages are very important; If Solr’s not doing what you expect, there’s a good chance that commit messages are involved. The tradeoff for Sunspot’s convenience is that it hides the complexity of commits from you, so it’s easy not to realize that there’s more you need to know about. Once you do know a bit about how the Solr server works, however, you’ll benefit from Sunspot’s simplicity on all your future projects.
Here are the important things to know about commits: they’re slow, they’re blocking calls, and a commit has to happen before documents actually show up in the index[2]. But Solr can also be configured to perform commits automatically in the background [3][4]. If you’re using a 3rd-party host for your Solr servers (such as the WebSolr add-on for Heroku), you should experiment or simply ask how the Solr servers are configured– they may be configured to commit automatically already.
You’ll need to find a configuration that works for your application’s needs: you may want a guarantee that documents are indexed transactionally with changes to your primary database, in which case you need to be committing explicitly in your Rails app or background jobs. If some lag time is ok, then you also need to make sure that the autocommit interval used by your server matches the performance you want to get out of your app.
TL;DR
It’s extremely easy to integrate search functionality into your Rails app with a test-first approach. But Solr’s server-side configuration affects its behavior a lot, so make sure you understand the configuration of your Solr servers too.
[1]https://github.com/sunspot/sunspot/wiki/Ordering-and-pagination
[2]http://wiki.apache.org/solr/UpdateXmlMessages#A.22commit.22_and_.22optimize.22
[3]http://wiki.apache.org/solr/SolrConfigXml#indexConfig_Section
[4]http://wiki.apache.org/solr/NearRealtimeSearch