homeASCIIcasts

278: Search With Sunspot 

(view original Railscast)

Other translations: Ja Es Fr

Sunspot is a solution for adding full-text searching to Ruby applications. It uses Solr in the background and has many great features. In this episode we’ll use it to add full-text searching to a Rails application, using the simple blogging app we’ve used before in previous episodes.

Our blogging application.

This application has a page that displays a number of articles and we want to implement the ability to search across them. Using SQL to do this can quickly become difficult and is often not the best approach. A dedicated full-text solution such as Sunspot is a much better way to implement this feature.

Installing Sunspot

Sunspot is installed comes as a gem and is installed in the usual way by adding it to the Gemfile and running bundle.

/Gemfile

source 'http://rubygems.org'
gem 'rails', '3.0.9'
gem 'sqlite3'
gem 'nifty-generators'
gem 'sunspot_rails'

Once the gem and its dependencies have installed we’ll need to generate Sunspot’s configuration file which we can do by running

$ rails g sunspot_rails:install

This command creates a YML file at /config/sunspot.yml. We don’t need to make any changes to the default settings in this file.

Sunspot embeds Solr inside the gem so there’s no need to install it separately. This means that it works straight out of the box which makes it far more convenient to use in development. To get it up and running we run

$ rake sunspot:solr:start

If you’re running OS X Lion and you haven’t installed a Java runtime you’ll be prompted to do so when you run this command. You may also see a deprecation warning but this can be safely ignored. The command will also create some more configuration files for advanced configuration. We won’t cover them here but there are details in the documentation on how to modify these.

Using Sunspot

Now that we have Sunspot installed we can use it in our Article model. To add full text searching we use the searchable method.

/app/models/article.rb

class Article < ActiveRecord::Base
  attr_accessible :name, :content, :published_at
  has_many :comments
  searchable do
    text :name, :content
  end
end

This method takes a block and inside it we define the attributes that we want to search against so that Sunspot knows what data to index. We can use the text method to define the attributes that will have full-text searches run against them. For our articles we’ll do this for the name and content fields.

Sunspot automatically indexes any new records but not existing ones. We can tell Sunspot to reindex the existing records by running

$ rake sunspot:reindex

All of the articles are now in our Solr database and can be searched so we’ll add a search field at the top of the index page.

/app/views/articles/index.html.erb

<% title "Articles" %>
<%= form_tag articles_path, :method => :get do %>
  <p>
    <%= text_field_tag :search, params[:search] %>
    <%= submit_tag "Search", :name => nil %>
<% end %>
<!-- rest of view omitted -->

This form is submitted to the index action using GET, so any search parameters added will be added to the query string. We’ll modify the controller next so that it fetches the articles using that search parameter. To perform a search with Sunspot we call search on the model and pass in a block. Inside the block we can call various methods to handle complex searches. We’ll use the fulltext method and pass it the search parameters from the form. Finally we’ll assign the result of all of this to @search. We can call results on this to get a list of the matching articles.

/app/controllers/articles_controller.rb

def index
  @search = Article.search do
    fulltext params[:search]
  end
  @articles = @search.results
end

We can test this now by reloading the articles page and searching for a keyword. When we do so we’ll get a list of matching articles returned.

The filtered list of articles.

The search returns a list of the articles that contain the search term whether it’s in the article’s name or its content.

There’s a lot more that we can do inside the searchable block in the Article model. For example we can use boost to weigh the results so that matches in the article’s name are considered more important than those in the content.

/app/models/article.rb

class Article < ActiveRecord::Base
  attr_accessible :name, :content, :published_at
  has_many :comments
  searchable do
    text :name, :boost => 5
    text :content
  end
end

This is important when we want to sort results by relevance. In this case articles whose name contains the search term will appear higher up in the results than articles where the search term only appears in the content.

The attributes listed in the searchable block don’t have to be actual database columns, we can use any method that we define in the model. We’ll create a publish_month column that will return a string containing the name of the month and the year when the article was published, then search against that method just as if it was a database column.

/app/models/article.rb

class Article < ActiveRecord::Base
  attr_accessible :name, :content, :published_at
  has_many :comments
  searchable do
    text :name, :boost => 5
    text :content, :publish_month
  end
  def publish_month
    published_at.strftime("%B %Y")
  end
end

We’ll need to reindex the records by running rake sunspot:reindex again before we can search against this new column, but once we’ve done so we can search for articles based on their month name.

The articles filtered by their publication month.

As an alternative to creating a method we can pass in a block and search against whatever the block returns. An article has many comments so we’ll add the ability to search for the comments’ content by using a block.

/app/models/article.rb

class Article < ActiveRecord::Base
  attr_accessible :name, :content, :published_at
  has_many :comments
  searchable do
    text :name, :boost => 5
    text :content, :publish_month
    text :comments do
      comments.map(&:content)
    end
  end
  def publish_month
    published_at.strftime("%B %Y")
  end
end

The context inside the block is an instance of an Article so inside it we can get the comments for an article and map them to the content of each comment. Even though this returns an array Sunspot will handle this and index all of the comments so that they’re searchable.

Searching Against Attributes

What if we want to add some search capabilities that go beyond simple full-text searching, maybe searching on a specific attribute? For this we can pass in the type of attribute we want to search, whether it’s a string, an integer, a float or even a timestamp. To add the published_at attribute to the search fields we can use the time method.

/app/models/article.rb

class Article < ActiveRecord::Base
  attr_accessible :name, :content, :published_at
  has_many :comments
  searchable do
    text :name, :boost => 5
    text :content, :publish_month
    text :comments do
      comments.map(&:content)
    end
    time :published_at
  end
  def publish_month
    published_at.strftime("%B %Y")
  end
end

We can make use of this in the ArticlesController to restrict the searches to articles with a published_at date earlier than the current time. We use the with method to do this.

/app/controllers/articles_controller.rb

def index
  @search = Article.search do
    fulltext params[:search]
    with(:published_at).less_than(Time.zone.now)
  end
  @articles = @search.results
end

With this in place the search won’t return articles that haven’t yet been published. There is some great documentation on the attributes you can pass in on the Sunspot wiki page.

Faceted Searching

Faceted Searching allows us to filter the search results based on certain attributes such as the month on which the article was published. Let’s say that we want to add a list of links showing the months for which there are published articles. When we click one of the links it will filter the list of articles so that only those published in that month are shown.

To do this we’ll first add a string attribute to the searchable block for our publish_month method.

/app/models/article.rb

class Article < ActiveRecord::Base
  attr_accessible :name, :content, :published_at
  has_many :comments
  searchable do
    text :name, :boost => 5
    text :content, :publish_month
    text :comments do
      comments.map(&:content)
    end
    time :published_at
    string :publish_month
  end
  def publish_month
    published_at.strftime("%B %Y")
  end
end

We can turn this into a facet by calling facet in the search block in the ArticlesController.

/app/controllers/articles_controller.rb

def index
  @search = Article.search do
    fulltext params[:search]
    with(:published_at).less_than(Time.zone.now)
    facet(:publish_month)
  end
  @articles = @search.results
end

Now we can list those facets on the index page by adding the following code between the search box and list of articles.

/app/views/articles/index.html.erb

<div id="facets">
  <h3>Published</h3>
  <ul>
    <% for row in @search.facet(:publish_month).rows %>
      <li>
        <% if params[:month].blank? %>
          <%= link_to row.value, :month => row.value %> (<%= row.count %>)
        <% else %>
          <strong><%= row.value %></strong> (<%= link_to "remove", :month => nil %>)
        <% end %>
      </li>
    <% end %>
  </ul>
</div>

In this code we loop through each of the publish_month facet items and display them. If we call .facet on our @search object and pass in the attribute that we want to list the facets by, in this case :publish_month, and then call .rows on that it will return every facet option for that attribute.

When we call row.value it returns the value for that attribute, e.g. “January 2011”. We can also call row.count to return the number of articles that match that value. If there’s a month parameter in the query string we’ll display the value along with a “remove” link that will remove the parameter. This gives us some nice functionality for selecting a given facet and passing it in through a month parameter.

When we reload the page now, and we’ve reindexed the records, we’ll see a list of facets in a panel, each one of which shows a month and the number of articles published in that month. If we select a month we’ll see it as a month parameter in the query string but the articles aren’t filtered. To fix this we need to add another with parameter to the search in the controller so that it filters by the month if the month parameter is present.

/app/controllers/articles_controller.rb

def index
  @search = Article.search do
    fulltext params[:search]
    with(:published_at).less_than(Time.zone.now)
    facet(:publish_month)
    with(:publish_month, params[:month]) ↵ 
      if params[:month].present?
  end
  @articles = @search.results
end

Now when we select a month we’ll see the list correctly filtered by the articles that were published that month.

Articles filtered by month using facets.

Clicking the “remove” link will return us to the complete list. This works in conjunction with search results too. If we enter a search term the list will show the months that have articles that match.

The matching months for the filtered articles are shown in the sidebar.

Facets are a great feature to have alongside searching.

That’s it for this episode on Sunspot. It’s a great way to add full-text searching to Rails applications and has many extra features that we’ve not covered here. Be sure to take a look at the wiki for more information.