Full-Text Search in Rails Using Elasticsearch

In this article I'm going to show you how to implement full-text search using Ruby on Rails and Elasticsearch. Everyone is used nowadays to entering a search term and getting suggestions as well as results with the search term highlighted. If you misspell what you are trying to search, having auto-correct is also a nice feature, as we can see on websites such as Google or Facebook. 

To implement all these features using only a relational database like MySQL or Postgres is not straightforward. For this reason, we are using Elasticsearch, which you can think of as a database specifically built and optimised for search. It is open source and it is built on top of Apache Lucene. 

One of the nicest features of Elasticsearch is that exposes its functionality using REST API, so there are libraries wrapping that functionality for most programming languages.

Introducing Elasticsearch

Earlier, I mentioned that Elasticsearch is like a database for search. It would be useful if you are familiar with some of the terminology around it.

  • Field: A field is like a key-value pair. The value can be a simple value (string, integer, date), or a nested structure like an array or an object. A field is similar to a column in a table in a relational database.
  • Document: A document is a list of fields. It is a JSON document which is stored in Elasticsearch. It is like a row in a table in a relational database. Each document is stored in an index and has a type and a unique id.  
  • Type: A type is like a table in a relational database. Each type has a list of fields that can be specified for documents of that type.
  • Index: An index is the equivalent of a relational database. It contains the definition for multiple types and stores multiple documents.

One thing to note here is that in Elasticsearch, when you write a document to an index, the document fields are analysed, word by word, to make search easy and fast. Elasticsearch also supports geolocation, so you can search documents that are located within a certain distance of a given location. That's exactly how Foursquare implements search.

I would like to mention that Elasticsearch was built with high scalability in mind, so it's very easy to build a cluster with multiple servers and have high availability even if some servers go down. I am not going to cover the specifics of how to plan and deploy different types of clusters in this article.

Installing Elasticsearch

If you're using Linux, possibly you can install Elasticsearch from one of the repositories. It's available in APT and YUM.

If you use Mac, you can install it using Homebrew: brew install elasticsearch. After elasticsearch is installed, you will see the list of relevant folders in your terminal:

Elasticsearch folders

To verify that the installation is working, type elasticsearch in your terminal to start it. Then run curl localhost:9200 in your terminal, and you should see something like:

Elasticsearch running

Install Elastic HQ

Elastic HQ is a monitoring plugin that we can use to manage Elasticsearch from the browser, similar to phpMyAdmin for MySQL. To install it, just run in your terminal:

/usr/local/Cellar/elasticsearch/2.2.0_1/libexec/bin/plugin -install royrusso/elasticsearch-HQ

Once it's installed, navigate to http://localhost:9200/_plugin/hq in your browser:

Elastic HQ Plugin

Click on Connect and you will see a screen showing the status of the cluster:

Elastic HQ Cluster Overview

At this time, as you might expect, no indexes or documents are created yet, but we have our local instance of Elasticsearch installed and running.

Creating a Rails Application

I'm going to create a very simple Rails application, where you can add Articles to the database so we can perform a full-text search on them using Elasticsearch. Start by creating a new Rails application:

rails new elasticsearch-rails

Next we generate a new Article resource with scaffolding:

rails generate scaffold Article title:string text:text

Now we need to add a new root route, so we can see by default the list of Articles. Edit config/routes.rb:

Create the database by running the command rake db:migrate. If you start rails server, open your browser, navigate to localhost:3000 and add a few articles to the database, or just download the file db/seeds.rb with dummy data that I have created so you don't have to spend a lot of time filling forms.

Adding Search

Now that we have our little Rails app with articles in the database, we are ready to add our search functionality. We are going to start by adding the reference to both official Elasticsearch Gems:

On many websites, it is very common to have a text box for search in the top menu on all pages. For that reason, I'm going to create a form partial on app/views/search/_form.html.erb. As you can see, I'm sending the form using GET, so it's easy to copy and paste the URL for a specific search.

Add a reference to the form to the main website layout. Edit app/views/layouts/application.html.erb.

Now we also need a controller to perform the actual search and display the results, so we generate it running the command rails g new controller Search.

As you can see, I'm calling the method search on the Article model. We haven't defined that yet, so if we try to perform a search at this point, we get an error. Also, we haven't added a route for the SearchController on the config/routes.rb file, so let's do so:

If we look at the documentation for the gem 'elasticsearch-rails',  we need to include two modules on the models that we want to be indexed in Elasticsearch, in our case Article.rb.

The first model injects the Search method that we were using in our previous controller among others. The second module integrates with ActiveRecord callbacks to index each instance of an article that we save to the database, and it also updates the index if we modify or delete the article from the database. So it's all transparent to us.

If you imported the data to the database earlier, those articles are still not in the Elasticsearch index; only the new ones are indexed automatically. For this reason, we have to index them manually, and it's easy if we start rails console. Then we only have to run irb(main) > Article.import.

Articleimport

Now we're ready to try the search functionality. If I type 'ruby' and click search, here are the results:

Search Result

Search Highlighting

On many websites, you can see on the search results page how the term that you searched for is highlighted. This is very easy to do using Elasticsearch.

Edit app/models/article.rb and modify the default search method:

By default, the search method is defined by the gem 'elasticsearch-models', and the proxy object __elasticsearch__ is provided to access the wrapper class for the Elasticsearch API. So we can modify the default query using the standard JSON options as provided by the documentation

Now the search method will wrap the results that match the query with the specified HTML tags. For this reason, we also need to update the search result page so that we can render HTML tags safely. To do so, edit app/views/search/search.html.erb.

Add a CSS style to app/assets/stylesheets/search.scss, for the highlighted tag:

Try to search for 'ruby' again:

Search result

As you can see, it's easy to highlight the search term, but not ideal, as we need to send a JSON query as specified by the Elasticsearch documentation, and we don't have any kind of abstraction.

Searchkick Gem

Searchkick gem is provided by Instacart, and it's an abstraction on top of the official Elasticsearch gems. I'm going to refactor the highlight functionality, so we start by adding gem 'searchkick' to the gemfile. The first class that we need to change is the Article.rb model:

As you can see, it's much simpler. We need to reindex the articles again, and execute the command rake searchkick:reindex CLASS=Article. To highlight the search term, we need to pass an additional parameter to the search method from our search_controller.rb.

The last file that we need to modify is views/search/search.html.erb as the results are returned in a different format by searchkick now:

Now it's time to run the application again and test the search functionality:

Search result

Notice that I entered as a search term 'dato'. I did this on purpose to show you that by default searchkick is set up to analyse the text indexed and be more permissive with misspellings.

Autosuggest

Autosuggest or typeahead predicts what a user will type, making the search experience faster and easier. Bear in mind that unless you have thousands of records, it might be best to filter on the client side.

Let's start by adding the typeahead plugin, which is available through the gem 'bootstrap-typeahead-rails', and add it to your Gemfile. Next, we need to add some JavaScript to app/assets/javascripts/application.js so that when you start typing in the search box, some suggestions appear.

A few comments about the previous snippet. In the last two lines, because I have not disabled turbolinks, that's the way to hook up the code that I want to run on page load. On the first part of the script, you can see that I'm using Bloodhound. It is the typeahead.js suggestion engine, and I am also setting up the JSON endpoint to make the AJAX requests to get the suggestions. After that, I call initialize() on the engine, and I set up typeahead on the search text field using its id "term".

Now, we need to do the backend implementation for the suggestions, let's start by adding the route, edit app/config/routes.rb.

Next, I'm going to add the implementation on app/controllers/search_controller.rb.

This method is returning the search results for the term entered using JSON. I'm only searching by title, but I could specify the body of the article too. I am also limiting the number of search results to 10 maximum.

Now we are ready to try the typeahead implementation:

Typeahead

Conclusion

As you can see, using Elasticsearch with Rails makes searching our data really easy and very fast. Here I showed you how to use the low-level gems provided by Elasticsearch, as well as the Searchkick gem, which is an abstraction that hides some of the details of how Elasticsearch works. 

Depending on your specific needs, you might be happy to use Searchkick and get your full-text search implemented quickly and easily. On the other hand, if you have some other complex queries including filters or groups, you might need to learn more about the details of the query language on Elasticsearch and end up using the lower-level gems 'elasticsearch-models' and 'elasticsearch-rails'.

Tags:

Comments

Related Articles