database labs performance rails

Using database views for performance wins in Rails

Sometimes you come across a problem where your go-to approaches start to fail you. I came across one of these examples recently where a Rails app required a report to be built which consumed data from almost every model within the application. The page took a long time to render and so we began to discuss how we would deal with this performance problem. There was also the secondary performance and aesthetic problem of the view for the page containing nested loops to render the page.

There are several approaches that can be taken when using ActiveRecord including eager loading, database table joins using the ActiveRecord DSL, a custom query, caching or using a database view. Our application used each one of these approaches depending on what was best for the problem at hand, and here I’ll discuss one of the lesser used approaches with database views.

Database views

A database view is “the result set of a stored query which users can query just as they would in a persistent database collection.”)

One of the side affects of the above statement with respect to ActiveRecord is that when an application model extends ActiveRecord::Base it doesn’t care if the collection it’s modelling is a table or a view. This means an application can put the query logic into the database, and keep the application source code lighter weight.

class Foo < ActiveRecord::Base
    # table 'foos'
end

class Bar < ActiveRecord::Base
    # view 'bars'
end

In doing this, there is now an implicit dependency on how the database constructs that query logic, but fortunately it’s easy to test. Just as you may write tests for an application model using the database, the same can be applied for views. I tend to try and have applications not hitting the database for unit tests for test performance and isolation reasons, but here we are testing the integration between the application the ORM and the database, so we definitely want to hit the database.

Let’s take a simple example of a Twitter like message service. On the users signed in homepage they see a list of messages from people they follow with the message content, the username and avatar for the user. This may be spread across two or more tables depending on the design, for now let’s assume messages are in the messages table , and the user details are in the users table.

class User < ActiveRecord::Base
    has_many :messages
End

class Follow < ActiveRecord::Base
    belongs_to :follower, class_name: 'User'
    belongs_to :target, class_name: 'User'
end

class Message < ActiveRecord::Base
    belongs_to :user
end

When we load the messages to display to the logged in user, we would load the set of users whom the logged user follows, then the collection of messages and then for each message we would load the associated user model to display the username and avatar. We might create a new model called UserMessage to house the data for each message and user detail to be displayed. In the test we can create the data we need, a user and a message that the user has sent. We can then create an assertion that the UserMessage model can retrieve and expose the details needed for that page.

it 'uses the view' do
    user = User.create(username: 'robb1e')
    message = Message.create(user: user, content: 'Hello, world')

    user_message = UserMessage.first
    expect(user_message).to be
    expect(user_message.content).to eq('Hello, world')
    expect(user_message.username).to eq('robb1e')
    expect(user_message.user_id).to eq(user.id)
    expect(user_message.message_id).to eq(message.id)
end

In true, red, green, refactor mode, the above fails as the model UserMessage does not exist.

class UserMessage < ActiveRecord::Base
  self.primary_key = 'message_id'
end

We also need a migration which creates the view

CREATE VIEW user_messages AS
  SELECT u.username, u.id AS user_id, m.content, m.id AS message_id
  FROM messages m
  INNER JOIN users u ON u.id = m.user_id

There is an argument to be made about the reasoning behind now having logic in more than one place, the application source code and the database, but this is for me is about using the best tool for the job, and putting the application logic in the optimal position. In this case databases can be extremely efficient at performing unions, intersections and calculations rather than mapping over in memory objects to perform the same operations. We can also make the most of some nice Rails features such as how the application can render models. In this case, the application could render the collection of user_messages implicitly using the file app/views/user_messages/_user_message.html.erb.

We’ve explored how database views can move data centric logic from the application to the database, and how performance benefits can be brought due to that change. We’ve discovered how we an test that logic within the application and how some Rails mechanisms can be used to clean up the code further.