tl;dr - You can write 632 rock solid UI tests with Capybara and RSpec, too.
We have exactly 231 integration tests and 401 view tests out of a total of 3086 in our core application today. This adds up to 632 tests that exercise UI. The vast majority use RSpec with Capybara and Selenium. This means that every time the suite runs we set up real data in a local MongoDB and use a real browser to hit a fully running local application, 632 times. The suite currently takes 45 minutes to run headless on a slow Linode, UI tests taking more than half the time.
Keeping the UI tests reliable is notoriously difficult. For the longest time we felt depressed under the Pacific Northwest -like weather of our Jenkins CI and blamed every possible combination of code and infrastructure for the many intermittent failures. We've gone on sprees of marking many such tests "pending" too.
We've learned a lot and stabilized our test suite. This is how we do UI testing.
An Asynchronous Application
The splash page on Artsy is a Backbone.js application where views fade in and out depending on user actions. It also implements a responsive layout because some elements cannot render on mobile devices or shouldn't depending on the size of your browser.
The application is initialized in a usual Backbone way.
1 2 3 4 5 6 7 8
From here, everything is asynchronous. The router will wire up the events and the different views that make up the page will render themselves.
Testing a Login Form
When a user clicks on a "Log In" link, he sees the
Splash.Views.Login Backbone view. There's no page reload or server roundtrip: the current view is swapped out by the Backbone view coming in. Some CSS animates the transition.
1 2 3 4 5 6 7 8
The log-in view has two input fields: an e-mail address and password. We can write a Capybara test that enters valid values and ensures that the user logged in by checking for a specific header.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
This test works well with Capybara, because it tries to wait for elements to appear on the page. For example, when you use
fill_in it attempts to locate an element with the
user[email] id, several times, until it times out or until the element is on the page.
Waiting for Explicit Events
The above test is "reliable" within some limits. It works as long as all the necessary asynchronous events run within a timeout period. But what if they don't? What if the test hardware is taking a break from flushing to disk? Or waiting on Google Analytics when the network cable is unplugged, which shouldn't affect the outcome of the test? These external issues make this code very brittle, so everyone keeps increasing the default timeout values.
A winning strategy to avoid this is to introduce explicit wait controls inside the tests. These wait
Capybara.default_wait_time for a true result and no longer force you to know which method in Capybara waits for a timeout and which doesn't. It effectively breaks up a single wait into multiple waits.
Consider a widget that needs to be saved by making a postback.
1 2 3 4
When the widget is saved, its element will get a
.saved CSS class. The test can wait for it.
1 2 3 4 5 6
There's Just Too Much Going On
- How can we wait on all remaining AJAX requests to finish?
- How can we wait on all remaining DOM events to finish?
Remaining AJAX Requests
If you're using jQuery, you can test the number of active connections to a server. The number is zero when all pending AJAX requests have finished. This was an original idea from Pivotal.
1 2 3 4 5
Remaining DOM Events
1 2 3 4 5 6 7 8 9 10
We do have to make sure that the body element is loaded, first. This allows a
wait_for_dom right after we navigate to a page that executes AJAX queries on load.
With enough attention we were able to explain and fix most spec failures. When implementing Capybara tests we favor explicit waits and use the combination of the two wait functions above when we just want to generically make sure that everything on the page has loaded and is ready for more action.
Finally, integration tests are essential for continuous deployment. They are very much worth the extra development effort.