Capybara, Cuprite and a slow-scrolling Chrome ARM

It sounds like the start to a bad joke...

Capybara, Cuprite and a slow-scrolling Chrome ARM walk into a bar....

...but the joke was definitely on me with a bunch of randomly failing tests, all with a variation of the same message.

The Error

A lot of system tests were randomly failing with the same error, and only the coordinates varying.

Capybara::Cuprite::MouseEventFailed: 
    Firing a click at coordinates [600.0,4548.3984375] failed. 
    Cuprite detected another element with CSS selector "none" 
    at this position. It may be overlapping the element you 
    are trying to interact with.

The Problem

Given I'm using Capybara and Cuprite it was easy enough to change it to a headful state, enable :slowmo and watch what happens.

When I tested all the failing tests, the one thing they all had in common was the element to be clicked was below the fold and not in the viewport.

What became very clear, was that the ARM version of Chrome was scrolling the page at the speed of around one line per second - ridiculously slow.

The Solution

The solution was to disable smooth-scrolling in the Chrome browser. This was done in the Cuprite driver configuration by adding 'disable-smooth-scrolling' => true to the browser options.

Capybara::Cuprite::Driver.new(
  app,
  **{
    # set the window size
    window_size: [1200, 800],
    # set browser options (disable-smooth-scrolling required for ARM chips)
    browser_options: { 'disable-smooth-scrolling' => true },
  }
)

(see the full configuration file below)

BONUS: As a bonus, turning off smooth scrolling increased the performance of all the other tests, not just the failing ones, reducing my total system test time by 13% - WIN!

The Process

I love tests that fail consistently. They let me pin-point a specific problem and I know it's fixed when they pass. Randomly failing tests though are a real pain, but often much more interesting :)

In this case, each individual test would fail in roughly four out of every five attempts.

The errors all state there is something with CSS selector "none" covering the element I want to click. That's a confusing message because every element in the DOM has a CSS selector of some sort. On the other hand, because the tests randomly pass it can't be the Capybara selector that's the problem.

With Capybara and Cuprite it was easy enough to change to a headful state with :slowmo enabled and watch what happens, pausing the test just before the failure so I could see what was covering the element.

On the first test, I scrolled down to the button and could see nothing covering it. I inspected it with developer tools and it was completely clear, on display and ready for clicking, so I let the test continue - aaaand it passed!

Well, that was no good, I needed to catch a failing test so I tried again, with the same successful result!

OK, I'll try some of the others - and in each case they successfully passed.

Perhaps I've change something by accident? So I individually re-ran the failing tests in headless mode and they all failed.

It's getting interesting now!

The first breakthrough was, as is often the case, by accident. I ran an individual test again to watch it in the browser but I forgot to add the debug command, so the test ran all the way through and failed - YES!

Was it pausing for the debug that allowed the test to pass? I added a sleep(1) just before the failing command and it made no difference. So it wasn't a timing issue.

After faffing around for a while, I realised that the tests would pass if I scrolled to the element before continuing - but would fail if I didn't scroll.

It's now clear the problem is scroll-related, and that gave me a few options to try:

when Capybara scrolls the element before clicking, does it put it at the bottom of the page under any cookie notice?
when Capybara scrolls the element before clicking, does it put it at the top of the page under the fixed nav-bar
what if I scroll the element in to view, before clicking?

The first two were easy to dismiss, as the the error would have given their CSS selectors rather than stating "none".

The third option, to pre-scroll the element, was worth trying so I added appropriate code...

scroll_to('#submit_btn', align: :center)
find('#submit_btn').click

...and it passed! I quickly ran it again, and it failed, and failed, and failed!

This reduced my failure rate on this specific test from 4/5 to 3/5. Watching in the browser, I could see that sometimes the scroll was almost instantaneous (passing the test) and at other times still incredibly slow (failing the test), but it was never consistant.

I wrote a quick macro to by-pass Capybara and ask the driver to scroll the widow to specific coordinates that would bring the element to mid-viewport.

  def scroll_into_view(element)
    window_height = page.driver.browser.viewport_size.last
    scroll_offset = window_height / 2

    element_node = page.driver.browser.css(element).first
    element_top  = Integer(element_node.find_position.last)

    page.driver.scroll_to(0, (element_top - scroll_offset))
    sleep(1)
  end
end

This again reduced my failure to around 2/5. An improvement that clearly highlighted the issue was slow scrolling in the browser.

Because I was using Cuprite, I couldn't easily test with another browser, so I needed to check what options I had with Chrome.

Cuprite is based on ferrum who's documentation pointed me to a great List of Chromium Command Line Switches by Peter Beverloo.

Trying out all the switches related to scrolling led to the answer: disable-smooth-scrolling.

I added it to the browser options in my driver configuration...

Capybara.register_driver(:cuprite) do |app|
  Capybara::Cuprite::Driver.new(
    app,
    **{
      # set the window size
      window_size: [1200, 800],
      # set browser options (disable-smooth-scrolling required for ARM chips)
      browser_options: {
        'disable-smooth-scrolling' => true
      },
      # Increase Chrome startup wait time (required for stable CI builds)
      process_timeout: 10,
      # standard timeout: default is 5 - must match Capybara.default_max_wait_time
      timeout: 5,
      # JS Errors get re-raised in ruby
      js_errors: false,
      # Enable debugging capabilities
      inspector: true,
      # Allow running Chrome in a headful mode by setting :headless to false
      headless: ENV['WITH_BROWSER'].present? ? false : true,
      # blacklist external sites to stop TimeoutErrors
      url_blacklist: [ 
                        "script.crazyegg.com",
                        "p.typekit.net",
                        "fonts.googleapis.com",
                        "www.googletagmanager",
                        "www.facebook.com",
                        "facebook.com",
                        "connect.facebook.net",
                        "facebook.net",
                     ]
    }
  )
end

... and every test passed without any changes - WIN!

Chrome Version 103.0.5060.53 (Official Build) (arm64)

The crazy thing is, I looked at the Chrome flags much earlier in the process, but it clearly stated that 'smooth scrolling' was "Not available on your platform". Looks like a bug in Chrome somewhere!

Conclusion

Although it took a while to fix this, the reduction in time to run the tests will easily re-pay the lost time and this lesson can easily be applied to all the other projects I'm involved with. Hopefully with a comparative reduction in test-run time.

There were no other blogs or questions on stackoverflow about this so I though it was time to bite the bullet and do my first blog post! Hope it saved you some time in need, and you enjoyed the approach.

CodeMeister