Improve Googlebot Rendering by Dynamically Omitting Unvital Scripts

Dec 13, 2018 Last updated Dec 13, 2018

Increase bot client rendering by omitting bloated 3rd party scripts.

I just re-watched this years Google I/O talk Deliver search-friendly JavaScript-powered websites, which goes behind the curtain on how Google indexes sites that build the HTML in the client via JavaScript.

Dynamic Rendering

In the talk, John Mueller talks about Dynamic Rendering, where the server detects the User-Agent and will dynamically render a more digestible variant of the content if the User Agent indicates the request is coming from a bot.

This is not to be confused with cloaking, which is when you show different or modified content based on the User Agent. Dynamic Rendering should not result in different content, but instead in a pre-rendered HTML thats more parsable to bots—without needing to run JavaScript. Its not intended to be deceptive, but instead more helpful.

A few months after the talk, Google published documentation about Dynamic Rendering which indicates there are exceptions to Google previous cloaking policy (namely, that there was no such thing as white hat cloaking).

Is it Wise?

Managing two different versions of content, one for bots and another for humans doesn't seem maintainable, and Google themselves say its only a temporary solution:

Currently, it's difficult to process JavaScript and not all search engine crawlers are able to process it successfully or immediately. In the future, we hope that this problem can be fixed, but in the meantime, we recommend dynamic rendering as a workaround solution to this problem.

At work I've been looking into a specific page that is getting terrible results in Google's simulated bot rendering tools. The page is loading and render really slowly, which is causing the GoogleBot renderer to stop prematurely, resulting in a partial render which dings our SEO score.

We are already rendering our pages on the server, so Dynamic Rendering wouldn't be useful, but the concept made me think of another solution.

Dynamic Omission

I'm thinking it would be good to dynamically omit supplemental scripts that don't directly impact the page behavior. Example candidates would be 3rd party scripts, such as Segment or Optimizely, and 1st party scripts used for analytics. Without these scripts, the renderer would quickly finish loading and executing all of the JavaScript and determine that it's finished rendering.

All of these 3rd party scripts can be omitted for bot traffic

To do this, a list of bot user agents needs to be centralized. Nobody should maintain separate lists of bots for each page. There is a pretty comprehensive list of bot user agents in the form of an NPM module.

But for maintainability's sake, should you check against a list of user agents in the code for each page, or should the tag injection be decoupled and all live inside some kind of tag manager like Google Tag Manager. I'm leaning toward the latter.

I'll follow up about the results of Dynamic Omission in a later post.