Can Google index pages made with javascript?


It is correct that Google announced in 2015 that they can now render and index content that requires Javascript so in theory it shouldn’t be a problem. However, theory and practice aren't always the same. Most SEO professionals agree that Google has a harder time reading and indexing pages that requires Javascript compared to one that doesn’t require Javascript.

In autumn 2015, Google announced that their bots would also inspect the JavaScript and CSS of web pages, so they would be able to render and understand web pages like browsers do. That was great news, but there are still problems with SEO:  

Google bots can easily scan and understand only HTML pages. Google performs all these operations very fast. But when it comes to pages with JavaScript code, the process of indexing gets more complex. Let’s take a look:

Image Placeholder
Only when all these steps are fulfilled can the bot find new links and add them to the crawling queue. This process is linear and significantly slower than indexing an HTML page. 

The good news is that we can see that Google indexes JavaScript very quickly now. As much as 60% of the JavaScript content is indexed within the first 24 hours after indexing HTML.

But there is also bad news. As much as 32% of the tested pages have unindexed JavaScript content after one month, due to a variety of reasons.



Issues with indexing react sites: Long delays

If the content on a page updates frequently, crawlers should regularly revisit the page. This can cause problems, since reindexing may only be done a week later after the content is updated, as Google Chrome developer Paul Kinlan reports on Twitter.

This happens because the Google Web Rendering Service (WRS) enters the game. After a bot has downloaded HTML, CSS, and JavaScript files, the WRS runs the JavaScript code, fetches data from APIs, and only after that sends the data to Google’s servers.


Issues with indexing react sites: Limited crawling budget

The crawl budget is the maximum number of pages on your website that a crawler can process in a certain period of time. Once that time is up, the bot leaves the site no matter how many pages it’s downloaded (whether that’s 26, 3, or 0). If each page takes too long to load because of running scripts, the bot will simply leave your website before indexing it.

Talking about other search engines, Yahoo’s and Bing’s crawlers will still see an empty page instead of dynamically loaded content. So getting your React-based SPA to rank at the top on these search engines is a will-o'-the-wisp. You should think of how to solve this problem on the stage of designing app architecture.



Making React SEO friendly


There are a couple of ways to make a React app SEO-friendly: by creating an isomorphic React app or by using pre-rendering

Isomorphic React apps

In plain English, an isomorphic JavaScript application (or in our case, an isomorphic React application) can run on both the client side and the server side. 

Thanks to isomorphic JavaScript, you can run the React app and capture the rendered HTML file that’s normally rendered by the browser. This HTML file can then be served to everyone who requests the site (including Googlebot).

On the client side, the app can use this HTML as a base and continue operating on it in the browser as if it had been rendered by the browser. When needed, additional data is added using JavaScript, as an isomorphic app is still dynamic. 

An isomorphic app defines whether the client is able to run scripts or not. When JavaScript is turned off, the code is rendered on the server, so a browser or bot gets all meta tags and content in HTML and CSS. 

When JavaScript is on, only the first page is rendered on the server, so the browser gets HTML, CSS, and JavaScript files. Then JavaScript starts running and the rest of the content is loaded dynamically. Thanks to this, the first screen is displayed faster, the app is compatible with older browsers, and user interactions are smoother in contrast to when websites are rendered on the client side. 

Building an isomorphic app can be really time-consuming. Luckily, there are frameworks that facilitate this process. The two most popular solutions for SEO are Next.js and Gatsby.

Gatsby vs Next.js

The challenge of SEO for GatsbyJS framework is solved by generating static websites. All HTML pages are generated in advance, during the development or build phase, and are then simply loaded to the browser. They contain static data that can be hosted on any hosting service or in the cloud. Such websites are very fast, since they aren’t generated at runtime and don’t wait for data from the database or API. 

But data is only fetched during the build phase. So if your web app has any new content, it won’t be shown until another build is run.

This approach works for apps that don’t update data too frequently, i.e. blogs. But if you want to build a web app that loads hundreds of comments and posts (like forums or social networks), it’s better to opt for another technique. 

The second approach is server-side rendering (SSR), which is offered by Next.js. In contrast with traditional client-side rendering, in server-side rendering, HTML is generated on the server, and then the server sends already generated HTML and CSS files to the browser. Also, with Next.js, HTML is generated each time a client sends a request to the server. This is vital when a web app contains dynamic data (as forums or social networks do). For SSR to work on React, developers also need to use the Node.js server, which can process all requests at runtime. 

Prerendering

The idea of prerendering is to preload all HTML elements on the page and cache all SPA pages on the server with the help of Headless Chrome. One popular way to do this is using a prerendering service like prerender.io. A prerendering service intercepts all requests to your website and, with the help of a user-agent, defines whether a bot or a user is viewing the website.

If the viewer is a bot, it gets the cached HTML version of the page. If it’s a user, the single-page app loads normally.