How To Make Javascript Apps Seo-Friendly

More frameworks based on javascript have been evolving day by day as javascript is the most popular and familiar language for all. Angular.js, ember.js, server-side javascript language node.js etc. are some which is used by lots of web developers. Here the main topic is how to make these client-side application SEO-friendly because google bots query to server for pages but these frameworks renders pages dynamically from client-side, that’s the main problem. Now to make them serve from server so that bots think the pages are come from server as static pages. A typical angular apps head section of HTML contains title, meta-name like keywords, description etc which is the most important part of any HTML page for SEO purpose.

1
2
3
4
5
6
7
8
9
10
<head>
<meta charset="utf-8">
<title>
{{ title }}
</title>
<meta name="keywords" content="{{ keywords }}">
<meta name="description" content="{{ description }}">
<link type="image/x-icon" data-ng-href="https://website.com/favicon.ico" rel="shortcut icon">
<link rel="stylesheet" href="https://website.com/style.css">
</head>

This is the template for js based apps in which the variable inside curly braces are replaced by javascript to meaningful title, keywords or description. But the problem is SE like Google can’t execute javascript so that it just reads How To Make Javascript Apps Seo-Friendly, and as the main text. This is very bad because google bot can’t get our actual title, keywords or description text and for every page it will get the same result.

Now what may be the solution that can serve search engines the post-processed HTML and not the templates from server. There are numerous techniques or tricks for it but mainly three popular, feasible and affordable solutions are briefly explained below.

Cloaking

Cloaking is the technique that serves different content to the google robot than that is served to actual users. In the past most hackers use this to make their webpages indexed in google search engine. Yes, cloaking is bad and google once ban lot’s of sites that use cloaking technique. It is prevented to enhance SEO. But don’t worry, once the javascript-based apps are popular, google endorsed it but you should remember the actual content that browser renders should be sent to the robot otherwise there is high chance of being banned.

It’s very simple to do this, using headless browser like PhantomJS that can run javascript in server (possible because of Node.js which can run on server-side and client-side), every single pages can be rendered in webserver and give them to google robot. The actual process follows

  1. Bot request some webpage say website.com/some-content.html
  2. The webserver finds out that the request comes from a bot and send special request (seo-app.website.com/?page=/some-content.html) to SEO-specific app.
  3. The app (i.e. server side browserless PhantomJS app) can handle the request and render actual content from url “website.com/some-content.html” and pass it to google bot.

The picture below illustrates it clearly.

javascript-seo-using-phantomjs

There are some online services that can handle this type of jobs for you without deploying in your own server.

BromBone

Another technique is online service called BromBone that crawls your sitemap, generates all the pages statically to stores it on Amazon (may be S3). Using this makes you lot of easier and remove hassle of setting up your own middleware SEO app on your server.

This is similar to the above app except one difference is that BromBone renders pages from sitemap beforehand instead of rendering them on the fly. When google robot crawls the website, you have to proxy it to BromBone page from where it got exact copy of the pages which user can see on their browser. This is instant and bot don’t have to wait for the pages which is also extra benefits SEO-wise.

It may be problematic because of it’s relying on the sitemap. What if the sitemap is not updated frequently ? It can’t give google bots new contents. For solving this problem, there is another service called Prerender which use some tricks.

Prerender

Prerender is open source library and can be deployed on your own server freely. It also uses PhantomJS to prerenders pages on the fly with some additional tricks to set the current status codes and HTTP headers. Because it also renders content dynamically, there is one disadvantage of performance penalty because google bot has to wait for some time when prerender is hard working for generating the required page on the fly. There is simple solution for it. You can post request manually to prerender to warm up prerender’s cache i.e. it can cache all the pages of the website that google bot may hit on cache beforehand so that there is no extra waiting for bot as pages are directly served from webserver’s cache.

For example a GET request to http://prerender.website.com/http://website.com/some-content.html will prerender the page on the fly and store it on webserver’s cache so that real bots can get the page instantly from it. Similarly if content is updated dynamically, a POST request to http://prerender.website.com/http://website.com/some-content.html will refresh the prerendered content. So this is extra manual work which is necessary if you host this SEO-app on your own server.

You can use online service which has nice solution called SaaS that avoids you doing anything manually from your part and it’s pricing is also very affordable.

It’s your choice which technology you prefer. But anyway in 2014, we can build SEO-friendly javascript apps more easily.