Technical SEO—Additionals

Learn some additional tips of technical SEO, effectively using crawl budget, robots.txt, meta robots, hreflang and structured data to optimize your websites for search engines and users.

There are some additional factors that too can help technically optimize a website. Not all of them may apply to our website. We only have to deal with these if they apply to our project.

Crawl budget

Crawl budget refers to the number of pages on our site that search bots will crawl and index within a given timeframe. Since crawl budget is limited, we need to be aware if all of our important content is being crawled and indexed efficiently.

Since Googlebot’s main priority is crawling and indexing, most webmasters won’t need to worry about their crawl budget [40]. Google will handle crawling their pages on its own. We only need to think about the crawl budget in the following cases:

  • If we have a new website with lots of pages (1000+) as is the case with e-commerce websites.

  • If we have a large website with millions of pages.

  • If our website is updated frequently and it matters that the indexed version is fresh, as is the case with news websites.

We can monitor our website’s crawling activity by using Google Search Console’s Crawl Stats Report. It gives us stats on Googlebot’s crawling history of our website. Google offers a guide on how to use the Crawl Stats Report, and explicitly states that we shouldn’t need to use the report if our website has fewer than a thousand pages [41].

We can maximize our crawl budget by:

  • Improving our site’s speed, as advised by Google itself [42].

  • Removing or applying 301 redirects to duplicate content.

  • Keeping our sitemap updated and resubmitting the latest version to Google.

  • Fixing broken links.

  • Fixing long redirect chains (with more than one redirect between starting URL and destination URL) since they negatively impact crawling and indexing.

  • Getting rid of outdated or low-quality content.

  • Use internal links because Google prioritizes pages with more links pointing towards them. Since backlinks aren’t completely in our control, we can fill the gaps with internal links.

The robots.txt file

Setting up the robots.txt file successfully is another plus of our website’s technical SEO. robots.txt, also called Robot Exclusion Protocol, is the first thing that Googlebot will try to retrieve for permission to crawl the website’s pages.

A robots.txt file must be placed in the topmost directory of the website and has a URL similar to the one below:

http://www.yourdomain.com/robots.txt

The page appears similar to the one below:

Get hands-on with 1200+ tech skills courses.