Google to Crawl and Index First 15 MB of Web Content

Written on 25 July, 2022 by Jay Salter
Categories Search Engine Optimisation
  1. What does this mean for my website’s SEO?
  2. How much is 15 MB worth of HTML content?
  3. Can I still use images and videos on my website?
  4. Will this affect my SEO rankings on Google?
  5. How can I improve my SEO rankings on search engines?

 

There has been much talk recently about Google’s 15MB crawl news. It sounds complicated so let us break it down a bit.

How does Google work?

To rank the performance of pages so that a Google search identifies pages of interest to the user, Google indexes website pages. The index entry for each page is then classified with a description of its content and address.

To index pages, Google has to know what is there, and Google does this by crawling the web and looking for new and updated pages.  That’s a massive undertaking.  Google achieves this by using automated software that crawls pages from the web and indexes them. The name of that software is called Googlebot.

As a web administrator or manager, you develop content that improves your SEO ranking by becoming more likely to be found by users searching against a particular set of criteria. If you are selling second-hand washing machines in Darwin, for example, you want your website to be found when someone searches “used washing machines, Darwin”.

There have been no changes in how Google is indexing web pages. They have simply documented and published the process they apply. That process indicates that Googlebot only ever “sees” the first 15 megabytes (MB) of every page indexed. Only the first 15 MB is presented for indexing. So clearly, the Google 15 MB crawl over your website pages means that the first 15 MB of every page is the most crucial. But is this a big deal?

 

What does this mean for my website’s SEO?

SEO is all about ensuring that when the Googlebot indexes your page, it is indexed in a way that guarantees you are discovered by those looking for what you offer and that you are found earlier than competitors chasing that same user.

How many web pages does Google index? Think billions. Although Google can search almost every page on the internet, it is possible to block crawling on pages that will not meet your SEO ranking objectives.

So we now know with certainty that Googlebot will only index the first 15MB of every page of your website. You should ensure that any SEO efforts are focused on this part of every page. Anything outside of that first 15MB should not impact SEO performance.

How much is 15 MB worth of HTML content?

There are very few pages on the internet that are bigger than 15MB. In fact, the median size of an HTML file is about 500 times smaller at 30 kilobytes (kB). This limit does not apply to the referenced resources within the page.

If you have a page with more than 15MB of HTML content, then it is likely that you will also have a page that is slow to open and load. That could be a problem.

There are several ways to measure the size of your webpage HTML content. The easiest is probably using your own browser and its Developer Tools. Load the page as you usually would, then launch the Developer Tools and switch to the Network tab. Reload the page, and you should see all the requests your browser had to make to render the page. The top request is what you’re looking for, with the byte size of the page in the size column.

Can I still use images and videos on my website?

Google has stated that resources such as images and videos are fetched separately. Based on Google’s wording, it appears that the 15MB cutoff applies to HTML only. Embedded resources and content referenced with IMG tags are not part of the HTML file and do not count towards the 15Mb total. An IMG tag is used to embed an image in an HTML page using a simple link.

Will this affect my SEO rankings on Google?

SEO will remain an essential strategy for your website, always. It is unlikely, however, that the announcement of how Google crawls webpages will make any difference to your SEO success.

The likelihood of your pages being greater than 15Mb is small. And again, if pages are longer than 15MB, it is likely that you are already encountering performance issues in terms of loading times.

To hedge your bets and guarantee your webpages are accurately classified by Googlebot, ensure that critical content is included near the top of webpages. This means code must be structured so that SEO-relevant information is found within the first 15 MB in an HTML or supported text-based file.

It also means images and videos should be compressed and not encoded directly into the HTML.

 

How can I improve my SEO rankings on search engines?

There are many different strategies employed by businesses to improve their findability online. Knowing how to do Google optimisation may seem daunting, but it need not be.

Begin by completing a free SEO health check to see how your website is travelling.

Then start trying to improve your understanding of SEO. Search our blog. It has plenty of informative SEO articles written by industry experts that can help you better understand SEO and develop strategies to improve your SEO rankings.

Articles such as:

SEO explained in 90 seconds

How to do an SEO Content Analysis of your website

Increase Prospects for Your Business with SEO

Google search console is a great way to understand how Google sees your website. Familiarise yourself with the Google Search Console, and you can measure your site’s search traffic and performance, fix issues and boost your site in terms of Google search results. Using a Google crawler test is also helpful in understanding how Google crawls and ranks your web pages.

But if you don’t have time and want to engage the experts to boost your SEO rankings, then why not set up a free consultation with one of our SEO consultants.

Looking for some help with domains, hosting, web design or digital marketing?
 

Send me marketing tips, special offers and updates