SEO Measures for Beginners 1 : Checking Indexing

2024.05.21

contents

1 How to Check If You’re Indexed
- 1.1 Method 1: Checking with Google Search Console
  - 1.1.1 URL Inspection
  - 1.1.2 Page Indexing
- 1.2 Method 2: Checking with Google Search
2 Reasons and Benefits for Checking Indexing
3 Low Content Quality
4 Frequently Asked Questions When Checking Indexing
5 Summary

checking indexing

After building a website and publishing content, the next SEO measure to tackle is checking the indexing status (index/registration). Being aware of the indexing status of your website allows you to identify challenges your website or pages may face, revealing the next actions to be taken.

There are several checkpoints to consider when checking indexing, which you should familiarize yourself with by reading this article. Furthermore, if your site or pages are not indexed at all, improvements are necessary. Identify the reasons behind the lack of indexing and resolve the issues.

This article explains the first SEO measure for beginners: checking indexing. If you are a complete beginner, please read ” What is SEO? The Latest Guide to SEO and Complete SEO Checklist 100 for Beginners to Advanced ” before proceeding with this page.

How to Check If You’re Indexed

There are primarily two methods to check if your site is indexed.

Method 1: Check with Google Search Console

Method 2: Check with a Google search

Method 1: Checking with Google Search Console

Google Search Console, a tool for managing websites, allows you to check indexing status. As a web analytics tool provided by Google, Google Search Console offers high reliability and functionality, making it valuable for detailed investigations into indexing status.

Features for checking indexing include primarily the following.

URL Inspection
Page Indexing

If you have not yet used Google Search Console, start by registering your website with Google Search Console.

Related Article: What is Google Search Console? An Explanation of How to Set Up and Use Google Search Console

URL Inspection

With URL inspection, you can check detailed indexing status on a page-by-page basis. For example, in addition to whether or not a page is indexed, you can also check items such as:

Whether the Googlebot has crawled the page
The crawled HTML information
Whether the page is mobile-friendly
Information on structured data, like breadcrumbs URL Inspection

Page Indexing

Page indexing allows you to check the overall indexing status of the site. Specifically, you can find out the following:

How many pages of the site are indexed
How many pages are not indexed
The reasons why some pages are not indexed

Method 2: Checking with Google Search

Using Google search, anyone can easily check the indexing status. By entering the following into the Google search box and searching, a list of pages indexed under the site URL will be displayed as search results:

site: (Enter “site or page URL” here)

The pages displayed in the results are indexed in Google’s database.

Checkpoints When Checking Indexing

When checking the indexing status, please note the following checkpoints

Check the overall indexing status of the site
Check the indexing status of individual pages
Verify whether the pages have been crawled

Checking the Overall Index Status of the Site

Please check what percentage of the pages you’ve published are indexed. If the number of indexed pages is extremely low, it may indicate that your website is facing some issues.

Checking the Index Status of Each Page

For each published page, check the index status in detail. Even if a page is indexed, it may still have issues. For example, if a page is indexed but not mobile-friendly, it needs to be updated.

Checking If the Page Has Been Crawled

If it’s determined that a page is not indexed, first check whether the published page has been crawled at all.

As part of the process for a page to be indexed by Google, a crawler developed by Google, known as Googlebot, first checks the page. Based on this check, Google’s system decides whether or not to index the page. If a page has not been crawled at all, it means that Google has not recognized the page.

Reasons and Benefits for Checking Indexing

Checking the indexing status offers several advantages, such as

Being able to check for any issues with the site structure
Being able to check for any issues with the quality of page content
Being able to infer Google’s evaluation of the site or pages
Being able to understand the total number of indexed pages
Reasons for Not Being Indexed

Continually updating your website and providing more accurate information is crucial. This requires creating new articles and updating existing ones. Checking the indexing status can help answer questions like “What should I create?” and “What needs to be fixed?”.

By checking indexing, you can determine if there are any issues with your site structure

By checking indexing, you can determine if there are any issues with your site structure. For instance, if the number of indexed pages is low despite increasing published articles, it may indicate a problem with the site’s structure.

Here, “site structure” refers to what is known as the directory structure or the site hierarchy. An inappropriate site structure can make it difficult for Googlebot to understand the overall picture of the site. Ensure that the site structure is as simple and understandable as possible.

Checking the Quality of Page Content

Checking indexing allows you to assess if there are any issues with the quality of page content. Typically, Google deems low-quality articles as “content of little value for indexing.” Moreover, an increase in low-quality articles within a website can lead to a vicious cycle where the site’s overall rating drops, making it harder for new articles to be indexed.

Inferring Google’s Evaluation of the Site or Pages

If a page is indexed quickly after publication, it may indicate that Google highly evaluates the quality of the website or page. Conversely, if indexing does not progress swiftly, it might mean that the site or page is considered a low priority for indexing.

Causes and Solutions for Not Being Indexed

The main reasons for a page not being indexed include

The website is not recognized by Google.
The content quality is low.
Cannibalization is occurring.
The site uses technology that Google cannot read.
Instructions to block indexing are in place.
The domain has been penalized by Google.
A certain amount of time is necessary for indexing.

The website is not recognized by Google

A newly launched website is not yet recognized by Google. From Google’s perspective, it’s an unknown entity with neither positive nor negative evaluations. To make Google aware of your website, you can use the following methods

Send a sitemap
Submit a crawl request
Acquire backlinks

First, submit your sitemap to Google to make your site known. You can do this through Google Search Console.

Additionally, you can use the URL Inspection tool within Google Search Console to request a crawl from Googlebot. Submitting a request when you create new articles can make Google aware of their existence. However, whether or not Google will actually crawl them is at Google’s discretion. Understand that it’s not guaranteed that they will be crawled.

Related Article: Mastering the Search Console’s URL Inspection Tool! A Clear Explanation on How to Promote Index Registration for SEO

Another way to make your website Google aware is by continuously creating quality content. This increases the chances of acquiring backlinks, which gradually makes Google recognize your site and makes it easier to be indexed.

Low Content Quality

Pages with low content quality tend to be indexed less frequently. Low-quality pages are those that are not useful to site visitors and typically have characteristics such as

No demand for the content
Copying articles to other sites already indexed
Overstuffed with topics, making it unclear what the message is
Poor grammar in the text, making it unclear what is being explained
Content is too thin for the theme
Over-designed, making the text hard to read

If quality content is not being indexed, it’s likely not created with Googlebot’s specifications in mind. To get your pages indexed, ensure the content is accessible for Googlebot just as it is for site visitors.

If you’re not experienced in content creation, consult Google Search Central (formerly Google Webmasters) for guidance as you proceed.

Cannibalism is occurring

Cannibalization within a website can make it harder to be indexed. To begin with, the term “cannibalization” directly translates to “cannibalism” in English, which carries the meaning of consuming one’s own kind. In a figurative sense, it refers to a situation where multiple pages within a website target the same keyword, causing competition among those pages for the keyword.

When cannibalization occurs, Google may struggle to decide which article to index preferentially. As a result, a page that the author did not intend may be prioritized for indexing, leading to significant delays in indexing the intended pages.

If you realize that cannibalization is occurring and it’s a significant issue, consider the following solutions

Merge duplicate elements between pages
Consolidate the pages themselves
Link the pages to each other with internal links

Using Technologies That Google Cannot Read

Google can index certain file formats but not others. Essentially, a website is composed of a combination of files and folders placed on a server. If the file formats used here are unreadable by Google, they will not be indexed. The file formats that can be indexed include the following:

Text (.txt, .text, and other file extensions). This includes the source code of common programming languages such as
- Basic source code (.bas)
- C, C++ source code (.c, .cc, .cpp, .cxx, .h, .hpp)
- C# source code (.cs)
- Java source code (.java)
- Perl source code (.pl)
- Python source code (.py)

Citation: File Formats That Can Be Indexed by Google

For more details on the file formats that can be indexed, you can check on Google Search Central’s page about file formats that Google can index.

Instructions That Block Indexing

There are instructions that can block Google’s indexing, and if these instructions are set mistakenly, the page will not be indexed. There are mainly two types of instructions:

robots.txt
noindex

The former is a text file that contains instructions to deny crawling, typically placed in the root directory where the website is installed. The latter is a value used in HTML language that denies indexing. If either is set, the page will generally not be indexed, so be cautious.

Related Article: What is robots.txt? Explaining the Purpose and How to Write It

Domain Has Been Penalized by Google

A domain is like an address that indicates the location of a website. If there have been serious violations of Google’s policies in the past, the domain may be penalized.

Especially, violations that fit the policies against spam in Google’s Web Search as mentioned in Google Search Central are a concern. In some cases, no matter how high-quality the content you create is, it might not be indexed at all.

There are two patterns when a domain is penalized.

The domain has been penalized for violations committed by the company or individual in the past.
The previous owner of a second-hand domain that was acquired has been penalized for violations.

Typically, it’s rare for beginners in website operation to use second-hand domains. However, there are cases where a newly acquired domain turns out to be a second-hand domain without the buyer’s knowledge. If unsure, consider checking the domain’s history using tools like the Wayback Machine.

Related Article: What is the Wayback Machine?

If it’s determined that the domain has been penalized by Google, consider changing the domain.

A Certain Amount of Time is Needed for Indexing

Not limited to new sites, indexing inherently requires a certain amount of time. While it may happen as quickly as within a day, it can also take over a month in slower cases. If you’ve explored various measures and still aren’t indexed, consider waiting some time.

Handling Pages You Don’t Want to Be Indexed

There are pages on a website that should not be indexed, such as sitemaps or community pages. Additionally, there may be low-quality pages that you wish to keep.

In cases where there are pages you don’t want to be indexed, the following methods can help avoid indexing

Specify pages with noindex
Block crawling with txt

However, note that neither method guarantees that indexing will be completely avoided.

Specifying Pages with noindex

noindex is an HTML tag that instructs not to index a page. Typically, noindex is used when wanting to avoid indexing.

To directly include noindex in HTML, write the following between the <header></header> tags:

Related Article: What is noindex? Explaining the Difference from nofollow and How to Utilize It

Blocking Crawling with robots.txt

robots.txt is a text file placed in the root directory on the server where the website is located, instructing Googlebot whether to allow access and crawling.

By using robots.txt, you can block crawling by Googlebot. However, be aware that this is not an instruction for controlling indexing.

Related Article: What is robots.txt? Explaining the Purpose and How to Write It

Frequently Asked Questions When Checking Indexing

Here, we’ve compiled frequently asked questions and answers regarding checking indexing.

Q: What exactly is indexing?

A: Indexing refers to the registration of web pages or site information in Google’s database. Websites and pages that have been indexed appear in the search results of Google’s search engine. The search ranking at that time is determined by Google’s evaluation of the website or page.

>>https://www.switchitmaker2.com/wordpress/seo/about-index/

Q: Why is it important to be indexed?

A: Indexing is considered important because it directly relates to securing traffic to the site. Typically, the purpose of operating a website is to attract visitors. Google’s search engine is used by people worldwide. By having your website or pages indexed, you create an opportunity to attract this vast user base to your site.

Q: Is there a simple way to check if something is indexed?

A: You can easily check if a page is indexed by searching for the URL in the Google search engine like this

site:(enter the “site or page URL”)

For example, to check the indexing status of Tokyo SEO Maker, you would search

site: https://www.switchitmaker2.com/wordpress/

Q: How long does it take for a new page to be indexed?

A: There is no set time for a new page to be indexed. It could be indexed within a few hours or take several months.

Q: My indexing is slow; are there any tips to speed it up?

A: Pages that tend to be indexed more quickly include:

New pages from domains with high ratings
New pages considered to contain high-quality content

Therefore, one strategy to promote indexing is to continuously create high-quality articles and improve the domain’s evaluation.

Q: What does the number of indexes refer to?

A: The number of indexes refers to the count of pages within a site that have been indexed. However, having a high number of indexed pages does not necessarily translate to a higher site evaluation. What’s important for enhancing a site’s evaluation is the extent to which high-quality content is indexed.

>>https://www.switchitmaker2.com/wordpress/seo/about-index/

Q: What is index coverage?

A: Index coverage refers to a feature in Google Search Console that allows you to check the indexing status within your site. It was originally a term used in the Google Search Console menu, but as of 2023, the term “coverage” is not being used in Google Search Console.

>>https://www.switchitmaker2.com/wordpress/seo/index-coverage/

Summary

The indexing status is a critical determinant of whether a website or page will appear in Google’s search engine results. To efficiently gather traffic to your website, it’s essential to implement SEO measures and promote indexing. Therefore, when you start operating a website, make it a habit to first understand the indexing status. If you find your pages are not being indexed, it’s necessary to identify the causes of this and make the necessary adjustments.

Author Profile

International Web Consultant Paveena Suphawet

A trilingual professional in English, Thai, and Japanese, she has numerous achievements in international SEO. She studied the latest IT technologies at Assumption International University, Thailand, and majored in International Business at the University of Greenwich, UK. Following her tenure at ExxonMobil’s Thai branch, she became a key member of Admano from its establishment.

Return to the top of Japan SEO