
The appearance of duplicate web pages on the site is a problem that nullifies the efforts put into search engine optimization. It often causes a sharp drop in search engine rankings and a decrease in the number of visitors. And this reduces the earnings received by the owner of an online store, blog or online portal.
Another consequence of the appearance of duplicates is the loss of pages from Google. In this case, visitors may never see the categories, product cards, or useful articles that were the highest bid.
As you already understood, duplicates are a problem that cannot be allowed to appear on sites. In a previous article on the Cityhost blog, we talked about how to find duplicates, and today you will learn how to eliminate duplicate pages and prevent them from appearing in the future.
How to start eliminating duplicate pages on the site
The first thing to do before removing duplicate web pages is to find the cause of this problem.
One of the most common reasons is CMS issues. Popular engines are designed to simplify the process of creating a website immediately after renting hosting and registering a domain. Simply put, even non-technical users can easily create a blog or online store on WordPress, Joomla or OpenCart in 15-30 minutes. However, often out-of-the-box solutions are imperfect and can cause a variety of problems on the site, including duplicate content on the site.
Another reason is incorrect filter and search settings. If there are filters in the categories of the online store, there is a chance that the web spiders will index the pages with the filtered results. Their content, even when different filter groups are selected, can be identical, resulting in the generation of many duplicates. The same story with internal site search: generated result pages can be the same for similar queries. And in this case, it is necessary to remove duplicate pages of the project as soon as possible.
Last but not least, there are errors and shortcomings in the structure. Let's imagine the situation: a webmaster who maintains an online store for women's shoes decided to create separate categories for groups of products with the same size. But he did not take into account that Model No. 1, Model No. 2 and Model No. 3 are presented in three sizes at once: 36, 37 and 38. As a result, each of these models falls into the categories "Shoes of size 36", "Shoes of size 37" and "Shoes of size 38" and three sections identical in content appear on the site, which differ only in titles and meta tags.
How to eliminate duplicate pages on the site
If the reason for the appearance of duplicates on the site is determined, it's time to eliminate this problem. There are six ways to do this.
Specify the canonical (original) page. To do this, put the following code in the < head >…< /head > section of the doubles:
<link rel="canonical" href="https://example.com/canonical-page" />
Replace the specified URL with the URL of the original (canonical) page.
Pay attention! In this way, you will not remove duplicates on the site, but you will inform the web spiders that only the web page specified in the code should be indexed, and its copies should be ignored.
Remove duplicates manually. If the number of pages of a blog or online store does not exceed 200 pieces, it is realistic to look at each of them and determine whether there are duplicates.
Disable scanning for duplicates in robots.txt. To do this, use the Disallow instruction. For example, if the duplicate is located at https://example.com/pagecopy/, add the following code to robots.txt to hide it:
User-agent: *
Disallow: /pagecopy
Perform a 301 redirect from the duplicate to the original page. It is easy to do this — add the following line to the .htaccess file located in the root directory of the site:
Redirect 301 /pagecopy https://example.com/originalpage
In this example: /pagecopy is the conditional address of the duplicate, https://example.com/originalpage is the full address of the canonical page.
If you have checked your site for duplicates and found relevant pages, then try to block them from indexing with the noindex meta tag. Like other site meta tags, it must be added to the < head > … < /head > section of the site:
<meta name="robots" content="noindex">
Remove duplicate pages by performing a 410 redirect. This redirect tells the web spiders that the page does not exist and that there is no information about its alternatives. To install a 410 redirect, write the following in the .htaccess file:
Redirect 410 /pagecopy
In this example: /pagecopy is the relative address of the duplicate.
Now you know how to search for duplicate pages online and how to remove them, which will help you easily take your site's technical optimization to a new level, attract more visitors, and make search engines fall in love with your online resource.
Was the publication informative? Then share it on social networks and join our Telegram channel. We remind you that you can buy Ukrainian hosting from the Cityhost. For technical questions, contact the online chat or call ☎️ 0 800 219 220.