CityHost.UA
Help and support

How to remove duplicate website pages

 4563
13.06.2019
article

The appearance of duplicate web pages on the site is a problem that nullifies the efforts put into search engine optimization. It often causes a sharp drop in search engine rankings and a decrease in the number of visitors. And this reduces the earnings received by the owner of an online store, blog or online portal.

Another consequence of the appearance of duplicates is the loss of pages from Google and Yandex. In this case, visitors may never see the categories, product cards, or useful articles that were the highest bid.

As you already understood, duplicates are a problem that cannot be allowed to appear on sites. In a previous article on the CityHost blog, we talked about how to find duplicates , and today you will learn how to eliminate duplicate pages and prevent them from appearing in the future.

How to start eliminating duplicate pages on the site

The first thing to do before removing duplicate web pages is to find the cause of this problem. Mostly, it is:

Shortcomings of CMS . Popular engines are designed to simplify the process of creating a website immediately after renting hosting and registering a domain . Simply put, even non-technical users can easily create a blog or online store on WordPress, Joomla! or OpenCart in 15-30 minutes. However, often the "out of the box" settings are imperfect and can cause a variety of problems on the site. Including duplicates.

Incorrect filter and search settings . If there are filters in the categories of the online store, there is a chance that the web spiders will index the pages with the filtered results. Their content, even when different filter groups are selected, can be identical, resulting in the generation of many duplicates. The same story with internal site search. The generated result pages can be the same for similar queries. And in this case, it is necessary to remove duplicate pages of the site as soon as possible.

Errors and deficiencies in the structure . Let's imagine the situation: a webmaster who maintains an online store for women's shoes decided to create separate categories for groups of products with the same size. But he did not take into account that Model No. 1, Model No. 2 and Model No. 3 are presented in three sizes at once: 36, 37 and 38. As a result, each of these models falls into the categories "Shoes of size 36", "Shoes of size 37" and " Shoes of size 38" and three sections identical in content appear on the site, which differ only in titles and meta tags .

How to eliminate duplicate pages on the site

If the reason for the appearance of duplicates on the site is determined, it's time to eliminate this problem. There are six ways to do this.

Specify the canonical (original) page. To do this, put the following code in the < head >…< /head > section of the doubles:

< link rel = ” canonical ” href = ” https://example.com/canonical-page ” / >,

replacing the specified URL with the URL of the original (canonical) page.

Pay attention! In this way, you will not remove duplicates on the site, but you will inform the web spiders that only the web page specified in the code should be indexed, and its copies should be ignored.

Remove duplicates manually. If the number of pages of a blog or online store does not exceed 200 pieces, it is realistic to look at each of them and determine whether there are duplicates.

Disable scanning for duplicates in robots.txt. To do this, use the Disallow instruction. For example, if the duplicate is located at https://example.com/pagecopy/, add the following code to robots.txt to hide it:

User-agent: *

Disallow: /pagecopy

Perform a 301 redirect from the duplicate to the original page. It is easy to do this - add the following line to the .htaccess file located in the root directory of the site:

Redirect 301 /pagecopy https://example.com/originalpage

In this example, /pagecopy is the conditional address of the duplicate, https://example.com/originalpage is the full address of the canonical page.

Close the duplicate from indexing with the noindex meta tag. Like other site meta tags, it must be added to the < head > … < /head > section of the site:

< meta name= ” robots ” content = ” noindex ” >

Remove duplicate pages by performing a 410 redirect. This redirect tells the web spiders that the page does not exist and that there is no information about its alternatives. To install a 410 redirect, write the following in the .htaccess file:

Redirect 410 /pagecopy

In this example, /pagecopy is the relative address of the duplicate.

Now you know how to remove duplicate pages, and you will easily raise the technical optimization of the site to a new level, attract more visitors and fall in love with the online resource of search robots.

Was the publication informative? Then share it on social networks and join our Telegram channel. We remind you that you can buy Ukrainian hosting from the hosting company CityHost. For technical questions, contact the online chat or call ?? 0 800 219 220.


Like the article? Tell your friends about it: