
A sitemap is a file in .xml format (eXtensible Markup Language), which contains links to all pages of a web resource. It is necessary for search engine crawlers to quickly perform scanning and indexing. The XML sitemap is placed in the root directory of the web project and is called sitemap.xml.
In addition to URLs, the sitemap indicates the date of the last modification of each page. Thanks to this parameter, when scanning sitemap.xml, web crawlers decide whether it is necessary to revisit each specific page and re-send it to the index.
The frequency with which a crawler visits web pages is influenced not only by the date of the last modification specified in the sitemap, but also by the loading speed. It has been observed that slow blogs and online stores are crawled less frequently than fast ones. When ordering even the cheapest hosting from Cityhost, you don't have to worry that your online resource will start to "lag" with the arrival of the first visitors, because we use high-speed 100% NVMe drives for instant page loading.
What an XML sitemap should be like for good site indexing
To ensure that crawlers quickly read the sitemap and correctly interpret the information specified in it, follow these rules when creating sitemap.xml:
- it must be saved in UTF-8 encoding;
- the server must return a 200 OK response when accessing the sitemap;
- the maximum allowable "weight" of the xml file with links to the web resource pages is 10 megabytes;
- the maximum number of pages whose URLs can be listed in sitemap.xml is 50 000;
- if the file exceeds 10 MB in size or contains more than 50 000 URLs, create several xml files and include links to them in sitemap.xml;
- the purpose of the sitemap is to help search engines quickly perform indexing, so the URLs in the xml file should be presented in the standard format https://example.com/pageslug/;
- it is recommended to remove links from the sitemap that are blocked in the robots.txt file.
A properly created sitemap.xml significantly increases the chances that search engines will quickly find and process the pages. That is why it is recommended to regularly update the file after significant changes and check its validity using Google Search Console. To do this, simply go to GSC → Sitemaps and check the status — it should show a green Success label. Then click on the sitemap and check the status of each page (category-sitemap, page-sitemap, post-sitemap, and others).
How to properly configure sitemap.xml
There are three main ways to create a sitemap: manually, using specialized online services, and using the capabilities of the engine.
Creating sitemap.xml manually is a meticulous task that requires focus and a lot of time. This method assumes knowledge of sitemap.xml syntax and the ability to work with XML files. We recommend creating the sitemap manually only if you have a technical background and the number of pages on the web resource does not exceed 100.
Online services help speed up the creation of the sitemap. They all operate on a similar principle: you need to enter the homepage URL, specify additional parameters, and wait for the file generation to finish. These services have limits on the number of URLs added to the XML file, which can be lifted for an additional fee.
Popular services for creating sitemap.xml:
- xml-sitemaps.com — free for up to 500 URLs;
- freesitemapgenerator.com — free sitemap generation for up to 5000 URLs;
- xmlsitemapgenerator.org — the basic plan offers generation of up to two sitemaps with a maximum of 25 URLs each.
The final method of creating an XML sitemap is using the built-in features of the engine your site is running on. Note that some CMSs do not generate an XML file with page URLs automatically "out of the box". But this problem is easily solved by installing plugins and extensions developed for this purpose. For example, if you are creating or managing websites on WordPress, install the Google XML Sitemaps, Rank Math, or Yoast SEO plugins, which immediately after activation provide the option for automatic creation of an XML sitemap.
A properly configured sitemap.xml is the factor that most strongly influences the speed of indexing by web crawlers. We recommend approaching the creation of a site map with the utmost diligence. Considering that you already know what a sitemap is and how to properly create it, the process will not take much of your time.