Automatic Sitemap Improvements

Before I start I know you can manually create the sitemap, hence the request to improve automatic sitemaps :)

The problem:

We have a lot of pages "in progress" and a lot of designers working on our project, so at any given time there are many pages not fit for public consumption.

There are also certain types of pages you don't want to be indexed. e.g. paid media pages and conversion pages where it's necessary to exclude from the public domain and indexing.

Current workflow:

The current solution is to password protect the draft pages, but they are still included in our site map which presents a couple of problems.

  1. The sitemap makes our production roadmap public to whoever takes the time to check it out (competitors). We also can't use funky naming to mask our activities because designers will get confused and it'll just make for a crazy workflow.
  2. Can't exclude those paid media landing pages or conversion event pages
  3. If we exclude these pages from crawling in the robots.txt (which we do) we are sending a conflicting message to search engines and we get the message "Sitemap contains urls which are blocked by robots.txt." in search console.

Current solution:

The current solution it to manually create the sitemap, but who really wants to do that. Meh.

Dream solution - automagic sitemap:

The request here is to make the automatic sitemap smarter.

Options:

  • Page level
    • Set pages to draft that are not included in the sitemap (in page settings)
    • Set page to be excluded in sitemap (in page settings)
  • Folder level
    • Set folder to draft that are not included in the sitemap (in page settings)
    • Set folder to be excluded in sitemap (in page settings)
  • Site wide option to include/exclude password protection pages in sitemap/robots

There are possibly some further improvements, what come to mind are auto generating the robots.txt file from the resulting options, but I'll leave that to your genius team.

  • Nick Soper
  • May 17 2017
  • Kasper Dam commented
    June 19, 2023 18:45

    This seems like kindergarten functionality, that was just seemingly forgotten? :(


    But also seems to not be a priority with only 250 votes, and no activity from Webflow since 2017. :(

  • John Gilbert commented
    November 16, 2022 09:10

    Nice content you are given and it's very informative and if you are going to buy a domain, you need a proper research on it and for the relevant domain you need to pay for it so you can also get 200 instant loan which is enough for the domain.

  • Andrew Chisholm commented
    September 11, 2022 13:16

    I really wish this feature request was acted upon. The automatic sitemap feature in Webflow has ruined my SEO by showing Google tag indexes, user quote indexes, etc. I'd love to be able to remove sections of my site from the automatic sitemap...

  • Collin Belt commented
    October 26, 2021 12:55

    Agreed, this would be incredibly helpful. In other platforms (such as Squarespace) there is a toggle to no-index the page and hide it from the automatically generated sitemap. Having a similar option would save us a lot of time and be much cleaner from an SEO perspective.

  • Raghu Kashyap commented
    August 13, 2021 10:08

    I think the ability to define which subdomain to use as default is very important.


    The issue is reported in here https://discourse.webflow.com/t/sitemaps-invalid-w-multiple-custom-domains/47697


    We use reverse proxy and due to that we cannot make anything default and in webflow and this messes up sitemaps that gets generated.

  • Tiphaine Bruel commented
    July 23, 2020 15:28

    +1

  • Kjetil Grøsland commented
    July 08, 2020 23:42

    I'm a bit surprised to encounter this one as Webflow is so rich already on functionality.
    I wish for a simple switch button on Page settings, allowing us to hide page from sitemap and to set no-index.

    Same issue her. Getting error from Search console since I both added no-index to page, but it's still in sitemap.xml. Confusing google.

  • Chris Erickson commented
    June 16, 2020 17:10

    This would be such an SEO win. No need to waste crawl budget on pages we have no index tag on. Also might be some pages we want to hide from public view (like landing pages for ads, etc). Exposing all pages in a sitemap makes all of this more difficult than it needs to be.

  • Robert commented
    May 20, 2020 11:39

    It would be great to exclude folders from the sitemap :)

  • Justin commented
    May 15, 2020 14:07

    We need the ability on each page, cms template page, and cms child pages to check a checkbox that adds a no index meta tag. This would also remove that page from the auto sitemap since Google Search console marks a page as an error if you add no index code but it shows on the sitemap. Also, the auto sitemap needs to include all pages including cms pages. All of this is important to technical SEO.

  • Aditya Lakhe commented
    January 10, 2020 11:07

    +1

  • Juan Manuel Garrido commented
    September 30, 2019 20:30

    +1

  • Axel Sturmann commented
    August 16, 2019 13:32

    This would be very useful! 
    And a page level option (preferably with check boxes) to auto add the following to each page:
    Noindex, Nofollow, Noarchive, Nosnippet
    Many thanks,

  • Austin Hellman commented
    July 19, 2019 12:43

    This is an essential feature for SEO. I am getting Google Search Console errors for using my robots.txt file or using a noindex tag with pages that are submitted in the automatically generated sitemap. There seems to be no way to get around this unless I upload the sitemap manually which is not user-friendly or realistic.

  • Brandon Urich commented
    February 16, 2019 22:23

    Yes! I'm adding No Index, No Follow in the head on certain pages, but then you get a GSC error because the pages still show up in the submitted site map. A control to allow "Exclude from Sitemap" is highly needed in order to adhere to best practices. This is very much needed for collection page templates since in some cases the individual collection item page doesn't ever need to be visited.

  • Christoffer Furnes commented
    February 12, 2018 12:28

    Sometimes I make collections specifically to be used as a multi-reference filtering functionality in other collections. The problem is that this collection also get a public collection url that will be indexed by Google if the Auto-generate Sitemap function in Webflow is used.

    Excluding the collection I do not want listed in a manual sitemap is a solution, but then it has to be updated every time the site structure is changes in the future.

    What if there could be an option in the collection settings to exclude it from the automatic site map list?

    Not a big problem, but a function for excluding a collection/page from the auto generated sitemap would ensure an always updated sitemap, but exclude the things you do not want to show.

  • Diarmuid Sexton commented
    February 09, 2018 09:25

    This feature would be great and is much needed! Clients seeing lots of errors on Search Console due to <meta name="robots" content="noindex"> that I've embedded on collection pages which are just for reference. If these pages were hidden from the sitemap, there would be no errors.

  • Evan McDaniel commented
    February 05, 2018 16:36

    I'd like to see the ability to indicate which domain should be used for the sitemap file. We use 4 domains per site on average (internal version, client facing non-public version, www live version, non-www live version that redirects to www), and it's a mystery as to which gets used in the sitemap, and there is currently no way to change it.

    This makes the auto-generation of the sitemap a no-go for us. Manual updating is tedious, so a solution for this would be a big improvement for us.

     

    Thanks!

  • Cameron Roe commented
    August 09, 2017 03:04

    I also think it would be amazing to have a sitemap diagramming tool built into Webflow. For example, you could go to a separate section under the Assets tab and it would show a visual sitemap as a diagram of all the pages on the site. The view would overlap the designer section and you could drag your page structure around in a more sophisticated manner. What would be nice is to see all the routes (with dynamic parameters from CMS) and update them by simply re-ordering or renaming the route structure.

    For example:

    => /

      => /about

      => /contact

      => /blog

      => /blog/:published_on_year/:published_on_month/:published_on_day/:post_title

      => /events/:event_title

      => /projects/:category/:project_title

  • Noelle Greenwood commented
    July 31, 2017 05:21

    Yes! This! Webflow masters - please can you advise when this might be considered? The conflicting messages for google is not great SEO... :(

  • +150