Pressable sites and robots.txt

Last modified: September 11, 2025

The robots.txt file gives you control over how automated bots and web crawlers access your site. With the exception of staging sites (which use a .mystagingwebsite.com address), most bots are allowed to crawl your site.

Adding rules to robots.txt allows you to limit or block specific bots from accessing all or part of your site.

Default Staging Site robots.txt

By default, all robots.txt for staging sites created at Pressable are “hidden” and prevent indexing by the search engine. This is generally a good thing, as you would not want clones of your live site being included in search results.

User-agent: *
Disallow: /

Default Live Site robots.txt

For a site created as (or converted to) a live site, the default robots.txt file allows indexing:

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

Sitemap: [URL]/wp-sitemap.xml

Custom Staging Domain robots.txt

The Custom Staging Domain can be added to your Pressable Account within the Settings under the Company section. When you enter the custom domain, there are two settings that you can adjust when creating the custom domain. Enabling “Existing Staging Sites” allows custom domains to be added to all existing staging sites.

Setting “Overwrite robots.txt file” allows the robots.txt file to be overwritten when a custom domain is created, ensuring all Staging sites will prevent search engines from indexing the site.

Cloned and Restored Site robots.txt

Accessing and Customizing robots.txt

If you need to override this functionality, you can do so by uploading your own custom robots.txt file to the root of your site via SFTP or a file manager plugin. When a custom robots.txt file exists, it takes precedence over the system-side one.

You can also customize robots.txt to disallow GPTBots and AI agents from ingesting your site content by following this guide.