1 : Anonymous2022/04/28(Thu) 08:31:10 ID: udqbn8
How does this work in practice? I'm struggling with the interpretation of it. Google's robots.txt specification states that "In case of conflicting rules, including those with wildcards, Google uses the least restrictive rule." My client's site has the following: User-agent: * Allow: / Disallow: /login/ If the least restrictive rule applies then surely the /login/ page would be crawled (due to the Allow: / command) however upon checking, Google is blocked from crawling this page and it is not getting indexed.
2 : Anonymous2022/04/28(Thu) 09:09:03 ID: i6igmb8
That only applies when all other things are equal. In your example, the rules aren't conflicting because the disallow rule is more specific as per the spec: "The most specific match found MUST be used. The most specific match is the match that has the most octets." The "use the least restrictive" part only comes into play when you have multiple rules with the same specificity. Here is an example where that is the case: URL: example.com/members/bar User-agent: foobot allow: /members/ disallow: /members/ Both rules provide an equally specific match. In this case, the "allow" is applied as it's less restrictive.
ID: i6ijr7h
Thanks for your reply, that's really helpful. In the example you gave, why would "allow" be used? Both rules are identical
3 : Anonymous2022/04/28(Thu) 08:59:26 ID: i6ig02r
Try practicing with the robots.txt tool in the Search Console, you can try out new directives to better understand rules.