Page 1 of 1

Walk Settings to include only directly linked pages on parent domain

Posted: Thu Jul 22, 2021 9:19 am
by Bryan
I am not having a lot of luck with setting up the walk settings so I can have the content we want indexed for our search results.

The main site is a subdomain like Support.domain.com but we have some, but not all, content we want searchable on domain.com
Of course the Base URL(s) setting has our support.domain.com entered.

If I add www.domain.com to the Extra Domain setting everything on that site seems to get crawled. I only want directly linked pages to be included.

If I set Off-Site Pages to "Y" it includes the content I want crawled but also some domains that are not under our control that we do not want.

So I am looking for some settings that allow the parent domain pages to be crawled but only those with direct links in the support subdomain. Any suggestions on how to implement this?

Re: Walk Settings to include only directly linked pages on parent domain

Posted: Thu Jul 22, 2021 9:43 am
by John
There are a couple of approaches you could take.

The simplest might be to add the extra domain, and then use "Exclude By Field". You could either use a query of: domain.com -support.domain.com or www.domain.com in the field URL, and Exclude Links Only. Which would exclude links from either any site on domain.com that isn't support.domain.com, or explicitly the www.domain.com site.

Re: Walk Settings to include only directly linked pages on parent domain

Posted: Fri Jul 23, 2021 10:41 am
by Bryan
Thank you. This worked perfectly!