Limit Crawling to Address and Sub-directories Only
This option is useful to limit the scope of the scan to part of the web application. By default, the option Limit Crawling to address and sub-directories only is enabled for new Targets.
This option will limit the scope of the scan up to the last forward slash (/) in the Target address.
NOTE: Any Target URL with a path but without a trailing slash will cause the crawler to consider the final part of the path to be a file and not a folder. The result is that the parent folder of that file will be the real target URL. For example:
|
Limiting Scan Scope - Examples
Example 1
- Scan the full domain:
- Set the Target URL to http://www.example.com (with or without the trailing forward slash). In this case, the option Limit Crawling to address and sub-directories only will have no effect on the scope of the scan.
Example 2
- Scan only part of the site or domain:
- Set the Target URL to http://www.example.com/part1/ (with the trailing forward slash) and set the option Limit Crawling to address and sub-directories only to enabled so as to limit the scope of the scan to only resources beneath the /part1/ folder.
- If you disable the option Limit Crawling to address and sub-directories only, then any path specified in the target URL will be ignored and you will scan the full domain.
Therefore, if your Target URL is set to http://www.example.com/task/subtask, you can disable the option Limit Crawling to address and sub-directories only to instruct the crawler to also look for resources in http://www.example.com/task/ and http://www.example.com.