Web Site Crawler - Metasploit
This page contains detailed information about how to use the auxiliary/scanner/http/crawler metasploit module. For list of all metasploit modules, visit the Metasploit Module Library.
Module Overview
Name: Web Site Crawler
Module: auxiliary/scanner/http/crawler
Source code: modules/auxiliary/scanner/http/crawler.rb
Disclosure date: -
Last modification time: 2021-01-28 10:35:25 +0000
Supported architecture(s): -
Supported platform(s): -
Target service / protocol: http, https
Target network port(s): 80, 443, 3000, 8000, 8008, 8080, 8443, 8880, 8888
List of CVEs: -
Crawl a web site and store information about what was found
Module Ranking and Traits
Module Ranking:
- normal: The exploit is otherwise reliable, but depends on a specific version and can't (or doesn't) reliably autodetect. More information about ranking can be found here.
Basic Usage
msf > use auxiliary/scanner/http/crawler
msf auxiliary(crawler) > show targets
... a list of targets ...
msf auxiliary(crawler) > set TARGET target-id
msf auxiliary(crawler) > show options
... show and set options ...
msf auxiliary(crawler) > exploit
Required Options
- RHOSTS: The target host(s), range CIDR identifier, or hosts file with syntax 'file:<path>'
Knowledge Base
Description
This module is a http crawler, it will browse the links recursively from the web site. If you have loaded a database plugin and connected to a database, this module will report web pages and web forms.
Vulnerable Application
You can use any web application to test the crawler.
Options
URI
Default path is /
DirBust
Bruteforce common url path, default is true
but may generate noise in reports.
HttpPassword, HttpUsername, HTTPAdditionalHeaders, HTTPCookie
You can add some login information
UserAgent
Default User Agent is Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Verification Steps
- Do:
use auxiliary/scanner/http/crawler
- Do:
set RHOST [IP]
- Do:
set RPORT [PORT]
- Do:
set URI [PATH]
- Do:
run
Scenarios
Example against WebGoat
msf> use auxiliary/scanner/http/crawler
msf auxiliary(crawler) > set RHOST 127.0.0.1
msf auxiliary(crawler) > set RPORT 8080
msf auxiliary(crawler) > set URI /webgoat/
msf auxiliary(crawler) > set DirBust false
msf auxiliary(crawler) > run
[*] Crawling http://127.0.0.1:8008/webgoat/...
[*] [00001/00500] 302 - 127.0.0.1 - http://127.0.0.1:8008/webgoat/ -> /webgoat/login.mvc
[*] [00002/00500] 200 - 127.0.0.1 - http://127.0.0.1:8008/webgoat/login.mvc
[*] FORM: POST /webgoat/j_spring_security_check;jsessionid=8B1EAF2554B60EFC93A52AFCA4B6C202
[-] [00003/00500] 404 - 127.0.0.1 - http://127.0.0.1:8008/webgoat/images/favicon.ico
[*] [00004/00500] 200 - 127.0.0.1 - http://127.0.0.1:8008/webgoat/plugins/bootstrap/css/bootstrap.min.css
[*] [00005/00500] 200 - 127.0.0.1 - http://127.0.0.1:8008/webgoat/css/font-awesome.min.css
[*] [00006/00500] 200 - 127.0.0.1 - http://127.0.0.1:8008/webgoat/css/animate.css
[*] [00007/00500] 302 - 127.0.0.1 - http://127.0.0.1:8008/webgoat/j_spring_security_check;jsessionid=8B1EAF2554B60EFC93A52AFCA4B6C202 -> /webgoat/login.mvc;jsessionid=8B1EAF2554B60EFC93A52AFCA4B6C202?error
[*] [00008/00500] 200 - 127.0.0.1 - http://127.0.0.1:8008/webgoat/login.mvc;jsessionid=8B1EAF2554B60EFC93A52AFCA4B6C202?error
[*] FORM: GET /webgoat/login.mvc
[*] FORM: POST /webgoat/j_spring_security_check;jsessionid=8B1EAF2554B60EFC93A52AFCA4B6C202
[*] [00009/00500] 200 - 127.0.0.1 - http://127.0.0.1:8008/webgoat/css/main.css
[*] [00010/00500] 302 - 127.0.0.1 - http://127.0.0.1:8008/webgoat/start.mvc -> http://127.0.0.1:8008/webgoat/login.mvc
[*] [00011/00500] 200 - 127.0.0.1 - http://127.0.0.1:8008/webgoat/login.mvc
[*] FORM: POST /webgoat/j_spring_security_check
[*] Crawl of http://127.0.0.1:8008/webgoat/ complete
[*] Auxiliary module execution completed
Follow-on: Wmap
As you see, the result is not very user friendly...
But you can view a tree of your website with the Wmap plugin. Simply run :
msf auxiliary(crawler) > load wmap
msf auxiliary(crawler) > wmap_sites -l
[*] Available sites
===============
Id Host Vhost Port Proto # Pages # Forms
-- ---- ----- ---- ----- ------- -------
0 127.0.0.1 127.0.0.1 8080 http 70 80
msf auxiliary(crawler) > wmap_sites -s 0
[127.0.0.1] (127.0.0.1)
└── webgoat (7)
├── css (3)
│ ├── animate.css
│ ├── font-awesome.min.css
│ └── main.css
├── j_spring_security_check;jsessionid=8B1EAF2554B60EFC93A52AFCA4B6C202
├── login.mvc
├── login.mvc;jsessionid=8B1EAF2554B60EFC93A52AFCA4B6C202
├── plugins (1)
│ └── bootstrap (1)
│ └── css (1)
│ └── bootstrap.min.css
├── start.mvc
└── j_spring_security_check
Go back to menu.
Msfconsole Usage
Here is how the scanner/http/crawler auxiliary module looks in the msfconsole:
msf6 > use auxiliary/scanner/http/crawler
msf6 auxiliary(scanner/http/crawler) > show info
Name: Web Site Crawler
Module: auxiliary/scanner/http/crawler
License: Metasploit Framework License (BSD)
Rank: Normal
Provided by:
hdm <[email protected]>
tasos
Check supported:
No
Basic options:
Name Current Setting Required Description
---- --------------- -------- -----------
DOMAIN WORKSTATION yes The domain to use for windows authentication
HttpPassword no The HTTP password to specify for authentication
HttpUsername no The HTTP username to specify for authentication
MAX_MINUTES 5 yes The maximum number of minutes to spend on each URL
MAX_PAGES 500 yes The maximum number of pages to crawl per URL
MAX_THREADS 4 yes The maximum number of concurrent requests
Proxies no A proxy chain of format type:host:port[,type:host:port][...]
RHOSTS yes The target host(s), range CIDR identifier, or hosts file with syntax 'file:<path>'
RPORT 80 yes The target port
SSL false no Negotiate SSL/TLS for outgoing connections
URI / yes The starting page to crawl
VHOST no HTTP server virtual host
Description:
Crawl a web site and store information about what was found
Module Options
This is a complete list of options available in the scanner/http/crawler auxiliary module:
msf6 auxiliary(scanner/http/crawler) > show options
Module options (auxiliary/scanner/http/crawler):
Name Current Setting Required Description
---- --------------- -------- -----------
DOMAIN WORKSTATION yes The domain to use for windows authentication
HttpPassword no The HTTP password to specify for authentication
HttpUsername no The HTTP username to specify for authentication
MAX_MINUTES 5 yes The maximum number of minutes to spend on each URL
MAX_PAGES 500 yes The maximum number of pages to crawl per URL
MAX_THREADS 4 yes The maximum number of concurrent requests
Proxies no A proxy chain of format type:host:port[,type:host:port][...]
RHOSTS yes The target host(s), range CIDR identifier, or hosts file with syntax 'file:<path>'
RPORT 80 yes The target port
SSL false no Negotiate SSL/TLS for outgoing connections
URI / yes The starting page to crawl
VHOST no HTTP server virtual host
Advanced Options
Here is a complete list of advanced options supported by the scanner/http/crawler auxiliary module:
msf6 auxiliary(scanner/http/crawler) > show advanced
Module advanced options (auxiliary/scanner/http/crawler):
Name Current Setting Required Description
---- --------------- -------- -----------
BasicAuthPass no The HTTP password to specify for basic authentication
BasicAuthUser no The HTTP username to specify for basic authentication
DirBust true no Bruteforce common URL paths
ExcludePathPatterns no Newline-separated list of path patterns to ignore ('*' is a wildcard)
HTTPAdditionalHeaders no A list of additional headers to send (separated by \x01)
HTTPCookie no A HTTP cookie header to send with each request
RedirectLimit 5 no The maximum number of redirects for a single request
RequestTimeout 15 no The maximum number of seconds to wait for a reply
RetryLimit 5 no The maximum number of attempts for a single request
SSLVersion Auto yes Specify the version of SSL/TLS to be used (Auto, TLS and SSL23 are auto-negotiate) (Accepted: Auto, TLS, SSL23, SSL3, TLS1, TLS1.1, TL
S1.2)
UserAgent Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html yes The User-Agent header to use for all requests
)
VERBOSE false no Enable detailed status messages
WORKSPACE no Specify the workspace for this module
Auxiliary Actions
This is a list of all auxiliary actions that the scanner/http/crawler module can do:
msf6 auxiliary(scanner/http/crawler) > show actions
Auxiliary actions:
Name Description
---- -----------
Evasion Options
Here is the full list of possible evasion options supported by the scanner/http/crawler auxiliary module in order to evade defenses (e.g. Antivirus, EDR, Firewall, NIDS etc.):
msf6 auxiliary(scanner/http/crawler) > show evasion
Module evasion options:
Name Current Setting Required Description
---- --------------- -------- -----------
Go back to menu.
Related Pull Requests
- #14696 Merged Pull Request: Zeitwerk rex folder
- #8716 Merged Pull Request: Print_Status -> Print_Good (And OCD bits 'n bobs)
- #8338 Merged Pull Request: Fix msf/core and self.class msftidy warnings
- #6655 Merged Pull Request: use MetasploitModule as a class name
- #6648 Merged Pull Request: Change metasploit class names
- #3409 Merged Pull Request: Fix HTTP Crawler Anemone::Page method error
- #3400 Merged Pull Request: Fix the last of the Set-Cookie msftidy warnings
- #2525 Merged Pull Request: Change module boilerplate
- #1487 Merged Pull Request: Web crawler updated to skip paths (ExcludePathPatterns option)
- #1228 Merged Pull Request: MSFTIDY cleanup #1 - auxiliary
- #1096 Merged Pull Request: Tiny Web-crawler bugfix and reformatting
- #1000 Merged Pull Request: Updated support for Web modules and analysis techniques
Go back to menu.
See Also
Check also the following modules related to this module:
- auxiliary/analyze/crack_aix
- auxiliary/analyze/crack_databases
- auxiliary/analyze/crack_linux
- auxiliary/analyze/crack_mobile
- auxiliary/analyze/crack_osx
- auxiliary/analyze/crack_webapps
- auxiliary/crawler/msfcrawler
- exploit/windows/mssql/mssql_linkcrawler
Authors
- hdm
- tasos
Version
This page has been produced using Metasploit Framework version 6.1.27-dev. For more modules, visit the Metasploit Module Library.
Go back to menu.