Gather PDF Authors - Metasploit
This page contains detailed information about how to use the auxiliary/gather/http_pdf_authors metasploit module. For list of all metasploit modules, visit the Metasploit Module Library.
Module Overview
Name: Gather PDF Authors
Module: auxiliary/gather/http_pdf_authors
Source code: modules/auxiliary/gather/http_pdf_authors.rb
Disclosure date: -
Last modification time: 2019-05-29 22:36:50 +0000
Supported architecture(s): -
Supported platform(s): -
Target service / protocol: http, https
Target network port(s): 80, 443, 3000, 8000, 8008, 8080, 8443, 8880, 8888
List of CVEs: -
This module downloads PDF documents and extracts the author's name from the document metadata. This module expects a URL to be provided using the URL option. Alternatively, multiple URLs can be provided by supplying the path to a file containing a list of URLs in the URL_LIST option. The URL_TYPE option is used to specify the type of URLs supplied. By specifying 'pdf' for the URL_TYPE, the module will treat the specified URL(s) as PDF documents. The module will download the documents and extract the authors' names from the document metadata. By specifying 'html' for the URL_TYPE, the module will treat the specified URL(s) as HTML pages. The module will scrape the pages for links to PDF documents, download the PDF documents, and extract the author's name from the document metadata.
Module Ranking and Traits
Module Ranking:
- normal: The exploit is otherwise reliable, but depends on a specific version and can't (or doesn't) reliably autodetect. More information about ranking can be found here.
Basic Usage
msf > use auxiliary/gather/http_pdf_authors
msf auxiliary(http_pdf_authors) > show targets
... a list of targets ...
msf auxiliary(http_pdf_authors) > set TARGET target-id
msf auxiliary(http_pdf_authors) > show options
... show and set options ...
msf auxiliary(http_pdf_authors) > exploit
Knowledge Base
This module downloads PDF files and extracts the author's name from the document metadata.
Verification Steps
- Start
msfconsole
- Do:
use auxiliary/gather/http_pdf_authors
- Do:
set URL [URL]
- Do:
run
Options
URL
The URL of a PDF to analyse.
URL_LIST
File containing a list of PDF URLs to analyze.
OUTFILE
File to store extracted author names.
Scenarios
URL
msf auxiliary(http_pdf_authors) > set url http://127.0.0.1/test4.pdf
url => http://127.0.0.1/test4.pdf
msf auxiliary(http_pdf_authors) > run
[*] Processing 1 URLs...
[*] Downloading 'http://127.0.0.1/test4.pdf'
[*] HTTP 200 -- Downloaded PDF (38867 bytes)
[+] PDF Author: Administrator
[*] 100.00% done (1/1 files)
[+] Found 1 authors: Administrator
[*] Auxiliary module execution completed
URL_LIST with OUTFILE
msf auxiliary(http_pdf_authors) > set outfile /root/output
outfile => /root/output
msf auxiliary(http_pdf_authors) > set url_list /root/urls
url_list => /root/urls
msf auxiliary(http_pdf_authors) > run
[*] Processing 8 URLs...
[*] Downloading 'http://127.0.0.1:80/test.pdf'
[*] HTTP 200 -- Downloaded PDF (89283 bytes)
[*] 12.50% done (1/8 files)
[*] Downloading 'http://127.0.0.1/test2.pdf'
[*] HTTP 200 -- Downloaded PDF (636661 bytes)
[+] PDF Author: sqlmap developers
[*] 25.00% done (2/8 files)
[*] Downloading 'http://127.0.0.1/test3.pdf'
[*] HTTP 200 -- Downloaded PDF (167478 bytes)
[+] PDF Author: Evil1
[*] 37.50% done (3/8 files)
[*] Downloading 'http://127.0.0.1/test4.pdf'
[*] HTTP 200 -- Downloaded PDF (38867 bytes)
[+] PDF Author: Administrator
[*] 50.00% done (4/8 files)
[*] Downloading 'http://127.0.0.1/test5.pdf'
[*] HTTP 200 -- Downloaded PDF (34312 bytes)
[+] PDF Author: ekama
[*] 62.50% done (5/8 files)
[*] Downloading 'http://127.0.0.1/doesnotexist.pdf'
[*] HTTP 404 -- Downloaded PDF (289 bytes)
[-] Could not parse PDF: PDF is malformed
[*] 75.00% done (6/8 files)
[*] Downloading 'https://127.0.0.1/test.pdf'
[-] Connection failed: Failed to open TCP connection to 127.0.0.1:443 (Connection refused - connect(2) for "127.0.0.1" port 443)
[*] Downloading 'https://127.0.0.1:80/test.pdf'
[-] Connection failed: SSL_connect returned=1 errno=0 state=unknown state: unknown protocol
[+] Found 4 authors: sqlmap developers, Evil1, Administrator, ekama
[*] Writing data to /root/output...
[*] Auxiliary module execution completed
Go back to menu.
Msfconsole Usage
Here is how the gather/http_pdf_authors auxiliary module looks in the msfconsole:
msf6 > use auxiliary/gather/http_pdf_authors
msf6 auxiliary(gather/http_pdf_authors) > show info
Name: Gather PDF Authors
Module: auxiliary/gather/http_pdf_authors
License: Metasploit Framework License (BSD)
Rank: Normal
Provided by:
bcoles <[email protected]>
Check supported:
No
Basic options:
Name Current Setting Required Description
---- --------------- -------- -----------
STORE_LOOT true no Store authors in loot
URL no The target URL
URL_LIST no File containing a list of target URLs
URL_TYPE html yes The type of URL(s) specified (Accepted: pdf, html)
Description:
This module downloads PDF documents and extracts the author's name
from the document metadata. This module expects a URL to be provided
using the URL option. Alternatively, multiple URLs can be provided
by supplying the path to a file containing a list of URLs in the
URL_LIST option. The URL_TYPE option is used to specify the type of
URLs supplied. By specifying 'pdf' for the URL_TYPE, the module will
treat the specified URL(s) as PDF documents. The module will
download the documents and extract the authors' names from the
document metadata. By specifying 'html' for the URL_TYPE, the module
will treat the specified URL(s) as HTML pages. The module will
scrape the pages for links to PDF documents, download the PDF
documents, and extract the author's name from the document metadata.
Module Options
This is a complete list of options available in the gather/http_pdf_authors auxiliary module:
msf6 auxiliary(gather/http_pdf_authors) > show options
Module options (auxiliary/gather/http_pdf_authors):
Name Current Setting Required Description
---- --------------- -------- -----------
STORE_LOOT true no Store authors in loot
URL no The target URL
URL_LIST no File containing a list of target URLs
URL_TYPE html yes The type of URL(s) specified (Accepted: pdf, html)
Advanced Options
Here is a complete list of advanced options supported by the gather/http_pdf_authors auxiliary module:
msf6 auxiliary(gather/http_pdf_authors) > show advanced
Module advanced options (auxiliary/gather/http_pdf_authors):
Name Current Setting Required Description
---- --------------- -------- -----------
DOMAIN WORKSTATION yes The domain to use for Windows authentication
DigestAuthIIS true no Conform to IIS, should work for most servers. Only set to false for non-IIS servers
FingerprintCheck true no Conduct a pre-exploit fingerprint verification
HttpClientTimeout no HTTP connection and receive timeout
HttpPassword no The HTTP password to specify for authentication
HttpRawHeaders no Path to ERB-templatized raw headers to append to existing headers
HttpTrace false no Show the raw HTTP requests and responses
HttpTraceColors red/blu no HTTP request and response colors for HttpTrace (unset to disable)
HttpTraceHeadersOnly false no Show HTTP headers only in HttpTrace
HttpUsername no The HTTP username to specify for authentication
SSLVersion Auto yes Specify the version of SSL/TLS to be used (Auto, TLS and SSL23 are auto-negotiate) (Accepted: Auto, TLS, SSL23, SSL3, TLS1, TLS1.1, TLS1.2)
UserAgent Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) no The User-Agent header to use for all requests
VERBOSE false no Enable detailed status messages
WORKSPACE no Specify the workspace for this module
Auxiliary Actions
This is a list of all auxiliary actions that the gather/http_pdf_authors module can do:
msf6 auxiliary(gather/http_pdf_authors) > show actions
Auxiliary actions:
Name Description
---- -----------
Evasion Options
Here is the full list of possible evasion options supported by the gather/http_pdf_authors auxiliary module in order to evade defenses (e.g. Antivirus, EDR, Firewall, NIDS etc.):
msf6 auxiliary(gather/http_pdf_authors) > show evasion
Module evasion options:
Name Current Setting Required Description
---- --------------- -------- -----------
HTTP::header_folding false no Enable folding of HTTP headers
HTTP::method_random_case false no Use random casing for the HTTP method
HTTP::method_random_invalid false no Use a random invalid, HTTP method for request
HTTP::method_random_valid false no Use a random, but valid, HTTP method for request
HTTP::pad_fake_headers false no Insert random, fake headers into the HTTP request
HTTP::pad_fake_headers_count 0 no How many fake headers to insert into the HTTP request
HTTP::pad_get_params false no Insert random, fake query string variables into the request
HTTP::pad_get_params_count 16 no How many fake query string variables to insert into the request
HTTP::pad_method_uri_count 1 no How many whitespace characters to use between the method and uri
HTTP::pad_method_uri_type space no What type of whitespace to use between the method and uri (Accepted: space, tab, apache)
HTTP::pad_post_params false no Insert random, fake post variables into the request
HTTP::pad_post_params_count 16 no How many fake post variables to insert into the request
HTTP::pad_uri_version_count 1 no How many whitespace characters to use between the uri and version
HTTP::pad_uri_version_type space no What type of whitespace to use between the uri and version (Accepted: space, tab, apache)
HTTP::uri_dir_fake_relative false no Insert fake relative directories into the uri
HTTP::uri_dir_self_reference false no Insert self-referential directories into the uri
HTTP::uri_encode_mode hex-normal no Enable URI encoding (Accepted: none, hex-normal, hex-noslashes, hex-random, hex-all, u-normal, u-all, u-random)
HTTP::uri_fake_end false no Add a fake end of URI (eg: /%20HTTP/1.0/../../)
HTTP::uri_fake_params_start false no Add a fake start of params to the URI (eg: /%3fa=b/../)
HTTP::uri_full_url false no Use the full URL for all HTTP requests
HTTP::uri_use_backslashes false no Use back slashes instead of forward slashes in the uri
HTTP::version_random_invalid false no Use a random invalid, HTTP version for request
HTTP::version_random_valid false no Use a random, but valid, HTTP version for request
Go back to menu.
Error Messages
This module may fail with the following error messages:
- No URL(s) specified
- File '<URL_LIST>' does not exist
- Could not parse PDF: PDF is malformed (MalformedPDFError)
- Could not parse PDF: PDF contains unsupported features (UnsupportedFeatureError)
- Could not parse PDF: PDF is malformed (SystemStackError)
- Could not parse PDF: PDF is malformed (SyntaxError)
- Could not parse PDF: PDF is malformed (Timeout)
- Could not parse PDF: Unhandled exception: <E>
- Found no links to PDF files
- Found no authors
- Warning: Truncated author's name at <MAX_LEN> characters
Check for the possible causes from the code snippets below found in the module source code. This can often times help in identifying the root cause of the problem.
No URL(s) specified
Here is a relevant code snippet related to the "No URL(s) specified" error message:
52:
53: def load_urls
54: return [ datastore['URL'] ] unless datastore['URL'].to_s.eql? ''
55:
56: if datastore['URL_LIST'].to_s.eql? ''
57: fail_with Failure::BadConfig, 'No URL(s) specified'
58: end
59:
60: unless File.file? datastore['URL_LIST'].to_s
61: fail_with Failure::BadConfig, "File '#{datastore['URL_LIST']}' does not exist"
62: end
File '<URL_LIST>' does not exist
Here is a relevant code snippet related to the "File '<URL_LIST>' does not exist" error message:
56: if datastore['URL_LIST'].to_s.eql? ''
57: fail_with Failure::BadConfig, 'No URL(s) specified'
58: end
59:
60: unless File.file? datastore['URL_LIST'].to_s
61: fail_with Failure::BadConfig, "File '#{datastore['URL_LIST']}' does not exist"
62: end
63:
64: File.open(datastore['URL_LIST'], 'rb') { |f| f.read }.split(/\r?\n/)
65: end
66:
Could not parse PDF: PDF is malformed (MalformedPDFError)
Here is a relevant code snippet related to the "Could not parse PDF: PDF is malformed (MalformedPDFError)" error message:
70: Timeout.timeout(10) do
71: reader = PDF::Reader.new data
72: return parse reader
73: end
74: rescue PDF::Reader::MalformedPDFError
75: print_error "Could not parse PDF: PDF is malformed (MalformedPDFError)"
76: return
77: rescue PDF::Reader::UnsupportedFeatureError
78: print_error "Could not parse PDF: PDF contains unsupported features (UnsupportedFeatureError)"
79: return
80: rescue SystemStackError
Could not parse PDF: PDF contains unsupported features (UnsupportedFeatureError)
Here is a relevant code snippet related to the "Could not parse PDF: PDF contains unsupported features (UnsupportedFeatureError)" error message:
73: end
74: rescue PDF::Reader::MalformedPDFError
75: print_error "Could not parse PDF: PDF is malformed (MalformedPDFError)"
76: return
77: rescue PDF::Reader::UnsupportedFeatureError
78: print_error "Could not parse PDF: PDF contains unsupported features (UnsupportedFeatureError)"
79: return
80: rescue SystemStackError
81: print_error "Could not parse PDF: PDF is malformed (SystemStackError)"
82: return
83: rescue SyntaxError
Could not parse PDF: PDF is malformed (SystemStackError)
Here is a relevant code snippet related to the "Could not parse PDF: PDF is malformed (SystemStackError)" error message:
76: return
77: rescue PDF::Reader::UnsupportedFeatureError
78: print_error "Could not parse PDF: PDF contains unsupported features (UnsupportedFeatureError)"
79: return
80: rescue SystemStackError
81: print_error "Could not parse PDF: PDF is malformed (SystemStackError)"
82: return
83: rescue SyntaxError
84: print_error "Could not parse PDF: PDF is malformed (SyntaxError)"
85: return
86: rescue Timeout::Error
Could not parse PDF: PDF is malformed (SyntaxError)
Here is a relevant code snippet related to the "Could not parse PDF: PDF is malformed (SyntaxError)" error message:
79: return
80: rescue SystemStackError
81: print_error "Could not parse PDF: PDF is malformed (SystemStackError)"
82: return
83: rescue SyntaxError
84: print_error "Could not parse PDF: PDF is malformed (SyntaxError)"
85: return
86: rescue Timeout::Error
87: print_error "Could not parse PDF: PDF is malformed (Timeout)"
88: return
89: rescue => e
Could not parse PDF: PDF is malformed (Timeout)
Here is a relevant code snippet related to the "Could not parse PDF: PDF is malformed (Timeout)" error message:
82: return
83: rescue SyntaxError
84: print_error "Could not parse PDF: PDF is malformed (SyntaxError)"
85: return
86: rescue Timeout::Error
87: print_error "Could not parse PDF: PDF is malformed (Timeout)"
88: return
89: rescue => e
90: print_error "Could not parse PDF: Unhandled exception: #{e}"
91: return
92: end
Could not parse PDF: Unhandled exception: <E>
Here is a relevant code snippet related to the "Could not parse PDF: Unhandled exception: <E>" error message:
85: return
86: rescue Timeout::Error
87: print_error "Could not parse PDF: PDF is malformed (Timeout)"
88: return
89: rescue => e
90: print_error "Could not parse PDF: Unhandled exception: #{e}"
91: return
92: end
93:
94: def parse(reader)
95: # PDF
Found no links to PDF files
Here is a relevant code snippet related to the "Found no links to PDF files" error message:
112:
113: if datastore['URL_TYPE'].eql? 'html'
114: urls = extract_pdf_links urls
115:
116: if urls.empty?
117: print_error 'Found no links to PDF files'
118: return
119: end
120:
121: print_line
122: print_good "Found links to #{urls.size} PDF files:"
Found no authors
Here is a relevant code snippet related to the "Found no authors" error message:
127: authors = extract_authors urls
128:
129: print_line
130:
131: if authors.empty?
132: print_status 'Found no authors'
133: return
134: end
135:
136: print_good "Found #{authors.size} authors: #{authors.join ', '}"
137:
Warning: Truncated author's name at <MAX_LEN> characters
Here is a relevant code snippet related to the "Warning: Truncated author's name at <MAX_LEN> characters" error message:
172: pdf.puts file
173: author = read pdf
174: unless author.blank?
175: print_good "PDF Author: #{author}"
176: if author.length > max_len
177: print_warning "Warning: Truncated author's name at #{max_len} characters"
178: authors << author[0...max_len]
179: else
180: authors << author
181: end
182: end
Go back to menu.
Related Pull Requests
- #11898 Merged Pull Request: move require for pdf-reader
- #11535 Merged Pull Request: add deregister_http_client_options
- #11234 Merged Pull Request: revisionism
- #10660 Merged Pull Request: deregister_options RHOSTS
- #8686 Merged Pull Request: HTTP PDF Authors Gather Module: Use Msf::Exploit::Remote::HttpClient
- #8658 Merged Pull Request: Add Gather PDF Authors auxiliary module
Go back to menu.
See Also
Check also the following modules related to this module:
- auxiliary/dos/windows/http/http_sys_accept_encoding_dos_cve_2021_31166
- auxiliary/fuzzers/http/http_form_field
- auxiliary/fuzzers/http/http_get_uri_long
- auxiliary/fuzzers/http/http_get_uri_strings
- auxiliary/scanner/http/http_header
- auxiliary/scanner/http/http_hsts
- auxiliary/scanner/http/http_login
- auxiliary/scanner/http/http_put
- auxiliary/scanner/http/http_sickrage_password_leak
- auxiliary/scanner/http/http_traversal
- auxiliary/scanner/http/http_version
- auxiliary/server/capture/http_basic
- auxiliary/server/capture/http_javascript_keylogger
- auxiliary/server/capture/http_ntlm
- auxiliary/server/http_ntlmrelay
- auxiliary/pdf/foxit/authbypass
- auxiliary/gather/firefox_pdfjs_file_theft
- auxiliary/fileformat/badpdf
- exploit/android/fileformat/adobe_reader_pdf_js_interface
- exploit/multi/browser/firefox_pdfjs_privilege_escalation
- exploit/windows/browser/verypdf_pdfview
- exploit/windows/fileformat/activepdf_webgrabber
- exploit/windows/fileformat/adobe_pdf_embedded_exe
- exploit/windows/fileformat/adobe_pdf_embedded_exe_nojs
- exploit/windows/fileformat/a_pdf_wav_to_mp3
- exploit/windows/fileformat/coolpdf_image_stream_bof
- exploit/windows/fileformat/corelpdf_fusion_bof
- exploit/windows/fileformat/documalis_pdf_editor_and_scanner
- exploit/windows/fileformat/nuance_pdf_launch_overflow
- exploit/windows/fileformat/shaper_pdf_bof
Authors
bcoles
Version
This page has been produced using Metasploit Framework version 6.1.36-dev. For more modules, visit the Metasploit Module Library.
Go back to menu.