构建一个 Firefox 扩展来解析脚本标签

发布于 2024-11-16 15:27:09 字数 710 浏览 0 评论 0原文

我正在构建我的第一个 Firefox 扩展,但遇到了困难。我也不熟悉 javascript,尽管我知道如何编程。

该扩展程序尝试通过分类器运行脚本标记中包含的所有代码来识别恶意 JavaScript 代码。我已经用 Python 构建了分类器,但我不知道如何识别 javascript 并将其发送到分类器。捕获脚本标记之间的所有内容并将其发送到分类器的最佳方法是什么?日志显示我使用“var script = document.getElementsByTagName( 'script' );”在数组中捕获的每个项目是 xulelement 对象类型,但我不知道如何获取实际代码。在 for 循环中,我想将数组中的每个项目发送到分类器。我已将迄今为止所拥有的内容包含在下面:

function extractScripts(){
    var scripts = document.getElementsByTagName( 'script' );
    scriptExtractor_Log( scripts.length + ' scripts were found' );
    var sLen = scripts.length
    for ( var i=0, len=sLen; i<len; ++i ){
      scriptExtractor_Log( 'script ' + i + ': ' + scripts[i]);
    }               
    return 0;
}

I am in the process of building my first Firefox extension and I've hit a wall. I'm also not familiar with javascript, though I do know how to program.

The extension is attempting identify malicious javascript code by running all of the code that is contained within script tags through a classifier. I have the classifier built in Python already, but I can't figure out how to identify the javascript and send it to the classifier. What is the best way to capture everything in-between script tags, one by one, and send it to the classifier. The log says that each item I am capturing in the array with "var scripts = document.getElementsByTagName( 'script' );" is of type xulelement object, but I don't know how to get the actual code. In the for loop, I'd like to send each item in the array to the classifier. I've included what I have so far below:

function extractScripts(){
    var scripts = document.getElementsByTagName( 'script' );
    scriptExtractor_Log( scripts.length + ' scripts were found' );
    var sLen = scripts.length
    for ( var i=0, len=sLen; i<len; ++i ){
      scriptExtractor_Log( 'script ' + i + ': ' + scripts[i]);
    }               
    return 0;
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

风追烟花雨 2024-11-23 15:27:09

提取脚本标签不足以识别恶意脚本。考虑这个典型的 XSS 代码,例如:

<img src="this_does_not_exist" onerror="alert('Doing something evil')">

如果您在浏览器中加载 HTML 页面,那么您可以查看 JavaScript 反混淆器扩展。此扩展使用 JavaScript 调试器服务来拦截正在编译或执行的所有 JavaScript 代码 - 即使它是动态生成的。您可以在 https://developer.mozilla.org/en/Code_snippets/JavaScript_Debugger_Service< 上找到一些代码示例/a>,JavaScript Deobfuscator 设置 debuggerService.scriptHook 来拦截正在编译的脚本(onScriptCreatedonScriptDestroyed 将被调用)。

Extracting the script tags won't be sufficient to identify malicious scripts. Consider this typical XSS code for example:

<img src="this_does_not_exist" onerror="alert('Doing something evil')">

If you load the HTML page in the browser anyway then you can have a look at the JavaScript Deobfuscator extension. This extension uses the JavaScript debugger service to intercept all JavaScript code that is being compiled or executed - even if it is generated dynamically. You can find some code examples on https://developer.mozilla.org/en/Code_snippets/JavaScript_Debugger_Service, JavaScript Deobfuscator sets debuggerService.scriptHook to intercept scripts being compiled (onScriptCreated and onScriptDestroyed will be called).

溺ぐ爱和你が 2024-11-23 15:27:09

脚本可以是外部脚本,也可以是内联脚本,因此您需要检查每个标签的 src 属性。然后您可以发出AJAX请求来获取脚本的源代码。但是,如果脚本来自另一个域(通常是),则由于 跨域策略

for ( var i=0, len=sLen; i<len; ++i ){
  if ( scripts[i].src ) { //check if src is defined
    var XHR = new XMLHttpRequest(); //create a new XHR object
    XHR.open("GET",scripts[i].src,false); // the false here makes your request synchronus
    XHR.send(); //send the request
    scripttxt = XHR.respose;
  } else {
    scripttxt = scripts[i].innerHTML;
  }
  scriptExtractor_Log( 'script ' + i + ': ' + scripttxt);

}  

如果您已经使用 python,那么使用 python 库来解析 HTML 并提取标签正文可能会更容易

Scripts are either external or inline so you need to check the src attribute of each tag. Then you can make an AJAX request to get the source code of the script. However if the script is from another domain (which it often is) you can't retreive it due to the cross-domain policy

for ( var i=0, len=sLen; i<len; ++i ){
  if ( scripts[i].src ) { //check if src is defined
    var XHR = new XMLHttpRequest(); //create a new XHR object
    XHR.open("GET",scripts[i].src,false); // the false here makes your request synchronus
    XHR.send(); //send the request
    scripttxt = XHR.respose;
  } else {
    scripttxt = scripts[i].innerHTML;
  }
  scriptExtractor_Log( 'script ' + i + ': ' + scripttxt);

}  

If your using python already it might just be easier to use python libraries to parse the HTML and pull out the body of tags

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文