构建一个 Firefox 扩展来解析脚本标签
我正在构建我的第一个 Firefox 扩展,但遇到了困难。我也不熟悉 javascript,尽管我知道如何编程。
该扩展程序尝试通过分类器运行脚本标记中包含的所有代码来识别恶意 JavaScript 代码。我已经用 Python 构建了分类器,但我不知道如何识别 javascript 并将其发送到分类器。捕获脚本标记之间的所有内容并将其发送到分类器的最佳方法是什么?日志显示我使用“var script = document.getElementsByTagName( 'script' );”在数组中捕获的每个项目是 xulelement 对象类型,但我不知道如何获取实际代码。在 for 循环中,我想将数组中的每个项目发送到分类器。我已将迄今为止所拥有的内容包含在下面:
function extractScripts(){
var scripts = document.getElementsByTagName( 'script' );
scriptExtractor_Log( scripts.length + ' scripts were found' );
var sLen = scripts.length
for ( var i=0, len=sLen; i<len; ++i ){
scriptExtractor_Log( 'script ' + i + ': ' + scripts[i]);
}
return 0;
}
I am in the process of building my first Firefox extension and I've hit a wall. I'm also not familiar with javascript, though I do know how to program.
The extension is attempting identify malicious javascript code by running all of the code that is contained within script tags through a classifier. I have the classifier built in Python already, but I can't figure out how to identify the javascript and send it to the classifier. What is the best way to capture everything in-between script tags, one by one, and send it to the classifier. The log says that each item I am capturing in the array with "var scripts = document.getElementsByTagName( 'script' );" is of type xulelement object, but I don't know how to get the actual code. In the for loop, I'd like to send each item in the array to the classifier. I've included what I have so far below:
function extractScripts(){
var scripts = document.getElementsByTagName( 'script' );
scriptExtractor_Log( scripts.length + ' scripts were found' );
var sLen = scripts.length
for ( var i=0, len=sLen; i<len; ++i ){
scriptExtractor_Log( 'script ' + i + ': ' + scripts[i]);
}
return 0;
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
提取脚本标签不足以识别恶意脚本。考虑这个典型的 XSS 代码,例如:
如果您在浏览器中加载 HTML 页面,那么您可以查看 JavaScript 反混淆器扩展。此扩展使用 JavaScript 调试器服务来拦截正在编译或执行的所有 JavaScript 代码 - 即使它是动态生成的。您可以在 https://developer.mozilla.org/en/Code_snippets/JavaScript_Debugger_Service< 上找到一些代码示例/a>,JavaScript Deobfuscator 设置
debuggerService.scriptHook
来拦截正在编译的脚本(onScriptCreated
和onScriptDestroyed
将被调用)。Extracting the script tags won't be sufficient to identify malicious scripts. Consider this typical XSS code for example:
If you load the HTML page in the browser anyway then you can have a look at the JavaScript Deobfuscator extension. This extension uses the JavaScript debugger service to intercept all JavaScript code that is being compiled or executed - even if it is generated dynamically. You can find some code examples on https://developer.mozilla.org/en/Code_snippets/JavaScript_Debugger_Service, JavaScript Deobfuscator sets
debuggerService.scriptHook
to intercept scripts being compiled (onScriptCreated
andonScriptDestroyed
will be called).脚本可以是外部脚本,也可以是内联脚本,因此您需要检查每个标签的 src 属性。然后您可以发出AJAX请求来获取脚本的源代码。但是,如果脚本来自另一个域(通常是),则由于 跨域策略
如果您已经使用 python,那么使用 python 库来解析 HTML 并提取标签正文可能会更容易
Scripts are either external or inline so you need to check the src attribute of each tag. Then you can make an AJAX request to get the source code of the script. However if the script is from another domain (which it often is) you can't retreive it due to the cross-domain policy
If your using python already it might just be easier to use python libraries to parse the HTML and pull out the body of tags