机器学习技术用于网络入侵检测的可行性

发布于 2024-12-12 21:32:42 字数 436 浏览 5 评论 0原文

是否有机器学习概念（算法或多分类器系统）可以检测网络攻击的方差（或尝试）。

基于签名的入侵检测系统的最大问题之一是无法检测新的或变体的攻击。

读起来，异常检测似乎仍然是一种基于统计的吞噬，它指的是检测给定数据集中的模式，这与检测数据包有效负载的变化不同。基于异常的 NIDS 监视网络流量并将其与正常流量配置文件的既定基线进行比较。基线描述了网络的“正常”情况，例如正常带宽使用、使用的通用协议、端口号和设备的正确组合等。

假设有人使用病毒 A 通过网络传播，然后有人编写一条规则阻止病毒 A，但另一个人编写了病毒 A 的“变体”，称为病毒 B，纯粹是为了逃避最初的规则，但仍然使用大部分（如果不是全部）相同的策略/代码。没有办法检测方差吗？

如果有一个总称的话，它会属于什么，因为我一直幻想着异常检测就是它。

机器学习可以用于数据包有效负载级别的模式识别（而不是模式匹配）吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

落叶缤纷 2024-12-19 21:32:42

我认为您对机器学习技术的直觉是正确的，或者将被证明是正确的（基于签名的入侵检测系统的最大问题之一是无法检测新的或变体的攻击。）机器学习技术的卓越性能通常归因于这些算法的泛化能力（多种软约束而不是一些硬约束）。以及适应（根据新的训练实例进行更新，以挫败简单的对策）——我认为这两个属性对于识别网络攻击至关重要。

除了理论上的承诺之外，将机器学习技术应用于 OP 中提到的问题也存在实际困难。到目前为止，最重要的是收集数据来训练分类器很困难。特别是，将数据点可靠地标记为“入侵”可能并不容易；同样，我的猜测是这些实例在原始数据中分布稀疏。”

我想正是这种限制导致了人们对应用无监督的兴趣增加（至少已发表的文献证明了这一点） em> 针对网络入侵检测等问题的 ML 技术

与监督技术的不同之处在于，将数据馈送到算法中时没有响应变量（即，在这些情况下您所依赖的不是类标签）。辨别算法数据中的结构——即，数据中的一些固有排序为相当稳定的组或簇（可能是OP所想到的“方差”。因此，使用无监督技术，不需要显式地显示算法实例没有必要建立每个类别的基线测量等。

应用于此类问题的最常用的无监督机器学习技术可能是Kohonen Map（有时也被称为自组织地图或SOM。）

我经常使用 Kohonen 地图，但到目前为止还没有用于此目的。然而，有许多已发表的报告表明它们在您感兴趣的领域中成功应用，例如

使用自组织映射进行动态入侵检测

用于入侵检测的多个自组织映射

我知道 MATLAB 在Kohonen Map 的至少一种可用实现 - SOM工具箱。该工具箱的主页还包含 Kohonen 地图的简要介绍。

i think your intution to look at machine learning techniques is correct, or will turn out to be correct (One of the biggest problems for signature based intrusion detection systems is the inability to detect new or variant attacks.) The superior performance of ML techiques is in general due to the ability of these algorithms to generalize (a multiplicity of soft constraints rather than a few hard constraints). and to adapt (updates based on new training instances to frustrate simple countermeasures)--two attributes that i would imagine are crucial for identifying network attacks.

The theoretical promise aside, there are practical difficulties with applying ML techniques to problems like the one recited in the OP. By far the most significant is the difficultly in gathering data to train the classifier. In particular, reliably labeling data points as "intrusion" is probably not easy; likewise, my guess is that these instances are sparsely distributed in the raw data."

I suppose it's this limitation that has led to the increased interest (as evidenced at least by the published literature) in applying unsupervised ML techniques to problems like network intrusion detection.

Unsupervised techniques differ from supervised techniques in that the data is fed to the algorithms without a response variable (i.e., without the class labels). In these cases you are relying on the algorithm to discern structure in the data--i.e., some inherent ordering in the data into reasonably stable groups or clusters (possibly what you the OP had in mind by "variance." So with an unsupervised technique, there is no need to explicitly show the algorithm instances of each class, nor is it necessary to establish baseline measurements, etc.

The most frequently used unsupervised ML technique applied to problems of this type is probably the Kohonen Map (also sometimes called self-organizing map or SOM.)

i use Kohonen Maps frequently, but so far not for this purpose. There are however, numerous published reports of their successful application in your domain of interest, e.g.,

Dynamic Intrusion Detection Using Self-Organizing Maps

Multiple Self-Organizing Maps for Intrusion Detection

I know MATLAB has at least one available implementation of Kohonen Map--the SOM Toolbox. The homepage for this Toolbox also contains a brief introduction to Kohonen Maps.

回复收藏 0 原文

~没有更多了~