模糊 C 均值算法的简单/实用示例
我正在写关于动态击键身份验证主题的硕士论文。为了支持正在进行的研究,我正在编写代码来测试不同的特征提取和特征匹配方法。
我当前的简单方法只是检查参考密码键码是否与当前输入的键码相匹配,并检查按键时间(停留)和按键时间(飞行)是否与参考时间相同+/- 100ms(容差) )。这当然是非常有限的,我想通过某种模糊 c 均值模式匹配来扩展它。
对于每个键,其特征如下:键码、停留时间、飞行时间(第一个飞行时间始终为 0)。
显然,键码可以从模糊算法中取出,因为它们必须完全相同。 在这种情况下,模糊 C 均值的实际实现会是什么样子?
I am writing my master thesis on the subject of dynamic keystroke authentication. To support ongoing research, I am writing code to test out different methods of feature extraction and feature matching.
My current simple approach just checks if the reference password keycodes matches the currently typed in keycodes and also checks if the keypress times (dwell) and the key-to-key times (flight) are the same as reference times +/- 100ms (tolerance). This is of course very limited and I want to extend it with some sort of fuzzy c-means pattern matching.
For each key the features look like: keycode, dwelltime, flighttime (first flighttime is always 0).
Obviously the keycodes can be taken out of the fuzzy algorithm because they have to be exactly the same.
In this context, how would a practical implementation of fuzzy c-means look like?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一般来说,您会执行以下操作:
我不是专家,但这似乎是确定登录尝试是否真实的奇怪方法。我见过 FCM 用于模式识别(例如,我正在做出哪种面部表情?),这是有道理的,因为您正在处理具有定义特征的多个类别(例如,快乐、悲伤、愤怒等...)。就您而言,您实际上只有一个具有定义特征的类别(真实的)。非真实击键只是“不像”真实击键,因此它们不会聚集。
也许我错过了什么?
Generally, you would do the following:
I'm not an expert, but this seems like an odd approach to determining whether a login attempt is authentic or not. I've seen FCM used for pattern recognition (eg. which facial expression am I making?), which makes sense because you're dealing with several categories (eg. happy, sad, angry, etc...) with defining characteristics. In your case, you really only have one category (authentic) with defining characteristics. Non-authentic keystrokes are simply "not like" authentic keystrokes, so they won't cluster.
Perhaps I am missing something?
我不认为你真的想在这里进行聚类。您可能想要进行一些适当的模糊匹配,而不是只允许每个值存在一些增量。
对于聚类,您需要有许多数据点。此外,您需要知道您需要的适当数量的资金。
但这些多个对象意味着什么呢?每个键码都有一个数据点。您不想让用户输入 100 次密码来看看他是否能始终如一地输入密码。即便如此,您期望这些集群是什么样的?您已经知道哪个键码出现在哪个位置,您不想知道用户使用什么键码作为密码...
抱歉,我在这里确实没有看到任何聚类。 “模糊”一词似乎误导了您对这种聚类算法的认识。尝试“模糊逻辑”。
I don't think you really want to do clustering here. You might want to do some proper fuzzy matching though instead of just allowing some delta on each value.
For clustering, you need to have many data points. Additionally, you'd need to know the proper number of means you need.
But what are these multiple objects meant to be? You have one data point for every keycode. You don't want to have the user type the password 100 times to see if he can do it consistently. And even then, what do you expect the clusters to be? You already know which keycode comes at which position, you don't want to find out what keycodes the user use for his password...
Sorry, I really don't see any clustering here. The term "fuzzy" seems to have mislead you to this clustering algorithm. Try "fuzzy logic" instead.