如何检测 QWERTY 键盘上的一个字符是否靠近另一个字符?
我正在开发一个垃圾邮件检测系统,并被警告发现它无法检测这样的字符串 - “asdfsdf”。
我的解决方案是检测之前的键是否靠近键盘上的其他键。我没有从键盘获取输入(以检测来自的垃圾邮件),而是以字符串的形式获取它。
我只想知道一个角色与另一个角色的距离是一个键、两个键还是两个以上键。
例如,在现代 QWERTY 键盘上,字符“q”和“w”将相距 1 个键。字符“q”和“s”也是如此。人类可以从逻辑上解决这个问题,我如何在代码中做到这一点?
I'm developing a spam detection system and have been alerted to find that it can't detect strings like this - "asdfsdf".
My solution to this involves detecting if the previous keys were near the other keys on the keyboard. I am not getting the input (to detect spam from) from the keyboard, I'm getting it in the form of a string.
All I want to know is whether a character is one key, two keys or more than two keys away from another character.
For example, on a modern QWERTY keyboard, the characters 'q' and 'w' would be 1 key away. Same would the chars 'q' and 's'. Humans can figure this out logically, how could I do this in code?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您可以简单地为标准 qwerty 键盘创建一个二维映射。
基本上它可能看起来像这样:
等等。
当你得到两个字符时,你只需要在上面的数组“map”中找到它们的x和y,并且可以使用毕达哥拉斯简单地计算距离。它不能满足您的要求,因为“q”和“s”相距 1。而是 sqrt(1^2 + 1^2) 约 1.4
公式为:
例如:
假设您得到字符 c1='q' 和 c2='w'。检查地图,发现“q”的坐标为 (x1,y1) = (0, 0),“w”的坐标为 (x2,y2) = (1, 0)。距离是
You could simply create a two-dimensional map for the standard qwerty keyboard.
Basically it could look something like this:
and so on.
When you get two characters, you simply need to find their x, and y in the array 'map' above, and can simply calculate the distance using pythagoras. It would not fill the requirement you had as 'q' and 's' being 1 distance away. But rather it would be sqrt(1^2 + 1^2) approx 1.4
The formula would be:
For example:
Say you get the characters c1='q', and c2='w'. Examine the map and find that 'q' has coordinates (x1,y1) = (0, 0) and 'w' has coordinates (x2,y2) = (1, 0). The distance is
好吧,让我们看看。这是一项艰难的任务。我总是采用蛮力方法,远离毕达哥拉斯试图强加给我们的先进概念,那么二维表怎么样?像这样的东西。也许:
这对你有用吗?您甚至可以使用负数来表示一个键位于另一个键的左侧。另外,您可以在每个单元格中放置一个 2 整数结构,其中第二个整数为正数或负数,以显示第二个字母相对于第一个字母向上或向下。快给我的专利律师打电话!
Well, let's see. That's a tough one. I always take the brute-force method and I stay away from advanced concepts like that guy Pythagoras tried to foist on us, so how about a two-dimensional table? Something like this. maybe:
Could that work for ya'? You could even have negative numbers to show that one key is to the left of the other. PLUS you could put a 2-integer struct in each cell where the second int is positive or negative to show that the second letter is up or down from the first. Get my patent attorney on the phone, quick!
构建从按键到理想键盘上位置的映射。比如:
那么你可以将“距离”作为两点之间的数学距离。
Build a map from keys to positions on an idealized keyboard. Something like:
Then you can take the "distance" as the mathematical distance between the two points.
基本思想是创建字符及其在键盘上的位置的映射。然后,您可以使用简单的距离公式来确定它们之间的距离。
例如,考虑键盘的左侧:
字符
a
的位置为[2, 0]
,字符b
的位置为>[3, 4]
。它们之间距离的公式为:因此,
a
和b
之间的距离为sqrt((4 - 0)^2 + (3 - 2)^2 )
将按键映射到矩形网格中需要花费一些精力(我的示例并不完美,但它给了您想法)。但之后你可以构建一个地图(或字典),并且查找既简单又快速。
The basic idea is to create a map of characters and their positions on the keyboard. You can then use a simple distance formula to determine how close they are together.
For example, consider the left side of the keyboard:
Character
a
has the position[2, 0]
and characterb
has the position[3, 4]
. The formula for their distance apart is:So the distance between
a
andb
issqrt((4 - 0)^2 + (3 - 2)^2)
It'll take you a little bit of effort to map the keys into a rectangular grid (my example isn't perfect, but it gives you the idea). But after that you can build a map (or dictionary), and lookup is simple and fast.
我在 PHP 中开发了一个用于相同目的的函数,因为我想看看是否可以使用它来分析字符串以确定它们是否可能是垃圾邮件。
这是针对 QWERTZ 键盘的,但可以轻松更改。数组
$keys
中的第一个数字是距左侧的大致距离,第二个数字是距顶部的行号。您可以按照以下方式使用它。
我使用多字节函数是因为我正在考虑将其扩展到其他字符。人们还可以通过检查字符的大小写来扩展它。
I developed a function for the same purpose in PHP because I wanted to see whether I can use it to analyse strings to figure out whether they're likely to be spam.
This is for the QWERTZ keyboard, but it can easily be changed. The first number in the array
$keys
is the approximate distance from the left and the second is the row number from top.You can use it the following way.
I used multibyte functions because I was thinking about extending it for other characters. One could extend it by checking the case of characters as well.