确定在 QWERTY 键盘上输入单词的难度
我正在寻找一种相当简单的算法来确定在 QWERTY 布局上键入单词的难度。
这些单词不一定是字典单词,因此通常错误输入的单词或类似单词的列表不是一个选择。我确信一定有一个现有的、经过充分测试的算法,但我找不到任何东西。
任何人都可以提供任何帮助或建议吗?我用 python 编写算法,但欢迎使用任何其他语言或伪代码。
I'm looking for a reasonably simple algorithm to determine how difficult it is to type a word on the QWERTY layout.
The words would not necessarily be dictionary words, so a list of commonly mistyped words or the like is not an option. I'm sure there must be an existing, well-tested algorithm, but I can't find anything.
Can anyone offer any help or advice? I'm coding the algorithm in python, but any other language or pseudo-code is welcome.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
QWERTY、Colemak 和 Dvorak 布局之间有这种比较,它可以计算键入的按键之间的距离、按键的百分比在同一方面,等等与Java源代码。这些指标的结合应该可以很好地估计单词的“可打字性”。
There is this comparison between QWERTY, Colemak and Dvorak layouts, which calculates the distance between the keys typed, the percentage of keys on the same hand, etc. with source code in Java. These metrics in combination should give a very good estimate of the 'typeability' of a word.
我没有任何算法可以提出,但有一些提示:
我用双手打字,这意味着键盘大致分为两半,我经常在两只手之间出现协调问题,这意味着每个人都以“正确”的顺序输入字母,但交错是错误的。如果一只手比另一只手要输入更多字母,则尤其如此,典型的是:“the”,因为左手输入
t
和e
而右手输入h
.“失误”很常见,这意味着人们经常会错过某个键而按下另一个键; “添加”/“删除”也很常见,即键入辅助键或按得不够用力 -->这意味着(显然)字母越多,正确输入单词就越困难。
混合大小写会变得更困难,它需要在按下大写字母和敲击按键之间同步,因此附近的按键可能不会有正确的大小写。
希望这有帮助...
I don't have any algorithms to propose, but a few hints:
I use both hands to type, meaning that the keyboard is roughly split in 2 halves, it is frequent that I have coordination issues between the two hands, meaning that each type the letters in the "right" order but the interleaving is wrong. This is especially true if one hand has more letters to type than the other, typical: "the" because the left hand type
t
ande
and the right hand typesh
."slips" are frequent, meaning that often time one is going to miss the key and hit another key instead; "addition" / "deletion" are frequent too, ie typing a supplementary key or not pushing hard enough --> this mean that (obviously) the more letters there is, the harder it is to get the word right.
mix case makes it harder, it requires synchronization between pushing CAPS and hitting the keys, so it's likely that the nearby keys won't have the right upper/lower case.
Hope this helps...
拿出你的拼字游戏集,记下每个字母的分数,计算一个单词的分数,嘿,很快你就有了你的算法。不确定它完全满足您的要求,但它可能会为您指明一个有用的方向。例如,您可能不仅想为单个字母分配分数,还想为二元组和三元组分配分数。
我不知道您需要的任何现有信息来源,也许您可以通过检查键盘并为较难的字母分配更高的分数来得出自己的字母分数:因此 1 代表“a”,8 代表“q” ',2 代表'm',依此类推。
编辑:当我回复SO时,我似乎比平时更让人们感到困惑。以下是我建议的要点:
a) 列出以英语(或您的语言)出现的所有三字组和双字组。为每个人分配一个打字难度分数。对单个字母执行相同的操作(毕竟 4 个字母的单词可能由一个三元组和一个字母而不是两个二元组组成)。
b) 将输入单词的难度评分为输入其组成部分的难度之和。
至于难度分数,我不知道,但是你可以从1开始,表示键盘上home键上的字母,2表示使用食指但不是home键的字母,3表示字母它使用手上的第二或第三手指,依此类推。然后,对于二连词,对于按顺序排列在左侧和右侧(或右侧和左侧)的简单字母,得分较低,对于按顺序排列在一只手上的困难字母,得分较高(例如 qz,尽管这可能对英语无效)。继续吧。
Take out your Scrabble set, note down the scores for each letter, total the scores for a word, hey presto you have your algorithm. Not sure it entirely satisfies your requirements, but it might point you in a useful direction. You might, for instance, want to assign scores not only to individual letters but also to di- and tri-grams.
I'm not aware of any existing source of the information you need, perhaps you could come up with your own letter scores by examining the keyboard and assigning higher scores to the more difficult letters: so 1 for 'a', 8 for 'q', 2 for 'm', and so on.
EDIT: I seem to have confused people more than I usually do when I reply on SO. Here's the barebones of my proposal:
a) List all trigrams and digrams which occur in English (or your language). To each of them assign a difficulty-of-typing score. Do the same for individual letters (after all a 4 letter word might be composed of a trigram and a letter rather than two digrams).
b) Score the difficulty of typing a word as the sum of the difficulty of typing its components.
As for the difficulty scores, I haven't a clue, but you could start from 1 for a letter on the home keys on a keyboard, 2 for a letter which uses the index fingers but is not a home key, 3 for a letter which uses the 2nd or 3rd fingers on your hand, and so on. Then for digrams, score low for easy letters on left and right (or right and left) in sequence, high for difficult letters on one hand in sequence (eg qz, though that's perhaps not valid for English). And on you go.
我认为,曼哈顿距离算法可能最接近您所看到的。该算法以四边形方式考虑了目标与源的距离。
至于Python中的实现,对于您在QWERTY中的难度的特定需求,您必须自己编写一个,否则如果您在Google上搜索“n puzzlesolver in python”,则几乎找不到manhatten distances实现
I think, manhatten distances algorithm could be closest of what you are looking at. That algorithm takes into account the distance of the target from the source in the quadrangular fashion.
As for the implementation in python, for your specific need of difficulty in QWERTY, you will have to write one for yourself, otherwise few manhatten distances implementation can be found if you google for "n puzzle solver in python"