如何有效地检查键盘上的两个字符是否相邻?
我想为Android开发一个软键盘,并且已经有了一个自动更正算法,该算法会根据输入字符和字典中单词的字符是否在键盘上相邻的事实提出建议。这与 levenshtein 算法结合使用(如果必须用不同的字符替换一个字符,则检查它们是否是邻居)。这就是为什么此检查被频繁调用的原因。目前,它消耗了 50% 的时间用于自动更正。
我当前的方法是一个具有 3 层的单独的 trie。第一层:第一个字符。第二层:第二个字符: 第三层:布尔值,保存字符是否相邻的信息。但恐怕 trie 太过分了?每个孩子的实习生哈希图也可能会减慢速度?我应该使用自己的 charToNumber 函数构建哈希图吗?
你会怎么做?哪些瓶颈可以避免?当每次执行检查时调用Character.toLowerCase() 时,它似乎也效率低下。
我希望你能帮助我加快任务速度:)
I want to develop a soft keyboard for Android and already got a autocorrect algorithm which makes suggestions based on the fact if the input character and the character of a word from the dictionary are neighbours on the keyboard. This works in combination with the levenshtein-algorithm (if a character has to be replaced with a different character it is checked if they are neighbours). That's why this check is called very frequently. Currently, it consumes 50% of the time spent on autocorrection.
My current approach is a seperate trie with 3 layers. First layer: first character. Second layer: second character: Third layer: boolean holding the information if the characters are neighbours. But I'm afraid a trie is overkill? The intern hashmaps for every children may slow it down, as well? Should I build a hashmap with an own charToNumber-function?
How would you do this? Which bottlenecks can be avoided? Character.toLowerCase() seems to be inefficient as well when it's called everytime a check is performed.
I hope you can help me speeding up the task :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您只想确定键盘上两个字符是否相邻?为什么不使用从一个字符到一组相邻字符的映射?当使用高效的数据结构时,您将获得
O(1)
时间 - 使用数组作为映射(连续键空间 - 键的 ASCII 代码)和 BitSet 用于一组相邻的键。也非常紧凑。这是一个示例代码:
这应该非常高效,没有循环和复杂的计算,如
hashCode
。当然,您必须手动初始化表,我建议在应用程序启动时从一些外部配置文件执行此操作。顺便说一句,好主意!
You just want to determine whether two characters are next to each other on the keyboard? Why not use a map from a character to a set of adjacent characters? When using efficient data structures you will get
O(1)
time - use array for a map (continuous key space - ASCII codes of keys) and BitSet for a set of adjacent keys. Also very compact.Here is a sample code:
This should be very efficient, no loops and complicated computations like
hashCode
s. Of course you have to initialize the table manually, I would advice doing this once at application startup from som external configuration file.BTW neat idea!
我真的很喜欢这个主意。
为了获得原始速度,您可以使用大量
switch
语句。代码会很大,但没有什么比这更快的了:这是一种仍然表现良好的“标准”方法:
该算法没有利用 if
a isneighbour b
thenb isneighbour a
的事实,而是利用了这样的事实:为了代码简单性而牺牲数据大小。I really like the idea.
For raw speed, you would use a massive
switch
statement. The code would be large, but there would be nothing faster:Here's a "standard" way to do it that should still perform well:
This algorithm does not make use of the fact that if
a isneighbour b
thenb isneighbour a
, but rather sacrifices data size for code simplicity.为每个键分配数字并使用它来确定接近度怎么样?
部分输出:
What about assigning numbers to each key and use that to determine proximity.
Partial Output:
这是我的匈牙利语版本(如果有人需要的话):
Here is my hungarian version (if somebody needs it):