T9类型字典背后的数据结构

发布于 2024-08-27 10:35:39 字数 136 浏览 8 评论 0原文

T9 词典如何工作?其背后的数据结构是怎样的。如果我们输入“4663”,当我们按下按钮时,我们会得到“gone”,然后是“home”等...

编辑:如果用户输入46,那么它应该显示“go”,当按下向下箭头时应该显示“go”显示“消失”等...

How does a T9 dictionary work? What is the data structure behind it. If we type '4663' we get 'good' when we press down button we get 'gone' then 'home' etc...

EDIT: If the user types in 46 then it should show 'go' and when pressed down arrow should show 'gone' etc...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

天煞孤星 2024-09-03 10:35:39

它可以通过多种方式实现,其中之一是 Trie。路径由数字表示,节点指向单词的集合。

它也可以使用嵌套哈希表来实现,哈希表的键是一个字母和对于每个数字,算法都会计算所有可能的路线(O(3^n) 路线)。

It can be implemented in several ways, one of them is Trie. The route is represented by the digits and the nodes point to collection of words.

It can be implemented using nested hash tables as well, the key of the hash table is a letter and on every digit the algorithm calculates all possible routes (O(3^n) routes).

吹梦到西洲 2024-09-03 10:35:39
4663

翻译为

{G,H,I}{M,N,O}{M,N,O}{D,E,F}

T9 的工作原理是从第一个可能的字母开始按顺序过滤可能性。因此,示例中的第一步是将字典列表过滤为以 G、H 或 I 开头的所有单词。下一步,获取该列表并按 M、N、O 过滤第二个字母。依此类推...

4663

translates to

{G,H,I}{M,N,O}{M,N,O}{D,E,F}

T9 works by filtering the possibilities down sequentially starting with the first possible letters. So the first step in your example will be to filter the dictionary list to all words beginning with G, H, or I. Next step, take that list and filter the second letters by M, N, O. And so on...

红颜悴 2024-09-03 10:35:39

我猜想,就像之前的 T9 使用 trie 一样,其中链接由位图表示(每个字母 1 位)。这被称为简洁的数据结构,正如 Steve Hanov 很好地解释的那样:

http://stevehanov。 ca/blog/index.php?id=120

I guess, as those before that T9 uses a trie, where the links are represented by a bitmap (1 bit per letter). This is called a succinct data structure, as Steve Hanov explains very nicely:

http://stevehanov.ca/blog/index.php?id=120

蓝海 2024-09-03 10:35:39

我认为 T9 使用的是基于位图的 Trie。第一层有一个 32 位(我假设它们扩展为 32)字,其中每个位代表一个字母。所有有效作为单词起始字母的字母都已设置其位。

在下一个级别中,对于上一个级别中设置的每个位都有一个 32 位映射。在这些映射中,如果相应的字母作为单词中的第二个字母有效,则设置每个位,而起始字母由第一级决定。

然后该结构继续。使用每个字母一位可以实现非常高效的存储。然而,寻址方案必定具有挑战性。还可能存在对短单词使用上述模式而对较长单词使用其他模式的某种优化。

I think that T9 is using a bit-map-based Trie. On the first level there is a 32-bit (I assume they expanded to 32) word where each bit represents a letter. All letters that are valid as start letters for a word have their bit set.

In the next level there is one 32-bit map for each of those bits that were set in the previous level. In these maps each bit is set if the corresponding letter is valid as a second letter in a word, with the starting letter decided by the first level.

The structure then continues. Using one-bit-per-letter makes a very efficient storage. The addressing scheme however must be challenging. There might also be some kind of optimization using the above schema for short words while using something else for longer words.

弃爱 2024-09-03 10:35:39

使用 Trie 的 C# 实现

        public interface ICellT9
        {
            void Add(string a_name);

            List<string> GetNames(string a_number);
        }

        public class Cell : ICellT9
        {
            private Dictionary<int, Cell> m_nodeHolder;
            private List<string> m_nameList;

            public Cell()
            {
                m_nameList = new List<string>();
                m_nodeHolder = new Dictionary<int, Cell>();

                for (int i = 2; i < 10; i++)
                {
                    m_nodeHolder.Add(i, null);
                }
            }

            public void Add(string a_name)
            {
                Add(a_name, a_name);
            }

            private void Add(string a_name, string a_originalName)
            {
                if (string.IsNullOrEmpty(a_name) && 
                   (string.IsNullOrEmpty(a_originalName) == false))
                {
                    m_nameList.Add(a_originalName);
                }
                else
                {
                    int l_firstNumber = CharToNumber(a_name[0].ToString());

                    if (m_nodeHolder[l_firstNumber] == null)
                    {
                        m_nodeHolder[l_firstNumber] = new Cell();
                    }

                    m_nodeHolder[l_firstNumber].Add(a_name.Remove(0, 1), a_originalName);
                }
            }

            public List<string> GetNames(string a_number)
            {
                List<string> l_result = null;

                if (string.IsNullOrEmpty(a_number))
                {
                    return l_result;
                }

                int l_firstNumber = a_number[0] - '0';

                if (a_number.Length == 1)
                {
                    l_result = m_nodeHolder[l_firstNumber].m_nameList;
                }
                else if(m_nodeHolder[l_firstNumber] != null)
                {
                    l_result = m_nodeHolder[l_firstNumber].GetNames(a_number.Remove(0, 1));
                }

                return l_result;

            }

            private int CharToNumber(string c)
            {
                int l_result = 0;

                if (c == "a" || c == "b" || c == "c")
                {
                    l_result = 2;
                }
                else if (c == "d" || c == "e" || c == "f")
                {
                    l_result = 3;
                }
                else if (c == "g" || c == "h" || c == "i")
                {
                    l_result = 4;
                }
                else if (c == "j" || c == "k" || c == "l")
                {
                    l_result = 5;
                }
                else if (c == "m" || c == "n" || c == "o")
                {
                    l_result = 6;
                }
                else if (c == "p" || c == "q" || c == "r" || c == "s")
                {
                    l_result = 7;
                }
                else if (c == "t" || c == "u" || c == "v")
                {
                    l_result = 8;
                }
                else if (c == "w" || c == "x" || c == "y" || c == "z")
                {
                    l_result = 9;
                }

                return l_result;
            }
        }

C# implementation using Trie

        public interface ICellT9
        {
            void Add(string a_name);

            List<string> GetNames(string a_number);
        }

        public class Cell : ICellT9
        {
            private Dictionary<int, Cell> m_nodeHolder;
            private List<string> m_nameList;

            public Cell()
            {
                m_nameList = new List<string>();
                m_nodeHolder = new Dictionary<int, Cell>();

                for (int i = 2; i < 10; i++)
                {
                    m_nodeHolder.Add(i, null);
                }
            }

            public void Add(string a_name)
            {
                Add(a_name, a_name);
            }

            private void Add(string a_name, string a_originalName)
            {
                if (string.IsNullOrEmpty(a_name) && 
                   (string.IsNullOrEmpty(a_originalName) == false))
                {
                    m_nameList.Add(a_originalName);
                }
                else
                {
                    int l_firstNumber = CharToNumber(a_name[0].ToString());

                    if (m_nodeHolder[l_firstNumber] == null)
                    {
                        m_nodeHolder[l_firstNumber] = new Cell();
                    }

                    m_nodeHolder[l_firstNumber].Add(a_name.Remove(0, 1), a_originalName);
                }
            }

            public List<string> GetNames(string a_number)
            {
                List<string> l_result = null;

                if (string.IsNullOrEmpty(a_number))
                {
                    return l_result;
                }

                int l_firstNumber = a_number[0] - '0';

                if (a_number.Length == 1)
                {
                    l_result = m_nodeHolder[l_firstNumber].m_nameList;
                }
                else if(m_nodeHolder[l_firstNumber] != null)
                {
                    l_result = m_nodeHolder[l_firstNumber].GetNames(a_number.Remove(0, 1));
                }

                return l_result;

            }

            private int CharToNumber(string c)
            {
                int l_result = 0;

                if (c == "a" || c == "b" || c == "c")
                {
                    l_result = 2;
                }
                else if (c == "d" || c == "e" || c == "f")
                {
                    l_result = 3;
                }
                else if (c == "g" || c == "h" || c == "i")
                {
                    l_result = 4;
                }
                else if (c == "j" || c == "k" || c == "l")
                {
                    l_result = 5;
                }
                else if (c == "m" || c == "n" || c == "o")
                {
                    l_result = 6;
                }
                else if (c == "p" || c == "q" || c == "r" || c == "s")
                {
                    l_result = 7;
                }
                else if (c == "t" || c == "u" || c == "v")
                {
                    l_result = 8;
                }
                else if (c == "w" || c == "x" || c == "y" || c == "z")
                {
                    l_result = 9;
                }

                return l_result;
            }
        }
寻找一个思念的角度 2024-09-03 10:35:39

我可能会从一本字典开始,然后(对于每个单词)将单词推到一个列表中,该列表由可以形成该单词的数字键入。

然后,您可能需要以某种方式对结果列表进行排序,以便更可能的单词出现在不太可能的单词之前(如果您有空间,我还会为计数器添加一个小区域,这样我们就可以逐步重新排序)对这些进行排序或简单地将最后使用的内容移至建议列表的前面,因此随着时间的推移,我们倾向于为用户提供更好的答案)。

这样,当您有 4663 作为输入时,您可以通过大约 O(1) 的哈希表查找来非常简单地检索相关链表。

I'd probably do it by starting with a dictionary, then (for each word) push the word onto a list keyed by the numbers that could form the word.

Then, you'll probably need to sort the resulting lists in some fashion, so more likely words appear before less likely words (if you have the space, I'd also include a small area for a counter, so we can incrementally re-sort these OR simply mov ethe last used to the front of the suggestion list, so we, over time, tend towards giving the user a better answer).

That way, when you have 4663 as an input, you can quite simply retrieve the relevant linked list wit ha roughly O(1) hash table lookup.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文