用于基准测试目的的随机单词的大型文本文件字典?
我想知道是否有人可以向我指出一个非常非常大的随机单词字典,可以用来测试一些高性能字符串数据结构?我发现一些在 ~2MB 范围内...但是如果可能的话我想要一些更大的。我猜测某个地方必须有一些可以使用的大型标准字符串数据集。谢谢!
I was wondering if anyone could point me to a very very large dictionary of random words that could be used to test some high performance string data structures? I'm finding some that are in the ~2MB range... however I'd like some larger if possible. I'm guessing there has to be some large standard string dataset somewhere that could be used. Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
更多
发布评论
评论(2)
http://norvig.com/big.txt
Norvig 的拼写检查器文章中提到了上述链接 - < a href="http://norvig.com/spell- Correct.html" rel="nofollow">http://norvig.com/spell- Correct.html
http://norvig.com/big.txt
The above link was mentioned in Norvig's spell checker article - http://norvig.com/spell-correct.html
我建议您浏览一下 TREC(文本检索会议)上提供的材料。一些可能满足您要求的好的数据集。
I'd recommend taking a look through the material available at the TREC (Text REtrieval Conference). Some good datasets which might meet your requirements.