whoosh 是否要求所有字符串都是 unicode ?
我正在 Solr 的 Whoosh 中重做我的搜索应用程序。我现在正在从快速入门开始学习。但每次我不得不处理字符串时,我总是遇到问题
>>>writer.add_document(iden=fil, content=F2T.file_to_text(fil_path))
ValueError: 'File Name.doc' 不是 unicode 或序列
然后:
>>>query = QueryParser("content", ix.schema).parse("first")
AssertionError: 'first' is not unicode
该行直接来自快速启动教程! Whoosh 是否要求所有字段均采用 unicode?让我的应用程序能够识别 unicode 将是一项非常艰巨的工作(而且甚至不值得)。至于“不是unicode或序列”,我理解字符串也是一种序列数据类型。
I am redoing my search app in Whoosh from Solr. I am now learning from the quick start. But I kept running into problems each time I had to deal with strings
>>>writer.add_document(iden=fil, content=F2T.file_to_text(fil_path))
ValueError: 'File Name.doc' is not unicode or sequence
and then:
>>>query = QueryParser("content", ix.schema).parse("first")
AssertionError: 'first' is not unicode
And THAT line comes straight from the quick-start turorial! Does Whoosh require all fields to be in unicode? It will be real hard work to make my app unicode-aware (and its not even worth it). As for "not unicode or sequence", I understand that string is also a sequence data type.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
是的,它要求字符串采用 Unicode。
将其更改为:
Yes, it requires strings are in Unicode.
Change that to: