对文档和相关数据进行全文检索 mssql
目前正在构建知识库应用程序,我有点不确定存储和索引文档信息的最佳方式。
用户上传文档,并在执行此操作时从下拉列表中选择多个选项(例如类别、主题、区域...,注意这些并非全部是强制性的),他们还输入一些关键字和文档的描述。目前,所选类别(和其他类别)使用类别表中的 id 作为外键存储在文档表中。 我们想要做的是不仅对文档所在的 varchar(max) 列中的信息进行 FREETEXTTABLE 或 CONTAINSTABLE 操作,还对类别名称、主题名称和区域名称等进行操作。
我查看了该选项创建索引视图的方法,但由于针对类别列的 LEFT JOIN,这是不可能的。因此,我不确定如何能够做到这一点,任何想法将不胜感激。
Currently in the middle of building a knowledge base app and am a bit unsure on the best way to store and index the document information.
The user uploads the document and when doing so selects a number of options from dropdown lists (such as category,topic,area..., note these are not all mandatory) they also enter some keywords and a description of the document. At the moment the category (and others) selected is stored as foreign key in the documents table using the id from the categories table.
What we want to be able to do is do a FREETEXTTABLE or CONTAINSTABLE on not only the information within the varchar(max) column where the document is located but also on the category name, topic name and area name etc.
I looked at the option of creating an indexed view but this wasn't possible due to the LEFT JOIN against the category column. So I'm not sure how to go about being able to do this any ideas would be most appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我假设您想将这两个搜索组合在一起。例如,查找包含文本“foo”且位于“汽车维修”类别中的所有文档。
也许您不需要完整显示附加数据,只需使用 = 或类似的符号即可?如果附加数据相当小,则可能无法保证全文的复杂性。
但是,如果您想对两者使用全文,请使用存储过程将结果汇总在一起。这里的技巧是暂存结果,而不是试图立即返回结果集。
这是一个粗略的起点。
I assume that you want to AND the two searches together. For example find all documents containing the text "foo" AND in category the "Automotive Repair".
Perhaps you don't need to full text the additional data and can just use = or like? If the additional data is reasonably small it may not warrant the complication of full text.
However, if you want to use full text on both, use a stored procedure that pulls the results together for you. The trick here is to stage the results rather than trying to get a result set back straight away.
This is rough starting point.
您可以为全文索引创建一个新列,其中包含原始文档以及作为元数据附加的类别。然后对该列的搜索可以同时搜索文档和类别。您需要发明一个标签系统,使它们在文档中保持唯一,但标签本身不太可能用作搜索短语。也许是这样的:
You could create a new column for your full text index which would contain the original document plus the categories appended as metadata. Then a search on that column could search both the document and the categories simultaneously. You'd need to invent a tagging system that would keep them unique within your document yet the tags would not be likely to be used as search phrases themselves. Perhaps something like: