如何在语义网中映射主观数据?
我一直在研究用于存储数据的 freebase 项目。 它似乎是存储具体、客观数据(例如名称、位置和日期)的好地方。 它是存储意见或评级等主观数据的好地方吗? 是否有另一种/更好的开放数据、语义数据存储或存储和查询此类信息的策略?
此外,由于它是主观的,我可以确定其他人不会同意我的观点。 我如何内嵌存储其他人的意见,以便更好地代表群众意见?
freebase 是否适合存储此类数据?
例如:餐厅评级或电影评级。 电影评级可能不如餐厅评级对时间敏感。 有关输入数据的人的任何非识别信息对于确定其他因素和关系都很有用。
I've been looking at the freebase project for storing data. It seems to be a great place to store concrete, objective data like names, locations and dates. Is it a good place to store subjective data like opinions or ratings? Is there another/better open data, semantic data store or strategy for storing and querying this kind of information?
Additionally, since it is subjective I can be sure that others will not agree with my opinion. How would I store the opinions of others inline so the crowd opinion could be represented better?
Is freebase the right place to store this type of data?
For example: a restaurant rating or a movie rating. The movie rating would probably be less time sensitive than the restaurant rating. Any non-identifying information about the person who entered the data would be interesting for determining other factors and relationships.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
在很大程度上,语义网或多或少是一阶逻辑的变体,因此重要的部分是清楚地理解每个谓词的“含义”。 这个想法非常简单,但适用于各种各样的含义表示——即它位于数据库实体模型的后面。
在语义网络表示中表示您提到的信息应该没有问题。 只要确保对每个谓词的含义有一个清晰的定义,这样含义就不会随着时间的推移而改变,并且最终会得到不一致的表示。
Genesereth 的书很旧,但如果您有兴趣进一步详细了解这方面内容,那么这是一本好书。 我认为很多从事语义网工作的人都参与了 Douglas Lenat 的 Cyc 项目,该项目随着时间的推移逐渐转向基于逻辑的意义表示。
http://www.amazon.com/Logical-Foundations-Artificial-Intelligence-Genesereth/ dp/0934613311
Cyc 网站:
http://www.cyc.com/
The Semantic Web is more or less a variant of first-order logic, for the most part, so the important part is to have a clear understanding of what each of your predicates "mean". This idea is very simple but applicable to a wide-variety of meaning representations - i.e. it is behind the entity model of databases.
There should be no problem representing the information you mentioned in a semantic web representation. Just be sure to have a clear definition of what each of your predicates denote, so that the meaning doesn't shift over time and you end up with an inconsistent representation.
Genesereth's book is old but a good one if you are interested in reading about this in further detail. I think a lot of people who worked on the Semantic Web were involved in Douglas Lenat's Cyc project which gradually shifted to a logic-based meaning representation over time.
http://www.amazon.com/Logical-Foundations-Artificial-Intelligence-Genesereth/dp/0934613311
The site for Cyc:
http://www.cyc.com/
我发现如果不了解我将使用该数据提出的问题,设计/选择数据格式是非常困难的。 您希望这些数据用于什么目的? 提出一些可以指导您搜索的用例。
存储归因数据是一个开放的研究主题,随着情报界(以及其他地方)的发展:这些用户显然需要跟踪信息的来源以及在此过程中谁添加了信息,以验证其可靠性和做一些事情,比如跟踪秘密信息是否被意外包含在内。 那可能是一个值得一看的好地方。
I find designing/selecting data formats is very hard without an understanding of the questions I will be asking using that data. What purpose do you expect the data to be used for? Come up with some use cases and that may guide your search.
Storing attributed data is an open research topic, with development in (among other places) the Intelligence community: these users obviously need to keep track of where information came from, and who has added to it along the way, both to verify its reliability and to do things like track whether Secret information has been included by accident. That may be a good place to look.
数据就是数据,你要做的就是给数据贴上标签,比如意见或评级。 我认为从这些数据中可以推断出的一个“事实”是,大多数人对上述主题有
x
主观意见。Data is data, what you want to do is label the data as what it is, an opinion or a rating. A "fact" I suppose which could be inferred from such data would be that most people had
x
subjective opinion about said topic.来自 twitter:
jimpick @the_real_kevinw 每个用户和应用程序/基础都有自己的命名空间,但我会问开发者邮件列表。 混搭可能更适合。
from twitter:
jimpick @the_real_kevinw Each user and app/base has their own namespace, but I'd ask on the developers mailing list. A mashup might fit better.