从句子/查询中提取位置的方法有哪些?
我想识别并提取句子中内置的位置。例如,我可能有一句话:
“我喜欢马萨诸塞州波士顿的披萨。”但这同一句话也可以写成 “波士顿的披萨,我喜欢它。”或者 “我喜欢波士顿的披萨。”
所以我必须能够在句子中的任何地方找到它,如果状态不包括在内的话。为了让事情变得更加复杂,人们会做诸如“ft.”或“s”之类的事情。对于堡垒或南部,所以我也需要一种方法来识别这些。
I want to recognize and extract a location that's built into a sentence. For example I might have a sentence:
"I love the pizza in Boston, Ma." but this same sentence could also be written as
"Pizza in Boston, I love it." OR
"I love the pizza in Boston."
So I have to be able to find it anywhere in the sentence and also if the state is not included. To makes things even more complicated people do things like ft. or s. for fort or south so I need a way to recognize these as well.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我非常幸运地使用 alchemyapi 从句子中提取名人姓名 - 您可能想看看他们的 实体提取将满足您的需要(查看该页面底部的示例,它可能会)。
I've had pretty good luck using alchemyapi to extract celebrity names from sentences - you might want to see if their entity extraction will do what you need (looking at the example on the bottom of that page, it just might).
解析句子中的每个单词并对照“位置”字典进行检查怎么样?命中后,二次测试可以包括之前/之后的单词,以用相应的城市(之前的单词)或州(之后的单词)来补充该位置。
对于“堪萨斯城最好的布法罗鸡翅”之类的事情来说,这会变得很棘手;当您在单个语句中找到多个位置时,您可以使用单独的“两用”词典来消除“Buffalo”等单词。您还可以搜索“in”、“near”、“at”等介词来识别真实位置。
How about parse each word in the sentence and check against a "location" dictionary? After a hit, secondary tests could include the word before/after to complement the location with a corresponding city (word before) or state (word after).
This will get tricky for things like "The best Buffalo Wings in Kansas City"; When you find multiple locations in a single statement, you could use a seperate "dual-use" dictionary to eliminate words like "Buffalo". You could also search for prepositions like "in", "near", "at", etc. to identify the real location.