从句子/查询中提取位置的方法有哪些?

发布于 2024-09-27 00:54:29 字数 200 浏览 8 评论 0原文

我想识别并提取句子中内置的位置。例如,我可能有一句话:

“我喜欢马萨诸塞州波士顿的披萨。”但这同一句话也可以写成 “波士顿的披萨,我喜欢它。”或者 “我喜欢波士顿的披萨。”

所以我必须能够在句子中的任何地方找到它,如果状态不包括在内的话。为了让事情变得更加复杂,人们会做诸如“ft.”或“s”之类的事情。对于堡垒或南部,所以我也需要一种方法来识别这些。

I want to recognize and extract a location that's built into a sentence. For example I might have a sentence:

"I love the pizza in Boston, Ma." but this same sentence could also be written as
"Pizza in Boston, I love it." OR
"I love the pizza in Boston."

So I have to be able to find it anywhere in the sentence and also if the state is not included. To makes things even more complicated people do things like ft. or s. for fort or south so I need a way to recognize these as well.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

两相知 2024-10-04 00:54:30

我非常幸运地使用 alchemyapi 从句子中提取名人姓名 - 您可能想看看他们的 实体提取将满足您的需要(查看该页面底部的示例,它可能会)。

I've had pretty good luck using alchemyapi to extract celebrity names from sentences - you might want to see if their entity extraction will do what you need (looking at the example on the bottom of that page, it just might).

吾性傲以野 2024-10-04 00:54:29

解析句子中的每个单词并对照“位置”字典进行检查怎么​​样?命中后,二次测试可以包括之前/之后的单词,以用相应的城市(之前的单词)或州(之后的单词)来补充该位置。

对于“堪萨斯城最好的布法罗鸡翅”之类的事情来说,这会变得很棘手;当您在单个语句中找到多个位置时,您可以使用单独的“两用”词典来消除“Buffalo”等单词。您还可以搜索“in”、“near”、“at”等介词来识别真实位置。

How about parse each word in the sentence and check against a "location" dictionary? After a hit, secondary tests could include the word before/after to complement the location with a corresponding city (word before) or state (word after).

This will get tricky for things like "The best Buffalo Wings in Kansas City"; When you find multiple locations in a single statement, you could use a seperate "dual-use" dictionary to eliminate words like "Buffalo". You could also search for prepositions like "in", "near", "at", etc. to identify the real location.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文