与具有替代拼写的表进行名称匹配

发布于 2024-10-20 22:45:21 字数 349 浏览 1 评论 0原文

我有一个表格,其中包含国家/地区名称的替代拼写:

使用名称、Alt1、Alt2、Alt3、Alt4
[...]
巴哈马,“巴哈马”
玻利维亚
波斯尼亚和黑塞哥维那, 波斯尼亚和黑塞哥维那黑塞哥维那
[...]

(某些国家/地区有 0 个替代拼写,其他国家最多有 4 个。)

给定一个国家/地区字符串,从性能角度来看,返回第一列中的元素的最佳解决方案是什么? (在大多数情况下,与替代拼写的数量无关,字符串与第一列匹配,并且不必进行名称匹配。在其他情况下,概率均匀分布在第 2-X 列中。

(最好在JavaScript 或 PHP,谢谢:))

I have a table with alternative spellings of country names:

Use name, Alt1, Alt2, Alt3, Alt4
[...]
Bahamas, "Bahamas, The"
Bolivia
Bosnia and Herzegovina, Bosnia & Herzegovina
[...]

(Some countries have 0 alternative spellings, other up to 4.)

Given a country string, what is performance-wise the best solution to returning the element in the first column? (In most of the cases, independent of the number of alternative spellings, the string matches the first column and doesn't have to be name-matched. In the other cases the probability is evenly distributed across column 2-X.

(Preferably in JavaScript or PHP, thanks :) )

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

顾北清歌寒 2024-10-27 22:45:21

在我看来,我认为您应该将其放入数据库中的两个单独的表中:

countries: id, ..., ...

countries_names: country_id, default (bool or enum('yes', 'no')), name

在名称列上放置索引,以便您可以快速搜索正确的名称和国家ID:(

SELECT name FROM countries_names
WHERE default = 'yes'
AND country_id = (SELECT country_id FROM countries_names
                  WHERE name = 'search_string'
                  LIMIT 1)
LIMIT 1

或者,您可以添加LEFT JOIN< /code> 如果您需要主表中的更多信息)

其他选项是仅为备用名称创建一个表:

countries: id, name, ..., ...

countries_alternative_names: country_id, name

但是在查找第一个匹配项时,您必须在两个表中搜索。


编辑:静态JavaScript解决方案:

function getCountryName(var name) {
  switch (name) {
    case "Bahamas" :
    case "The Bahamas" :
      return "Bahamas";
    case "Bolivia" :
      return "Bolivia";
    case "Bosnia and Herzegovina" :
    case "Bosnia & Herzegovina" :
      return "Bosnia and Herzegovina"

    // ...

    default :
      return null;
  }
}

In my opinion, I think you should put this in two separate tables in database:

countries: id, ..., ...

countries_names: country_id, default (bool or enum('yes', 'no')), name

Put an index on the name column so that you can search fast for the correct name and country_id:

SELECT name FROM countries_names
WHERE default = 'yes'
AND country_id = (SELECT country_id FROM countries_names
                  WHERE name = 'search_string'
                  LIMIT 1)
LIMIT 1

(optionally, you can add a LEFT JOIN if you need more info from the main table)

Other option would be to create a table only for alternative names:

countries: id, name, ..., ...

countries_alternative_names: country_id, name

But you'd have to search in two tables when looking for a first match.


EDIT: Static JavaScript solution:

function getCountryName(var name) {
  switch (name) {
    case "Bahamas" :
    case "The Bahamas" :
      return "Bahamas";
    case "Bolivia" :
      return "Bolivia";
    case "Bosnia and Herzegovina" :
    case "Bosnia & Herzegovina" :
      return "Bosnia and Herzegovina"

    // ...

    default :
      return null;
  }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文