如何构建可交叉引用的多语言词典数据库?

发布于 2024-12-14 11:55:52 字数 879 浏览 1 评论 0原文

该数据库应支持蒙古文、英文、中文之间的三种方式交叉引用,并作为每种语言各自的词典。

英语编码的信息包括:单词、国际音标发音、定义、例句、复数拼写、复数发音、同义词、反义词、单词类型、学习注、中文对应、蒙古文对应、蒙古文对应

中文:繁体字、简体字、释义、拼音发音), 例子句子、同义词、反义词、HSK等级、笔画、部首、查部首、搭配量词、词型、简字解说、深入字解、学习笔记、英语对等( s)、蒙古文 C. 等效项、蒙古文 S. 等效项

蒙古西里尔文:西里尔文单词、西里尔文定义、西里尔文定义、西里尔文示例、c。同义词,c.反义词、中文对等、内蒙古文(脚本)等价含义、蒙古文等价拼写、Eng。等效,中国。 Eqiv.(s)

蒙古文字:脚本、脚本替代结尾、脚本定义、脚本定义、s。同义词,s.反义词词类型、学习笔记、工程。等效,中国。 Eqiv.(s)、外蒙古语(西里尔文)等效含义、蒙古语西里尔文等效拼写。

我对数据库很陌生。起初我考虑为每种语言制作一个表格,但这留下了所有复数项目的问题。 现在我想知道是否需要为每种语言的每个项目提供一个表格,以确保我不会最终无法包含我需要的所有信息。我在想,对于每个条目,语言之间的链接将基于它的 ID/PK。

  1. 我对数据库的想法正确吗?
  2. 如果我想包含这么多信息,那么每个可能包含多个值的项目都需要自己的表,不是吗?
  3. 但是,确实,编辑它应该很容易,只要每个值都通过它的 PK 链接,我就可以从一个界面编辑一种语言(或跨语言)的所有值,对吧?
  4. 不知道可能有多少个多重条目的问题怎么办?例如,某些单词可能比其他单词有更多的等价词或更多的同语言同义词。这是一个问题,还是您只是在表中添加更多列,没有问题?

This database should support three way-cross referencing between Mongolian, English, an Chinese, as well as stand as a dictionary of its own for each language.

The information encoded for English would include things like: Word, IPA pronunciation (s), definition(s), example sentence(s), plural spelling, plural pronunciation, synonyms, antonyms, word type, study note(s), Chinese equivalent(s), Mongolian C. equivalent(s), Mongolian S. equivalent(s)

Chinese: Traditional character, Simplified character, definition(s), pinyin pronunciation(s), example sentence(s), synonyms, antonyms, HSK test level, strokes, radical(s), lookup radical, collocating measure words, word type, simple explanation of character, in depth explanation of character, study note(s), English equivalent(s), Mongolian C. equivalent(s), Mongolian S. equivalent(s)

Mongolian Cyrillic: Cyrillic word, Cyrillic definition, Cyrillic definition(s), Cyrillic examples, c. synonyms, c. antonyms, Chinese equivalents, Inner Mongolian (script) equivalent meaning, Mongolians script equivalent spelling,Eng. Equiv.(s), Chin. Eqiv.(s)

Mongolian script: script, script alternative ending, script definition(s), script definition(s), s. synonyms, s. antonyms word type(s), study note(s), Eng. Equiv.(s), Chin. Eqiv.(s), Outer Mongolian (Cyrillic) equivalent meaning(s), Mongolian Cyrillic equivalent spelling.

I'm very new with databases. At first I considered making a table for each language, but this leaves a problem with all plural (s) items.
Now I wonder if I need a table for each item of each language in order to ensure I don't wind up not being able to include all the information I need. I was thinking, for each entry the link between languages would be base on it's ID/PK.

  1. Do I have the right idea with the database?
  2. If I want to include this much information, then each item that may include multiple values needs its own table, no?
  3. But, it's true that editing this should be easy, provided each is linked by it's PK, I can edit all values from a language (or cross language) from one interface, right?
  4. What about the issue of not knowing how many multiple entries there may be. For instance, some words may have more equivalents, or more same language synonyms than others. Is this an issue, or do you just add more columns in the table, no problem?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

简单气质女生网名 2024-12-21 11:55:52

我会致力于为所有语言建立一个统一的结构。这将使维护变得更容易,并为其编写一个编辑器。

此外,我会规范化属性,这样就不会有很多重复或未使用的列。这在属性可以具有多个值(例如多个定义或多个复数形式)的情况下也很有帮助。

这就是我开始的方式,我保留了许多设计决策,例如是否始终使用数字 ID、是否强制执行外键约束等。我已将表名称和主键加粗。

  • 表:word_language(域表)
    • word_id:一个数字,可能会自动递增。主键,FK 到 word_attributes
    • 语言(_id):字符串(例如:“english”)、“语言”域表中的 FK,或两者兼而有之
    • 名称(可选,也可以是属性):字符串,单词(例如:“lamp”)
  • 表:word_attributes(一对多)
    • word_id:主键,FK 到 word_language
    • attribute(_id|_key):“属性”域表的 FK、字符串(例如:“复数”)或两者兼而有之
    • attribute_value:一个字符串,实际值(例如:“lamps”)
  • 表:语言(可选,域表)
    • 语言(_id):自动递增数字或字符串(例如:英语)主键
    • 名称(可选,如果语言是 ID,则使用):字符串(例如:“english”)等
    • (描述语言的其他有用列)
  • 表:属性(可选,域表)
    • 属性(_id|_key):数字或字符串,主键
    • 语言(_id)(可选):可以轻松查找语言具有哪些属性,是主键的一部分
    • 描述:描述属性,也许编辑工具中会显示什么内容
  • 表:等效(可选,多对多,也可以是属性)
    • source_word_id:单词主键
    • destination_word_id:不同语言中的等效单词

I would go about having a unified structure for all languages. This will make it easier to maintain, plus write up an editor for it.

Additionally, I would normalize the attributes, so you don't have many repeated or unused columns. This also helps in cases where an attribute can have multiple values, like multiple definitions or multiple plural forms.

This is how I would start, I'm leaving many design decisions open, such as whether to always use number ids, whether foreign key constraints are enforced, etc. I've bolded the table name and the primary key(s).

  • Table: word_language (domain table)
    • word_id: a number, probably auto incrementing. primary key, FK to word_attributes
    • language(_id): either a string (eg: "english"), a FK into "languages" domain table, or both
    • name (optional, could also be an attribute): a string, the word (eg: "lamp")
  • Table: word_attributes (one to many)
    • word_id: primary key, FK to word_language
    • attribute(_id|_key): either a FK to a "attributes" domain table, a string (eg: "plural"), or both
    • attribute_value: a string, the actual value (eg: "lamps")
  • Table: languages (optional, domain table)
    • language(_id): either an auto-incrementing number or a string (eg: english) primary key
    • name (optional, use if language if is an id): string (eg: "english"), etc
    • (other useful columns describing languages)
  • Table: attributes (optional, domain table)
    • attribute(_id|_key): number or string, primary key
    • language(_id) (optional): makes it easy to look up what attributes a language has, part of primary key
    • description: describe the attribute, perhaps what will show up in the editing tool
  • Table: equivalents (optional, many to many, could also be an attribute)
    • source_word_id: a word primary key
    • destination_word_id: equivalent word in a different language
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文