当前位置：文江博客话题详情

如何构建可交叉引用的多语言词典数据库？

发布于 2024-12-14 11:55:52 字数 879 浏览 1 评论 0原文

该数据库应支持蒙古文、英文、中文之间的三种方式交叉引用，并作为每种语言各自的词典。

为英语编码的信息包括：单词、国际音标发音、定义、例句、复数拼写、复数发音、同义词、反义词、单词类型、学习注、中文对应、蒙古文对应、蒙古文对应

中文：繁体字、简体字、释义、拼音发音），例子句子、同义词、反义词、HSK等级、笔画、部首、查部首、搭配量词、词型、简字解说、深入字解、学习笔记、英语对等( s)、蒙古文 C. 等效项、蒙古文 S. 等效项

蒙古西里尔文：西里尔文单词、西里尔文定义、西里尔文定义、西里尔文示例、c。同义词，c.反义词、中文对等、内蒙古文（脚本）等价含义、蒙古文等价拼写、Eng。等效，中国。 Eqiv.(s)

蒙古文字：脚本、脚本替代结尾、脚本定义、脚本定义、s。同义词，s.反义词词类型、学习笔记、工程。等效，中国。 Eqiv.(s)、外蒙古语（西里尔文）等效含义、蒙古语西里尔文等效拼写。

我对数据库很陌生。起初我考虑为每种语言制作一个表格，但这留下了所有复数项目的问题。现在我想知道是否需要为每种语言的每个项目提供一个表格，以确保我不会最终无法包含我需要的所有信息。我在想，对于每个条目，语言之间的链接将基于它的 ID/PK。

我对数据库的想法正确吗？
如果我想包含这么多信息，那么每个可能包含多个值的项目都需要自己的表，不是吗？
但是，确实，编辑它应该很容易，只要每个值都通过它的 PK 链接，我就可以从一个界面编辑一种语言（或跨语言）的所有值，对吧？
不知道可能有多少个多重条目的问题怎么办？例如，某些单词可能比其他单词有更多的等价词或更多的同语言同义词。这是一个问题，还是您只是在表中添加更多列，没有问题？

原文

This database should support three way-cross referencing between Mongolian, English, an Chinese, as well as stand as a dictionary of its own for each language.

The information encoded for English would include things like: Word, IPA pronunciation (s), definition(s), example sentence(s), plural spelling, plural pronunciation, synonyms, antonyms, word type, study note(s), Chinese equivalent(s), Mongolian C. equivalent(s), Mongolian S. equivalent(s)

Chinese: Traditional character, Simplified character, definition(s), pinyin pronunciation(s), example sentence(s), synonyms, antonyms, HSK test level, strokes, radical(s), lookup radical, collocating measure words, word type, simple explanation of character, in depth explanation of character, study note(s), English equivalent(s), Mongolian C. equivalent(s), Mongolian S. equivalent(s)

Mongolian Cyrillic: Cyrillic word, Cyrillic definition, Cyrillic definition(s), Cyrillic examples, c. synonyms, c. antonyms, Chinese equivalents, Inner Mongolian (script) equivalent meaning, Mongolians script equivalent spelling,Eng. Equiv.(s), Chin. Eqiv.(s)

Mongolian script: script, script alternative ending, script definition(s), script definition(s), s. synonyms, s. antonyms word type(s), study note(s), Eng. Equiv.(s), Chin. Eqiv.(s), Outer Mongolian (Cyrillic) equivalent meaning(s), Mongolian Cyrillic equivalent spelling.

I'm very new with databases. At first I considered making a table for each language, but this leaves a problem with all plural (s) items.
Now I wonder if I need a table for each item of each language in order to ensure I don't wind up not being able to include all the information I need. I was thinking, for each entry the link between languages would be base on it's ID/PK.

Do I have the right idea with the database?
If I want to include this much information, then each item that may include multiple values needs its own table, no?
But, it's true that editing this should be easy, provided each is linked by it's PK, I can edit all values from a language (or cross language) from one interface, right?
What about the issue of not knowing how many multiple entries there may be. For instance, some words may have more equivalents, or more same language synonyms than others. Is this an issue, or do you just add more columns in the table, no problem?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

简单气质女生网名 2024-12-21 11:55:52

我会致力于为所有语言建立一个统一的结构。这将使维护变得更容易，并为其编写一个编辑器。

此外，我会规范化属性，这样就不会有很多重复或未使用的列。这在属性可以具有多个值（例如多个定义或多个复数形式）的情况下也很有帮助。

这就是我开始的方式，我保留了许多设计决策，例如是否始终使用数字 ID、是否强制执行外键约束等。我已将表名称和主键加粗。

表：word_language（域表）
- word_id：一个数字，可能会自动递增。主键，FK 到 word_attributes
- 语言(_id)：字符串（例如：“english”）、“语言”域表中的 FK，或两者兼而有之
- 名称（可选，也可以是属性）：字符串，单词（例如：“lamp”）
表：word_attributes（一对多）
- word_id：主键，FK 到 word_language
- attribute(_id|_key)：“属性”域表的 FK、字符串（例如：“复数”）或两者兼而有之
- attribute_value：一个字符串，实际值（例如：“lamps”）
表：语言（可选，域表）
- 语言(_id)：自动递增数字或字符串（例如：英语）主键
- 名称（可选，如果语言是 ID，则使用）：字符串（例如：“english”）等
- （描述语言的其他有用列）
表：属性（可选，域表）
- 属性(_id|_key)：数字或字符串，主键
- 语言(_id)（可选）：可以轻松查找语言具有哪些属性，是主键的一部分
- 描述：描述属性，也许编辑工具中会显示什么内容
表：等效（可选，多对多，也可以是属性）
- source_word_id：单词主键
- destination_word_id：不同语言中的等效单词