使用 mysql/PHP 进行本地化
我目前正在使用 PHP 和 MySQL 构建一个多语言网站,我想知道关于我的数据模型的本地化最佳方法是什么。每个表都包含需要翻译成多种语言的字段(例如name
、description
...)。
第一个想法是为每种语言创建一个字段(例如 name_en
、name_de
、name_fr
),并使用变量从 PHP 检索适当的字段(例如$entry['name_'.LANGUAGE]
)。虽然它可行,但我认为这种方法有很多缺点:
-您需要的每个字段的出现次数与您所使用的语言一样多(请记住,您可以使用 en-US、en-CA、en-GB...)
-如果添加或删除语言,则需要相应地修改数据库结构
-如果您有未翻译的字段,仍然会为每个条目创建它们,这似乎不是很优化
第二个想法是创建一个翻译表,可用于存储数据库中任何表的任何字段的翻译:
---------------- translation ---------------- id INT table_name VARCHAR field_name VARCHAR value VARCHAR language_id VARCHAR
< code>table_name 和 field_name
将允许识别翻译涉及哪个表和哪个字段,而 language_id
将指示翻译的语言。我们的想法是创建模型,根据用户选择的语言,用相应的翻译来替换可翻译字段的值(例如name
、description
)。
您能看出这种方法的缺点吗?您有什么建议吗?
谢谢。
I'm currently building a multilingual website using PHP and MySQL and am wondering what the best approach regarding localization is with regards to my data model. Each table contains fields that need to be translated (e.g. name
, description
....) in several languages.
The first idea was to create a field for each language (e.g. name_en
, name_de
, name_fr
) and retrieve the appropriate field from PHP using a variable (e.g. $entry['name_' . LANGUAGE]
). While it would work, this approach has in my opinion many drawbacks:
-you need as many occurrences of each field as you have languages (bearing in mind you can have en-US, en-CA, en-GB...)
-if you add or remove languages you need to modify the database structure accordingly
-if you have untranslated fields, they are still created for each entry which doesn't seem very optimized
The second idea is to create a translation table that can be used to store the translation of any field of any table in the database:
---------------- translation ---------------- id INT table_name VARCHAR field_name VARCHAR value VARCHAR language_id VARCHAR
The table_name
and field_name
will allow identifying which table and which field the translation is about, while language_id
will indicate which language that translation if for. The idea is to create models that would replace the value of the translatable fields (e.g. name
, description
) by their corresponding translation based on the language selected by the user.
Can you see drawbacks with this approach? Have you got suggestions to make?
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
主要缺点是通过将表名和字段名等元数据存储为应用程序数据会破坏关系模型。你的查询太丑陋而且无效。
另一个缺点是您只能使用可翻译数据的一种数据类型。您的表结构将定义
这意味着您将始终在 VARCHAR(255) 中存储需要较小字段的数据。如果你想让它更通用来存储大文本,你需要定义它,
这更糟糕。
流行的模型如下。对于每个实体,您定义不依赖于语言的字段和依赖于语言的字段,并始终创建 2 个表。例如:
这是正确的关系方法。当然,它也有一个主要缺点,那就是您总是会连接 2 个表来获取项目,并且添加/更新数据变得有点复杂。
The main drawback is that you destroy the relational model by storing metadata like table name and field name as application data. You queries would be too ugly and non-effective.
Another drawback is that you are limited only to one data type of the translatable data. Your table structure would define
which means you would store data that would require smaller field always in VARCHAR(255). And if you like to have it even more universal to store also large text you need to define it
which is even worse.
The popular model is the following. For every entity you define the fields which are not language dependent and those which are language dependent and create always 2 tables. For example:
This is the proper relational approach. Of course, it also has drawbacks major one being that you would always join 2 table to fetch items and adding/updating of data becomes a bit more complex.
不确定这个答案是否会让您满意,但我区分了两种类型的文本:
静态文本是您自己提供的,用于用户无法影响的一般应用程序文本。诸如表单输入标签和介绍性文本之类的东西。我使用 gettext 来处理这些内容,因此如果我需要翻译,我可以将其发送给专业翻译人员。
动态文本是应用程序用户提供的文本,这似乎就是您所说的。就我个人而言,我也将动态文本分为两种不同的类型。
一般类型的一个示例是 HTML 选择元素内的选项或标记系统。它们并不特定于单个内容元素,但(可以)适用于多个内容元素。
特定文本的示例可以是 CMS 内的实际内容(例如文章)或在线商店中的产品描述。
对于第一个,我使用一种中央查找表,其中实际原始文本的哈希值作为索引,我将其称为使用该字符串的表中的外键。然后,您在中心表中查找该哈希值以回显其背后的真实文本(当然,您应该在此处使用某种缓存)。
对于后一个,我使用经典的内容表,其中包含特定于该逻辑内容单元的每个内容区域的列和每种语言的行。
到目前为止,一切进展顺利。
Not sure if this answer will satisfy you, but I discern between two types of texts:
Static text is provided by yourself for general application text that users have no influence on. Stuff like form input labels and introductory text. I use gettext for those, so I can send it off to professional translators, if i need it translated.
Dynamic text is text provided by the user of the application, which seems to be what you're talking about. Personally, I discern dynamic text into 2 different types as well.
An example of the general type would be options inside of HTML select elements, or a tagging system. They're not specific to a single content element, but (can) apply to multiple ones.
Examples for a specific text would be the actual content inside of a CMS like an article, or a product description in an online shop.
For the first one I use a kind of central lookup table with a hash of the actual, original text as the index, which i refer to as a foreign key in tables where i use that string. Then you look up that hash in the central table to echo the real text behind it (of course, you ought to use some sort of caching here).
For the latter one I use a classic content table with columns for every content area specific to that logical content unit and a row for each language.
Thus far it's working out pretty well.