如何存储元数据数据类型
如何存储描述资产的元数据? 考虑到我事先不知道我将拥有哪些元数据字段。
我有一张包含资产的表(比如说人员), 包含元数据字段的表(“姓名”、“年龄”、“出生日期”……) 包含链接到元数据字段表的元数据值的表(“John Doe”、44、“1968-10-10”、...) 以及将元数据字段链接到资产的交叉表元数据。
我的问题是如何处理元数据字段表中的不同数据类型。 “John Doe”是文本,44 是整数,1968-10-10 是日期。
我是否将这些存储在元数据字段表的 txt 字段中,但我能够比较日期吗?
或者我是否将数据类型存储在此表中,并为 txt、int 和 date 创建 3 个字段。但后来我有很多空的领域。
或者我是否为每种数据类型创建不同的元数据字段表(例如:metadatafields_txt、metadatafields_int、metadatafields_date),但随后我无法正确链接到元数据表。
这里的最佳实践是什么?
TX
How do I store metadata to describe assets?
Considering that I don't know in advance which metadatafields I'll have.
I have a table with assets (let's say persons),
a table with metadata fields ("name", "age", "day of birth", ...)
a table with metadata values that links to the metadata fields table ("John Doe", 44, "1968-10-10", ...)
and a crosstable metadata that links the metadatafields to the assets.
My problem is how do I handle the different datatypes in the metadata fields table.
"John Doe" is text, 44 is int, 1968-10-10 is a date.
Do i store these in a txt-field in my metadatafields table, but will I be able to compare dates?
Or do I store the datatype in this table and do I make 3 fields for txt,int and date. But then I have a lot of empty fields.
Or do I make different metadata fields tables for each datatype (eg: metadatafields_txt, metadatafields_int, metadatafields_date) but then I can't link properly to the metadata table.
What is the best practice here?
tx
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您的答案取决于您想要如何处理元数据:如果您只想存储它们,
VARCHAR
字段中的文本表示可能就足够了,它将简化您的查询。当您想要查询元数据时,请不要存储文本表示,否则您会被
8<10
与'8'>'10'
和朋友们。在这种情况下,我建议您要么拥有一个包含 3 个字段的元数据表,要么甚至拥有 3 个元数据表。我怀疑一个包含 3 个字段的表是最好的选择 - 查询仍然很容易,并且空间浪费是可以管理的(一个 int 是 4 个字节,一个空的 varchar 是 2 或 3 个字节)。顺便说一句:对于其他数据类型,您可以充分利用 int 字段:通过存储日期的 unix 时间戳,您也许可以避免一些
UNIX_TIMESTAMP()
或FROM_UNIXTIME()
魔法稍后。对于字符串,您可能需要长度(尤其是如果您使用 C 风格的 API)The question to your answer depends, on what you want to do with the metadata: If you just want to store them, a text representation in a
VARCHAR
field might be enough and it will simplify your queries.The moment you want to query your metadata, stay away from storing a textual representation or you will be burned by
8<10
vs.'8'>'10'
and friends. In thsi case I recommend you either have a metadata table with 3 fields, or even have 3 metadata tables. I suspect to be the sweet point to be a single table with 3 fields - queries still quite easy, and space waste manageable (an int is 4 bytes, an empty varchar is 2 or 3 bytes).BTW: You could make good use of the int field, for other data types: By storing the unix timestamp for a date, you might be able to avoid some
UNIX_TIMESTAMP()
orFROM_UNIXTIME()
magic later. For strings you might want the length (esp. if you use a C-ish API)所有这些都是有效的选项(在第一种情况下略有修改 - 我将存储一个附加字段来指示正在存储哪种类型的数据,并在需要时进行适当的转换)。
第三个选项应该是可行的 - 您可以将连接保留到查询中的三个不同的表,而不是内部连接到一个表。
正如尤根所说,这取决于 - 您是否事先知道您期望拥有多少元数据记录(仅达到一个数量级 - 数千、数百万、更多?),以及它们是否可能绝大多数属于一种类型?
All of these are valid options (with a slight modification in the first case - I would store an additional field to indicate which type of data is being stored, and convert appropriately where required).
The third option should be viable - instead of inner joining to one table, you would left join to the three different tables in your query.
As Eugen says, it depends - do you know in advance roughly how many metadata records (just to an order of magnitude - thousands, millions, more?) you expect to have, and whether they are likely to be overwhelmingly of one type?