产品聚合器的数据库设计
我正在尝试为产品聚合器设计一个数据库。每个产品都有关于它的来源、成本、产品类型、价格、颜色等的信息。用户需要能够根据任何这些产品类别搜索和过滤结果。我也期望拥有大量的用户。我最初的想法是建立一个大表,其中包含每种产品,每条信息有一列,以及我需要能够搜索的任何内容的索引,但我认为这可能效率低下,因为有很多用户在处理这个问题桌子。我的另一个想法是组织数据库以促进表格的树状导航,但因为您可以按任何内容进行搜索,所以我不确定如何组织表格。
对一些好的做法有什么想法吗?
I'm trying to design a database for a product aggregator. Each product has information about where it comes from, what it costs, what type of thing it is, price, color, etc. Users need to able to search and filter results based on any of those product categories. I also expect to have a large number of users. My initial thought was having one big table with every product in it with a column for each piece of information and an index on anything I need to be able to search by but I think this might be inefficient with a lot of users pounding on this one table. My other thought was to organize the database to promote a tree-like navigation of tables but because you can search by anything I'm not sure how I would organize the tables.
Any thoughts on some good practices?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
一张产品表 - 数据库的设计目的是让大量用户在表上进行操作。
(来自评论)
您需要对数据进行建模。这来自于查看您拥有的所有数据,确定哪些内容与哪些内容相关(表称为关系,因为行中的所有属性都与候选键相关)。您还没有真正提供足够的信息来了解您在这些产品上拥有哪些数据(非结构化?)的范围以及它们如何变化。您是否会遇到困难,因为鞋子有品牌、型号、尺寸和颜色,而桌子只有品牌、型号和表面处理?所有这些都将为您的数据模型提供信息。通常,您有一个产品表,并且其他内容链接到它。
其中一些属性将是查找表的外键,其他属性(价格)将是简单的标量。适当的索引,你会没事的。对于高级分析,请考虑维度建模的星型模式,但可能不适用于您的实时交易系统 - 取决于您的数据流/工作流/交易是什么。或者考虑其原则在事务数据库中的一些好处。 Ralph Kimball 提供了有关维度建模的丰富信息。
One table of products - databases are designed to have lots of users pounding on tables.
(from the comments)
You need to model your data. This comes from looking at the all the data you have, determining what is related to what (a table is called a relation because all the attributes in a row are related to a candidate key). You haven't really given enough information about the scope of what data (unstructured?) you have on these products and how it varies. Are you going to have difficulties because Shoes have brand, model, size and color, but Desks only have brand, model and finish? All this is going to inform your data model. Typically you have one products table, and other things link to it.
Some of those attributes will be foreign keys to lookup tables, others (price) would be simple scalars. Appropriate indexing and you'll be fine. For advanced analytics, consider a dimensionally modeled star-schema, but perhaps not for your live transaction system - depends what your data flow/workflow/transactions are. Or consider some benefits of its principles in your transactional database. Ralph Kimball is source of good information on dimensional modeling.
我认为这里不需要树结构。你可以用单表来做。
如果您坚持使用具有层次结构的树结构,这里有一个 示例 来获取你开始了。
I dont see any need for the tree structure here. You can do with single table.
if you insist on tree structure with hierarchy here is an example to get you started.
用于基于文本的搜索,并且易于启动和使用设计方面,我强烈推荐Apache SOLR。 SOLR API 易于使用(尤其是 JSON)。数据库的文本搜索效果很差,我建议您只需确保它们正确响应主/唯一键查询,而这些是您应该索引的字段。
For text based search, and ease of startup & design, I strongly recommend Apache SOLR. The SOLR API is easy to use (especially JSON). Databases do text search poorly, and I would instead recommend that you just make sure that they respond to primary/unique key queries properly, and those are the fields you should index.
一个用于产品的表,另一个用于产品类别层次结构的表(您没有具体说您有这个,但“表的树状导航”让我认为您可能有)。
我可以看到,如果您计划对几乎每一列建立索引,您可能会担心过度索引会导致问题。在这种情况下,最好在您认为用户可能搜索的前 5 或 10 列上建立索引,除非用户可以在任何列上搜索。在这种情况下,您可能需要考虑构建数据仓库。也许您会想研究一下数据立方体,看看这些是否有帮助......?
对于层次结构数据,您需要一个如下所示的
PRODUCT_CATEGORY
表:一些示例数据:
某些 SQL 引擎(例如 Oracle)允许您编写递归查询以在单个查询中遍历层次结构。在此示例中,树的根的
PARENT_ID
为NULL
,但如果您不希望此列可为空,我也见过-1
用于相同目的。One table for the products, and another table for the product category hierarchy (you don't specifically say you have this but "tree-like navigation of tables" makes me think you might).
I can see you might be concerned about over-indexing causing problems if you plan to index almost every column. In that case, it might be best to index on the top 5 or 10 columns you think users are likely to search for, unless it's possible for a user to search on ANY column. In that case you might want to look at building a data warehouse. Maybe you'll want to look into data cubes to see if those will help...?
For hierarchical data, you need a
PRODUCT_CATEGORY
table looking something like this:Some sample data:
Some SQL engines (such as Oracle) allow you to write recursive queries to traverse the hierarchy in a single query. In this example, the root of the tree has a
PARENT_ID
ofNULL
, but if you don't want this column to be nullable, I've also seen-1
used for the same purposes.