是使用多个数据库各有一个架构更好,还是一个数据库有多个架构更好?

发布于 2024-07-26 23:49:03 字数 1705 浏览 13 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

痴骨ら 2024-08-02 23:49:03

PostgreSQL“架构”与 MySQL“数据库”大致相同。 在 PostgreSQL 安装上安装多个数据库可能会出现问题; 拥有许多模式将毫无问题。 因此,您肯定希望使用一个数据库和该数据库中的多个模式。

A PostgreSQL "schema" is roughly the same as a MySQL "database". Having many databases on a PostgreSQL installation can get problematic; having many schemas will work with no trouble. So you definitely want to go with one database and multiple schemas within that database.

临走之时 2024-08-02 23:49:03

我建议不要接受接受的答案 - 多个数据库而不是多个架构,原因如下:

  1. 如果您正在运行微服务,您希望强制无法加入之间的“模式”,因此数据不会纠缠在一起,开发人员最终不会加入其他微服务的模式,并想知道为什么当其他团队进行更改时,他们的东西不再起作用。
  2. 如果您的负载需要轻松,您可以稍后迁移到单独的数据库计算机。
  3. 如果您需要设置高可用性和/或复制,最好拥有彼此完全独立的单独数据库。 与整个数据库相比,您无法仅复制一种架构。

I would recommend against accepted answer - multiple databases instead of multiple schemas for this set of reasons:

  1. If you are running microservices, you want to enforce the inability to join between your "schemas", so the data is not entangled and developers won't end up joining other microservice's schema and wonder why when other team makes a change their stuff no longer works.
  2. You can later migrate to a separate database machine if your load requires with ease.
  3. If you need to have a high-availability and/or replication set up, it's better to have separate databases completely independent of each other. You cannot replicate one schema only compared to the whole database.
妄断弥空 2024-08-02 23:49:03

当然,我会采用一数据库多模式方法。 这使我可以转储所有数据库,但可以通过多种方式非常轻松地仅恢复一个数据库:

  1. 转储数据库(所有模式),将转储加载到新数据库中,仅转储我需要的模式,然后在主数据库中恢复D b。
  2. 单独转储模式,一个接一个(但我认为机器会受到更多这种方式的影响 - 我预计大约 500 个模式!)

否则,谷歌搜索我发现没有自动过程来复制模式(使用一个作为模板),但许多人建议这样做:

  1. 创建一个模板模式
  2. 当需要复制时,用新名称
  3. 重命名它
  4. 转储它 重命名回来
  5. 恢复转储
  6. 魔法就完成了。

我已经用 Python 编写了两行来做到这一点; 我希望他们可以帮助某人(在 2 秒内编写代码,不要在生产中使用它):

import os
import sys
import pg

# Take the new schema name from the second cmd arguments (the first is the filename)
newSchema = sys.argv[1]

# Temperary folder for the dumps
dumpFile = '/test/dumps/' + str(newSchema) + '.sql'

# Settings
db_name = 'db_name'
db_user = 'db_user'
db_pass = 'db_pass'
schema_as_template = 'schema_name'

# Connection
pgConnect = pg.connect(dbname= db_name, host='localhost', user= db_user, passwd= db_pass)

# Rename schema with the new name
pgConnect.query("ALTER SCHEMA " + schema_as_template + " RENAME TO " + str(newSchema))

# Dump it
command = 'export PGPASSWORD="' + db_pass + '" && pg_dump -U ' + db_user + ' -n ' + str(newSchema) + ' ' + db_name + ' > ' + dumpFile
os.system(command)

# Rename back with its default name
pgConnect.query("ALTER SCHEMA " + str(newSchema) + " RENAME TO " + schema_as_template)

# Restore the previous dump to create the new schema
restore = 'export PGPASSWORD="' + db_pass + '" && psql -U ' + db_user + ' -d ' + db_name + ' < ' + dumpFile
os.system(restore)

# Want to delete the dump file?
os.remove(dumpFile)

# Close connection
pgConnect.close()

Definitely, I'll go for the one-db-many-schemas approach. This allows me to dump all the database, but restore just one very easily, in many ways:

  1. Dump the db (all the schema), load the dump in a new db, dump just the schema I need, and restore back in the main db.
  2. Dump the schema separately, one by one (but I think the machine will suffer more this way - and I'm expecting like 500 schemas!)

Otherwise, googling around I've seen that there is no auto-procedure to duplicate a schema (using one as a template), but many suggest this way:

  1. Create a template-schema
  2. When need to duplicate, rename it with new name
  3. Dump it
  4. Rename it back
  5. Restore the dump
  6. The magic is done.

I've written two rows in Python to do that; I hope they can help someone (in-2-seconds-written-code, don’t use it in production):

import os
import sys
import pg

# Take the new schema name from the second cmd arguments (the first is the filename)
newSchema = sys.argv[1]

# Temperary folder for the dumps
dumpFile = '/test/dumps/' + str(newSchema) + '.sql'

# Settings
db_name = 'db_name'
db_user = 'db_user'
db_pass = 'db_pass'
schema_as_template = 'schema_name'

# Connection
pgConnect = pg.connect(dbname= db_name, host='localhost', user= db_user, passwd= db_pass)

# Rename schema with the new name
pgConnect.query("ALTER SCHEMA " + schema_as_template + " RENAME TO " + str(newSchema))

# Dump it
command = 'export PGPASSWORD="' + db_pass + '" && pg_dump -U ' + db_user + ' -n ' + str(newSchema) + ' ' + db_name + ' > ' + dumpFile
os.system(command)

# Rename back with its default name
pgConnect.query("ALTER SCHEMA " + str(newSchema) + " RENAME TO " + schema_as_template)

# Restore the previous dump to create the new schema
restore = 'export PGPASSWORD="' + db_pass + '" && psql -U ' + db_user + ' -d ' + db_name + ' < ' + dumpFile
os.system(restore)

# Want to delete the dump file?
os.remove(dumpFile)

# Close connection
pgConnect.close()
尾戒 2024-08-02 23:49:03

我想说,使用多个数据库和多个模式:)

PostgreSQL 中的模式很像 Oracle 中的包,如果您熟悉这些的话。 数据库旨在区分整个数据集,而模式更像是数据实体。

例如,您可以为整个应用程序使用一个数据库,其模式为“UserManagement”、“LongTermStorage”等。 然后,“UserManagement”将包含“User”表,以及用户管理所需的所有存储过程、触发器、序列等。

数据库是整个程序,模式是组件。

I would say, go with multiple databases AND multiple schemas :)

Schemas in PostgreSQL are a lot like packages in Oracle, in case you are familiar with those. Databases are meant to differentiate between entire sets of data, while schemas are more like data entities.

For instance, you could have one database for an entire application with the schemas "UserManagement", "LongTermStorage" and so on. "UserManagement" would then contain the "User" table, as well as all stored procedures, triggers, sequences, etc. that are needed for the user management.

Databases are entire programs, schemas are components.

夜无邪 2024-08-02 23:49:03

在 PostgreSQL 上下文中,我建议使用一个具有多个模式的数据库,因为您可以(例如)跨模式进行 UNION ALL,但不能跨数据库。 因此,一个数据库实际上与另一个数据库完全隔离,而模式并不与同一数据库中的其他模式隔离。

如果您出于某种原因将来必须跨模式整合数据,那么在多个模式上执行此操作将很容易。 对于多个数据库,您将需要多个数据库连接,并通过应用程序逻辑“手动”从每个数据库收集和合并数据。

后者在某些情况下具有优势,但对于主要部分,我认为单数据库多模式方法更有用。

In a PostgreSQL context I recommend to use one db with multiple schemas, as you can (e.g.) UNION ALL across schemas, but not across databases. For that reason, a database is really completely insulated from another database while schemas are not insulated from other schemas within the same database.

If you -for some reason- have to consolidate data across schemas in the future, it will be easy to do this over multiple schemas. With multiple databases you would need multiple db-connections and collect and merge the data from each database "manually" by application logic.

The latter have advantages in some cases, but for the major part I think the one-database-multiple-schemas approach is more useful.

天邊彩虹 2024-08-02 23:49:03

许多模式应该比许多数据库更轻量级,尽管我找不到证实这一点的参考文献。

但是,如果您确实想让事情保持非常独立(而不是重构 Web 应用程序以便将“客户”列添加到您的表中),您可能仍然希望使用单独的数据库:我断言您可以更轻松地恢复以这种方式访问​​特定客户的数据库,而不会干扰其他客户。

A number of schemas should be more lightweight than a number of databases, although I cannot find a reference which confirms this.

But if you really want to keep things very separate (instead of refactoring the web application so that a "customer" column is added to your tables), you may still want to use separate databases: I assert that you can more easily make restores of a particular customer's database this way -- without disturbing the other customers.

┊风居住的梦幻卍 2024-08-02 23:49:03

这取决于系统的可用性和连接性的设计方式。 这些数据库中存储的数据是什么。如果它们是链接数据,则它们可以保存在单个数据库实例上,但如果它们部分链接并且在一个系统关闭时可以部分运行,那么它们必须位于不同的实例上。

详细说明:-

1) 当您使用一个数据库实例并且使用多个数据库时,您会遇到这样的问题:如果您的连接中断(由于系统崩溃或 mysql 服务器宕机),所有数据库都会恢复正常同一实例上的应用程序也已关闭,因此您的所有应用程序都会受到影响。

2)当您为每个数据库分离数据库实例时,如果任何一个数据库系统发生故障,您的其他应用程序不会受到影响。因此其他应用程序只能运行依赖于故障数据库受影响的应用程序。

另外,在这两种情况下,我认为您还必须使用复制机制,以便可以在从数据库上完成负载平衡。

It depends on how the availability and connectivity of your system is designed. What are the data that are stored in these Databases.If they are linked data, there they can be kept on single DB instance but if they are partially linked and can run partially if one system is down then it must be on different instances.

Detailed explanation:-

1) When you use one DB instance and in that you use multiple databases, then you are caught up with the issue that if your connection goes down(due to system crash or mysql server is down),all Databases as they are on same instance are also down, so all your applications are impacted.

2) When you separate DB instance for each Database,then if any one Database system is down,your other applications doesn't have impact.So other application can run only the application which depends on down DB is impacted.

Also,in both the cases i think you must also use replication mechanism so that load balancing can be done on slave Databases.

烈酒灼喉 2024-08-02 23:49:03

<块引用>

使用具有多个模式的单个数据库是一个好方法
在postgres数据库中实践是因为:

  1. postgres 中的数据库之间不共享任何数据。
  2. 与服务器的任何给定连接只能访问连接请求中指定的单个数据库中的数据。

使用多个模式:

  1. 允许多个用户使用同一个数据库而不互相干扰。
  2. 将数据库对象组织成逻辑组,使其更易于管理。
  3. 第三方应用程序可以放入单独的架构中,这样它们就不会与其他对象的名称发生冲突。

Working with single Database with multiple Schemas is good way to
practice in postgres database because:

  1. No any data is shared across databases in postgres.
  2. any given connection to the server can access only the data in the single database, the one specified in the connection request.

With using multiple schemas:

  1. To allow many users to use one database without interfering with eachother.
  2. To organize database objects into logical groups to make them more manageable.
  3. Third party applications can be put into separate schemas so they cannot collide with the names of other objects.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文