在数据库中存储大量图像?一次好的经历吗?

发布于 2024-11-07 22:02:30 字数 445 浏览 0 评论 0 原文

我正在编写一个应用程序,它将存储大量图像(可能还有视频)文件。上传后,它们将立即推送到某个云服务 CDN,以实际向公众提供服务。这个想法是将图像存储在可靠的、可备份的存储中。我预计会有 200,000 个对象(每个对象最多 10KB)以及可能更少的几 MB 视频文件。

默认情况下,我会使用 文档建议 的 Postgres好的。

  • 这是一个明智的想法吗?
  • 它会让备份数据库成为一场彻底的噩梦吗?经验?
  • 有可靠性问题吗?
  • 这会影响数据库其他部分的性能吗?请记住,每个图像的数据库只会被命中一次或两次。

I'm writing an app which will store a large number of image (and possibly video) files. After they're uploaded they will be immediately pushed out to some cloud serving CDN for actual serving to the public. The idea is to have the images stored in a reliable, back-uppable store. I would anticipate of the order of 200,000 objects of up to 10KB each and possibly fewer video files of a few MB.

By default I would go to Postgres which the documentation suggests would be ok.

  • Is this is a sensible idea?
  • Will it make backing up the database a complete nightmare. Experiences?
  • Any reliability issues?
  • Will this affect the performance for other parts of the db? Bear in mind that the db will only be hit once or twice for each image.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

过潦 2024-11-14 22:02:30

我有在 Oracle 和 MySQL 中以这种方式将图像存储在数据库中的经验。性能和可靠性不是问题。备份是。您的备份将会变得非常大。由于备份既耗时又昂贵,因此节省空间可能是个好主意。如果这意味着您只需从数据库中删除图像就可以将数据库缩小 80%,那么将它们存储在其他地方可能是个好主意。备份单独的文件更加高效,因为您可以轻松创建仅包含新映像和修改后的映像的增量备份。

I've got experience with storing images in a database this way in Oracle and MySQL. Performance and reliability are not an issue. Backing up is. Your backup will get very large. Since backing up is time consuming and expensive, it might be a good idea to save space. If that means you can shrink your database by 80% by just removing the images from the database, it might be a good idea to store them elsewhere. Backing up separate files is more efficient, because you can easily create incremental backups containing only new and modified images.

浴红衣 2024-11-14 22:02:30

我有使用 PostgreSQL 的经验,将图像存储为 ByteA(类似 BLOB 的数据类型),这是一次很好的体验,并将图像存储在“双重解决方案"(文件系统中的图像,MySQL 和 PostgreSQL 等数据库中的元数据),我不推荐。

有 3 个方面或架构考虑因素可以帮助我们做出决定:

  1. 是否统一解决方案?今天,当我们看到图像量(图像的大小和数量)不断增长时,所有应用,“统一解决方案”是目标。示例:Wikimedia 是维基百科的统一且专业的解决方案。
  2. 直接存储还是间接存储? 就像旧的“双重解决方案”一样,不将图像存储到 SQL 表中,某些解决方案可以使用外部数据库或外部数据指针...在 PostgreSQL 上,BLOB 数据类型具有间接存储(生成单独的备份),BYTEA 数据类型是直接的(与表一起备份)。选择需要技术和性能方面的考虑。
  3. 原始图像还是处理后的图像?我们需要在“原始图像”和“处理后的图像”之间进行一些区分,例如缩略图,需要数据库存储(用于缓存!),但不需要备份。

我建议:

  • 在表中存储为blob(带有间接存储的二进制大型对象):用于原始映像存储,但单独备份。请参阅伊万的回答PostgreSQL 额外提供的模块操作方法

  • 存储为bytea(或blob),在一个单独的数据库中(使用DBlink):用于原始图像存储,位于另一个(统一)数据库。在这种情况下,我更喜欢 bytea,但 blob 几乎相同。分离数据库是“统一图像网络服务”的最佳方式。

  • 在表中存储为bytea(直接存储的字节数组):用于缓存处理后的图像(通常是缩略图)。缓存小图像以将其快速发送到网络浏览器(避免渲染问题)并减少服务器处理。还缓存必要的元数据,例如宽度和高度。数据库缓存是最简单的方法,但请检查您的需求和服务器配置(例如 Apache 模块):PostgreSQL 二进制数据类型手册使用 bytea 列进行测试

I have experiences with PostgreSQL, storing images as ByteA (a BLOB-like datatype), a good experience, and storing images in "dual solution" (images at filesystem, metadata at databases like MySQL and PostgreSQL), that I not recommend.

There are 3 aspects, or architecture considerations, that can help us in our decision:

  1. Unify solution or not? Today, when we see that image volume (sizes and number of images) are growing and growing, in all applications, the "unified solutions" are the goal. Example: Wikimedia is a unified and specialized solution for Wikipedia.
  2. Direct or indirect store? Like old "dual solutions", that not store image into the SQL table, some solutions can use external database or external data pointer... On PostgreSQL BLOB datatypes have indirect store (generates a separated backup), and BYTEA datatype is direct (backup-ed with tables). The choice need technical and performance considerations.
  3. Original or processed images? We need some distinction between "original image" and "processed image", like thumbnail, that need database store (for caching!), but not need backup.

I recommend:

  • to store as blob (Binary Large OBject with indirect store) at your table: for original image store, but separated backup. See Ivan's answer, PostgreSQL additional supplied modules, How-tos etc.

  • to store as bytea (or blob), at a separated database (with DBlink): for original image store, at another (unified) database. In this case, I preffer bytea, but blob is near the same. Separating database is the best way for a "unified image webservice".

  • to store as bytea (BYTE Array with direct store) at your table: for caching processed images (typically thumbnails). Cache the little images to send it fast to the web-browser (avoiding renderization problems) and reduce server processing. Cache also the essential metadata, like width and height. Database caching is the easiest way, but check your needs and server configs (ex. Apache modules): store thumbnails at file system may be better, compare performances. Remember that it is a (unified) web-service, then can be stored at a separete database with no backups, serving many tables. See also PostgreSQL binary data types manual, tests with bytea column, etc.

青衫儰鉨ミ守葔 2024-11-14 22:02:30

我的经验仅限于 SQL Server,但我的数据库中有数百万个大于 10KB 的 PDF 文件,该数据库的性能仍然很好。当然需要索引。对于如此大量的数据,完整数据库备份所需的时间不会比预期的时间长。再次强调,这是针对 MS-SQL 服务器的!

My experience is limited to SQL server, but I have several million PDF-files that are larger than 10KB in a database, which is still performing quite nicely. Of course indexes are required. Full database backup takes no longer than expected with such an amount of data. Again, this is for MS-SQL server!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文