在文件系统上混合使用 RDBMS 和文件的最佳实践
在我正在处理的模式中的一个表中,我需要处理数千个“数据表”,其中大部分是 PDF 文档,有时还有 PNG、JPG 等图形图像文件。该模式对电子器件进行建模经销商门户,新产品经常添加到他们的产品组合中。
这些文档(数据表)是在推出新产品时添加的,但它们需要不时更新(由于文档的新版本,而不是产品本身),所以我认为更新为异步过程。
鉴于此,我应该只在表中保留数据表(和类似文档)的文件名/路径,而实际文件位于文件系统上,还是应该采用 blob 方法。我几乎可以肯定这应该是前一种方法,但仍然想听取社区的建议,看看是否有一些需要注意的陷阱。
In one of the tables in the schema I am working on, I need to deal with couple-of thousand "data-sheets" which are mostly PDF documents, and sometimes graphic-image files like PNG, JPG etc. The schema models a Electronics Distributor's portal, where new products get added to their portfolio frequently.
These documents (data-sheets) are added, at the time of introduction of a new product, but they need updates from time to time (s.a. due to newer version of the document, not the product itself), so I'd think the update to be an asynchronous procedure.
Given this, should I keep only the file-name/path of the data-sheets (& similar documents) in my table, with the actual file being on filesystem, or should I take the blob approach. I am almost certain that it should be the former approach, but still wanted to take community advise, and see if there are some pitfalls to watchout for.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
为了完整起见,我只想提一下,某些数据库允许您使用这两种方法的“混合”,例如 Oracle BFILE 或 MS SQL Server 文件流。
Ask Tom 上还有一个有趣的讨论 在 Oracle BLOB 中存储文件(简而言之:“BLOB比文件更好”)。
顺便说一句,您不一定需要选择其中一个...如果您可以承受存储开销并且您在以读取为主的环境中操作,则可以将“主”数据存储在BLOB 确保完整性,但将相同的数据“缓存”在文件中以进行快速只读访问。一些注意事项:
因此,这不是最简单的方法,但根据您的需求,可能是完整性、性能和实施工作之间的良好权衡。
For completeness, let me just mention that some databases allow you to have a "hybrid" of these two approaches, for example Oracle BFILE or MS SQL Server FILESTREAM.
There is also an interesting discussion at Ask Tom on storing files in Oracle BLOBs (in a nutshell: "BLOBs are better than files").
BTW, you don't necessarily need to chose one over another... If you can afford storage overhead and you are operating in a read-mostly environment, you could store the "master" data in the BLOB for integrity but "cache" that same data in a file for quick read-only access. Some considerations:
So, this is not the simplest approach but, depending on your needs, may be a good tradeoff between integrity, performance and implementation effort.