在 SQL 数据库中存储大型文本 blob 的选项?
我有一些大量的文本(日志文件),它们可能非常大(高达千兆字节)。它们与我存储在数据库中的实体相关联,并且我试图弄清楚是否应该将它们存储在 SQL 数据库中还是外部文件中。
MySQL 中的 LONGTEXT 字段的数据库内存储似乎可能限制为 4GB,并且其他数据库可能也有类似的限制。另外,存储在数据库中可能会阻止在查看此数据时进行任何类型的搜索——我必须加载数据的完整长度才能呈现它的任何部分,对吧?
所以看来我倾向于将这些数据存储在数据库之外:我对在数据库中存储大型 blob 的疑虑是否有效,如果我要将它们存储在数据库之外,那么是否有任何框架/库来帮忙吗?
(我正在使用 python 工作,但也对其他语言的技术感兴趣)
I have some large volumes of text (log files) which may be very large (up to gigabytes). They are associated with entities which I'm storing in a database, and I'm trying to figure out whether I should store them within the SQL database, or in external files.
It seems like in-database storage may be limited to 4GB for LONGTEXT fields in MySQL, and presumably other DBs have similar limits. Also, storing in the database presumably precludes any kind of seeking when viewing this data -- I'd have to load the full length of the data to render any part of it, right?
So it seems like I'm leaning towards storing this data out-of-DB: are my misgivings about storing large blobs in the database valid, and if I'm going to store them out of the database then are there any frameworks/libraries to help with that?
(I'm working in python but am interested in technologies in other languages too)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
你的疑虑是有道理的。
几年前,数据库获得了处理大型二进制和文本字段的能力,在每个人都尝试过之后我们放弃了。
问题源于这样一个事实:对大型对象的操作往往与对原子值的操作非常不同。因此代码变得困难且不一致。
因此,大多数退伍军人只是将它们存储在文件系统上,并使用数据库中的指针。
Your misgivings are valid.
DB's gained the ability to handle large binary and text fields some years ago, and after everybody tried we gave up.
The problem stems from the fact that your operations on large objects tend to be very different from your operations on the atomic values. So the code gets difficult and inconsistent.
So most veterans just go with storing them on the filesystem with a pointer in the db.
我知道 php/mysql/oracle/prob 更多可以让您使用大型数据库对象,就像您有一个文件指针一样,这可以解决内存问题。
I know php/mysql/oracle/prob more lets you work with large database objects as if you have a file pointer, which gets around memory issues.