查询 MongoDB GridFS?

发布于 2024-12-21 08:21:12 字数 240 浏览 1 评论 0原文

我有一个博客系统,将上传的文件存储到 GridFS 系统中。问题是,我不明白如何查询它!

我正在将 Mongoose 与 NodeJS 结合使用,但它尚不支持 GridFS,因此我使用实际的 mongodb 模块进行 GridFS 操作。似乎没有办法像查询常规集合中的文档那样查询文件元数据。

将元数据存储在指向 GridFS objectId 的文档中是否明智?方便能够查询吗?

任何帮助将不胜感激,我有点卡住了:/

I have a blogging system that stores uploaded files into the GridFS system. Problem is, I dont understand how to query it!

I am using Mongoose with NodeJS which doesnt yet support GridFS so I am using the actual mongodb module for the GridFS operations. There doesn't SEEM to be a way to query the files metadata like you do documents in a regular collection.

Would it be wise to store the metadata in a document pointing to the GridFS objectId? to easily be able to query?

Any help would be GREATLY appreciated, im kinda stuck :/

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

贱贱哒 2024-12-28 08:21:12

GridFS 的工作原理是为每个文件存储多个块。这样,您就可以交付和存储非常大的文件,而无需将整个文件存储在 RAM 中。此外,这还允许您存储大于最大文档大小的文件。建议的块大小为 256kb。

文件元数据字段可用于存储附加的特定于文件的元数据,这比将元数据存储在单独的文档中更有效。这很大程度上取决于您的具体要求,但元数据字段通常提供了很大的灵活性。请记住,默认情况下,一些更明显的元数据已经是 fs.files 文档的一部分:

> db.fs.files.findOne();
{
    "_id" : ObjectId("4f9d4172b2ceac15506445e1"),
    "filename" : "2e117dc7f5ba434c90be29c767426c29",
    "length" : 486912,
    "chunkSize" : 262144,
    "uploadDate" : ISODate("2011-10-18T09:05:54.851Z"),
    "md5" : "4f31970165766913fdece5417f7fa4a8",
    "contentType" : "application/pdf"
}

要实际从 GridFS 读取文件,您必须从 获取文件文档fs.files 和来自 fs.chunks 的块。最有效的方法是将其逐块传输到客户端,这样您就不必将整个文件加载到 RAM 中。 chunks 集合具有以下结构:

> db.fs.chunks.findOne({}, {"data" :0});
{
    "_id" : ObjectId("4e9d4172b2ceac15506445e1"),
    "files_id" : ObjectId("4f9d4172b2ceac15506445e1"),
    "n" : 0, // this is the 0th chunk of the file
    "data" : /* loads of data */
}

如果您想使用 fs.filesmetadata 字段进行查询,请确保您了解点符号,例如

> db.fs.files.find({"metadata.OwnerId": new ObjectId("..."), 
                    "metadata.ImageWidth" : 280});

还要确保您的查询可以使用索引 explain()

GridFS works by storing a number of chunks for each file. This way, you can deliver and store very large files without having to store the entire file in RAM. Also, this enables you to store files that are larger than the maximum document size. The recommended chunk size is 256kb.

The file metadata field can be used to store additional file-specific metadata, which can be more efficient than storing the metadata in a separate document. This greatly depends on your exact requirements, but the metadata field, in general, offers a lot of flexibility. Keep in mind that some of the more obvious metadata is already part of the fs.files document, by default:

> db.fs.files.findOne();
{
    "_id" : ObjectId("4f9d4172b2ceac15506445e1"),
    "filename" : "2e117dc7f5ba434c90be29c767426c29",
    "length" : 486912,
    "chunkSize" : 262144,
    "uploadDate" : ISODate("2011-10-18T09:05:54.851Z"),
    "md5" : "4f31970165766913fdece5417f7fa4a8",
    "contentType" : "application/pdf"
}

To actually read the file from GridFS you'll have to fetch the file document from fs.files and the chunks from fs.chunks. The most efficient way to do that is to stream this to the client chunk-by-chunk, so you don't have to load the entire file in RAM. The chunks collection has the following structure:

> db.fs.chunks.findOne({}, {"data" :0});
{
    "_id" : ObjectId("4e9d4172b2ceac15506445e1"),
    "files_id" : ObjectId("4f9d4172b2ceac15506445e1"),
    "n" : 0, // this is the 0th chunk of the file
    "data" : /* loads of data */
}

If you want to use the metadata field of fs.files for your queries, make sure you understand the dot notation, e.g.

> db.fs.files.find({"metadata.OwnerId": new ObjectId("..."), 
                    "metadata.ImageWidth" : 280});

also make sure your queries can use an index using explain().

爱的故事 2024-12-28 08:21:12

正如 规范 所说,您可以在元数据字段中存储您想要的任何内容。

文件集合中的文档如下所示:

必填字段

{
  "_id" : <unspecified>,                  // unique ID for this file
  "length" : data_number,                 // size of the file in bytes
  "chunkSize" : data_number,              // size of each of the chunks.  Default is 256k
  "uploadDate" : data_date,               // date when object first stored
  "md5" : data_string                     // result of running the "filemd5" command on this file's chunks
}

可选字段

{    
  "filename" : data_string,               // human name for the file
  "contentType" : data_string,            // valid mime type for the object
  "aliases" : data_array of data_string,  // optional array of alias strings
  "metadata" : data_object,               // anything the user wants to store
}

因此,将您想要的任何内容存储在元数据中,并像在 MongoDB 中一样正常查询它:

db.fs.files.find({"metadata.some_info" : "sample"});

As the specification says, you can store whatever you want in the metadata field.

Here's how a document from the files collection looks like:

Required fields

{
  "_id" : <unspecified>,                  // unique ID for this file
  "length" : data_number,                 // size of the file in bytes
  "chunkSize" : data_number,              // size of each of the chunks.  Default is 256k
  "uploadDate" : data_date,               // date when object first stored
  "md5" : data_string                     // result of running the "filemd5" command on this file's chunks
}

Optional fields

{    
  "filename" : data_string,               // human name for the file
  "contentType" : data_string,            // valid mime type for the object
  "aliases" : data_array of data_string,  // optional array of alias strings
  "metadata" : data_object,               // anything the user wants to store
}

So store anything you want in the metadata and query it normally like you would in MongoDB:

db.fs.files.find({"metadata.some_info" : "sample"});
南笙 2024-12-28 08:21:12

我知道这个问题不是询问查询元数据的 Java 方式,但这里是,假设您添加 gender 作为元数据字段:

// Get your database's GridFS
GridFS gfs = new GridFS("myDatabase);

// Write out your JSON query within JSON.parse() and cast it as a DBObject
DBObject dbObject = (DBObject) JSON.parse("{metadata: {gender: 'Male'}}");

// Querying action (find)
List<GridFSDBFile> gridFSDBFiles = gfs.find(dbObject);

// Loop through the results
for (GridFSDBFile gridFSDBFile : gridFSDBFiles) {
    System.out.println(gridFSDBFile.getFilename());
}

I know the question doesn't ask about the Java way of querying for metadata, but here it is, assuming you add gender as a metadata field:

// Get your database's GridFS
GridFS gfs = new GridFS("myDatabase);

// Write out your JSON query within JSON.parse() and cast it as a DBObject
DBObject dbObject = (DBObject) JSON.parse("{metadata: {gender: 'Male'}}");

// Querying action (find)
List<GridFSDBFile> gridFSDBFiles = gfs.find(dbObject);

// Loop through the results
for (GridFSDBFile gridFSDBFile : gridFSDBFiles) {
    System.out.println(gridFSDBFile.getFilename());
}
红焚 2024-12-28 08:21:12

元数据存储在元数据字段中。你可以这样查询

db.fs.files.find({metadata: {content_type: 'text/html'}}) 

metadata is stored in metadata field. You can query it like

db.fs.files.find({metadata: {content_type: 'text/html'}}) 
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文