管道中的$匹配项会怎样?
我是MongoDB和Python脚本的新手。我很困惑$匹配
术语在管道中处理。
假设我管理了一个库,在该图书馆中,在其中跟踪书籍中的JSON文件。每本书的副本都有一个JSON。 book.json文件看起来像这样:
{
"Title": "A Tale of Two Cities",
"subData":
{
"status": "Checked In"
...more data here...
}
}
status
将是有限字符串的一个字符串,也许只是:{{'''ined in in',“ nocked ofer”,“ nocked”,“丢失”,等等。 }但还请注意,可能没有状态
字段:
{
"Title": "Great Expectations",
"subData":
{
...more data here...
}
}
好的:我试图在Python脚本中编写MongoDB管道,该脚本执行以下操作:
- 对于库中的每本书:
- 分组并计算
状态的不同实例
字段
- 分组并计算
,因此我的Python脚本中的目标输出将是这样的:
{ "A Tale of Two Cities" 'Checked In' 3 }
{ "A Tale of Two Cities" 'Checked Out' 4 }
{ "Great Expectations" 'Checked In' 5 }
{ "Great Expectations" '' 7 }
这是我的代码:
mydatabase = client.JSON_DB
mycollection = mydatabase.JSON_all_2
listOfBooks = mycollection.distinct("bookname")
for book in listOfBooks:
match_variable = {
"$match": { 'Title': book }
}
group_variable = {
"$group":{
'_id': '$subdata.status',
'categories' : { '$addToSet' : '$subdata.status' },
'count': { '$sum': 1 }
}
}
project_variable = {
"$project": {
'_id': 0,
'categories' : 1,
'count' : 1
}
}
pipeline = [
match_variable,
group_variable,
project_variable
]
results = mycollection.aggregate(pipeline)
for result in results:
print(str(result['Title'])+" "+str(result['categories'])+" "+str(result['count']))
您可能会说,我几乎不知道我是什么正在做。当我运行代码时,我会发现一个错误,因为我试图引用我的$ match
术语:
Traceback (most recent call last):
File "testScript.py", line 34, in main
print(str(result['Title'])+" "+str(result['categories'])+" "+str(result['count']))
KeyError: 'Title'
那么$ match
项不包含在管道中?或者我不是在group_variable
或project_variable
中包括它吗?
总的来说,上述似乎有很多代码可以做一些相对容易的事情。有人看到更好的方法吗?它很容易在线找到简单的示例,但这是与我可以找到的任何内容相比,这是复杂的一步。谢谢。
I'm a newbie to MongoDB and Python scripts. I'm confused how a $match
term is handled in a pipeline.
Let's say I manage a library, where books are tracked as JSON files in a MongoDB. There is one JSON for each copy of a book. The book.JSON files look like this:
{
"Title": "A Tale of Two Cities",
"subData":
{
"status": "Checked In"
...more data here...
}
}
Here, status
will be one string from a finite set of strings, perhaps just: { "Checked In", "Checked Out", "Missing", etc. } But also note also that there may not be a status
field at all:
{
"Title": "Great Expectations",
"subData":
{
...more data here...
}
}
Okay: I am trying to write a MongoDB pipeline within a Python script that does the following:
- For each book in the library:
- Groups and counts the different instances of the
status
field
- Groups and counts the different instances of the
So my target output from my Python script would be something like this:
{ "A Tale of Two Cities" 'Checked In' 3 }
{ "A Tale of Two Cities" 'Checked Out' 4 }
{ "Great Expectations" 'Checked In' 5 }
{ "Great Expectations" '' 7 }
Here's my code:
mydatabase = client.JSON_DB
mycollection = mydatabase.JSON_all_2
listOfBooks = mycollection.distinct("bookname")
for book in listOfBooks:
match_variable = {
"$match": { 'Title': book }
}
group_variable = {
"$group":{
'_id': '$subdata.status',
'categories' : { '$addToSet' : '$subdata.status' },
'count': { '$sum': 1 }
}
}
project_variable = {
"$project": {
'_id': 0,
'categories' : 1,
'count' : 1
}
}
pipeline = [
match_variable,
group_variable,
project_variable
]
results = mycollection.aggregate(pipeline)
for result in results:
print(str(result['Title'])+" "+str(result['categories'])+" "+str(result['count']))
As you can probably tell, I have very little idea what I'm doing. When I run the code, I get an error because I'm trying to reference my $match
term:
Traceback (most recent call last):
File "testScript.py", line 34, in main
print(str(result['Title'])+" "+str(result['categories'])+" "+str(result['count']))
KeyError: 'Title'
So a $match
term is not included in the pipeline? Or am I not including it in the group_variable
or project_variable
?
And on a general note, the above seems like a lot of code to do something relatively easy. Does anyone see a better way? Its easy to find simple examples online, but this is one step of complexity away from anything I can locate. Thank you.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是一个聚合管道到
“ $ group”
ast“ title”
和“ subdata .status“
。其中一个“书籍”的示例输出:
在。
Here's one aggregation pipeline to
"$group"
all the books by"Title"
and"subData.status"
.Example output for one of the "books":
Try it on mongoplayground.net.