使用elasticsearch按营业时间过滤搜索结果

发布于 2024-11-30 05:09:30 字数 285 浏览 1 评论 0原文

我正在使用elasticsearch来索引和搜索位置,并且我遇到了一个按营业时间过滤的特定问题,我不知道如何解决

基本上,每个位置都有营业时间(一周中的每一天) )并且每天可能有超过1“组”的运行时间(我们现在使用2组)。

例如: 周一: 上午 9 点开门 / 中午 12 点关门 下午 1 点开放/晚上 9 点关闭

给定当前时间和星期几,我需要搜索“开放”位置。

我不知道应该如何将这些营业时间与位置详细信息一起索引,以及如何使用它们来过滤结果,任何帮助、建议将不胜

感激

I'm using elasticsearch to index and search locations, and I'm running into 1 particular issue with filtering by operating hour which I don't know how to work out

Basically, each location will have operating hour (for every day of the week) and each day may have more than 1 "sets" of operating hour (we use 2 for now).

For example:
Monday:
open 9am / close 12pm
open 1pm / close 9pm

Given the current time and the current day of the week, I need to search for the "open" locations.

I don't know how should I index these operating hour together with the location details, and how to use them to filter out the results yet, any help, suggestion would be really appreciated

Regards

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

夜深人未静 2024-12-07 05:09:30

更好的方法是使用嵌套文档。

首先:设置映射以指定 hours 文档应被视为嵌套:

curl -XPUT 'http://127.0.0.1:9200/foo/?pretty=1'  -d '
{
   "mappings" : {
      "location" : {
         "properties" : {
            "hours" : {
               "include_in_root" : 1,
               "type" : "nested",
               "properties" : {
                  "open" : {
                     "type" : "short"
                  },
                  "close" : {
                     "type" : "short"
                  },
                  "day" : {
                     "index" : "not_analyzed",
                     "type" : "string"
                  }
               }
            },
            "name" : {
               "type" : "string"
            }
         }
      }
   }
}
'

添加一些数据:(注意开放时间的多个值)

curl -XPOST 'http://127.0.0.1:9200/foo/location?pretty=1'  -d '
{
   "name" : "Test",
   "hours" : [
      {
         "open" : 9,
         "close" : 12,
         "day" : "monday"
      },
      {
         "open" : 13,
         "close" : 17,
         "day" : "monday"
      }
   ]
}
'

然后运行查询,按当前日期和时间进行过滤:

curl -XGET 'http://127.0.0.1:9200/foo/location/_search?pretty=1'  -d '
{
   "query" : {
      "filtered" : {
         "query" : {
            "text" : {
               "name" : "test"
            }
         },
         "filter" : {
            "nested" : {
               "path" : "hours",
               "filter" : {
                  "and" : [
                     {
                        "term" : {
                           "hours.day" : "monday"
                        }
                     },
                     {
                        "range" : {
                           "hours.close" : {
                              "gte" : 10
                           }
                        }
                     },
                     {
                        "range" : {
                           "hours.open" : {
                              "lte" : 10
                           }
                        }
                     }
                  ]
               }
            }
         }
      }
   }
}
'

这应该有效。

不幸的是,在 0.17.5 中,它抛出了一个 NPE - 这可能是一个简单的错误,很快就会被修复。我在这里为此提出了一个问题: https://github.com/elasticsearch/elasticsearch/issues/ 1263

更新 奇怪的是,我现在无法复制 NPE - 此查询似乎在 0.17.5 及更高版本上都能正常工作。一定是一些暂时的故障。

克林特

A better way to do this would be to use nested documents.

First: set up your mapping to specify that the hours document should be treated as nested:

curl -XPUT 'http://127.0.0.1:9200/foo/?pretty=1'  -d '
{
   "mappings" : {
      "location" : {
         "properties" : {
            "hours" : {
               "include_in_root" : 1,
               "type" : "nested",
               "properties" : {
                  "open" : {
                     "type" : "short"
                  },
                  "close" : {
                     "type" : "short"
                  },
                  "day" : {
                     "index" : "not_analyzed",
                     "type" : "string"
                  }
               }
            },
            "name" : {
               "type" : "string"
            }
         }
      }
   }
}
'

Add some data: (note the multiple values for opening hours)

curl -XPOST 'http://127.0.0.1:9200/foo/location?pretty=1'  -d '
{
   "name" : "Test",
   "hours" : [
      {
         "open" : 9,
         "close" : 12,
         "day" : "monday"
      },
      {
         "open" : 13,
         "close" : 17,
         "day" : "monday"
      }
   ]
}
'

Then run your query, filtering by the current day and time:

curl -XGET 'http://127.0.0.1:9200/foo/location/_search?pretty=1'  -d '
{
   "query" : {
      "filtered" : {
         "query" : {
            "text" : {
               "name" : "test"
            }
         },
         "filter" : {
            "nested" : {
               "path" : "hours",
               "filter" : {
                  "and" : [
                     {
                        "term" : {
                           "hours.day" : "monday"
                        }
                     },
                     {
                        "range" : {
                           "hours.close" : {
                              "gte" : 10
                           }
                        }
                     },
                     {
                        "range" : {
                           "hours.open" : {
                              "lte" : 10
                           }
                        }
                     }
                  ]
               }
            }
         }
      }
   }
}
'

This should work.

Unfortunately, in 0.17.5, it throws an NPE - it is likely to be a simple bug which will be fixed shortly. I have opened an issue for this here: https://github.com/elasticsearch/elasticsearch/issues/1263

UPDATE Bizarrely, I now can't replicate the NPE - this query seems to work correctly both on version 0.17.5 and above. Must have been some temporary glitch.

clint

山人契 2024-12-07 05:09:30

上述解决方案不起作用,因为如果您有一些内容在周一 2-4 点开放,周二 6-8 点开放,那么在周一 6 点进行过滤将返回该文档。下面是一些伪json来说明应该如何完成。

{
    "business_document": "...",
    "hours": {
        "1": [
            {
                "open": 930,
                "close": 1330
            },
            {
                "open": 1530,
                "close": 2130
            }
        ],
        "2": [
            {
                "open": 1000,
                "close": 2100
            }
        ],
        "3": [
            {
                "open": 1000,
                "close": 2100
            }
        ],
        "4": [
            {
                "open": 1000,
                "close": 2100
            }
        ],
        "5": [
            {
                "open": 1000,
                "close": 2100
            }
        ],
        "6": [
            {
                "open": 1000,
                "close": 2100
            }
        ],
        "7": [
            {
                "open": 930,
                "close": 1330
            },
            {
                "open": 1530,
                "close": 2130
            }
        ]
    }
} 


Sample Filter (can be applied to any query for a businesses): 
{ 
    "filter": { 
        "and": [ //Must match all following clauses 
            { 
                "range": { 
                    "hours.1.open": { //Close Hour of Day 1 (current day) 
                        "lte": 1343 //Store open time is less than 13:43 (current time) 
                    } 
                } 
            }, 
            { 
                "range": { 
                    "hours.1.close": { //Close Hour of Day 1 (current day) 
                        "gte": 1343 //Store close time is greater than 13:43 (current time) 
                    } 
                } 
            } 
        ] 
    } 
} 

所有时间均应采用标准时区 (GMT) 的 24 小时格式

The above solution doesn't work because if you have something that is open 2-4 on monday and 6-8 on tuesday then doing a filter on monday at 6 will return the document. Below is some pseudojson to illustrate how it should be done.

{
    "business_document": "...",
    "hours": {
        "1": [
            {
                "open": 930,
                "close": 1330
            },
            {
                "open": 1530,
                "close": 2130
            }
        ],
        "2": [
            {
                "open": 1000,
                "close": 2100
            }
        ],
        "3": [
            {
                "open": 1000,
                "close": 2100
            }
        ],
        "4": [
            {
                "open": 1000,
                "close": 2100
            }
        ],
        "5": [
            {
                "open": 1000,
                "close": 2100
            }
        ],
        "6": [
            {
                "open": 1000,
                "close": 2100
            }
        ],
        "7": [
            {
                "open": 930,
                "close": 1330
            },
            {
                "open": 1530,
                "close": 2130
            }
        ]
    }
} 


Sample Filter (can be applied to any query for a businesses): 
{ 
    "filter": { 
        "and": [ //Must match all following clauses 
            { 
                "range": { 
                    "hours.1.open": { //Close Hour of Day 1 (current day) 
                        "lte": 1343 //Store open time is less than 13:43 (current time) 
                    } 
                } 
            }, 
            { 
                "range": { 
                    "hours.1.close": { //Close Hour of Day 1 (current day) 
                        "gte": 1343 //Store close time is greater than 13:43 (current time) 
                    } 
                } 
            } 
        ] 
    } 
} 

All times should be in 24 hour format using a standard timezone (GMT)

沧笙踏歌 2024-12-07 05:09:30

最简单的方法是在某个位置开放时对时段进行命名和索引。首先,您需要提出一个模式,为每个位置可以开放的时间段分配一个名称。例如,thu17 可能代表星期四下午 5 点。然后,应使用包含以下值的多个“open”字段对示例中的位置进行索引:mon09、mon10、mon11、mon13、mon14、mon15、mon16、mon17、mon18、mon19、mon20、tue09、tue10 等。要仅显示星期四上午 7 点营业的地点,您只需将此过滤器添加到您的查询中:open:thu07。

您不必使用这个特定的命名模式。例如,您可以只计算从一周开始的小时数。在这种情况下,周一上午 9 点将是 9 点,周一晚上 11 点 - 23 点,周二凌晨 2 点 - 26 点,依此类推。

The simplest way to do it is by naming and indexing time slots when a location is open. First, you need to come up with a schema that assigns a name to each time slot when location can be open. For example, thu17 may represent 5PM on Thursday. The location in your example should then be indexed with several fields "open" containing the following values: mon09, mon10, mon11, mon13, mon14, mon15, mon16, mon17, mon18, mon19, mon20, tue09, tue10 and so on. To show only locations that are open on Thursday 7AM, you just need add this filter to your query: open:thu07.

You don't have to use this particular naming schema. You can, for example, just count the number of hours from the beginning of the week. In this case, 9 AM on Monday would be 9, 11PM on Monday - 23, 2AM on Tuesday - 26 and so on.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文