如何在elasticsearch中返回最新的不同行忽略时间戳字段
我有这样的文档:
{'foo': 'foo', 'bar': 'bar', ..., 'timestamp': 3}
{'foo': 'diffval', 'bar': 'diffval', ..., 'timestamp': 2}
{'foo': 'foo', 'bar': 'bar', ..., 'timestamp': 2}
{'foo': 'foo', 'bar': 'bar', ..., 'timestamp': 1}
我通过 _search?from=0&size=20&sort=timestamp%3Adesc
进行搜索
我现在想只搜索最新的不同行 - 例如:
{'foo': 'foo', 'bar': 'bar', ..., 'timestamp': 3}
{'foo': 'diffval', 'bar': 'diffval', ..., 'timestamp': 2}
但我会喜欢在不明确指示 foo
、bar
字段的情况下执行此操作,因为可能有很多字段并且不一致 - 但是 timestamp
字段是一致的。
I have documents like:
{'foo': 'foo', 'bar': 'bar', ..., 'timestamp': 3}
{'foo': 'diffval', 'bar': 'diffval', ..., 'timestamp': 2}
{'foo': 'foo', 'bar': 'bar', ..., 'timestamp': 2}
{'foo': 'foo', 'bar': 'bar', ..., 'timestamp': 1}
Which I search for via _search?from=0&size=20&sort=timestamp%3Adesc
I would like to now search for just the latest distinct row - e.g:
{'foo': 'foo', 'bar': 'bar', ..., 'timestamp': 3}
{'foo': 'diffval', 'bar': 'diffval', ..., 'timestamp': 2}
But I would like to do this without explicitly indicating the foo
, bar
, fields as there could be a lot and are not consistently there - the timestamp
field however is consistent.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我找到了一个解决方案,在存储到文档中之前,我为除时间戳之外的所有字段创建一个
hash
字段。然后我使用 opensearch 中的折叠功能 - 然后命中将返回最新的不同哈希值。I have found a sollution where I create a
hash
field of all the fields apart from the timestamp before storing in the document. Then I use the collapse functionality in opensearch - the hits will then return the latest distinct hash.