如何使用线性插值填充缺失的字段值?
我有一组代表特定日期值的文档。
其中一些日期没有值(该字段可能缺失或设置为 null
)。
我想通过使用按日期排序的最接近的先前和后续文档进行线性插值来填充那些缺失或未定义的值。
这样:
{ date: ISODate("2021-03-07"), value: 10 }
{ date: ISODate("2021-03-08") }
{ date: ISODate("2021-03-09"), value: 15 }
{ date: ISODate("2021-03-10"), value: null }
{ date: ISODate("2021-03-11") }
{ date: ISODate("2021-03-12"), value: 3 }
变为:
{ date: ISODate("2021-03-07"), value: 10 }
{ date: ISODate("2021-03-08"), value: 12.5 } <=
{ date: ISODate("2021-03-09"), value: 15 }
{ date: ISODate("2021-03-10"), value: 11 } <=
{ date: ISODate("2021-03-11"), value: 7 } <=
{ date: ISODate("2021-03-12"), value: 3 }
I have a collection of documents representing values at specific dates.
Some of those dates don't have a value (the field can be missing or set to null
).
I'd like to fill in those missing or undefined values by linearly interpolating with the closest previous and following documents ordered by date.
Such that:
{ date: ISODate("2021-03-07"), value: 10 }
{ date: ISODate("2021-03-08") }
{ date: ISODate("2021-03-09"), value: 15 }
{ date: ISODate("2021-03-10"), value: null }
{ date: ISODate("2021-03-11") }
{ date: ISODate("2021-03-12"), value: 3 }
becomes:
{ date: ISODate("2021-03-07"), value: 10 }
{ date: ISODate("2021-03-08"), value: 12.5 } <=
{ date: ISODate("2021-03-09"), value: 15 }
{ date: ISODate("2021-03-10"), value: 11 } <=
{ date: ISODate("2021-03-11"), value: 7 } <=
{ date: ISODate("2021-03-12"), value: 3 }
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
从
Mongo 5.3
开始,这是新的$fill
聚合运算符:文档按
日期
按时间顺序排序(sortBy: { date: 1 }
),以便使用线性
插值填充缺失的值
(value: { method : 前一个值和后一个值之间的“线性”}
)。请注意,这实际上是基于所选
sortBy
字段的线性插值,这样,如果您缺少日期,您将正确得到:另请注意,如果您有上面提到的间隙,并且还想插入文档来代替这些间隙,您可以使用
$densify
阶段以获得最终的缺失数据填充!:Starting in
Mongo 5.3
, it's a nice use case for the new$fill
aggregation operator:Documents are chronologically ordered by
date
(sortBy: { date: 1 }
), such that missingvalue
s are populated using alinear
interpolation (value: { method: "linear" }
) between previous and following values.Note that this really is a linear interpolation based on the chosen
sortBy
field, such that if you had a missing date, you'd correctly get:Also note that if you have gaps as mentioned just above and and also want to insert documents in place of those gaps, you can use a
$densify
stage to get the ultimate missing data filler!: