排序/组合相关数组

发布于 2024-11-02 22:43:20 字数 263 浏览 1 评论 0原文

必须有某种算法可以使这比我正在做的事情更容易......

我有两个数组,每个数组有两列。两者中的一列是时间戳,另一列是测量值。

需要发生的是将其转换为单个数组:timestamp、measurement1、measurement2

问题是时间戳通常不会完全匹配。一个数组可能在某个时间段内完全缺少某个值,或者时间戳可能有微小的偏差(微小到可以将两个测量值分配给同一时间戳)。

是否有一些众所周知的方法来进行这种模糊合并操作?一个简单的公共领域功能?

There must be some algorithm that will make this easier than what I'm doing...

What I have are two arrays, each with two columns. One column in both is a timestamp, and the other in both is a measurement.

What needs to happen is turn this into a single array of: timestamp, measurement1, measurement2

The problem is the timestamps often won't match up exactly. One array might be missing a value completely for a time period, or the timestamps might be off by an insignificant amount (insignificant enough that it would be OK to assign both measurements to the same timestamp).

Is there some well-known way of doing this fuzzy merge operation? A simple public domain function??

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

幸福还没到 2024-11-09 22:43:20

首先问自己这些问题:数组的元素数量是否相同?您想如何组合具有相同时间戳的两个项目?您想如何组合具有不同时间戳的两个项目?

您可能必须自己编写算法。这样的事情很容易实现:

  1. 首先按时间戳顺序对每个数组进行单独排序。
  2. 分别在每个输入数组的开头声明两个迭代器,以及一个空的输出数组。
  3. 然后,检查哪个数组具有最早的时间戳。称其为“早”,而称其为“晚”。
    • 如果 EARLY 接近 LATE(小于某个常量),则应用合并操作并将结果插入到输出数组的末尾。增加两个迭代器并返回到 3。
    • 否则,“早”与“晚”相去甚远。您需要处理 LATE 数组中的缺失值,可能是通过重复前一个值或使用某些函数对其进行插值。决定是否在输出数组中插入值。在这种情况下,您只需增加 EARLY 数组迭代器并返回到 3。
  4. 如果您已到达任一数组的末尾,则另一个数组的其余部分将是 LATE。您可能希望将其解释为缺失值,并重复或插入测量值。
  5. 返回输出数组。

OutputArray merge(InputArray& a, InputArray& b) {
    InputArray::iterator a_it = a.begin();
    InputArray::iterator b_it = b.begin();
    while(a_it != a.end() && b_it != b.end()) {
        InputArray::iterator& early = *a_it.timestamp < *b_it.timestamp ? a_it : b_it;
        InputArray::iterator& late = *a_it.timestamp < *b_it.timestamp ? b_it : a_it;
        if(*late.timestamp - *early.timestamp < TIMESTAMP_CLOSE_ENOUGH) {
            output.timestamp = (*late.timestamp + *early.timestamp) / 2; // mean value
            output.measure1 = *a_it.measure;
            output.measure2 = *b_it.measure;
            outputArray.push_back(output);
            a_it++; b_it++;
        }
        else {
            output.timestamp = *early.timestamp;
            output.measure1 = *a_it.timestamp < *b_it.timestamp ? *a_it.measure : outputArray.back.measure1; // previous value if missing
            output.measure2 = *a_it.timestamp < *b_it.timestamp ? outputArray.back.measure2 : *b_it.measure;
            outputArray.push_back(output);
            early++;
        }
    }

    InputArray::iterator& late = a_it != a.end() ? a_it : b_it;
    InputArray::iterator late_end = a_it != a.end() ? a.end() : b.end();
    while(late != late_end) {
            output.timestamp = *late.timestamp;
            output.measure1 = a_it != a.end() ? *a_it.measure : outputArray.back.measure1; // previous value if missing
            output.measure2 = a_it != a.end() ? outputArray.back.measure2 : *b_it.measure;
            outputArray.push_back(output);
            late++;
    }
    return outputArray;
}

Start by asking yourself these questions: Do the arrays have the same number of elements? How do you want to combine two items with the same timestamp? How do you want to combine two items with different timestamp?

You will probably have to write the algorithm yourself. Something like this would be easy to implement:

  1. Start by sorting each of the arrays individually by the timestamp order.
  2. Declare two iterators at the beginning of each input array respectively, and an empty output array.
  3. Then, you check which of the arrays has the earliest timestamp. Call it EARLY, and the other LATE.
    • If EARLY is close to LATE (by less than some constant), you apply a merge operation and insert the result at the end of the output array. Increment both iterators and go back to 3.
    • Otherwise, EARLY is far from LATE. You need to handle a missing value in the LATE array, perhaps by repeating the previous value or interpolating it using some function. Decide to insert or not a value in the output array. You only increment the EARLY array iterator in this case and go back to 3.
  4. If you have reached the end of either one of the arrays, the rest of the other array is LATE. You may want to interpret this as missing values and also repeat or interpolate the measurements.
  5. Return the output array.

~

OutputArray merge(InputArray& a, InputArray& b) {
    InputArray::iterator a_it = a.begin();
    InputArray::iterator b_it = b.begin();
    while(a_it != a.end() && b_it != b.end()) {
        InputArray::iterator& early = *a_it.timestamp < *b_it.timestamp ? a_it : b_it;
        InputArray::iterator& late = *a_it.timestamp < *b_it.timestamp ? b_it : a_it;
        if(*late.timestamp - *early.timestamp < TIMESTAMP_CLOSE_ENOUGH) {
            output.timestamp = (*late.timestamp + *early.timestamp) / 2; // mean value
            output.measure1 = *a_it.measure;
            output.measure2 = *b_it.measure;
            outputArray.push_back(output);
            a_it++; b_it++;
        }
        else {
            output.timestamp = *early.timestamp;
            output.measure1 = *a_it.timestamp < *b_it.timestamp ? *a_it.measure : outputArray.back.measure1; // previous value if missing
            output.measure2 = *a_it.timestamp < *b_it.timestamp ? outputArray.back.measure2 : *b_it.measure;
            outputArray.push_back(output);
            early++;
        }
    }

    InputArray::iterator& late = a_it != a.end() ? a_it : b_it;
    InputArray::iterator late_end = a_it != a.end() ? a.end() : b.end();
    while(late != late_end) {
            output.timestamp = *late.timestamp;
            output.measure1 = a_it != a.end() ? *a_it.measure : outputArray.back.measure1; // previous value if missing
            output.measure2 = a_it != a.end() ? outputArray.back.measure2 : *b_it.measure;
            outputArray.push_back(output);
            late++;
    }
    return outputArray;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文