将数组细分为“半相等”、均匀子数组的算法

发布于 2024-12-15 08:42:44 字数 2189 浏览 0 评论 0原文

给定一个包含 N 个元素的数组，我正在寻找 M (M < N) 个长度相等或长度相差大部分为 1 的连续子数组。例如，如果 N = 12 且 M = 4，则所有子数组将具有相等的长度 N/M = 3。如果 N = 100 且 M = 12，我期望长度为 8 和 9 的子数组，并且这两个大小应在原始数组中均匀分布。这个简单的任务实施起来有点微妙。我提出了 Bresenham 线算法的改编版，在编码时看起来像这样C++：

/// The function suggests how an array with num_data-items can be
/// subdivided into successively arranged groups (intervals) with
/// equal or "similar" length. The number of intervals is specified
/// by the parameter num_intervals. The result is stored into an array
/// with (num_data + 1) items, each of which indicates the start-index of
/// an interval, the last additional index being a sentinel item which 
/// contains the value num_data.
///
/// Example:
///
///    Input:  num_data ........... 14,
///            num_intervals ...... 4
///
///    Result: result_start_idx ... [ 0, 3, 7, 10, 14 ]
///

void create_uniform_intervals( const size_t         num_data,
                               const size_t         num_intervals,
                               std::vector<size_t>& result_start_idx )
{
    const size_t avg_interval_len  = num_data / num_intervals;
    const size_t last_interval_len = num_data % num_intervals;

    // establish the new size of the result vector
    result_start_idx.resize( num_intervals + 1L );
    // write the pivot value at the end:
    result_start_idx[ num_intervals ] = num_data;

    size_t offset     = 0L; // current offset

    // use Bresenham's line algorithm to distribute
    // last_interval_len over num_intervals:
    intptr_t error = num_intervals / 2;

    for( size_t i = 0L; i < num_intervals; i++ )
    {
        result_start_idx[ i ] = offset;
        offset += avg_interval_len;
        error -= last_interval_len;
        if( error < 0 )
        {
            offset++;
            error += num_intervals;
        } // if
    } // for
}

此代码计算 N = 100、M=12 的区间长度： 8 9 8 8 9 8 8 9 8 8 9

8实际的问题是我不知道如何准确地称呼我的问题，所以我很难找到它。

还有其他算法可以完成这样的任务吗？
他们怎么称呼？如果我知道其他应用领域，也许这些名字就会出现。

我需要该算法作为更大的数据聚类算法的一部分。我认为它对于实现并行排序也很有用（？）。

原文

Given an array with N elements, I am looking for M (M < N) successive sub-arrays with equal lengths or with lengths that differ by mostly 1. For example, if N = 12 and M = 4, all sub-arrays would have equal lengths of N/M = 3. If N = 100 and M = 12, I expect sub-arrays with lengths 8 and 9, and both sizes should be uniformly spread within the original array. This simple task turned to be a little bit subtle to implement. I came up with an adaptation of the Bresenham's line algorithm, which looks like this when coded in C++:

/// The function suggests how an array with num_data-items can be
/// subdivided into successively arranged groups (intervals) with
/// equal or "similar" length. The number of intervals is specified
/// by the parameter num_intervals. The result is stored into an array
/// with (num_data + 1) items, each of which indicates the start-index of
/// an interval, the last additional index being a sentinel item which 
/// contains the value num_data.
///
/// Example:
///
///    Input:  num_data ........... 14,
///            num_intervals ...... 4
///
///    Result: result_start_idx ... [ 0, 3, 7, 10, 14 ]
///

void create_uniform_intervals( const size_t         num_data,
                               const size_t         num_intervals,
                               std::vector<size_t>& result_start_idx )
{
    const size_t avg_interval_len  = num_data / num_intervals;
    const size_t last_interval_len = num_data % num_intervals;

    // establish the new size of the result vector
    result_start_idx.resize( num_intervals + 1L );
    // write the pivot value at the end:
    result_start_idx[ num_intervals ] = num_data;

    size_t offset     = 0L; // current offset

    // use Bresenham's line algorithm to distribute
    // last_interval_len over num_intervals:
    intptr_t error = num_intervals / 2;

    for( size_t i = 0L; i < num_intervals; i++ )
    {
        result_start_idx[ i ] = offset;
        offset += avg_interval_len;
        error -= last_interval_len;
        if( error < 0 )
        {
            offset++;
            error += num_intervals;
        } // if
    } // for
}

This code calculates the interval lengths for N = 100, M=12: 8 9 8 8 9 8 8 9 8 8 9 8

The actual question is that I don't know how exactly to call my problem, so I had difficulty searching for it.

Are there other algorithms for accomplishing such a task?
How are they called? Maybe the names would come if I knew other areas of application.

I needed the algorithm as a part of a bigger algorithm for clustering of data. I think it could also be useful for implementing a parallel sort(?).

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

悲欢浪云 2024-12-22 08:42:44

如果您的语言具有截断的整数除法，则计算部分 i 大小的简单方法是通过 (N*i+N)/M - (N*i)/M.例如，Python 程序

  N=100;M=12
  for i in range(M): print (N*i+N)/M - (N*i)/M

输出数字 8 8 9 8 8 9 8 8 9 8 8 9。使用 N=12;M=5 则输出 2 2 3 2 3。使用 N =12;M=3 它输出 4 4 4。

如果您的节号是从 1 开始而不是从 0 开始，则表达式为 (N*i)/M - (N*iN)/M。

If your language has integer division that truncates, an easy way to compute the size of section i is via (N*i+N)/M - (N*i)/M. For example, the python program

  N=100;M=12
  for i in range(M): print (N*i+N)/M - (N*i)/M

outputs the numbers 8 8 9 8 8 9 8 8 9 8 8 9. With N=12;M=5 it outputs 2 2 3 2 3. With N=12;M=3 it outputs 4 4 4.

If your section numbers are 1-based rather than 0-based, the expression is instead (N*i)/M - (N*i-N)/M.

回复收藏 0 原文

最好是你 2024-12-22 08:42:44

空间填充曲线和分形细分平面并降低复杂性。例如有 z 曲线、希尔伯特曲线、莫顿曲线。

回复收藏 0 原文

~没有更多了~

关于作者

脸赞

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

将数组细分为“半相等”、均匀子数组的算法

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

小瓶盖

wxsp_Ukbq8xGR

1638627670

仅一夜美梦

夜访吸血鬼

近卫軍团

友情链接

将数组细分为“半相等”、均匀子数组的算法

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

小瓶盖

wxsp_Ukbq8xGR

1638627670

仅一夜美梦

夜访吸血鬼

近卫軍团

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。