快速块放置算法，需要建议吗？

发布于 2024-08-31 06:19:02 字数 3416 浏览 13 评论 0原文

我需要模拟 Fluxbox 窗口管理器的窗口放置策略。

作为一个粗略的指导，想象一下随机大小的窗口一次填满屏幕，其中每个窗口的粗略大小导致屏幕上平均有 80 个窗口，没有任何窗口与另一个窗口重叠。

如果您有 Fluxbox并在您的系统上安装了 Xterm，您可以尝试 xwinmidiarptoy BASH脚本来查看我想要发生的事情的粗略原型。请参阅我的 xwinmidiarptoy.txt 注释写了一篇关于它的文章，解释了它的作用以及如何使用它。

需要注意的是，窗口将关闭，并且先前关闭的窗口占用的空间将再次可用于放置新窗口。

该算法需要是在线算法以串行方式逐个处理数据，即按照输入的顺序被输入到算法中，而从一开始就没有完整的输入。”

Fluxbox 窗口放置策略具有三个我想要模拟的二元选项：

Windows 构建水平行或垂直列（可能）
窗口从左到右或从右到左放置
< p>窗口从上到下放置或从下到上

目标算法之间的差异和窗口放置算法

坐标单位不是像素。将放置块的网格将为 128 x 128 单位。此外，放置区域可以通过放置在网格内的边界区域进一步缩小。

为什么该算法是一个问题？

它需要在音频应用程序中的实时线程的最后期限内运行。

目前我只关心获得快速算法，不要担心实时线程的影响以及由此带来的所有编程障碍。

尽管该算法永远不会放置与另一个窗口重叠的窗口，但用户将能够放置和移动某些类型的块，但重叠窗口将会存在。用于存储窗口和/或可用空间的数据结构需要能够处理这种重叠。

到目前为止，我有两个选择，我已经为其构建了松散的原型：

1）将 Fluxbox 放置算法移植到我的代码中。

这样做的问题是，客户端（我的程序）被踢出当我尝试使用算法放置 256 个块的最坏情况时，音频服务器（JACK）。该算法对放置第 256 个窗口时已放置的块列表执行超过 14000 次完整（线性）扫描。

为了演示这一点，我创建了一个名为 text_boxer-0.0.2.tar.bz2 它将文本文件作为输入并将其排列在 ASCII 框中。发出 make 来构建它。有点不友好，使用 --help （或任何其他无效选项）获取命令行选项列表。您必须使用该选项指定文本文件。

2）我的替代方法。

仅部分实现，此方法对矩形空闲未使用空间的每个区域使用数据结构（窗口列表可以完全独立，并且是测试该算法不需要）。该数据结构充当双向链表中的节点（具有排序插入），并包含左上角的坐标以及宽度和高度。

此外，每个块数据结构还包含四个链接，这些链接连接到四个边的每一个上的每个直接相邻（接触）的块。

重要规则：每个方块每边只能与一个方块接触。这是特定于算法存储空闲未使用空间的方式的规则，并且不考虑有多少实际窗口可以相互接触。

这种方法的问题是，它非常复杂。我已经实现了简单的情况，其中 1) 从块的一个角落删除空间，2) 分割相邻的块，以便遵守重要规则。

不太简单的情况，即要删除的空间只能在一列或一行盒子中找到，仅部分实现 - 如果要删除的块之一精确适合宽度（即列）或高度（即行）然后就会出现问题。 甚至更不用说这只检查一格宽的列和一格高的行。

我已经用 C 实现了这个算法 - 我在这个项目中使用的语言（我已经已经有几年没有使用 C++ 了，在将所有注意力集中在 C 开发上之后，使用它感到不舒服，这是一种爱好）。该实现有 700 多行代码（包括大量空行、大括号行、注释等）。该实现仅适用于水平行+左右+上下放置策略。

因此，我要么必须添加某种方法，使这 700 行代码适用于其他 7 个放置策略选项，要么必须为其他 7 个选项复制这 700 行代码。这些都没有吸引力，第一，因为现有代码足够复杂，第二，因为臃肿。

由于缺少功能，该算法甚至还没有达到可以在实时最坏情况下使用它的阶段，所以我仍然不知道它实际上是否比第一种方法表现更好或更差。

该算法的 C 实现的当前状态是 freespace.c 。我使用 gcc -O0 -ggdb freespace.c 来构建并在大小至少为 124 x 60 个字符的 xterm 中运行它。

还有什么？

我已经浏览并打折了：

Bin Packing算法：他们的强调最佳拟合并不符合本条要求算法。
递归二分放置算法：听起来很有希望，但是这些用于电路设计。他们的重点是最佳导线长度。

这两者，尤其是后者，所有要放置/打包的元素在算法开始之前都是已知的。

您对此有何看法？你会如何处理它？我还应该考虑哪些其他算法？或者，由于我没有学过计算机科学/软件工程，所以我应该研究哪些概念？

如果需要更多信息，请在评论中提出问题。

提出这个问题后提出了进一步的想法

我的“替代算法”与空间哈希图的某种组合，用于确定要放置的大窗口是否会覆盖多个可用空间块。

原文

I need to emulate the window placement strategy of the Fluxbox window manager.

As a rough guide, visualize randomly sized windows filling up the screen one at a time, where the rough size of each results in an average of 80 windows on screen without any window overlapping another.

If you have Fluxbox and Xterm installed on your system, you can try the xwinmidiarptoy BASH script to see a rough prototype of what I want happening. See the xwinmidiarptoy.txt notes I've written about it explaining what it does and how it should be used.

It is important to note that windows will close and the space that closed windows previously occupied becomes available once more for the placement of new windows.

The algorithm needs to be an Online Algorithm processing data "piece-by-piece in a serial fashion, i.e., in the order that the input is fed to the algorithm, without having the entire input available from the start."

The Fluxbox window placement strategy has three binary options which I want to emulate:

Windows build horizontal rows or vertical columns (potentially)
Windows are placed from left to right or right to left
Windows are placed from top to bottom or bottom to top

Differences between the target algorithm and a window-placement algorithm

The coordinate units are not pixels. The grid within which blocks will be placed will be 128 x 128 units. Furthermore, the placement area may be further shrunk by a boundary area placed within the grid.

Why is the algorithm a problem?

It needs to operate to the deadlines of a real time thread in an audio application.

At this moment I am only concerned with getting a fast algorithm, don't concern yourself over the implications of real time threads and all the hurdles in programming that that brings.

And although the algorithm will never ever place a window which overlaps another, the user will be able to place and move certain types of blocks, overlapping windows will exist. The data structure used for storing the windows and/or free space, needs to be able to handle this overlap.

So far I have two choices which I have built loose prototypes for:

1) A port of the Fluxbox placement algorithm into my code.

The problem with this is, the client (my program) gets kicked out of the audio server (JACK) when I try placing the worst case scenario of 256 blocks using the algorithm. This algorithm performs over 14000 full (linear) scans of the list of blocks already placed when placing the 256th window.

For a demonstration of this I created a program called text_boxer-0.0.2.tar.bz2 which takes a text file as input and arranges it within ASCII boxes. Issue make to build it. A little unfriendly, use --help (or any other invalid option) for a list of command line options. You must specify the text file by using the option.

2) My alternative approach.

Only partially implemented, this approach uses a data structure for each area of rectangular free unused space (the list of windows can be entirely separate, and is not required for testing of this algorithm). The data structure acts as a node in a doubly linked list (with sorted insertion), as well as containing the coordinates of the top-left corner, and the width and height.

Furthermore, each block data structure also contains four links which connect to each immediately adjacent (touching) block on each of the four sides.

IMPORTANT RULE: Each block may only touch with one block per side. This is a rule specific to the algorithm's way of storing free unused space and bears no factor in how many actual windows may touch each other.

The problem with this approach is, it's very complex. I have implemented the straightforward cases where 1) space is removed from one corner of a block, 2) splitting neighbouring blocks so that the IMPORTANT RULE is adhered to.

The less straightforward case, where the space to be removed can only be found within a column or row of boxes, is only partially implemented - if one of the blocks to be removed is an exact fit for width (ie column) or height (ie row) then problems occur. And don't even mention the fact this only checks columns one box wide, and rows one box tall.

I've implemented this algorithm in C - the language I am using for this project (I've not used C++ for a few years and am uncomfortable using it after having focused all my attention to C development, it's a hobby). The implementation is 700+ lines of code (including plenty of blank lines, brace lines, comments etc). The implementation only works for the horizontal-rows + left-right + top-bottom placement strategy.

So I've either got to add some way of making this +700 lines of code work for the other 7 placement strategy options, or I'm going to have to duplicate those +700 lines of code for the other seven options. Neither of these is attractive, the first, because the existing code is complex enough, the second, because of bloat.

The algorithm is not even at a stage where I can use it in the real time worst case scenario, because of missing functionality, so I still don't know if it actually performs better or worse than the first approach.

The current state of C implementation of this algorithm is freespace.c. I use gcc -O0 -ggdb freespace.c to build, and run it in an xterm sized to atleast 124 x 60 chars.

What else is there?

I've skimmed over and discounted:

Bin Packing algorithms: their
emphasis on optimal fit does not
match the requirements of this
algorithm.
Recursive Bisection Placement algorithms: sounds promising, but
these are for circuit design. Their
emphasis is optimal wire length.

Both of these, especially the latter, all elements to be placed/packs are known before the algorithm begins.

What are your thoughts on this? How would you approach it? What other algorithms should I look at? Or even what concepts should I research seeing as I've not studied computer science/software engineering?

Please ask questions in comments if further information is needed.

Further ideas developed since asking this question

Some combination of my "alternative algorithm" with a spatial hashmap for identifying if a large window to be placed would cover several blocks of free space.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梦在深巷 2024-09-07 06:19:02

我会考虑某种空间哈希结构。想象一下你的整个自由空间被粗略地网格化，称之为块。随着窗口的出现和消失，它们占据了某些连续的矩形块组。对于每个块，跟踪与每个角相关的最大未使用矩形，因此每个块需要存储 2*4 实数。对于空块，每个角的矩形的大小等于块的大小。因此，一个区块只能在其角落“用完”，因此任何区块中最多可以有 4 个窗户。

现在，每次添加窗口时，您都必须搜索适合该窗口的一组矩形块，并且在搜索时更新可用角尺寸。您应该调整块的大小，以便将少数（~4x4）块放入一个典型的窗口中。对于每个窗口，跟踪它接触的块（您只需要跟踪范围），以及哪些窗口接触给定块（在此算法中最多 4 个）。块的粒度和每个窗口插入/删除的工作量之间存在明显的权衡。

删除窗口时，循环遍历它接触的所有块，并为每个块重新计算自由角大小（您知道哪些窗口接触它）。这很快，因为内部循环的长度最多为 4。

这样的数据结构，

struct block{
    int free_x[4]; // 0 = top left, 1 = top right,
    int free_y[4]; // 2 = bottom left, 3 = bottom right
    int n_windows; // number of windows that occupy this block
    int window_id[4]; // IDs of windows that occupy this block
};
block blocks[NX][NY];

struct window{
    int id;
    int used_block_x[2]; // 0 = first index of used block,
    int used_block_y[2]; // 1 = last index of used block
};

我想象一个像Edit

这是一张图片：

alt text

它显示了两个示例块。彩色点表示块的角，从它们发出的箭头表示从该角开始的最大自由矩形的范围。

您在编辑中提到，将放置窗口的网格已经相当粗糙（127x127），因此块大小可能类似于一侧的 4 个网格单元，这可能不会给您带来太多好处。如果您的窗口角坐标可以采用很多值（我认为它们是像素），则此方法适用，但在您的情况下则不适用。您仍然可以尝试一下，因为它很简单。您可能还想保留一个完全空的块的列表，以便如果出现一个大于 2 个块宽度的窗口，那么您首先在该列表中查找，然后再在块网格中寻找一些合适的可用空间。

I would consider some kind of spatial hashing structure. Imagine your entire free space is gridded coarsely, call them blocks. As windows come and go, they occupy certain sets of contiguous rectangular blocks. For each block, keep track of the largest unused rectangle incident to each corner, so you need to store 2*4 real numbers per block. For an empty block, the rectangles at each corner have size equal to the block. Thus, a block can only be "used up" at its corners, and so at most 4 windows can sit in any block.

Now each time you add a window, you have to search for a rectangular set of blocks for which the window will fit, and when you do, update the free corner sizes. You should size your blocks so that a handful (~4x4) of them fit into a typical window. For each window, keep track of which blocks it touches (you only need to keep track of extents), as well as which windows touch a given block (at most 4, in this algorithm). There is an obvious tradeoff between the granularity of the blocks and the amount of work per window insertion/removal.

When removing a window, loop over all blocks it touches, and for each block, recompute the free corner sizes (you know which windows touch it). This is fast since the inner loop is at most length 4.

I imagine a data structure like

struct block{
    int free_x[4]; // 0 = top left, 1 = top right,
    int free_y[4]; // 2 = bottom left, 3 = bottom right
    int n_windows; // number of windows that occupy this block
    int window_id[4]; // IDs of windows that occupy this block
};
block blocks[NX][NY];

struct window{
    int id;
    int used_block_x[2]; // 0 = first index of used block,
    int used_block_y[2]; // 1 = last index of used block
};

Edit

Here is a picture:

alt text

It shows two example blocks. The colored dots indicate the corners of the block, and the arrows emanating from them indicate the extents of the largest-free-rectangle from that corner.

You mentioned in the edit that the grid on which your windows will be placed is already quite coarse (127x127), so the block sizes would probably be something like 4 grid cells on a side, which probably wouldn't gain you much. This method is suitable if your window corner coordinates can take on a lot of values (I was thinking they would be pixels), but not so much in your case. You can still try it, since it's simple. You would probably want to also keep a list of completely empty blocks so that if a window comes in that is larger than 2 block widths, then you look first in that list before looking for some suitable free space in the block grid.

回复收藏 0 原文

我最亲爱的 2024-09-07 06:19:02

经过一些错误的开始，我最终到达了这里。这里已经放弃使用数据结构来存储自由空间的矩形区域。相反，有一个包含 128 x 128 元素的二维数组可以实现相同的结果，但复杂性要低得多。

以下函数扫描数组中 width * height 大小的区域。它找到的第一个位置将左上角坐标写入 resultx 和 resulty 指向的位置。

_Bool freespace_remove( freespace* fs,
                        int width,     int height,
                        int* resultx,  int* resulty)
{
    int x = 0;
    int y = 0;
    const int rx = FSWIDTH - width;
    const int by = FSHEIGHT - height;

    *resultx = -1;
    *resulty = -1;

    char* buf[height];

    for (y = 0; y < by; ++y)
    {
        x = 0;
        char* scanx = fs->buf[y];

        while (x < rx)
        {
            while(x < rx && *(scanx + x))
                ++x;

            int w, h;

            for (h = 0; h < height; ++h)
                buf[h] = fs->buf[y + h] + x;

            _Bool usable = true;
            w = 0;

            while (usable && w < width)
            {
                h = 0;
                while (usable && h < height)
                    if (*(buf[h++] + w))
                        usable = false;
                ++w;
            }

            if (usable)
            {
                for (w = 0; w < width; ++w)
                    for (h = 0; h < height; ++h)
                        *(buf[h] + w) = 1;

                *resultx = x;
                *resulty = y;
                return true;
            }

            x += w;
        }
    }

    return false;
}

二维数组初始化为零。数组中使用该空间的任何区域都设置为 1。该结构和函数将独立于占用标有 1 的区域的实际窗口列表而工作。

这种方法的优点是简单。它只使用一种数据结构——数组。该函数很短，并且应该不会太难适应处理其余的放置选项（这里它只处理 Row Smart + Left to Right + Top to Bottom）。

我的初步测试在速度方面看起来也很有希望。虽然我认为这不适合窗口管理器将窗口放置在具有像素精度的 1600 x 1200 桌面上，但就我的目的而言，我相信它会比我之前使用的任何方法都要好得多尝试过。

可编译的测试代码在这里：
http://jwm-art.net/art/text/freespace_grid.c
（在Linux中我使用gcc -ggdb -O0 freespace_grid.c来编译）

After some false starts, I have eventually arrived here. Here is where the use of data structures for storing rectangular areas of free space have been abandoned. Instead, there is a 2d array with 128 x 128 elements to achieve the same result but with much less complexity.

The following function scans the array for an area width * height in size. The first position it finds it writes the top left coordinates of, to where resultx and resulty point to.

_Bool freespace_remove( freespace* fs,
                        int width,     int height,
                        int* resultx,  int* resulty)
{
    int x = 0;
    int y = 0;
    const int rx = FSWIDTH - width;
    const int by = FSHEIGHT - height;

    *resultx = -1;
    *resulty = -1;

    char* buf[height];

    for (y = 0; y < by; ++y)
    {
        x = 0;
        char* scanx = fs->buf[y];

        while (x < rx)
        {
            while(x < rx && *(scanx + x))
                ++x;

            int w, h;

            for (h = 0; h < height; ++h)
                buf[h] = fs->buf[y + h] + x;

            _Bool usable = true;
            w = 0;

            while (usable && w < width)
            {
                h = 0;
                while (usable && h < height)
                    if (*(buf[h++] + w))
                        usable = false;
                ++w;
            }

            if (usable)
            {
                for (w = 0; w < width; ++w)
                    for (h = 0; h < height; ++h)
                        *(buf[h] + w) = 1;

                *resultx = x;
                *resulty = y;
                return true;
            }

            x += w;
        }
    }

    return false;
}

The 2d array is zero initialized. Any areas in the array where the space is used are set to 1. This structure and function will work independently from the actual list of windows that are occupying the areas marked with 1's.

The advantages of this method are its simplicity. It only uses one data structure - an array. The function is short, and should not be too difficult to adapt to handle the remaining placement options (here it only handles Row Smart + Left to Right + Top to Bottom).

My initial tests also look promising on the speed front. Though I don't think this would be suitable for a window manager placing windows on, for example, a 1600 x 1200 desktop with pixel accuracy, for my purposes I believe it is going to be much better than any of the previous methods I have tried.

Compilable test code here:
http://jwm-art.net/art/text/freespace_grid.c
(in Linux I use gcc -ggdb -O0 freespace_grid.c to compile)

回复收藏 0 原文

盛装女皇 2024-09-07 06:19:02

#include <limits.h>
#include <stdbool.h>
#include <stddef.h>
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>


#define FSWIDTH 128
#define FSHEIGHT 128


#ifdef USE_64BIT_ARRAY
    #define FSBUFBITS 64
    #define FSBUFWIDTH 2
    typedef uint64_t fsbuf_type;
    #define TRAILING_ZEROS( v ) __builtin_ctzl(( v ))
    #define LEADING_ONES( v )   __builtin_clzl(~( v ))
#else
#ifdef USE_32BIT_ARRAY
    #define FSBUFBITS 32
    #define FSBUFWIDTH 4
    typedef uint32_t fsbuf_type;
    #define TRAILING_ZEROS( v ) __builtin_ctz(( v ))
    #define LEADING_ONES( v )   __builtin_clz(~( v ))
#else
#ifdef USE_16BIT_ARRAY
    #define FSBUFBITS 16
    #define FSBUFWIDTH 8
    typedef uint16_t fsbuf_type;
    #define TRAILING_ZEROS( v ) __builtin_ctz( 0xffff0000 | ( v ))
    #define LEADING_ONES( v )   __builtin_clz(~( v ) << 16)
#else
#ifdef USE_8BIT_ARRAY
    #define FSBUFBITS 8
    #define FSBUFWIDTH 16
    typedef uint8_t fsbuf_type;
    #define TRAILING_ZEROS( v ) __builtin_ctz( 0xffffff00 | ( v ))
    #define LEADING_ONES( v )   __builtin_clz(~( v ) << 24)
#else
    #define FSBUFBITS 1
    #define FSBUFWIDTH 128
    typedef unsigned char fsbuf_type;
    #define TRAILING_ZEROS( v ) (( v ) ? 0 : 1)
    #define LEADING_ONES( v )   (( v ) ? 1 : 0)
#endif
#endif
#endif
#endif


static const fsbuf_type fsbuf_max =   ~(fsbuf_type)0;
static const fsbuf_type fsbuf_high =  (fsbuf_type)1 << (FSBUFBITS - 1);


typedef struct freespacegrid
{
    fsbuf_type buf[FSHEIGHT][FSBUFWIDTH];

    _Bool left_to_right;
    _Bool top_to_bottom;

} freespace;


void freespace_dump(freespace* fs)
{
    int x, y;

    for (y = 0; y < FSHEIGHT; ++y)
    {
        for (x = 0; x < FSBUFWIDTH; ++x)
        {
            fsbuf_type i = FSBUFBITS;
            fsbuf_type b = fs->buf[y][x];

            for(; i != 0; --i, b <<= 1)
                putchar(b & fsbuf_high ? '#' : '/');
/*
            if (x + 1 < FSBUFWIDTH)
                putchar('|');
*/
        }
        putchar('\n');
    }
}


freespace* freespace_new(void)
{
    freespace* fs = malloc(sizeof(*fs));

    if (!fs)
        return 0;

    int y;

    for (y = 0; y < FSHEIGHT; ++y)
    {
        memset(&fs->buf[y][0], 0, sizeof(fsbuf_type) * FSBUFWIDTH);
    }

    fs->left_to_right = true;
    fs->top_to_bottom = true;

    return fs;
}


void freespace_delete(freespace* fs)
{
    if (!fs)
        return;

    free(fs);
}

/* would be private function: */
void fs_set_buffer( fsbuf_type buf[FSHEIGHT][FSBUFWIDTH],
                    unsigned x,
                    unsigned y1,
                    unsigned xoffset,
                    unsigned width,
                    unsigned height)
{
    fsbuf_type v;
    unsigned y;

    for (; width > 0 && x < FSBUFWIDTH; ++x)
    {
        if (width < xoffset)
            v = (((fsbuf_type)1 << width) - 1) << (xoffset - width);
        else if (xoffset < FSBUFBITS)
            v = ((fsbuf_type)1 << xoffset) - 1;
        else
            v = fsbuf_max;

        for (y = y1; y < y1 + height; ++y)
        {
#ifdef FREESPACE_DEBUG
            if (buf[y][x] & v)
                printf("**** over-writing area ****\n");
#endif
            buf[y][x] |= v;
        }

        if (width < xoffset)
            return;

        width -= xoffset;
        xoffset = FSBUFBITS;
    }
}


_Bool freespace_remove(   freespace* fs,
                          unsigned width, unsigned height,
                          int* resultx,   int* resulty)
{
    unsigned x, x1, y;
    unsigned w, h;
    unsigned xoffset, x1offset;
    unsigned tz; /* trailing zeros */

    fsbuf_type* xptr;
    fsbuf_type mask =   0;
    fsbuf_type v;

    _Bool scanning = false;
    _Bool offset = false;

    *resultx = -1;
    *resulty = -1;

    for (y = 0; y < (unsigned) FSHEIGHT - height; ++y)
    {
        scanning = false;
        xptr = &fs->buf[y][0];

        for (x = 0; x < FSBUFWIDTH; ++x, ++xptr)
        {
            if(*xptr == fsbuf_max)
            {
                scanning = false;
                continue;
            }

            if (!scanning)
            {
                scanning = true;
                x1 = x;
                x1offset = xoffset = FSBUFBITS;
                w = width;
            }
retry:
            if (w < xoffset)
                mask = (((fsbuf_type)1 << w) - 1) << (xoffset - w);
            else if (xoffset < FSBUFBITS)
                mask = ((fsbuf_type)1 << xoffset) - 1;
            else
                mask = fsbuf_max;

            offset = false;

            for (h = 0; h < height; ++h)
            {
                v = fs->buf[y + h][x] & mask;

                if (v)
                {
                    tz = TRAILING_ZEROS(v);
                    offset = true;
                    break;
                }
            }

            if (offset)
            {
                if (tz)
                {
                    x1 = x;
                    w = width;
                    x1offset = xoffset = tz;
                    goto retry;
                }
                scanning = false;
            }
            else
            {
                if (w <= xoffset) /***** RESULT! *****/
                {
                    fs_set_buffer(fs->buf, x1, y, x1offset, width, height);
                    *resultx = x1 * FSBUFBITS + (FSBUFBITS - x1offset);
                    *resulty = y;
                    return true;
                }
                w -= xoffset;
                xoffset = FSBUFBITS;
            }
        }
    }
    return false;
}


int main(int argc, char** argv)
{
    int x[1999];
    int y[1999];
    int w[1999];
    int h[1999];

    int i;

    freespace* fs = freespace_new();

    for (i = 0; i < 1999; ++i, ++u)
    {
        w[i] = rand() % 18 + 4;
        h[i] = rand() % 18 + 4;

        freespace_remove(fs, w[i], h[i], &x[i], &y[i]);
/*
        freespace_dump(fs);
        printf("w:%d h:%d x:%d y:%d\n", w[i], h[i], x[i], y[i]);
        if (x[i] == -1)
            printf("not removed space %d\n", i);
        getchar();
*/
    }

    freespace_dump(fs);
    freespace_delete(fs);

    return 0;
}

上述代码需要USE_64BIT_ARRAY、USE_32BIT_ARRAY、USE_16BIT_ARRAY、USE_8BIT_ARRAY之一定义，否则它将回退到仅使用 unsigned char 的高位来存储网格单元的状态。

函数 fs_set_buffer 不会在标头中声明，并且当此代码在 .h 和 .c 文件之间拆分时，该函数将在实现中变为静态。将提供隐藏实现细节的更加用户友好的功能，以从网格中删除已用空间。

总的来说，这种实现在没有优化的情况下比我的更快先前的答案具有最大优化（在64位Gentoo上使用GCC，分别优化选项-O0和-O3）。

关于 USE_NNBIT_ARRAY 和不同的位大小，我使用了两种不同的方法对代码进行计时，这些方法在 1999 年调用了 freespace_remove。

使用 Unix time 命令对 main() 进行计时（并禁用代码中的任何输出）似乎证明了我的期望是正确的 - 位大小越高，速度越快。

另一方面，对 freespace_remove 的各个调用进行计时（使用 gettimeofday）并比较 1999 年调用所花费的最大时间似乎表明位数越小速度越快。

这仅在 64 位系统（Intel Dual Core II）上进行了测试。

#include <limits.h>
#include <stdbool.h>
#include <stddef.h>
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>


#define FSWIDTH 128
#define FSHEIGHT 128


#ifdef USE_64BIT_ARRAY
    #define FSBUFBITS 64
    #define FSBUFWIDTH 2
    typedef uint64_t fsbuf_type;
    #define TRAILING_ZEROS( v ) __builtin_ctzl(( v ))
    #define LEADING_ONES( v )   __builtin_clzl(~( v ))
#else
#ifdef USE_32BIT_ARRAY
    #define FSBUFBITS 32
    #define FSBUFWIDTH 4
    typedef uint32_t fsbuf_type;
    #define TRAILING_ZEROS( v ) __builtin_ctz(( v ))
    #define LEADING_ONES( v )   __builtin_clz(~( v ))
#else
#ifdef USE_16BIT_ARRAY
    #define FSBUFBITS 16
    #define FSBUFWIDTH 8
    typedef uint16_t fsbuf_type;
    #define TRAILING_ZEROS( v ) __builtin_ctz( 0xffff0000 | ( v ))
    #define LEADING_ONES( v )   __builtin_clz(~( v ) << 16)
#else
#ifdef USE_8BIT_ARRAY
    #define FSBUFBITS 8
    #define FSBUFWIDTH 16
    typedef uint8_t fsbuf_type;
    #define TRAILING_ZEROS( v ) __builtin_ctz( 0xffffff00 | ( v ))
    #define LEADING_ONES( v )   __builtin_clz(~( v ) << 24)
#else
    #define FSBUFBITS 1
    #define FSBUFWIDTH 128
    typedef unsigned char fsbuf_type;
    #define TRAILING_ZEROS( v ) (( v ) ? 0 : 1)
    #define LEADING_ONES( v )   (( v ) ? 1 : 0)
#endif
#endif
#endif
#endif


static const fsbuf_type fsbuf_max =   ~(fsbuf_type)0;
static const fsbuf_type fsbuf_high =  (fsbuf_type)1 << (FSBUFBITS - 1);


typedef struct freespacegrid
{
    fsbuf_type buf[FSHEIGHT][FSBUFWIDTH];

    _Bool left_to_right;
    _Bool top_to_bottom;

} freespace;


void freespace_dump(freespace* fs)
{
    int x, y;

    for (y = 0; y < FSHEIGHT; ++y)
    {
        for (x = 0; x < FSBUFWIDTH; ++x)
        {
            fsbuf_type i = FSBUFBITS;
            fsbuf_type b = fs->buf[y][x];

            for(; i != 0; --i, b <<= 1)
                putchar(b & fsbuf_high ? '#' : '/');
/*
            if (x + 1 < FSBUFWIDTH)
                putchar('|');
*/
        }
        putchar('\n');
    }
}


freespace* freespace_new(void)
{
    freespace* fs = malloc(sizeof(*fs));

    if (!fs)
        return 0;

    int y;

    for (y = 0; y < FSHEIGHT; ++y)
    {
        memset(&fs->buf[y][0], 0, sizeof(fsbuf_type) * FSBUFWIDTH);
    }

    fs->left_to_right = true;
    fs->top_to_bottom = true;

    return fs;
}


void freespace_delete(freespace* fs)
{
    if (!fs)
        return;

    free(fs);
}

/* would be private function: */
void fs_set_buffer( fsbuf_type buf[FSHEIGHT][FSBUFWIDTH],
                    unsigned x,
                    unsigned y1,
                    unsigned xoffset,
                    unsigned width,
                    unsigned height)
{
    fsbuf_type v;
    unsigned y;

    for (; width > 0 && x < FSBUFWIDTH; ++x)
    {
        if (width < xoffset)
            v = (((fsbuf_type)1 << width) - 1) << (xoffset - width);
        else if (xoffset < FSBUFBITS)
            v = ((fsbuf_type)1 << xoffset) - 1;
        else
            v = fsbuf_max;

        for (y = y1; y < y1 + height; ++y)
        {
#ifdef FREESPACE_DEBUG
            if (buf[y][x] & v)
                printf("**** over-writing area ****\n");
#endif
            buf[y][x] |= v;
        }

        if (width < xoffset)
            return;

        width -= xoffset;
        xoffset = FSBUFBITS;
    }
}


_Bool freespace_remove(   freespace* fs,
                          unsigned width, unsigned height,
                          int* resultx,   int* resulty)
{
    unsigned x, x1, y;
    unsigned w, h;
    unsigned xoffset, x1offset;
    unsigned tz; /* trailing zeros */

    fsbuf_type* xptr;
    fsbuf_type mask =   0;
    fsbuf_type v;

    _Bool scanning = false;
    _Bool offset = false;

    *resultx = -1;
    *resulty = -1;

    for (y = 0; y < (unsigned) FSHEIGHT - height; ++y)
    {
        scanning = false;
        xptr = &fs->buf[y][0];

        for (x = 0; x < FSBUFWIDTH; ++x, ++xptr)
        {
            if(*xptr == fsbuf_max)
            {
                scanning = false;
                continue;
            }

            if (!scanning)
            {
                scanning = true;
                x1 = x;
                x1offset = xoffset = FSBUFBITS;
                w = width;
            }
retry:
            if (w < xoffset)
                mask = (((fsbuf_type)1 << w) - 1) << (xoffset - w);
            else if (xoffset < FSBUFBITS)
                mask = ((fsbuf_type)1 << xoffset) - 1;
            else
                mask = fsbuf_max;

            offset = false;

            for (h = 0; h < height; ++h)
            {
                v = fs->buf[y + h][x] & mask;

                if (v)
                {
                    tz = TRAILING_ZEROS(v);
                    offset = true;
                    break;
                }
            }

            if (offset)
            {
                if (tz)
                {
                    x1 = x;
                    w = width;
                    x1offset = xoffset = tz;
                    goto retry;
                }
                scanning = false;
            }
            else
            {
                if (w <= xoffset) /***** RESULT! *****/
                {
                    fs_set_buffer(fs->buf, x1, y, x1offset, width, height);
                    *resultx = x1 * FSBUFBITS + (FSBUFBITS - x1offset);
                    *resulty = y;
                    return true;
                }
                w -= xoffset;
                xoffset = FSBUFBITS;
            }
        }
    }
    return false;
}


int main(int argc, char** argv)
{
    int x[1999];
    int y[1999];
    int w[1999];
    int h[1999];

    int i;

    freespace* fs = freespace_new();

    for (i = 0; i < 1999; ++i, ++u)
    {
        w[i] = rand() % 18 + 4;
        h[i] = rand() % 18 + 4;

        freespace_remove(fs, w[i], h[i], &x[i], &y[i]);
/*
        freespace_dump(fs);
        printf("w:%d h:%d x:%d y:%d\n", w[i], h[i], x[i], y[i]);
        if (x[i] == -1)
            printf("not removed space %d\n", i);
        getchar();
*/
    }

    freespace_dump(fs);
    freespace_delete(fs);

    return 0;
}

The above code requires one of USE_64BIT_ARRAY, USE_32BIT_ARRAY, USE_16BIT_ARRAY, USE_8BIT_ARRAY to be defined otherwise it will fall back to using only the high bit of an unsigned char for storing the state of grid cells.

The function fs_set_buffer will not be declared in the header, and will become static within the implementation when this code gets split between .h and .c files. A more user friendly function hiding the implementation details will be provided for removing used space from the grid.

Overall, this implementation is faster without optimization than my previous answer with maximum optimization (using GCC on 64bit Gentoo, optimization options -O0 and -O3 respectively).

Regarding USE_NNBIT_ARRAY and the different bit sizes, I used two different methods of timing the code which make 1999 calls to freespace_remove.

Timing main() using the Unix time command (and disabling any output in the code) seemed to prove my expectations correct - that higher bit sizes are faster.

On the other hand, timing individual calls to freespace_remove (using gettimeofday) and comparing the maximum time taken over the 1999 calls seemed to indicate lower bit sizes were faster.

This has only been tested on a 64bit system (Intel Dual Core II).

回复收藏 0 原文

~没有更多了~