C++,用一个字节存储两个变量

发布于 2024-08-29 01:22:28 字数 1388 浏览 13 评论 0原文

我正在研究国际象棋棋盘的表示,我计划将其存储在 32 字节数组中,其中每个字节将用于存储两个棋子。 (这样每块只需要 4 位)

这样做会导致访问板的特定索引的开销。 您认为该代码可以优化还是可以使用完全不同的访问索引的方法?

我同样

char getPosition(unsigned char* c, int index){
    //moving pointer
    c+=(index>>1);

    //odd number
    if (index & 1){
        //taking right part
        return *c & 0xF;
    }else
    {
        //taking left part
        return *c>>4;
    }
}


void setValue(unsigned char* board, char value, int index){
    //moving pointer
    board+=(index>>1);

    //odd number
    if (index & 1){
        //replace right part
                 //save left       value only 4 bits
        *board = (*board & 0xF0) + value;
    }else
    {
        //replacing left part
        *board  = (*board & 0xF) + (value<<4);
    }
}


int main() {

    char* c = (char*)malloc(32);

    for (int i = 0; i < 64 ; i++){
        setValue((unsigned char*)c, i % 8,i);
    }

    for (int i = 0; i < 64 ; i++){
        cout<<(int)getPosition((unsigned char*)c, i)<<" ";

        if (((i+1) % 8 == 0) && (i > 0)){
            cout<<endl;
        }


    }


    return 0;
}

对您关于国际象棋表示以及上述方法的优化作为一个独立问题的看法感兴趣。

非常感谢

编辑

感谢您的回复。不久前,我创建了跳棋游戏,其中使用 64 字节棋盘表示。这次我尝试了一些不同的方法,只是想看看我喜欢什么。内存并不是什么大问题。 Bit-boards 绝对在我的尝试清单上。谢谢

I am working on representation of the chess board, and I am planning to store it in 32 bytes array, where each byte will be used to store two pieces. (That way only 4 bits are needed per piece)

Doing it in that way, results in a overhead for accessing particular index of the board.
Do you think that, this code can be optimised or completely different method of accessing indexes can be used?

c++

char getPosition(unsigned char* c, int index){
    //moving pointer
    c+=(index>>1);

    //odd number
    if (index & 1){
        //taking right part
        return *c & 0xF;
    }else
    {
        //taking left part
        return *c>>4;
    }
}


void setValue(unsigned char* board, char value, int index){
    //moving pointer
    board+=(index>>1);

    //odd number
    if (index & 1){
        //replace right part
                 //save left       value only 4 bits
        *board = (*board & 0xF0) + value;
    }else
    {
        //replacing left part
        *board  = (*board & 0xF) + (value<<4);
    }
}


int main() {

    char* c = (char*)malloc(32);

    for (int i = 0; i < 64 ; i++){
        setValue((unsigned char*)c, i % 8,i);
    }

    for (int i = 0; i < 64 ; i++){
        cout<<(int)getPosition((unsigned char*)c, i)<<" ";

        if (((i+1) % 8 == 0) && (i > 0)){
            cout<<endl;
        }


    }


    return 0;
}

I am equally interested in your opinions regarding chess representations, and optimisation of the method above, as a stand alone problem.

Thanks a lot

EDIT

Thanks for your replies. A while ago I created checkers game, where I was using 64 bytes board representation. This time I am trying some different methods, just to see what I like. Memory is not such a big problem. Bit-boards is definitely on my list to try. Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

相思碎 2024-09-05 01:22:28

这就是过早优化的问题。您的棋盘原本需要 64 个字节来存储,现在需要 32 个字节。这真正给您带来了什么?您是否真的分析了情况以查看是否需要保存该内存?

假设您使用了最不理想的搜索方法之一,直接 AB 搜索到深度 D,没有启发式,并且您在搜索之前在某个位置生成了所有可能的移动,那么您的棋盘所需的绝对最大内存将是 sizeof(board) * W * D。如果我们假设相当大的 W = 100 和大 D = 30,那么您将在深度 D 的内存中拥有 3000 个板。64k 与 32k...真的值得吗?

另一方面,您增加了访问 board[location] 所需的操作量,并且每次搜索将调用数百万次。

在构建国际象棋人工智能时,您最终要寻找的主要内容是 CPU 周期,而不是内存。如果您的目标是手机或其他设备,这可能会有所不同,但即使如此,在达到足够的深度以导致任何内存问题之前,您也会更加担心速度。

至于我更喜欢​​哪种表示形式......我喜欢位板。没有进行大量认真的测量,但我确实比较了我制作的两个引擎,一个位板和一个阵列,位板一个速度更快,并且可以比另一个引擎到达更大的深度。

That's the problem with premature optimization. Where your chess board would have taken 64 bytes to store, now it takes 32. What has this really boughten you? Did you actually analyze the situation to see if you needed to save that memory?

Assuming that you used one of the least optimal search method, straight AB search to depth D with no heuristics, and you generate all possible moves in a position before searching, then absolute maximum memory required for your board is going to be sizeof(board) * W * D. If we assume a rather large W = 100 and large D = 30 then you're going to have 3000 boards in memory at depth D. 64k vs 32k...is it really worth it?

On the other hand, you've increased the amount of operations necessary to access board[location] and this will be called many millions of times per search.

When building chess AI's the main thing you'll end up looking for is cpu cycles, not memory. This may vary a little bit if you're targeting a cell phone or something, but even at that you're going to worry more about speed before you'll ever reach enough depth to cause any memory issues.

As to which representation I prefer...I like bitboards. Haven't done a lot of serious measurements but I did compare two engines I made, one bitboard and one array, and the bitboard one was faster and could reach much greater depths than the other.

别想她 2024-09-05 01:22:28

让我第一个指出潜在的错误(取决于编译器和编译器设置)。错误就是为什么过早优化是邪恶的:

   //taking left part
    return *c>>4;

如果 *c 是负数,那么 >>>可以重复负高位。即二进制:

0b10100000 >> 4 == 0b11111010

对于某些编译器(即 C++ 标准将其留给编译器来决定 - 是否携带高位,以及 char 是有符号还是无符号)。

如果您确实想继续处理您的打包位(让我说您可能不应该打扰,但这取决于您),我建议将打包位包装到一个类中,并覆盖 [] ,这样可以

board[x][y] 

给出你是解压的部分。然后您可以轻松地打开和关闭打包,并且在任何情况下都具有相同的语法。如果您内联运算符重载,它应该与您现在的代码一样高效。

Let me be the first to point out a potential bug (depending on compilers and compiler settings). And bugs being why premature optimization is evil:

   //taking left part
    return *c>>4;

if *c is negative, then >> may repeat the negative high bit. ie in binary:

0b10100000 >> 4 == 0b11111010

for some compilers (ie the C++ standard leaves it to the compiler to decide - both whether to carry the high bit, and whether a char is signed or unsigned).

If you do want to go forward with your packed bits (and let me say that you probably shouldn't bother, but it is up to you), I would suggest wrapping the packed bits into a class, and overriding [] such that

board[x][y] 

gives you the unpacked bits. Then you can turn the packing on and off easily, and having the same syntax in either case. If you inline the operator overloads, it should be as efficient as the code you have now.

毁梦 2024-09-05 01:22:28

嗯,64 字节是非常小的 RAM。你最好只使用 char[8][8] 。也就是说,除非您计划存储大量棋盘。执行 char[8][8] 可以更轻松(且更快)地访问板并对其执行更复杂的操作。

如果您仍然对以打包表示形式存储板感兴趣(无论是为了练习还是存储大量板),我说您在位操作方面“做得正确”。如果您想使用 inline 关键字提高速度,您可能需要考虑内联访问器。

Well, 64 bytes is a very small amount of RAM. You're better off just using a char[8][8]. That is, unless you plan on storing a ton of chess boards. Doing char[8][8] makes it easier (and faster) to access the board and do more complex operations on it.

If you're still interested in storing the board in packed representation (either for practice or to store a lot of boards), I say you're "doing it right" regarding the bit operations. You may want to consider inlining your accessors if you're going for speed using the inline keyword.

故事和酒 2024-09-05 01:22:28

当您不能仅使用完整字节来表示正方形时,空间是否足够考虑?这将使程序上的访问更容易遵循,而且由于不需要位操作,因此很可能更快。

否则,为了确保一切顺利,我将确保所有类型都是无符号的:getPosition 返回无符号字符,并用“U”(例如 0xF0U)限定所有数字文字,以确保它们始终被解释为无符号。您很可能不会遇到任何符号性问题,但为什么要在某些行为异常的架构上冒险呢?

Is space enough of a consideration where you can't just use a full byte to represent a square? That would make accesses easier to follow on the program and additionally most likely faster as the bit manipulations are not required.

Otherwise to make sure everything goes smoothly I would make sure all your types are unsigned: getPosition return unsigned char, and qualify all your numeric literals with "U" (0xF0U for example) to make sure they're always interpreted as unsigned. Most likely you won't have any problems with signed-ness but why take chances on some architecture that behaves unexpectedly?

望喜 2024-09-05 01:22:28

不错的代码,但如果您真的对性能优化如此深入,您可能应该更多地了解您的特定 CPU 架构。

AFAIK,您可能会发现将棋子存储在 8 个字节中会更有效。即使深度递归 15 次,L2 缓存大小也几乎不会成为约束,但 RAM 未对齐可能。我猜想正确处理棋盘将包括 Expand() 和 Reduce() 函数,以在算法的不同部分期间在棋盘表示之间进行转换:有些在紧凑表示上可能更快,有些反之亦然。例如,缓存和涉及通过两个相邻单元的组合进行散列的算法可能有利于紧凑结构,但其他一切都没有。

如果性能如此重要的话,我还会考虑开发一些辅助硬件,例如一些 FPGA 板或一些 GPU 代码。

Nice code, but if you are really that deep into performance optimization, you should probably learn more about your particular CPU architecture.

AFAIK, you may found that storing a chess piece in as much 8 bytes will be more efficient. Even if you recurse 15 moves deep, L2 cache size would hardly be a constraint, but RAM misalignment may be. I would guess that proper handling of a chess board would include Expand() and Reduce() functions to translate between board representations during different parts of the algorithm: some may be faster on compact representation, and some vice versa. For example, caching, and algorithms involving hashing by composition of two adjacent cells might be good for the compact structure, all else no.

I would also consider developing some helper hardware, like some FPGA board, or some GPU code, if performance is so important..

夏至、离别 2024-09-05 01:22:28

作为一名国际象棋棋手,我可以告诉你:走位不仅仅取决于每颗棋子的位置。你必须考虑其他一些事情:

  1. 哪一方接下来必须采取行动?
  2. 白色和/或黑色城堡可以是国王和/或皇后区吗?
  3. 典当可以被带走吗?
  4. 自上次棋子移动和/或吃子移动以来已经经过了多少步?

如果您用来表示仓位的数据结构不能反映此信息,那么您就有大麻烦了。

As a chess player, I can tell you: There's more to a position than the mere placement of each piece. You have to take in to consideration some other things:

  1. Which side has to move next?
  2. Can white and/or black castle king and/or queenside?
  3. Can a pawn be taken en passant?
  4. How many moves have passed since the last pawn move and/or capturing move?

If the data structure you use to represent a position doesn't reflect this information, then you're in big trouble.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文