SET 游戏赔率模拟 (MATLAB)

发布于 2024-09-01 07:03:36 字数 1702 浏览 10 评论 0原文

我最近发现了一张很棒的卡片 - SET。简而言之，共有 81 张卡片，具有四种特征：符号（椭圆形、波浪线或菱形）、颜色（红色、紫色或绿色）、数字（一个、两个或三个）和阴影（纯色、条纹或开放）。任务是（从选定的 12 张卡片中）找到一组 3 张卡片，其中每张卡片上的四个特征全部相同或每张卡片上全部不同（没有 2+1 组合）。

我在 MATLAB 中对其进行了编码，以找到解决方案并估计随机选择的卡片中出现一组的几率。

这是我估计赔率的代码：

%% initialization
K = 12; % cards to draw
NF = 4; % number of features (usually 3 or 4)
setallcards = unique(nchoosek(repmat(1:3,1,NF),NF),'rows'); % all cards: rows - cards, columns - features
setallcomb = nchoosek(1:K,3); % index of all combinations of K cards by 3

%% test
tic
NIter=1e2; % number of test iterations
setexists = 0; % test results holder
% C = progress('init'); % if you have progress function from FileExchange
for d = 1:NIter
% C = progress(C,d/NIter);    

% cards for current test
setdrawncardidx = randi(size(setallcards,1),K,1);
setdrawncards = setallcards(setdrawncardidx,:);

% find all sets in current test iteration
for setcombidx = 1:size(setallcomb,1)
    setcomb = setdrawncards(setallcomb(setcombidx,:),:);
    if all(arrayfun(@(x) numel(unique(setcomb(:,x))), 1:NF)~=2) % test one combination
        setexists = setexists + 1;
        break % to find only the first set
    end
end
end
fprintf('Set:NoSet = %g:%g = %g:1\n', setexists, NIter-setexists, setexists/(NIter-setexists))
toc

100-1000 次迭代很快，但要小心更多。在我的家用计算机上，一百万次迭代大约需要 15 个小时。不管怎样，有了 12 张卡牌和 4 个功能，我的一套牌的比例大约是 13:1。这其实是一个问题。说明书上说这个数字应该是33:1。最近 Peter Norvig 证实了这一点。他提供了Python代码，但我还没有测试。

那么你能发现错误吗？欢迎任何有关性能改进的意见。

原文

I have recently found the great card came - SET. Briefly, there are 81 cards with the four features: symbol (oval, squiggle or diamond), color (red, purple or green), number (one, two or three) and shading (solid, striped or open). The task is to locate (from selected 12 cards) a SET of 3 cards, in which each of the four features is either all the same on each card or all different on each card (no 2+1 combination).

I've coded it in MATLAB to find a solution and to estimate odds of having a set in randomly selected cards.

Here is my code to estimate odds:

%% initialization
K = 12; % cards to draw
NF = 4; % number of features (usually 3 or 4)
setallcards = unique(nchoosek(repmat(1:3,1,NF),NF),'rows'); % all cards: rows - cards, columns - features
setallcomb = nchoosek(1:K,3); % index of all combinations of K cards by 3

%% test
tic
NIter=1e2; % number of test iterations
setexists = 0; % test results holder
% C = progress('init'); % if you have progress function from FileExchange
for d = 1:NIter
% C = progress(C,d/NIter);    

% cards for current test
setdrawncardidx = randi(size(setallcards,1),K,1);
setdrawncards = setallcards(setdrawncardidx,:);

% find all sets in current test iteration
for setcombidx = 1:size(setallcomb,1)
    setcomb = setdrawncards(setallcomb(setcombidx,:),:);
    if all(arrayfun(@(x) numel(unique(setcomb(:,x))), 1:NF)~=2) % test one combination
        setexists = setexists + 1;
        break % to find only the first set
    end
end
end
fprintf('Set:NoSet = %g:%g = %g:1\n', setexists, NIter-setexists, setexists/(NIter-setexists))
toc

100-1000 iterations are fast, but be careful with more. One million iterations takes about 15 hours on my home computer. Anyway, with 12 cards and 4 features I've got around 13:1 of having a set. This is actually a problem. The instruction book said this number should be 33:1. And it was recently confirmed by Peter Norvig. He provides the Python code, but I didn't test it yet.

So can you find an error? Any comments on performance improvement are welcome.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

鸠魁 2024-09-08 07:03:36

在查看您的代码之前，我解决了编写自己的实现的问题。我的第一次尝试与您已有的非常相似:)

%# some parameters
NUM_ITER = 100000;  %# number of simulations to run
DRAW_SZ = 12;       %# number of cards we are dealing
SET_SZ = 3;         %# number of cards in a set
FEAT_NUM = 4;       %# number of features (symbol,color,number,shading)
FEAT_SZ = 3;        %# number of values per feature (eg: red/purple/green, ...)

%# cards features
features = {
    'oval' 'squiggle' 'diamond' ;    %# symbol
    'red' 'purple' 'green' ;         %# color
    'one' 'two' 'three' ;            %# number
    'solid' 'striped' 'open'         %# shading
};
fIdx = arrayfun(@(k) grp2idx(features(k,:)), 1:FEAT_NUM, 'UniformOutput',0);

%# list of all cards. Each card: [symbol,color,number,shading]
[W X Y Z] = ndgrid(fIdx{:});
cards = [W(:) X(:) Y(:) Z(:)];

%# all possible sets: choose 3 from 12
setsInd = nchoosek(1:DRAW_SZ,SET_SZ);

%# count number of valid sets in random draws of 12 cards
counterValidSet = 0;
for i=1:NUM_ITER
    %# pick 12 cards
    ord = randperm( size(cards,1) );
    cardsDrawn = cards(ord(1:DRAW_SZ),:);

    %# check for valid sets: features are all the same or all different
    for s=1:size(setsInd,1)
        %# set of 3 cards
        set = cardsDrawn(setsInd(s,:),:);

        %# check if set is valid
        count = arrayfun(@(k) numel(unique(set(:,k))), 1:FEAT_NUM);
        isValid = (count==1|count==3);

        %# increment counter
        if isValid
            counterValidSet = counterValidSet + 1;
            break           %# break early if found valid set among candidates
        end
    end
end

%# ratio of found-to-notfound
fprintf('Size=%d, Set=%d, NoSet=%d, Set:NoSet=%g\n', ...
    DRAW_SZ, counterValidSet, (NUM_ITER-counterValidSet), ...
    counterValidSet/(NUM_ITER-counterValidSet))

在使用 Profiler 发现热点后，可以通过在可能的情况下尽早退出循环来进行一些改进。主要瓶颈是对 UNIQUE 函数的调用。上面我们检查有效集的两行可以重写为：

%# check if set is valid
isValid = true;
for k=1:FEAT_NUM
    count = numel(unique(set(:,k)));
    if count~=1 && count~=3
        isValid = false;
        break   %# break early if one of the features doesnt meet conditions
    end
end

不幸的是，对于较大的模拟来说，模拟仍然很慢。因此，我的下一个解决方案是矢量化版本，对于每次迭代，我们从 12 张抽取的牌中构建所有可能的 3 张牌组的单个矩阵。对于所有这些候选集，我们使用逻辑向量来指示存在哪些特征，从而避免调用 UNIQUE/NUMEL（我们希望该组的每张卡上的特征全部相同或全部不同）。

我承认代码现在可读性较差且难以遵循（因此我发布了两个版本以进行比较）。原因是我尝试尽可能地优化代码，以便每个迭代循环都是完全矢量化的。这是最终代码：

%# some parameters
NUM_ITER = 100000;  %# number of simulations to run
DRAW_SZ = 12;       %# number of cards we are dealing
SET_SZ = 3;         %# number of cards in a set
FEAT_NUM = 4;       %# number of features (symbol,color,number,shading)
FEAT_SZ = 3;        %# number of values per feature (eg: red/purple/green, ...)

%# cards features
features = {
    'oval' 'squiggle' 'diamond' ;    %# symbol
    'red' 'purple' 'green' ;         %# color
    'one' 'two' 'three' ;            %# number
    'solid' 'striped' 'open'         %# shading
};
fIdx = arrayfun(@(k) grp2idx(features(k,:)), 1:FEAT_NUM, 'UniformOutput',0);

%# list of all cards. Each card: [symbol,color,number,shading]
[W X Y Z] = ndgrid(fIdx{:});
cards = [W(:) X(:) Y(:) Z(:)];

%# all possible sets: choose 3 from 12
setsInd = nchoosek(1:DRAW_SZ,SET_SZ);

%# optimizations: some calculations taken out of the loop
ss = setsInd(:);
set_sz2 = numel(ss)*FEAT_NUM/SET_SZ;
col = repmat(1:set_sz2,SET_SZ,1);
col = FEAT_SZ.*(col(:)-1);
M = false(FEAT_SZ,set_sz2);

%# progress indication
%#hWait = waitbar(0./NUM_ITER, 'Simulation...');

%# count number of valid sets in random draws of 12 cards
counterValidSet = 0;
for i=1:NUM_ITER
    %# update progress
    %#waitbar(i./NUM_ITER, hWait);

    %# pick 12 cards
    ord = randperm( size(cards,1) );
    cardsDrawn = cards(ord(1:DRAW_SZ),:);

    %# put all possible sets of 3 cards next to each other
    set = reshape(cardsDrawn(ss,:)',[],SET_SZ)';
    set = set(:);

    %# check for valid sets: features are all the same or all different
    M(:) = false;            %# if using PARFOR, it will complain about this
    M(set+col) = true;
    isValid = all(reshape(sum(M)~=2,FEAT_NUM,[]));

    %# increment counter if there is at least one valid set in all candidates
    if any(isValid)
        counterValidSet = counterValidSet + 1;
    end
end

%# ratio of found-to-notfound
fprintf('Size=%d, Set=%d, NoSet=%d, Set:NoSet=%g\n', ...
    DRAW_SZ, counterValidSet, (NUM_ITER-counterValidSet), ...
    counterValidSet/(NUM_ITER-counterValidSet))

%# close progress bar
%#close(hWait)

如果您有并行处理工具箱，您可以轻松地用并行 PARFOR 替换普通的 FOR 循环（您可能希望再次将矩阵 M 的初始化移到循环内：将 M(:) = false; 替换为 M = false(FEAT_SZ,set_sz2);)

以下是 50000 次模拟的一些示例输出（PARFOR 与 2 个池一起使用）本地实例）：

» tic, SET_game2, toc
Size=12, Set=48376, NoSet=1624, Set:NoSet=29.7882
Elapsed time is 5.653933 seconds.

» tic, SET_game2, toc
Size=15, Set=49981, NoSet=19, Set:NoSet=2630.58
Elapsed time is 9.414917 seconds.

并且经过一百万次迭代（PARFOR 为 12，no-PARFOR 为 15）：

» tic, SET_game2, toc
Size=12, Set=967516, NoSet=32484, Set:NoSet=29.7844
Elapsed time is 110.719903 seconds.

» tic, SET_game2, toc
Size=15, Set=999630, NoSet=370, Set:NoSet=2701.7
Elapsed time is 372.110412 seconds.

优势比与彼得·诺维格。

I tackled the problem writing my own implementation before looking at your code. My first attempt was very similar to what you already had :)

%# some parameters
NUM_ITER = 100000;  %# number of simulations to run
DRAW_SZ = 12;       %# number of cards we are dealing
SET_SZ = 3;         %# number of cards in a set
FEAT_NUM = 4;       %# number of features (symbol,color,number,shading)
FEAT_SZ = 3;        %# number of values per feature (eg: red/purple/green, ...)

%# cards features
features = {
    'oval' 'squiggle' 'diamond' ;    %# symbol
    'red' 'purple' 'green' ;         %# color
    'one' 'two' 'three' ;            %# number
    'solid' 'striped' 'open'         %# shading
};
fIdx = arrayfun(@(k) grp2idx(features(k,:)), 1:FEAT_NUM, 'UniformOutput',0);

%# list of all cards. Each card: [symbol,color,number,shading]
[W X Y Z] = ndgrid(fIdx{:});
cards = [W(:) X(:) Y(:) Z(:)];

%# all possible sets: choose 3 from 12
setsInd = nchoosek(1:DRAW_SZ,SET_SZ);

%# count number of valid sets in random draws of 12 cards
counterValidSet = 0;
for i=1:NUM_ITER
    %# pick 12 cards
    ord = randperm( size(cards,1) );
    cardsDrawn = cards(ord(1:DRAW_SZ),:);

    %# check for valid sets: features are all the same or all different
    for s=1:size(setsInd,1)
        %# set of 3 cards
        set = cardsDrawn(setsInd(s,:),:);

        %# check if set is valid
        count = arrayfun(@(k) numel(unique(set(:,k))), 1:FEAT_NUM);
        isValid = (count==1|count==3);

        %# increment counter
        if isValid
            counterValidSet = counterValidSet + 1;
            break           %# break early if found valid set among candidates
        end
    end
end

%# ratio of found-to-notfound
fprintf('Size=%d, Set=%d, NoSet=%d, Set:NoSet=%g\n', ...
    DRAW_SZ, counterValidSet, (NUM_ITER-counterValidSet), ...
    counterValidSet/(NUM_ITER-counterValidSet))

After using the Profiler to discover hot spots, some improvement can be made mainly by early-break'ing out of loops when possible. The main bottleneck is the call to the UNIQUE function. Those two lines above where we check for valid sets can be rewritten as:

%# check if set is valid
isValid = true;
for k=1:FEAT_NUM
    count = numel(unique(set(:,k)));
    if count~=1 && count~=3
        isValid = false;
        break   %# break early if one of the features doesnt meet conditions
    end
end

Unfortunately, the simulation is still slow for larger simulation. Thus my next solution is a vectorized version, where for each iteration, we build a single matrix of all possible sets of 3 cards from the hand of 12 drawn cards. For all these candidate sets, we use logical vectors to indicate what feature is present, thus avoiding the calls to UNIQUE/NUMEL (we want features all the same or all different on each card of the set).

I admit that the code is now less readable and harder to follow (thus I posted both versions for comparison). The reason being that I tried to optimize the code as much as possible, so that each iteration-loop is fully vectorized. Here is the final code:

%# some parameters
NUM_ITER = 100000;  %# number of simulations to run
DRAW_SZ = 12;       %# number of cards we are dealing
SET_SZ = 3;         %# number of cards in a set
FEAT_NUM = 4;       %# number of features (symbol,color,number,shading)
FEAT_SZ = 3;        %# number of values per feature (eg: red/purple/green, ...)

%# cards features
features = {
    'oval' 'squiggle' 'diamond' ;    %# symbol
    'red' 'purple' 'green' ;         %# color
    'one' 'two' 'three' ;            %# number
    'solid' 'striped' 'open'         %# shading
};
fIdx = arrayfun(@(k) grp2idx(features(k,:)), 1:FEAT_NUM, 'UniformOutput',0);

%# list of all cards. Each card: [symbol,color,number,shading]
[W X Y Z] = ndgrid(fIdx{:});
cards = [W(:) X(:) Y(:) Z(:)];

%# all possible sets: choose 3 from 12
setsInd = nchoosek(1:DRAW_SZ,SET_SZ);

%# optimizations: some calculations taken out of the loop
ss = setsInd(:);
set_sz2 = numel(ss)*FEAT_NUM/SET_SZ;
col = repmat(1:set_sz2,SET_SZ,1);
col = FEAT_SZ.*(col(:)-1);
M = false(FEAT_SZ,set_sz2);

%# progress indication
%#hWait = waitbar(0./NUM_ITER, 'Simulation...');

%# count number of valid sets in random draws of 12 cards
counterValidSet = 0;
for i=1:NUM_ITER
    %# update progress
    %#waitbar(i./NUM_ITER, hWait);

    %# pick 12 cards
    ord = randperm( size(cards,1) );
    cardsDrawn = cards(ord(1:DRAW_SZ),:);

    %# put all possible sets of 3 cards next to each other
    set = reshape(cardsDrawn(ss,:)',[],SET_SZ)';
    set = set(:);

    %# check for valid sets: features are all the same or all different
    M(:) = false;            %# if using PARFOR, it will complain about this
    M(set+col) = true;
    isValid = all(reshape(sum(M)~=2,FEAT_NUM,[]));

    %# increment counter if there is at least one valid set in all candidates
    if any(isValid)
        counterValidSet = counterValidSet + 1;
    end
end

%# ratio of found-to-notfound
fprintf('Size=%d, Set=%d, NoSet=%d, Set:NoSet=%g\n', ...
    DRAW_SZ, counterValidSet, (NUM_ITER-counterValidSet), ...
    counterValidSet/(NUM_ITER-counterValidSet))

%# close progress bar
%#close(hWait)

If you have the Parallel Processing Toolbox, you can easily replace the plain FOR-loop with a parallel PARFOR (you might want to move the initialization of the matrix M inside the loop again: replace M(:) = false; with M = false(FEAT_SZ,set_sz2);)

Here are some sample outputs of 50000 simulations (PARFOR used with a pool of 2 local instances):

» tic, SET_game2, toc
Size=12, Set=48376, NoSet=1624, Set:NoSet=29.7882
Elapsed time is 5.653933 seconds.

» tic, SET_game2, toc
Size=15, Set=49981, NoSet=19, Set:NoSet=2630.58
Elapsed time is 9.414917 seconds.

And with a million iterations (PARFOR for 12, no-PARFOR for 15):

» tic, SET_game2, toc
Size=12, Set=967516, NoSet=32484, Set:NoSet=29.7844
Elapsed time is 110.719903 seconds.

» tic, SET_game2, toc
Size=15, Set=999630, NoSet=370, Set:NoSet=2701.7
Elapsed time is 372.110412 seconds.

The odds ratio agree with the results reported by Peter Norvig.

回复收藏 0 原文

浮世清欢 2024-09-08 07:03:36

这是一个矢量化版本，大约一分钟就可以计算出 1M 手牌。我用它得到了大约 28:1，所以找到“所有不同”的集合可能仍然有点偏差。我的猜测是，这也是您的解决方案遇到的问题。

%# initialization
K = 12; %# cards to draw
NF = 4; %# number of features (this is hard-coded to 4)
nIter = 100000; %# number of iterations

%# each card has four features. This means that a card can be represented
%# by a coordinate in 4D space. A set is a full row, column, etc in 4D
%# space. We can even parallelize the iterations, at least as long as we
%# have RAM (each hand costs 81 bytes)
%# make card space - one dimension per feature, plus one for the iterations
cardSpace = false(3,3,3,3,nIter);

%# To draw cards, we put K trues into each cardSpace. I can't think of a
%# good, fast way to draw exactly K cards that doesn't involve calling
%# unique
for i=1:nIter
    shuffle = randperm(81) + (i-1) * 81;
    cardSpace(shuffle(1:K)) = true;
end

%# to test, all we have to do is check whether there is any row, column,
%# with all 1's
isEqual = squeeze(any(any(any(all(cardSpace,1),2),3),4) | ...
    any(any(any(all(cardSpace,2),1),3),4) | ...
    any(any(any(all(cardSpace,3),2),1),4) | ...
    any(any(any(all(cardSpace,4),2),3),1));
%# to get a set of 3 cards where all symbols are different, we require that
%# no 'sub-volume' is completely empty - there may be something wrong with this
%# but since my test looked ok, I'm not going to investigate on Friday night
isDifferent = squeeze(~any(all(all(all(~cardSpace,1),2),3),4) & ...
    ~any(all(all(all(~cardSpace,1),2),4),3) & ...
    ~any(all(all(all(~cardSpace,1),3),4),2) & ...
    ~any(all(all(all(~cardSpace,4),2),3),1));

isSet = isEqual | isDifferent;

%# find the odds
fprintf('odds are %5.2f:1\n',sum(isSet)/(nIter-sum(isSet)))

Here's a vectorized version, where 1M hands can be calculated in about a minute. I got about 28:1 with it, so there might still be something a little off with finding 'all different' sets. My guess is that this is what your solution has trouble with, as well.

%# initialization
K = 12; %# cards to draw
NF = 4; %# number of features (this is hard-coded to 4)
nIter = 100000; %# number of iterations

%# each card has four features. This means that a card can be represented
%# by a coordinate in 4D space. A set is a full row, column, etc in 4D
%# space. We can even parallelize the iterations, at least as long as we
%# have RAM (each hand costs 81 bytes)
%# make card space - one dimension per feature, plus one for the iterations
cardSpace = false(3,3,3,3,nIter);

%# To draw cards, we put K trues into each cardSpace. I can't think of a
%# good, fast way to draw exactly K cards that doesn't involve calling
%# unique
for i=1:nIter
    shuffle = randperm(81) + (i-1) * 81;
    cardSpace(shuffle(1:K)) = true;
end

%# to test, all we have to do is check whether there is any row, column,
%# with all 1's
isEqual = squeeze(any(any(any(all(cardSpace,1),2),3),4) | ...
    any(any(any(all(cardSpace,2),1),3),4) | ...
    any(any(any(all(cardSpace,3),2),1),4) | ...
    any(any(any(all(cardSpace,4),2),3),1));
%# to get a set of 3 cards where all symbols are different, we require that
%# no 'sub-volume' is completely empty - there may be something wrong with this
%# but since my test looked ok, I'm not going to investigate on Friday night
isDifferent = squeeze(~any(all(all(all(~cardSpace,1),2),3),4) & ...
    ~any(all(all(all(~cardSpace,1),2),4),3) & ...
    ~any(all(all(all(~cardSpace,1),3),4),2) & ...
    ~any(all(all(all(~cardSpace,4),2),3),1));

isSet = isEqual | isDifferent;

%# find the odds
fprintf('odds are %5.2f:1\n',sum(isSet)/(nIter-sum(isSet)))

回复收藏 0 原文

巨坚强 2024-09-08 07:03:36

我发现了我的错误。感谢乔纳斯对 RANDPERM 的提示。

我用RANDI随机抽了K张牌，但即使是12张牌，也有大约50%的机会重复。当我用 randperm 替换这一行时，经过 10000 次迭代，我得到了 33.8:1，非常接近说明书中的数字。

setdrawncardidx = randperm(81);
setdrawncardidx = setdrawncardidx(1:K);

不管怎样，看到解决这个问题的其他方法会很有趣。

I found my error. Thanks Jonas for the hint with RANDPERM.

I used RANDI to randomly drawn K cards, but there is about 50% chance to get repeats even in 12 cards. When I substituted this line with randperm, I've got 33.8:1 with 10000 iterations, very close to the number in instruction book.

setdrawncardidx = randperm(81);
setdrawncardidx = setdrawncardidx(1:K);

Anyway, it would be interesting to see other approaches to the problem.

回复收藏 0 原文

愛放△進行李 2024-09-08 07:03:36

我确信我对这些赔率的计算有问题，因为其他几个人已经通过模拟确认了赔率接近 33:1，如说明所示，但是以下逻辑有什么问题吗？

对于 12 张随机牌，三张牌有 220 种可能的组合 (12!/(9!3!) = 220)。每三张牌的组合都有 1/79 的机会成为一组，因此任意三张牌不成为一组的机会有 78/79。因此，如果您检查了所有 220 个组合，并且每个组合都不是一组的可能性为 78/79，那么检查所有可能组合时找不到一组的可能性将为 78/79 的 220 次方，即 0.0606，大约是赔率17：1。

我一定是错过了什么……？

克里斯托弗

回复收藏 0 原文

~没有更多了~