在 Matlab/Octave 中识别（并删除）向量中的序列

发布于 2024-11-10 11:15:52 字数 822 浏览 2 评论 0原文

我正在尝试从 Matlab（或 Octave）中的数字向量中修剪长度为 3 或以上的任何序列。例如，给定向量 dataSet，

dataSet = [1 2 3 7 9 11 13 17 18 19 20 22 24 25 26 28 30 31];

删除长度为 3 或以上的所有序列将产生 prunedDataSet：

prunedDataSet = [7 9 11 13 22 28 30 31 ];

我可以暴力破解解决方案，但我怀疑有一种更简洁（也许更有效）的方法可以做到它使用向量/矩阵运算，但我总是对某些东西是否产生索引或所述索引处的值感到困惑。建议？

这是我想出的暴力方法：

dataSet = [1 2 3 7 9 11 13 17 18 19 20 22 24 25 26 28 30 31];
benign = [];
for i = 1:size(dataSet,2)-2;
    if (dataSet(i) == (dataSet(i+1)-1) && dataSet(i) == dataSet(i+2)-2);
        benign = [benign i ] ;
    end;
end;

remove = [];
for i = 1:size(benign,2);
    remove = [remove benign(i) benign(i)+1 benign(i)+2 ];
end;

remove = unique(remove);

prunedDataSet = setdiff(dataSet, dataSet(remove));

原文

I'm trying to prune any sequence of length 3 or more from a vector of numbers in Matlab (or Octave). For example, given the vector dataSet,

dataSet = [1 2 3 7 9 11 13 17 18 19 20 22 24 25 26 28 30 31];

removing all sequences of length 3 or more would yield prunedDataSet:

prunedDataSet = [7 9 11 13 22 28 30 31 ];

I can brute force a solution, but I suspect there is a more succinct (and perhaps efficient) way to do it using vector/matrix operations, but I always get confused about whether something yields an index or the value at said index. Suggestions?

Here's the brute force method I came up with:

dataSet = [1 2 3 7 9 11 13 17 18 19 20 22 24 25 26 28 30 31];
benign = [];
for i = 1:size(dataSet,2)-2;
    if (dataSet(i) == (dataSet(i+1)-1) && dataSet(i) == dataSet(i+2)-2);
        benign = [benign i ] ;
    end;
end;

remove = [];
for i = 1:size(benign,2);
    remove = [remove benign(i) benign(i)+1 benign(i)+2 ];
end;

remove = unique(remove);

prunedDataSet = setdiff(dataSet, dataSet(remove));

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

怀念你的温柔 2024-11-17 11:15:52

这是使用 DIFF 和 STRFIND

%# define dataset
dataSet = [1 2 3 7 9 11 13 17 18 19 20 22 24 25 26 28 30 31];

%# take the difference. Whatever is part of a sequence will have difference 1
dds = diff(dataSet);

%# sequences of 3 lead to two consecutive ones. Sequences of 4 are like two sequences of 3
seqIdx = findstr(dds,[1 1]);

%# remove start, start+1, start+2
dataSet(bsxfun(@plus,seqIdx,[0;1;2])) = []
dataSet =

     7     9    11    13    22    28    30    31

Here's a solution using DIFF and STRFIND

%# define dataset
dataSet = [1 2 3 7 9 11 13 17 18 19 20 22 24 25 26 28 30 31];

%# take the difference. Whatever is part of a sequence will have difference 1
dds = diff(dataSet);

%# sequences of 3 lead to two consecutive ones. Sequences of 4 are like two sequences of 3
seqIdx = findstr(dds,[1 1]);

%# remove start, start+1, start+2
dataSet(bsxfun(@plus,seqIdx,[0;1;2])) = []
dataSet =

     7     9    11    13    22    28    30    31

回复收藏 0 原文

思慕 2024-11-17 11:15:52

下面是使用向量矩阵表示法的尝试：

s1 = [(dataSet(1:end-1) == dataSet(2:end)-1), false];
s2 = [(dataSet(1:end-2) == dataSet(3:end)-2), false, false];
s3 = s1 & s2;
s = s3 | [false, s3(1:end-1)] | [false, false, s3(1:end-2)];
dataSet(~s)

其想法是：对于数字 a 出现在 a+1 之前的所有位置，s1 均成立。对于 a 出现在 a+2 之前两个位置的所有位置，s2 均成立。如果前面的两个条件都满足，则 s 变为 true。然后，我们构建 s ，以便将每个真值传播到其两个后继者。

最后，dataSet(~s) 保留上述条件为假的所有值，即，它保留不属于 3 序列的数字。

Here's an attempt using vector-matrix notation:

s1 = [(dataSet(1:end-1) == dataSet(2:end)-1), false];
s2 = [(dataSet(1:end-2) == dataSet(3:end)-2), false, false];
s3 = s1 & s2;
s = s3 | [false, s3(1:end-1)] | [false, false, s3(1:end-2)];
dataSet(~s)

The idea is: s1 is true for all positions where a number a appears before a+1. s2 is true for all positions where a appears two positions before a+2. Then s becomes true where both the previous conditions are met. Then, we build s such that every true value is propagated to its two successors.

Finally, dataSet(~s) keeps all the values for which the above conditions are false, that is, it keeps numbers that are not part of a 3-sequence.

回复收藏 0 原文

~没有更多了~