在 MATLAB 中使用 unique() 重构向量

发布于 2024-10-15 07:08:43 字数 375 浏览 4 评论 0原文

假设 A 是一个包含 4 个不同字符串(每个字符串有 50 次重复)的 200 项元胞数组。 B 是一个包含一些整数的 200 项向量。

我使用 [cellNos cellStartInd enumCells ] = unique(A) 并获取 A 中的哪一项等于唯一字符串之一(enumCells 是一个包含整数 1 的数组-4,某种枚举字符串)。

我想使用此信息从 B 创建一个 4x50 值矩阵,以便每一列都具有特定唯一字符串的值。换句话说,我想将 B 重塑为一个矩阵,其中的列根据 A 中的每个唯一字符串进行排列。

Let's say A is a 200 item cell array containing 4 different strings (each has 50 repetitions).
B is a 200 item vector with some integers.

I'm using [cellNos cellStartInd enumCells ] = unique(A) and get which item in A is equal to one of the unique strings (enumCells is an array containing integers 1-4, sort of enumerating the strings).

I want to use this information to create a 4x50 matrix of values from B so that each column will have the values for a specific unique string. In other words, I want to reshape B into a matrix where the columns were arranged according to each unique string in A.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

小…楫夜泊 2024-10-22 07:08:43

假设您已经知道会有多少次重复,并且所有字符串都以相同的频率重复,您可以执行以下操作:

%# sort to find where the entries occur (remember: sort does stable sorting)
[~,sortIdx] = sort(enumCells);

%# preassign the output to 50-by-4 for easy linear indexing
newB = zeros(50,4);

%# fill in values from B: first the 50 ones, then the 50 2's etc
newB(:) = B(sortIdx);

%# transpose to get a 4-by-50 array
newB = newB';

或者,以更紧凑的方式(感谢@Rich C)

[~,sortIdx] = sort(enumCells);
newB = reshape(B(sortIdx),50,4)';

Assuming that you know already how many repetitions there will be, and that all strings are repeated with equal frequency, you can do the following:

%# sort to find where the entries occur (remember: sort does stable sorting)
[~,sortIdx] = sort(enumCells);

%# preassign the output to 50-by-4 for easy linear indexing
newB = zeros(50,4);

%# fill in values from B: first the 50 ones, then the 50 2's etc
newB(:) = B(sortIdx);

%# transpose to get a 4-by-50 array
newB = newB';

Or, in a more compact fashion (thanks @Rich C)

[~,sortIdx] = sort(enumCells);
newB = reshape(B(sortIdx),50,4)';
抚你发端 2024-10-22 07:08:43

对于一般情况,您有 N 个不同的字符串,并且每个字符串出现不同的次数 M_i,则 B 中每个对应的值集code> 将具有不同的长度,并且您将无法将这些集合连接到一个数字数组中。您必须将这些集合存储在 N 元素 元胞数组,您可以使用函数 UNIQUE 来完成此操作ACCUMARRAY

>> A = {'a' 'b' 'b' 'c' 'a' 'a' 'a' 'c' 'd' 'b'};  %# Sample array A
>> B = 1:10;                                       %# Sample array B
>> [uniqueStrings,~,index] = unique(A)
>> associatedValues = accumarray(index(:),B,[],@(x) {x})

associatedValues = 

    [4x1 double]    %# The values 1, 5, 6, and 7
    [3x1 double]    %# The values 2, 3, and 10
    [2x1 double]    %# The values 4 and 8
    [         9]    %# The value 9

在每个字符串出现的特定情况下上面的代码仍然可以正常工作相同的次数,并且您可以选择将输出从元胞数组转换为所需的数值数组,如下所示:

associatedValues = [associatedValues{:}];

注意:由于 ACCUMARRAY 不保证维护其累积的项目的相对顺序,linkedValues 单元格内项目的顺序可能与它们在向量 B 中的相对顺序不匹配。确保维持 B 中原始相对顺序的一种方法是修改对 ACCUMARRAY 如下:

 associatedValues = accumarray(index(:),1:numel(B),[],@(x) {B(sort(x))});

或者您可以将输入排序到 ACCUMARRAY 获得相同的效果:

[index,sortIndex] = sort(index);
associatedValues = accumarray(index(:),B(sortIndex),[],@(x) {x});

For the general case where you have N different strings and each of these strings occurs a different number of times M_i, then each corresponding set of values in B will have a different length and you won't be able to concatenate the sets together into a numeric array. You will instead have to store the sets in an N-element cell array, and you can do this using the functions UNIQUE and ACCUMARRAY:

>> A = {'a' 'b' 'b' 'c' 'a' 'a' 'a' 'c' 'd' 'b'};  %# Sample array A
>> B = 1:10;                                       %# Sample array B
>> [uniqueStrings,~,index] = unique(A)
>> associatedValues = accumarray(index(:),B,[],@(x) {x})

associatedValues = 

    [4x1 double]    %# The values 1, 5, 6, and 7
    [3x1 double]    %# The values 2, 3, and 10
    [2x1 double]    %# The values 4 and 8
    [         9]    %# The value 9

In the specific case where each string occurs the same number of times the above code will still work just fine, and you will have the option of converting the output from a cell array to the desired numeric array like so:

associatedValues = [associatedValues{:}];

NOTE: Since ACCUMARRAY is not guaranteed to maintain the relative order of items it accumulates, the order of items within the cells of associatedValues may not match the relative order they had in the vector B. One way to ensure that the original relative order in B is maintained is to modify the call to ACCUMARRAY as follows:

 associatedValues = accumarray(index(:),1:numel(B),[],@(x) {B(sort(x))});

Or you could sort the inputs to ACCUMARRAY to get the same effect:

[index,sortIndex] = sort(index);
associatedValues = accumarray(index(:),B(sortIndex),[],@(x) {x});
我恋#小黄人 2024-10-22 07:08:43

如果我正确理解你的问题,这可以使用 find 函数来完成。
http://www.mathworks.com/help/techdoc/ref/find.html

要创建矩阵,只需编写:

M(:,1) = B(find(enumCells==1));
M(:,2) = B(find(enumCells==2));
M(:,3) = B(find(enumCells==3));
M(:,4) = B(find(enumCells==4));

可能有一种更优雅的方法来做到这一点,但这应该可行。

编辑:您可以尝试使用“排序”来做到这一点。排序函数可以给出排序的排列作为输出。尝试:

[s perm] = sort(enumCells);
M = reshape(B(perm),50,4);

If I understand your question correctly, this can be done using the find function.
http://www.mathworks.com/help/techdoc/ref/find.html

To create your matrix, just write:

M(:,1) = B(find(enumCells==1));
M(:,2) = B(find(enumCells==2));
M(:,3) = B(find(enumCells==3));
M(:,4) = B(find(enumCells==4));

There's probably a more elegant way to do it, but this should work.

EDIT: You could try using "sort" to do it. The sort function can give the permutations of the sorting as output. Try:

[s perm] = sort(enumCells);
M = reshape(B(perm),50,4);
っ〆星空下的拥抱 2024-10-22 07:08:43

如果每个字符串的条目数相同,则此方法将起作用,如果它们不同,请参阅@gnovice解决方案。

NumStrings = numel(CellNos);
M = zeros(size(B,1)/NumStrings,NumStrings);
for i = 1:NumStrings
    M(:,i) = B(strcmp(B,CellNos{i}));
end

另外,如果您提前知道唯一字符串是什么(即 CellNos},这可以让您跳过唯一调用,这是相对昂贵的。

This method will work if the number of entries per string is the same, if they vary see @gnovice solution.

NumStrings = numel(CellNos);
M = zeros(size(B,1)/NumStrings,NumStrings);
for i = 1:NumStrings
    M(:,i) = B(strcmp(B,CellNos{i}));
end

Also, if you know what the unique strings are ahead of time (ie CellNos}, this allows you to skip the unique call, which is relatively expensive.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文