如何使用 Matlab 或 Excel 获取两个表中行的交集?

发布于 2024-12-22 01:47:05 字数 1620 浏览 4 评论 0原文

我一直在谷歌搜索,但到目前为止发现结果并不完全符合我的需要。如果有人发布了同一问题的答案,有人可以告诉我吗?非常感谢。

我有来自两个 Excel 文件的两个表,一个仅包含数值,另一个包含 txt 和数值。例如,file1.xlsx 只有一个 StudentID 的列向量,它们是整数,例如,

1  
3    
5     
11     
20 ....    

并且 file2.xlsx 包含一个工作表:

 studentID   Name     score    email        
 1           Cassie   60       [email protected]     
 2           John     100      [email protected]      
 3           Marry    80       ...    
 4           Bob      70       ...     
 5           Kevin    99       ....     

我想查找学生 ID 在 file1.xlsx 中的学生行,因此最终结果表应该是这样的:

studentID  Name    score  email       
1          Cassie  60      [email protected]...     
3          Marry   80      [email protected]    
5          Kevin   99     [email protected]    
...    

我没有 SQL 或 Perl 等脚本语言。我知道在matlab中,有一个交集函数,但是,它只适用于数值矩阵。对于我的 file2.xlsx,它包含数字值和 txt 值。我不想把他们分开。而且我也不知道如何将 Excel 文件读入单元格数组,因为该表既有数字部分又有 txt 部分。如果有人知道如何使用 Matlab 或 Excel 解决这个问题对我来说没问题,我只需要得到最终的结果表。

非常感谢你,

I have been googling but so far results found not fully match to my need. If someone has posted answer for the same question, can someone let me know ? Thank you very much.

I have two tables from two excel files that one contains only numerical values and the other contains both txt and numerical values. For example file1.xlsx has only a column vector of studentid which are integer number such as,

1  
3    
5     
11     
20 ....    

And the file2.xlsx contains a worksheet as:

 studentID   Name     score    email        
 1           Cassie   60       [email protected]     
 2           John     100      [email protected]      
 3           Marry    80       ...    
 4           Bob      70       ...     
 5           Kevin    99       ....     

I would like to find rows of students whose studentID are in the file1.xlsx so the final resulted table should be looked like:

studentID  Name    score  email       
1          Cassie  60      [email protected]...     
3          Marry   80      [email protected]    
5          Kevin   99     [email protected]    
...    

I do not have script languages such as SQL or Perl. I know in matlab, there is a interset function, however, it only work with numerical matrix. For my file2.xlsx, it contains both numerical and txt values. I do not want to separate them. And I do not know how to read a excel file into a cellarray either since the table has both numerical and txt parts. If there is anyone who knows how to solve this problem either by using Matlab or Excel is fine with me, I just need to get the final resulted table.

Thank you so very much,

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

耀眼的星火 2024-12-29 01:47:05

在 Excel 中,您可以使用以下公式在 file2.xlsx 中创建一个新列:

=MATCH(A2,[\path\file1.xlsx]Sheet1!$A$2:$A$20,0)

其中 $A$2:$A$20 是 file1.xlsx 中包含学生 ID 的范围。最后的0表示完全匹配。将此公式分布到所有行。如果 ID 不在 file1 中,则该函数在行中返回 #N/A。您可以使用此列过滤掉这些行。

为了更容易过滤,您可以添加 ISERROR 函数:

=NOT(ISERROR(MATCH(A2,[\path\file1.xlsx]Sheet1!$A$2:$A$20,0)))

然后 TRUE 将对应于现有 ID,FALSE 对应于缺失 ID。

如果 file1.xlsx 中有附加列并希望将其与 file2 中的数据合并,则可以使用 INDEX 函数:

=INDEX([\path\file1.xlsx]Sheet1!B$2:B$20,MATCH($A2,[\path\file1.xlsx]Sheet1!$A$2:$A$20,0),1)

请注意,MATCH/INDEX 组合不需要对搜索列中的数据进行排序,就像 VLOOKUP 的情况一样。


在 MATLAB 中,您可以使用 ISMEMBER 函数。

假设您已将两个文件导入 MATLAB,并且有元胞数组 file1 和 file2。然后您可以执行以下操作:

[~,~,file1] = xlsread('file1.xlsx');
[~,~,file2] = xlsread('file2.xlsx');
[idx2, idx1] = ismember(file2(:,1), file1(:,1));
filtered2 = file2(idx2,:);

要查找 file2.xlsx 中存在的 file1.xlsx 中的行,并按相同顺序对它们进行排序:

idx1(idx1==0) = [];
filtered1 = file1(idx1,:);

如果要保留 file1.xlsx 中的行顺序,只需在 ismember 语句中切换 file2 和 file1 即可。

In Excel you create a new column in file2.xlsx with a formula:

=MATCH(A2,[\path\file1.xlsx]Sheet1!$A$2:$A$20,0)

where $A$2:$A$20 is a range with student IDs in file1.xlsx. Last 0 means exact match. Distribute this formula to all the rows. The function returns #N/A in rows if ID is not in file1. You can filter those rows out using this column.

For easier filtering you can add ISERROR function:

=NOT(ISERROR(MATCH(A2,[\path\file1.xlsx]Sheet1!$A$2:$A$20,0)))

Then the TRUE will correspond to existing IDs, and FALSE to missing IDs.

If you have additional column in file1.xlsx and want to combine it with data in file2 you can use INDEX function:

=INDEX([\path\file1.xlsx]Sheet1!B$2:B$20,MATCH($A2,[\path\file1.xlsx]Sheet1!$A$2:$A$20,0),1)

Note that MATCH/INDEX combination does not require the data in search column to be sorted, as in case of VLOOKUP.


In MATLAB you can use ISMEMBER function.

Let's say you have imported both files to MATLAB and you have cell arrays file1 and file2. Then you can do:

[~,~,file1] = xlsread('file1.xlsx');
[~,~,file2] = xlsread('file2.xlsx');
[idx2, idx1] = ismember(file2(:,1), file1(:,1));
filtered2 = file2(idx2,:);

To find the rows in file1.xlsx existing in file2.xlsx and sort them in the same order:

idx1(idx1==0) = [];
filtered1 = file1(idx1,:);

If you want to keep the order of rows in file1.xlsx just switch file2 and file1 in the ismember statement.

生生漫 2024-12-29 01:47:05

您可以尝试 Excels VLOOKUP(英语)/ SVERWEIS(德语)函数。
考虑 Excel 文档的以下布局:

1.xlsx
 |  A|  B|  C|  D|
1|  1|   |   |   |
2|  3|   |   |   |
3|  5|   |   |   |
4| 11|   |   |   |
5| 20|   |   |   |


2.xlsx
 |  A|  B|   C|      D|
1|  1|  C|  60| [email protected]|
2|  2|  J| 100| [email protected]|
3|  3|  M|  80| [email protected]|
4|  4|  B|  70|       |
5|  5|  K|  99|       |

使用 2.xlsx 中的学生详细信息填充 1.xlsx 将以下公式分配给 1.xlsx:

B1: =VLOOKUP($A1;[<PATH_TO_2.xlsx>]<NAME_OF_SHEET_IN_2.xlsx>!$A$1:$D$5;2;TRUE)
B2: =VLOOKUP($A2;[<PATH_TO_2.xlsx>]<NAME_OF_SHEET_IN_2.xlsx>!$A$1:$D$5;2;TRUE)
...
C1: =VLOOKUP($A1;[<PATH_TO_2.xlsx>]<NAME_OF_SHEET_IN_2.xlsx>!$A$1:$D$5;3;TRUE)
C2: =VLOOKUP($A2;[<PATH_TO_2.xlsx>]<NAME_OF_SHEET_IN_2.xlsx>!$A$1:$D$5;3;TRUE)
...
D1: =VLOOKUP($A1;[<PATH_TO_2.xlsx>]<NAME_OF_SHEET_IN_2.xlsx>!$A$1:$D$5;4;TRUE)
D2: =VLOOKUP($A2;[<PATH_TO_2.xlsx>]<NAME_OF_SHEET_IN_2.xlsx>!$A$1:$D$5;4;TRUE)
...

VLOOKUP 在第二个参数指定的工作表中搜索第一个参数 ($A1, ..) 的出现次数并复制第二个参数指定的矩阵中的第 n 个值(第三个参数)。第四个布尔参数指定是否需要精确匹配。

你需要替换 '< PATH_TO_2.xlsx >'和'< NAME_OF_SHEET_IN_2.xlsx >'使用正确的值“[]”和“!”必须保留

进一步说明:http://www.techonthenet.com/excel/formulas/ vlookup.php

祝你好运:)

You can try Excels VLOOKUP (english) / SVERWEIS (german) function.
Considering the following layout of your Excel documents:

1.xlsx
 |  A|  B|  C|  D|
1|  1|   |   |   |
2|  3|   |   |   |
3|  5|   |   |   |
4| 11|   |   |   |
5| 20|   |   |   |


2.xlsx
 |  A|  B|   C|      D|
1|  1|  C|  60| [email protected]|
2|  2|  J| 100| [email protected]|
3|  3|  M|  80| [email protected]|
4|  4|  B|  70|       |
5|  5|  K|  99|       |

to populate 1.xlsx with students' details from 2.xlsx assign following formulas to 1.xlsx:

B1: =VLOOKUP($A1;[<PATH_TO_2.xlsx>]<NAME_OF_SHEET_IN_2.xlsx>!$A$1:$D$5;2;TRUE)
B2: =VLOOKUP($A2;[<PATH_TO_2.xlsx>]<NAME_OF_SHEET_IN_2.xlsx>!$A$1:$D$5;2;TRUE)
...
C1: =VLOOKUP($A1;[<PATH_TO_2.xlsx>]<NAME_OF_SHEET_IN_2.xlsx>!$A$1:$D$5;3;TRUE)
C2: =VLOOKUP($A2;[<PATH_TO_2.xlsx>]<NAME_OF_SHEET_IN_2.xlsx>!$A$1:$D$5;3;TRUE)
...
D1: =VLOOKUP($A1;[<PATH_TO_2.xlsx>]<NAME_OF_SHEET_IN_2.xlsx>!$A$1:$D$5;4;TRUE)
D2: =VLOOKUP($A2;[<PATH_TO_2.xlsx>]<NAME_OF_SHEET_IN_2.xlsx>!$A$1:$D$5;4;TRUE)
...

VLOOKUP searches for occurrences of first parameter ($A1, ..) in sheet specified by second parameter and copies n-th value (3rd param) in matrix specified in 2nd param. 4th boolean param specifies whether exact match is required.

you need to replace '< PATH_TO_2.xlsx >' and '< NAME_OF_SHEET_IN_2.xlsx >' with correct values, '[]' and '!' must stay in place

further explanation: http://www.techonthenet.com/excel/formulas/vlookup.php

good luck :)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文