Linq 连接参数化的不同键
我正在尝试基于动态键 LINQ 两个表。用户可以通过组合框更改密钥。键可能是钱、字符串、双精度数、整数等。目前我得到的数据很好,但没有过滤掉双精度数。我可以在 VB 中过滤双精度值,但速度很慢。我想立即在 LINQ 查询中执行此操作。
数据如下:
第一个表:
-------------------------------------------------------------
| AppleIndex | AppleCost | AppleColor | AppleDescription |
------------------------------------------------------------
| 1 | 3 | Red | This is an apple |
| 2 | 5 | Green | This is an apple |
| 3 | 4 | Pink | This is an apple |
| 4 | 2 | Yellow | This is an apple |
| 5 | 2 | Orange | This is an apple |
| 1 | 3 | Red | This is a duplicate|
| 2 | 5 | Green | This is a duplicate|
| 3 | 4 | Pink | This is a duplicate|
| 4 | 2 | Yellow | This is a duplicate|
| 5 | 2 | Orange | This is a duplicate|
-------------------------------------------------------------
第二个表:
------------------------------------------------------------
| OrangeIndex | OrangeCost | OrangeColor | OrangeDescription |
------------------------------------------------------------
| 1 | 1 | Orange | This is an Orange |
| 2 | 3 | Orange | |
| 3 | 2 | Orange | This is an Orange |
| 4 | 3 | Orange | |
| 5 | 2 | Orange | This is an Orange |
------------------------------------------------------------
当前,我使用以下代码获取太多数据:
Dim Matches = From mRows In LinqMasterTable Join sRows In LinqSecondTable _
On mRows(ThePrimaryKey) Equals sRows(TheForignKey) _
Order By mRows(ThePrimaryKey) _
Select mRows, sRows Distinct
结果:
-------------------------------------------------------------------------
| 1 | 3 | Red | This is an apple | 1 | Orange | This is an Orange |
| 1 | 3 | Red | This is an duplicate | 1 | Orange | This is an Orange |
| 2 | 5 | Green | This is an apple | 3 | Orange | |
| 2 | 5 | Green | This is an duplicate | 3 | Orange | |
| 3 | 4 | Pink | This is an apple | 2 | Orange | This is an Orange |
| 3 | 4 | Pink | This is an duplicate | 2 | Orange | This is an Orange |
| 4 | 2 | Yellow | This is an apple | 3 | Orange | |
| 4 | 2 | Yellow | This is an duplicate | 3 | Orange | |
| 5 | 2 | Orange | This is an apple | 2 | Orange | This is an Orange |
| 5 | 2 | Orange | This is an duplicate | 2 | Orange | This is an Orange |
-------------------------------------------------------------------------
期望的结果:
------------------------------------------------------------------------
| 1 | 3 | Red | This is an apple | 1 | 1 | Orange | This is an Orange |
| 2 | 5 | Green | This is an apple | 2 | 3 | Orange | |
| 3 | 4 | Pink | This is an apple | 3 | 2 | Orange | This is an Orange |
| 4 | 2 | Yellow | This is an apple | 4 | 3 | Orange | |
| 5 | 2 | Orange | This is an apple | 5 | 2 | Orange | This is an Orange |
------------------------------------------------------------------------
我尝试了以下操作:
'Get the original Column Names into an Array List
'MasterTableColumns = GetColumns(qMasterDS, TheMasterTable) '(external code)
'Plug the Existing DataSet into a DataView:
Dim View As DataView = New DataView(qMasterTable)
'Sort by the Primary Key:
View.Sort = ThePrimaryKey
'Build a new table listing only one column:
Dim newListTable As DataTable = _
View.ToTable("UniqueData", True, ThePrimaryKey)
这返回一个唯一的列表,但没有关联的数据:
-------------
| AppleIndex |
-------------
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
-------------
所以我尝试了这个:
'Build a new table with ALL the columns:
Dim newFullTable As DataTable = _
View.ToTable("UniqueData", True, _
MasterTableColumns(0), _
MasterTableColumns(1), _
MasterTableColumns(2), _
MasterTableColumns(3))
不幸的是,它产生以下结果...有重复:
-------------------------------------------------------------
| AppleIndex | AppleCost | AppleColor | AppleDescription |
------------------------------------------------------------
| 1 | 3 | Red | This is an apple |
| 2 | 5 | Green | This is an apple |
| 3 | 4 | Pink | This is an apple |
| 4 | 2 | Yellow | This is an apple |
| 5 | 2 | Orange | This is an apple |
| 1 | 3 | Red | This is a duplicate|
| 2 | 5 | Green | This is a duplicate|
| 3 | 4 | Pink | This is a duplicate|
| 4 | 2 | Yellow | This is a duplicate|
| 5 | 2 | Orange | This is a duplicate|
-------------------------------------------------------------
有什么想法吗?
~~ ~~~~~~~~~~ 更新:~~~~~~~~~~~~
Jeff M 建议使用以下代码。 (谢谢杰夫)但是,它给了我一个错误。有谁知道在 VB 中实现此功能的语法吗?我已经搞砸了一点,但似乎无法做到正确。
Dim matches = _
From mRows In (From row In LinqMasterTable _
Group row By row(ThePrimaryKey) Into g() _
Select g.First()) _
Join sRows In LinqSecondTable _
On mRows(ThePrimaryKey) Equals sRows(TheForignKey) _
Order By mRows(ThePrimaryKey) _
Select mRows, sRows
“row(ThePrimaryKey)”第三行中的错误:
“只能从不带参数的简单名称或限定名称推断范围变量名称。”
I'm trying to LINQ two tables based on a dynamic key. User can change key via a combo box. Key may be money, string, double, int, etc. Currently I'm getting the data just fine, but without filtering out the doubles. I can filter the double in VB, but it's slooooow. I'd like to do it in the LINQ query right out of the gate.
Here's the data:
First Table:
-------------------------------------------------------------
| AppleIndex | AppleCost | AppleColor | AppleDescription |
------------------------------------------------------------
| 1 | 3 | Red | This is an apple |
| 2 | 5 | Green | This is an apple |
| 3 | 4 | Pink | This is an apple |
| 4 | 2 | Yellow | This is an apple |
| 5 | 2 | Orange | This is an apple |
| 1 | 3 | Red | This is a duplicate|
| 2 | 5 | Green | This is a duplicate|
| 3 | 4 | Pink | This is a duplicate|
| 4 | 2 | Yellow | This is a duplicate|
| 5 | 2 | Orange | This is a duplicate|
-------------------------------------------------------------
Second Table:
------------------------------------------------------------
| OrangeIndex | OrangeCost | OrangeColor | OrangeDescription |
------------------------------------------------------------
| 1 | 1 | Orange | This is an Orange |
| 2 | 3 | Orange | |
| 3 | 2 | Orange | This is an Orange |
| 4 | 3 | Orange | |
| 5 | 2 | Orange | This is an Orange |
------------------------------------------------------------
Currently, I'm using the following code to get too much data:
Dim Matches = From mRows In LinqMasterTable Join sRows In LinqSecondTable _
On mRows(ThePrimaryKey) Equals sRows(TheForignKey) _
Order By mRows(ThePrimaryKey) _
Select mRows, sRows Distinct
Outcome:
-------------------------------------------------------------------------
| 1 | 3 | Red | This is an apple | 1 | Orange | This is an Orange |
| 1 | 3 | Red | This is an duplicate | 1 | Orange | This is an Orange |
| 2 | 5 | Green | This is an apple | 3 | Orange | |
| 2 | 5 | Green | This is an duplicate | 3 | Orange | |
| 3 | 4 | Pink | This is an apple | 2 | Orange | This is an Orange |
| 3 | 4 | Pink | This is an duplicate | 2 | Orange | This is an Orange |
| 4 | 2 | Yellow | This is an apple | 3 | Orange | |
| 4 | 2 | Yellow | This is an duplicate | 3 | Orange | |
| 5 | 2 | Orange | This is an apple | 2 | Orange | This is an Orange |
| 5 | 2 | Orange | This is an duplicate | 2 | Orange | This is an Orange |
-------------------------------------------------------------------------
Desired Outcome:
------------------------------------------------------------------------
| 1 | 3 | Red | This is an apple | 1 | 1 | Orange | This is an Orange |
| 2 | 5 | Green | This is an apple | 2 | 3 | Orange | |
| 3 | 4 | Pink | This is an apple | 3 | 2 | Orange | This is an Orange |
| 4 | 2 | Yellow | This is an apple | 4 | 3 | Orange | |
| 5 | 2 | Orange | This is an apple | 5 | 2 | Orange | This is an Orange |
------------------------------------------------------------------------
I have tried the following:
'Get the original Column Names into an Array List
'MasterTableColumns = GetColumns(qMasterDS, TheMasterTable) '(external code)
'Plug the Existing DataSet into a DataView:
Dim View As DataView = New DataView(qMasterTable)
'Sort by the Primary Key:
View.Sort = ThePrimaryKey
'Build a new table listing only one column:
Dim newListTable As DataTable = _
View.ToTable("UniqueData", True, ThePrimaryKey)
This returns a unique list, but no associated data:
-------------
| AppleIndex |
-------------
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
-------------
So I tried this instead:
'Build a new table with ALL the columns:
Dim newFullTable As DataTable = _
View.ToTable("UniqueData", True, _
MasterTableColumns(0), _
MasterTableColumns(1), _
MasterTableColumns(2), _
MasterTableColumns(3))
Unfortunately, it yields the following... with duplicates:
-------------------------------------------------------------
| AppleIndex | AppleCost | AppleColor | AppleDescription |
------------------------------------------------------------
| 1 | 3 | Red | This is an apple |
| 2 | 5 | Green | This is an apple |
| 3 | 4 | Pink | This is an apple |
| 4 | 2 | Yellow | This is an apple |
| 5 | 2 | Orange | This is an apple |
| 1 | 3 | Red | This is a duplicate|
| 2 | 5 | Green | This is a duplicate|
| 3 | 4 | Pink | This is a duplicate|
| 4 | 2 | Yellow | This is a duplicate|
| 5 | 2 | Orange | This is a duplicate|
-------------------------------------------------------------
Any ideas?
~~~~~~~~~~~~ Update: ~~~~~~~~~~~~
Jeff M suggested the following code. (Thanks Jeff) However, it gives me a error. Does anyone know the syntax for making this work in VB? I've monkeyed with it a bit and can't seem to get it right.
Dim matches = _
From mRows In (From row In LinqMasterTable _
Group row By row(ThePrimaryKey) Into g() _
Select g.First()) _
Join sRows In LinqSecondTable _
On mRows(ThePrimaryKey) Equals sRows(TheForignKey) _
Order By mRows(ThePrimaryKey) _
Select mRows, sRows
Error in Third row at "row(ThePrimaryKey)":
"Range variable name can be inferred only from a simple or qualified name with no arguments."
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
嗯,基本问题不是 LINQ。事实上,您的第一个表包含“重复项”,它们并不是真正的重复项,因为在您的示例中,每一行都是独特的。
因此,我们向您提出的问题是“我们如何识别原始表中的重复项?”。一旦回答了这个问题,剩下的事情就变得微不足道了。
例如(在 C# 中,因为我不确定 VB 语法)
Well, the basic problem isn't the LINQ. It's the fact the your First Table contains "duplicates", which aren't really duplicates, since in your example, every row is distinctive.
So, our question to you is "How do we identify the duplicates in the original table?". Once that is answered, the rest should be trivial.
For example (In C# since I'm not sure of the VB syntax)
编辑:
以下是我编写 C# LINQ 查询的方式。这是一个替代版本,而不是使用
Distinct()
,而是使用带有分组的嵌套查询,该查询应该具有类似的语义。它应该很容易转换为VB。以及我对上述内容的 VB 版本的尝试:
编辑:
至于最近的错误,我完全知道如何处理。当我玩VB LINQ时,我发现编译器不喜欢复杂的分组表达式。要解决这个问题,请将 row(ThePrimaryKey) 分配给临时变量并按该变量进行分组。那么它应该可以工作。
事实上,经过第二次检查,发现分组依据需要一个名称。以下内容将起作用。
Edit:
Here's how I would write the C# LINQ query. Here's an alternate version rather than using
Distinct()
, uses a nested query with grouping which should have similar semantics. It should be easily convertible to VB.and my attempt at a VB version of the above:
Edit:
As for the most recent error, I know exactly how to deal with it. When I was playing with VB LINQ, I found that the compiler doesn't like complex grouping expressions. To get around that, assign
row(ThePrimaryKey)
to a temporary variable and group by that variable. It should work then.Actually upon second inspection, it turns out that what is being grouped by needs a name. The following will work.
声明等:
获取数据并按所选键对其进行排序:
将结果放入数据集表中:
提供给用户是否清理双打的选项:
如果需要的话删除重复项:
然后清理所有剩余部分:
最终分析:
针对 2 个具有 4 列、65,535 行和一些双精度的文件进行此测试。处理时间,大约1秒。事实上,将字段加载到内存中比解析数据花费的时间更长。
Declarations and Such:
Get the Data and order it by the Selected Key:
Put the Results into a Dataset Table:
Give the user an option to clean doubles or not:
Remove the Duplicates if they so desire:
Then clean up any leftovers:
Final Analysis:
Ran this against 2 Files with 4 columns, 65,535 rows, and with some doubles. Process time, roughly 1 second. In fact it took longer to load the fields into memory than it did to parse the data.