LINQ 在列表上更快还是在表上更快?
我有很多查询要做,我想知道查询 List 和 DataTable 甚至 SQL Server 索引表之间是否存在显着的性能差异?或者如果我选择其他类型的集合,也许会更快?
总的来说,你觉得怎么样?
谢谢你!
I have many queries to do and I was wondering if there is a significant performance difference between querying a List and a DataTable or even a SQL server indexed table? Or maybe would it be faster if I go with another type of collection?
In general, what do you think?
Thank you!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
查询内存中的任何内容几乎总是更快,例如相对于数据库的
List
或DataTable
。话虽如此,您必须先将数据放入内存对象(例如 List)中,然后才能进行查询,所以我当然希望您不要考虑将数据库转储到
List
用于快速查询。这将是一个非常糟糕的主意。我明白你问题的重点了吗?
It should almost always be faster querying anything in memory, like a
List<T>
or aDataTable
vis-a-vis a database.Having said that, you have to get the data into an in-memory object like a List before it can be queried, so I certainly hope you're not thinking of dumping your DB into a
List<T>
for fast querying. That would be a very bad idea.Am I getting the point of your question?
您可能会将 Linq 与数据库查询语言混淆。我建议阅读 Linq,特别是 IQueryable 与 IEnumerable。
简而言之,Linq 是一种代码内查询语言,它可以指向几乎任何数据集合,以与 SQL 类似的方式执行搜索、投影、聚合等,但不限于 RDBMS 。从表面上看,它不是像 SQL 那样的数据库查询语言;而是一种数据库查询语言。它只能通过使用 IQueryable 提供程序将其转换为一个,例如 Linq2SQL、Linq2Azure、Linq for Entities...这样的例子不胜枚举。
Linq 的 IEnumerable 端适用于已经在堆中的内存中对象,几乎肯定会比 IQueryable 端表现更好,IQueryable 端的存在是为了翻译成 SQL 等本机查询语言。然而,这并不是因为该语言的任何一方都有任何固有的弱点或优势。相反,(通常)必须通过网络通道发送翻译后的 IQueryable 命令并通过相同的命令获取结果,这将比本地计算机的内存执行得慢得多。
然而,从数据存储中提取记录并创建内存中对象表示的“繁重工作”必须在某个时候完成,并且 IQueryable Linq 几乎肯定会比将所有记录实例化为内存中对象然后使用更快IEnumerable Linq(Linq 2 对象)进行过滤以获取实际数据。
举例说明:你有一个表MyTable;它包含相对较少的 2 亿行。使用像 Linq2SQL 这样的 Linq 提供程序,您的代码可能如下所示:
这将被 Linq2SQL IQueryable 提供程序消化成这样的 SQL 字符串:
SQL 引擎可以轻松地消化此查询,以准确返回所需的信息(例如 500 行) )。
没有 Linq 提供程序,但想要使用 Linq,您可以执行以下操作:
从表面上看,差异很微妙。在幕后,细节决定成败。第二个查询依赖于为数据库中的每条记录检索并实例化对象的方法。这意味着它必须提取所有这些记录,并在内存中为它们创建一个空间。这将为您提供 2 亿条记录的列表,由于每条记录都通过网络传输,并且现在已驻留在您的页面文件中,因此该列表不再那么简单。第一个查询可能会在构建表达式树并将其消化为 SQL 时引入一些开销,但它比将整个表转储到内存集合中并对其进行迭代要好得多。
You might be confusing Linq with a database query language. I would suggest reading up on Linq, particularly IQueryable vs IEnumerable.
In short, Linq is an in-code query language, which can be pointed at nearly any collection of data to perform searches, projections, aggregates, etc in a similar fashion as SQL, but not limited to RDBMSes. It is not, on its face, a DB query language like SQL; it can merely be translated into one by use of an IQueryable provider, line Linq2SQL, Linq2Azure, Linq for Entities... the list goes on.
The IEnumerable side of Linq, which works on in-memory objects that are already in the heap, will almost certainly perform better than the IQueryable side, which exists to be translated into a native query language like SQL. However, that's not because of any inherent weakness or strength in either side of the language. It is instead a factor of (usually) having to send the translated IQueryable command over a network channel and get the results over same, which will perform much more slowly than your local computer's memory.
However, the "heavy lifting" of pulling records out of a data store and creating in-memory object representations has to be done at some time, and IQueryable Linq will almost certainly be faster than instantiating ALL records as in-memory objects, THEN using IEnumerable Linq (Linq 2 Objects) to filter to get your actual data.
To illustrate: You have a table MyTable; it contains a relatively modest 200 million rows. Using a Linq provider like Linq2SQL, your code might look like this:
This will be digested by the Linq2SQL IQueryable provider into a SQL string like this:
This query can be easily digested by the SQL engine to return EXACTLY the information needed (say 500 rows).
Without a Linq provider, but wanting to use Linq, you may do something like this:
On the surface, the difference is subtle. Behind the scenes, the devil's in those details. This second query relies on a method that retrieves and instantiates an object for every record in the database. That means it has to pull all those records, and create a space in memory for them. That will give you a list of 200 MILLION records, which isn't so modest anymore now that each of those records was transmitted over the network and is now taking up residence in your page file. The first query MAY introduce some overhead in building and then digesting the expression tree into SQL, but it's MUCH preferred over dumping an entire table into an in-memory collection and iterating over it.