当前位置：文江博客话题详情

LINQ 在列表上更快还是在表上更快？

发布于 2024-10-20 06:44:42 字数 126 浏览 1 评论 0原文

我有很多查询要做，我想知道查询 List 和 DataTable 甚至 SQL Server 索引表之间是否存在显着的性能差异？或者如果我选择其他类型的集合，也许会更快？

总的来说，你觉得怎么样？

谢谢你！

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

家住魔仙堡 2024-10-27 06:44:42

查询内存中的任何内容几乎总是更快，例如相对于数据库的List或DataTable。

话虽如此，您必须先将数据放入内存对象（例如 List）中，然后才能进行查询，所以我当然希望您不要考虑将数据库转储到 List 用于快速查询。这将是一个非常糟糕的主意。

我明白你问题的重点了吗？

回复收藏 0 原文

最单纯的乌龟 2024-10-27 06:44:42

您可能会将 Linq 与数据库查询语言混淆。我建议阅读 Linq，特别是 IQueryable 与 IEnumerable。

简而言之，Linq 是一种代码内查询语言，它可以指向几乎任何数据集合，以与 SQL 类似的方式执行搜索、投影、聚合等，但不限于 RDBMS 。从表面上看，它不是像 SQL 那样的数据库查询语言；而是一种数据库查询语言。它只能通过使用 IQueryable 提供程序将其转换为一个，例如 Linq2SQL、Linq2Azure、Linq for Entities...这样的例子不胜枚举。

Linq 的 IEnumerable 端适用于已经在堆中的内存中对象，几乎肯定会比 IQueryable 端表现更好，IQueryable 端的存在是为了翻译成 SQL 等本机查询语言。然而，这并不是因为该语言的任何一方都有任何固有的弱点或优势。相反，（通常）必须通过网络通道发送翻译后的 IQueryable 命令并通过相同的命令获取结果，这将比本地计算机的内存执行得慢得多。

然而，从数据存储中提取记录并创建内存中对象表示的“繁重工作”必须在某个时候完成，并且 IQueryable Linq 几乎肯定会比将所有记录实例化为内存中对象然后使用更快IEnumerable Linq（Linq 2 对象）进行过滤以获取实际数据。

举例说明：你有一个表MyTable；它包含相对较少的 2 亿行。使用像 Linq2SQL 这样的 Linq 提供程序，您的代码可能如下所示：

//GetContext<>() is a method that will return the IQueryable provider
//used to produce MyTable entitiy objects

//pull all records for the past 5 days
var results = from t in Repository.GetContext<MyTable>()
              where t.SomeDate >= DateTime.Today.AddDays(-5)
              && t.SomeDate <= DateTime.Now
              select t;

这将被 Linq2SQL IQueryable 提供程序消化成这样的 SQL 字符串：

SELECT [each of MyTable's fields] FROM MyTable WHERE SomeDate Between @p1 and @p2; @p1 = '2/26/2011', @p2 = '3/3/2011 9:30:00'

SQL 引擎可以轻松地消化此查询，以准确返回所需的信息（例如 500 行））。

没有 Linq 提供程序，但想要使用 Linq，您可以执行以下操作：

//GetAllMyTable() is a method that will execute and return the results of
//"Select * from MyTable"

//pull all records for the past 5 days
var results = from t in Repository.GetAllMyTable()
              where t.SomeDate >= DateTime.Today.AddDays(-5)
              && t.SomeDate <= DateTime.Now
              select t;

从表面上看，差异很微妙。在幕后，细节决定成败。第二个查询依赖于为数据库中的每条记录检索并实例化对象的方法。这意味着它必须提取所有这些记录，并在内存中为它们创建一个空间。这将为您提供 2 亿条记录的列表，由于每条记录都通过网络传输，并且现在已驻留在您的页面文件中，因此该列表不再那么简单。第一个查询可能会在构建表达式树并将其消化为 SQL 时引入一些开销，但它比将整个表转储到内存集合中并对其进行迭代要好得多。

You might be confusing Linq with a database query language. I would suggest reading up on Linq, particularly IQueryable vs IEnumerable.

In short, Linq is an in-code query language, which can be pointed at nearly any collection of data to perform searches, projections, aggregates, etc in a similar fashion as SQL, but not limited to RDBMSes. It is not, on its face, a DB query language like SQL; it can merely be translated into one by use of an IQueryable provider, line Linq2SQL, Linq2Azure, Linq for Entities... the list goes on.

The IEnumerable side of Linq, which works on in-memory objects that are already in the heap, will almost certainly perform better than the IQueryable side, which exists to be translated into a native query language like SQL. However, that's not because of any inherent weakness or strength in either side of the language. It is instead a factor of (usually) having to send the translated IQueryable command over a network channel and get the results over same, which will perform much more slowly than your local computer's memory.

However, the "heavy lifting" of pulling records out of a data store and creating in-memory object representations has to be done at some time, and IQueryable Linq will almost certainly be faster than instantiating ALL records as in-memory objects, THEN using IEnumerable Linq (Linq 2 Objects) to filter to get your actual data.

To illustrate: You have a table MyTable; it contains a relatively modest 200 million rows. Using a Linq provider like Linq2SQL, your code might look like this:

//GetContext<>() is a method that will return the IQueryable provider
//used to produce MyTable entitiy objects

//pull all records for the past 5 days
var results = from t in Repository.GetContext<MyTable>()
              where t.SomeDate >= DateTime.Today.AddDays(-5)
              && t.SomeDate <= DateTime.Now
              select t;

This will be digested by the Linq2SQL IQueryable provider into a SQL string like this:

SELECT [each of MyTable's fields] FROM MyTable WHERE SomeDate Between @p1 and @p2; @p1 = '2/26/2011', @p2 = '3/3/2011 9:30:00'

This query can be easily digested by the SQL engine to return EXACTLY the information needed (say 500 rows).

Without a Linq provider, but wanting to use Linq, you may do something like this:

//GetAllMyTable() is a method that will execute and return the results of
//"Select * from MyTable"

//pull all records for the past 5 days
var results = from t in Repository.GetAllMyTable()
              where t.SomeDate >= DateTime.Today.AddDays(-5)
              && t.SomeDate <= DateTime.Now
              select t;

On the surface, the difference is subtle. Behind the scenes, the devil's in those details. This second query relies on a method that retrieves and instantiates an object for every record in the database. That means it has to pull all those records, and create a space in memory for them. That will give you a list of 200 MILLION records, which isn't so modest anymore now that each of those records was transmitted over the network and is now taking up residence in your page file. The first query MAY introduce some overhead in building and then digesting the expression tree into SQL, but it's MUCH preferred over dumping an entire table into an in-memory collection and iterating over it.

回复收藏 0 原文

~没有更多了~