.NET 中有比 SqlDataReader 更快的东西吗?

发布于 2024-09-19 20:45:20 字数 187 浏览 12 评论 0 原文

我需要使用 C# 将 SqlServer 上的表中的一列字符串加载到内存中的数组中。 有没有比打开 SqlDataReader 并循环遍历它更快的方法。 桌子很大,时间很关键。

编辑 我正在尝试构建 .dll 并在服务器上使用它对数据库进行一些操作。但目前还很慢。如果这是最快的,那么我必须重新设计数据库。我认为可能有一些解决方案可以加快速度。

I need to load one column of strings from table on SqlServer into Array in memory using C#.
Is there a faster way than open SqlDataReader and loop through it.
Table is large and time is critical.

EDIT
I am trying to build .dll and use it on server for some operations on database. But it is to slow for now. If this is fastest than I have to redesign the database. I tough there may be some solution how to speed thing up.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

妖妓 2024-09-26 20:45:47

如果加载大量数据时响应能力是一个问题,请考虑使用异步方法 - BeginReader。

我一直使用它在后台填充大型 GUI 元素,同时应用程序继续响应。

您还没有确切说明该数据有多大,或者为什么将其全部加载到数组中。

很多时候,对于大量数据,您可能希望将其保留在数据库中或让数据库承担繁重的工作。但我们需要知道您正在执行哪种处理,需要一次将所有数据放入一个数组中。

If responsiveness is an issue loading a great deal of data, look at using the asynchronous methods - BeginReader.

I use this all the time for populating large GUI elements in the background while the app continues to be responsive.

You haven't said exactly how large this data is, or why you are loading it all into an array.

Often times, for large amounts of data, you may want to leave it in the database or let the database do the heavy lifting. But we'd need to know what kind of processing you are doing that needs it all in an array at one time.

不乱于心 2024-09-26 20:45:45

需要考虑的一些表面级别的事情可能会影响速度(除了数据读取器):

  1. 数据库查询优化
    • OrderBy 价格昂贵
    • 独特就是昂贵
    • RowCount 很昂贵
    • GroupBy 价格昂贵
    • 等等。有时您不能没有这些东西,但如果您可以在 C# 代码中处理其中一些东西,速度可能会更快。
  2. 数据库表索引(对于初学者来说,WHERE 子句中的字段是否已索引?)
  3. 数据库表数据类型(给定数据,您是否使用尽可能小的数据类型?)
  4. 为什么要将数据读取器转换为数组?
    • 例如,创建一个不需要转换为数组的适配器/数据表是否同样有效?
  5. 您研究过实体框架吗? (可能会慢一些……但如果您别无选择,可能值得研究一下以确保)

只是随机的想法。不确定什么对您的情况有帮助。

Some surface-level things to consider that may affect speed (besides a data-reader):

  1. Database Query Optimization
    • OrderBy is expensive
    • Distinct is expensive
    • RowCount is expensive
    • GroupBy is expensive
    • etc. Sometimes you can't live without these things, but if you can handle some of these things in your C# code instead, it may be faster.
  2. Database Table indexing (for starters, are the fields in your WHERE clause indexed?)
  3. Database Table DataTypes (are you using the smallest possible, given the data?)
  4. Why are you converting the datareader to an array?
    • e.g., would it serve just as well to create an adapter/datatable that you then would not need to convert to an array?
  5. Have you looked into Entity Framework? (might be slower...but if you're out of options, might be worthwhile to look into just to make sure)

Just random thoughts. Not sure what might help in your situation.

剑心龙吟 2024-09-26 20:45:43

您有 4 组管理费用
- 磁盘访问
- .net代码(CPU)
- SQL服务器代码(CPU)
- 在托管和非托管代码(CPU)之间切换的时间

首先

select * where column = “junk” 

对您来说足够快,如果不是唯一的解决方案就是使磁盘更快。 (您从 SQL Server 获取数据的速度比它读取数据的速度快)

您可以在 C# 中定义 Sql Server 函数,然后在列上运行该函数;抱歉,我不知道该怎么做。这可能比数据读取器更快。

如果您有多个 CPU,并且您知道表中间的一个值,则可以尝试使用多个线程。

您也许可以编写一些 TSQL,使用您知道安全的分隔符将所有字符串组合成一个字符串。然后在 C# 中再次分割字符串。这将减少托管代码和非托管代码之间的往返次数。

You have 4 sets of overheads
- Disk Access
- .net code (cpu)
- SQL server code (cpu)
- Time to switch between managed and unmanaged code (cpu)

Firstly is

select * where column = “junk” 

fast enough for you, if not the only solution is to make the disk faster. (You can get data from SQL Server faster than it can read it)

You may be able to define a Sql Server function in C# then run the function over the column; sorry I don’t know how to do it. This may be faster than a data reader.

If you have more than one CPU, and you know a value the middle of the table, you could try using more than one thread.

You may be able to write some TSQL that combines all the strings into a single string using a separator you know is safe. Then split the string up again in C#. This will reduce the number of round trips between managed and unmanaged code.

黎夕旧梦 2024-09-26 20:45:41

“提供一种从 SQL Server 数据库读取只进行流的方法”这是 MSDN 中 SqlDataReader 的使用。 SqlDataReder 背后的数据结构仅允许向前读取,它针对单向读取数据进行了优化。在我看来,我想使用SqlDataReader而不是DataSet来进行简单的数据读取。

"Provides a way of reading a forward-only stream of rows from a SQL Server database" This is the use of SqlDataReader from MSDN . The Data structure behind SqlDataReder only allow read forward, it's optimized for reading data in one direction. In my opinion, I want to use SqlDataReader than DataSet for simple data reading.

半边脸i 2024-09-26 20:45:40

SqlDataReader 将是最快的方法。
通过使用适当的 Getxxx 方法(该方法采用序数作为参数)来优化它的使用。

如果速度不够快,请查看是否可以调整查询。在要检索的列上放置覆盖索引。通过这样做,Sql Server 只需读取索引,而不必直接访问表来检索所需的所有信息。

The SqlDataReader will be the fastest way.
Optimize the use of it, by using the appropriate Getxxx method , which takes an ordinal as parameter.

If it is not fast enough, see if you can tweak your query. Put a covering index on the column (s) that you want to retrieve. By doing so, Sql Server only has to read the index, and does not have to go to the table directly to retrieve all the info that is required.

我为君王 2024-09-26 20:45:37

将一列行转换为一行列,并且只读取一行怎么样? SqlDataReader 对读取单行进行了优化(ExecuteReaderSystem.Data.CommandBehavior.SingleRow 参数),因此也许可以提高速度一点。

我看到了几个优点:

  • 单行改进,
  • 不需要在每次迭代时访问数组(reader[0]),
  • 将数组(reader)克隆到另一个数组可能是比循环遍历元素并将每个元素添加到新数组中更快。

另一方面,它的缺点是迫使SQL数据库做更多的工作。

What about transforming one column of rows to one row of columns, and having only one row to read? SqlDataReader has an optimization for reading a single row (System.Data.CommandBehavior.SingleRow argument of ExecuteReader), so maybe it can improve the speed a bit.

I see several advantages:

  • Single row improvement,
  • No need to access an array on each iteration (reader[0]),
  • Cloning an array (reader) to another one may be faster than looping through elements and adding each one to a new array.

On the other hand, it has a disadvantage to force SQL database to do more work.

猫性小仙女 2024-09-26 20:45:36

SqlDataReader 是最快的方法。确保使用按顺序获取方法而不是按列名称获取。例如 GetString(1);

另外值得尝试的是在连接字符串中使用 MinPoolSize,以便池中始终有一些连接。

SqlDataReader is the fastest way. Make sure you use the get by ordinal methods rather than get by column name. e.g. GetString(1);

Also worthwhile is experimenting with MinPoolSize in the connection string so that there are always some connections in the pool.

黎夕旧梦 2024-09-26 20:45:34

我怀疑 SqlDataReader 与您所得到的一样好。

I suspect that SqlDataReader is about as good as you're going to get.

深海蓝天 2024-09-26 20:45:32

不。实际上,这不仅是最快的方法,而且是唯一(!)的方法。无论如何,所有其他机制内部都使用 DataReader。

No. It is actually not only the fastest way - it is the ONLY (!) way. All other mechanisms INTERNALLY use a DataReader anyway.

好久不见√ 2024-09-26 20:45:30

如果 SqlDataReader 不够快,也许您应该将内容存储在其他地方,例如(内存中)缓存。

If SqlDataReader isn't fast enough, perhaps you should store your stuff somewhere else, such as an (in-memory) cache.

秋意浓 2024-09-26 20:45:28

数据读取器

关于 SQL 的最快访问速度是使用 SqlDataReader

分析它

实际分析性能问题的位置是值得的。通常,您认为性能问题所在的位置在您分析之后会被证明是完全错误的。

例如,它可能是:

  1. 运行查询所需的时间
  2. 跨网络/进程边界复制数据所需的
  3. 时间 .Net 将数据加载到内存
  4. 所需的时间 ..您的代码需要用它来执行某些操作

,单独分析这些内容将使您更好地了解瓶颈所在。对于分析代码,有一篇 来自 Microsoft 的精彩文章

缓存它

要提高性能,需要考虑的是确定是否您每次都需要加载所有数据。列表(或其一部分)可以缓存吗?查看新的 System.Runtime。缓存命名空间。

重写为 T-SQL

如果您正在进行纯粹的数据操作(如您的问题所示),您可以将使用数据的代码重写为 T-SQL 并在 SQL 上本机运行。这可能会更快,因为您将直接处理数据而不需要移动它。

如果您的代码有很多必要的过程逻辑,您可以尝试将 T-SQL 与 CLR 集成 为您提供两全其美的好处。

这很大程度上取决于逻辑的复杂性(或更程序化的性质)。

如果所有其他方法都失败了

如果所有区域都是最佳的(或接近最佳的),并且您的设计没有错误。我什至不会进行微观优化,我只是 向其扔硬件

什么硬件?尝试使用可靠性和性能监视器来了解位置瓶颈是。最有可能出现您所描述的 HDD 或 RAM 问题的地方。

Data Reader

About the fastest access you will get to SQL is with the SqlDataReader.

Profile it

It's worth actually profiling where your performance issue is. Usually, where you think the performance issue is, is proven to be totally wrong after you've profiled it.

For example it could be:

  1. The time... the query takes to run
  2. The time... the data takes to copy across the network/process boundry
  3. The time... .Net takes to load the data into memory
  4. The time... your code takes to do something with it

Profiling each of these in isolation will give you a better idea of where your bottleneck is. For profiling your code, there is a great article from Microsoft

Cache it

The thing to look at to improve performance is to work out if you need to load all that data every time. Can the list (or part of it) be cached? Take a look at the new System.Runtime.Caching namespace.

Rewrite as T-SQL

If you're doing purely data operations (as your question suggests), you could rewrite your code which is using the data to be T-SQL and run natively on SQL. This has the potential to be much faster, as you will be working with the data directly and not shifting it about.

If your code has a lot of necessary procedural logic, you can try mixing T-SQL with CLR Integration giving you the benefits of both worlds.

This very much comes down to the complexity (or more procedural nature) of your logic.

If all else fails

If all areas are optimal (or as near as), and your design is without fault. I wouldn't even get into micro-optimisation, I'd just throw hardware at it.

What hardware? Try the reliability and performance monitor to find out where the bottle neck is. Most likely place for the problem you describe HDD or RAM.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文