每天3000万条记录，SQL Server跟不上，需要其他类型的数据库系统吗？

发布于 2024-08-06 04:04:12 字数 581 浏览 6 评论 0原文

不久前，我考虑为我们的数百万用户网站设计一个新的统计系统，以记录和报告客户的用户操作。

数据库设计非常简单，包含一张表，其中包含一个foreignId（200,000 个不同的id）、一个日期时间字段、一个actionId（30 个不同的id）以及另外两个包含一些元信息（只是小整数）的字段。其他表没有任何限制。此外，我们有两个索引，每个索引包含 4 个字段，不能删除它们，因为当我们拥有较小的索引时，用户会超时。 foreignId 是最重要的字段，因为每个查询都包含此字段。

我们选择使用 SQL Server，但实施后关系数据库似乎不太适合，因为我们不能每天插入 3000 万条记录（它只是插入，我们不做任何更新），同时还进行大量随机操作读取数据库；因为索引更新得不够快。因此：我们遇到了一个大问题 :-) 我们已经暂时解决了这个问题，但是

关系数据库似乎不适合这个问题！

像 BigTable 这样的数据库会是更好的选择吗？为什么？或者在处理此类问题时还有其他更好的选择吗？

注意。此时我们使用具有 4 GB 内存和 Win 2003 32 位的单 8 核 Xeon 系统。据我所知，RAID10 SCSI。索引大小约为表大小的 1.5 倍。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

旧瑾黎汐 2024-08-13 04:04:12

您说您的系统在没有索引的情况下每秒能够插入 3000 条记录，但在有两个附加非聚集索引的情况下只能插入大约 100 条记录。如果 3k/s 是您的 I/O 允许的最大吞吐量，那么理论上添加两个索引应该会降低大约 1000-1500/秒的吞吐量。相反，您会发现性能下降了 10 倍。正确的解决方案和答案是“视情况而定”，并且必须进行一些认真的故障排除和瓶颈识别。考虑到这一点，如果我大胆猜测，我会给出两个可能的罪魁祸首：

A. 附加的非聚集索引将脏页的写入分配到更多的分配区域。解决方案是将聚集索引和每个非聚集索引放入其自己的文件组中，并将三个文件组分别放入 RAID 上的单独 LUN 上。

B. 非聚集索引的低选择性导致读写之间的高争用（键冲突以及 %lockres% 冲突) 导致插入和选择的锁定等待时间较长。可能的解决方案是将快照与读取提交的快照模式一起使用，但我必须警告在版本中添加大量 IO 的危险存储（即在 tempdb 中）在可能已经处于高 IO 压力下的系统上。第二种解决方案是使用数据库快照进行报告，它们会降低 IO 压力并且可以更好地控制它们（不涉及 tempdb 版本存储），但报告不再基于实时数据。

我倾向于相信 B) 是可能的原因，但我必须再次强调需要进行适当的调查和适当的根本案例分析。

“RAID10”并不是一个非常精确的描述。

RAID 0 部分有多少个主轴？它们是短条纹的吗？
有多少个 LUN？
数据库日志位于哪里？
数据库位于哪里？
有多少个分区？
tempdb 位于哪里？

至于关系数据库是否适合这样的问题，是的，绝对适合。还有很多因素需要考虑，可恢复性、可用性、工具集生态系统、专业知识、开发的简易性、部署的简易性、管理的简易性等等。关系数据库可以轻松处理您的工作负载，它们只需要适当的调整。每天 3000 万次插入，每秒 350 次，对于数据库服务器来说只是很小的变化。但无论 CPU 数量有多少，32 位 4GB RAM 系统都很难成为数据库服务器。

You say that your system is capable of inserting 3000 records per second without indexes, but only about 100 with two additional non-clustered indexes. If 3k/s is the maximum throughput your I/O permits, adding two indexes should in theory reduces the throughput at about 1000-1500/sec. Instead you see a degradation 10 times worse. The proper solution and answer is 'It Dependts' and some serious troubleshooting and bottleneck identification would have to be carried out. With that in mind, if I was to venture a guess, I'd give two possible culprits:

A. Th additional non-clustered indexes distribute the writes of dirty pages into more allocation areas. The solution would be to place the the clustered index and each non-clustered index into its own filegroup and place the three filegroups each onto separate LUNs on the RAID.

B. The low selectivity of the non-clustered indexes creates high contention between reads and writes (key conflicts as well as %lockres% conflicts) resulting in long lock wait times for both inserts and selects. Possible solutions would be using SNAPSHOTs with read committed snapshot mode, but I must warn about the danger of adding lot of IO in the version store (ie. in tempdb) on system that may already be under high IO stress. A second solution is using database snapshots for reporting, they cause lower IO stress and they can be better controlled (no tempdb version store involved), but the reporting is no longer on real-time data.

I tend to believe B) as the likely cause, but I must again stress the need to proper investigation and proper root case analysis.

'RAID10' is not a very precise description.

How many spindles in the RAID 0 part? Are they short-striped?
How many LUNs?
Where is the database log located?
Where is the database located?
How many partitions?
Where is tempdb located?

As on the question whether relational databases are appropriate for something like this, yes, absolutely. There are many more factors to consider, recoverability, availability, toolset ecosystem, know-how expertise, ease of development, ease of deployment, ease of management and so on and so forth. Relational databases can easily handle your workload, they just need the proper tuning. 30 million inserts a day, 350 per second, is small change for a database server. But a 32bit 4GB RAM system hardly a database server, regardless the number of CPUs.

回复收藏 0 原文

风启觞 2024-08-13 04:04:12

听起来您可能遇到两个特殊的问题。您遇到的第一个问题是每次执行插入时您的索引都需要重建 - 您是否真的尝试运行事务服务器的实时报告（这通常被认为是禁忌）？其次，您可能还会遇到服务器必须调整数据库大小的问题 - 检查以确保您已分配足够的空间并且不依赖数据库来为您执行此操作。

您是否考虑过研究 SQL Server 中的索引视图之类的内容？它们是从主表中删除索引并将其移至物化视图的好方法。

回复收藏 0 原文

屋檐 2024-08-13 04:04:12

您可以尝试将表设为分区表。这样索引更新将影响较小的行集。也许每日分区就足够了。如果没有，请尝试按小时分区！

回复收藏 0 原文

浪漫人生路 2024-08-13 04:04:12

您没有提供足够的信息；我不确定为什么你说关系数据库似乎不适合，除了你现在遇到性能问题这一事实。 RDBMS 运行在什么类型的机器上？鉴于您有外国 ID，似乎关系数据库正是这里所需要的。假设 SQL Server 在足够的硬件上运行，它每天应该能够处理 3000 万次插入。

回复收藏 0 原文

英雄似剑 2024-08-13 04:04:12

鉴于交通繁忙，复制数据库进行报告似乎是最佳途径。但是，首先要尝试一些事情......

使用单个索引，而不是两个索引。聚集索引可能是比非聚集索引更好的选择。更少、更宽的索引通常会比较更多、更窄的索引表现得更好。而且，正如您所说，索引正在杀死您的应用程序。

您没有说明您使用的 ID，但如果您使用 GUID，您可能需要将密钥更改为 bigint。由于 GUID 是随机的，因此无论是在构建索引还是在使用索引时，它们都给索引带来了沉重的负担。使用 bigint 标识列将使索引几乎按时间顺序运行，如果您确实对实时访问最近数据的查询感兴趣，那么您的访问模式更适合单调递增的键。

回复收藏 0 原文