在 Delphi 中使用 TStringList 加载巨大的文本文件是最好的方法吗?

发布于 2024-10-19 05:24:39 字数 270 浏览 4 评论 0原文

在delphi中加载巨大文本文件数据的最佳方法是什么?有没有可以超快加载文本文件的组件?

假设我有一个包含数据库并以固定长度格式存储的文本文件。 它包含 150 个字段,每个字段至少 50 个字符。 1.我需要将它加载到内存中 2.我需要解析它并可能将其存储在memdataset中以处理

我的问题: 1. 如果我使用TStringList.loadFromFile方法就足够了吗? 2.还有其他更好的组件来操作文本文件吗? 3. 我应该使用文本文件的低级读取吗?

先感谢您。

What is the best way to load huge text file data in delphi? Is there any component that can load text file superfast?

Let's say I have a text file contains database and stored in fix length format.
It contains 150 field with each at least 50 characters.
1. I need to load it into memory
2. I need to parse it and probably store it in a memdataset for processing

My questions:
1. Is it enough if I use TStringList.loadFromFile method?
2. Is there any other better component to manipulate the text file?
3. Should I use low level reading from textfile?

Thank you in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

雨后咖啡店 2024-10-26 05:24:39

TStringList 从来都不是处理大量文本的最佳方式,但它是最简单的。如果您手头有小文件,则可以毫无问题地使用 TStringList。即使您有大文件(不是大文件),您也可以使用 TStringList 实现算法的一个版本用于测试目的,因为它简单且易于理解。

如果您的文件很大(因为您将它们称为“数据库”),那么您需要研究替代技术,使您能够从数据库中仅读取所需的数据。查看:

  • TFileStream
  • 内存映射文件。

不要看 Delphi 中仍然可用的旧的基于“文件”的 API,它们已经很旧了。

我不会详细介绍如何使用这些方法访问文本,因为我们最近有两个类似的问题:

如何在 Delphi 中高效地读取多个文件的前几行

快速搜索以查看大文件中是否存在字符串德尔福

TStringList is never the optimal way of working with lots of text, but it's the simplest. If you've got small files on your hands you can use TStringList without issues. Even if you have large files (not huge files) you might implement a version of you algorithm using TStringList for testing purposes, because it's simple and easy to understand.

If your files are large, as they probably are since you call them "databases", you need to look into alternative technologies that will enable you to read only as much as you need from the database. Look into:

  • TFileStream
  • Memory mapped files.

Don't look at the old "file" based API's still available in Delphi, they're plain old.

I'm not going to go into details on how to access text using those methods because we've recently had two similar questions on SO:

How Can I Efficiently Read The FIrst Few Lines of Many Files in Delphi

and

Fast Search to see if a String Exists in Large Files with Delphi

情释 2024-10-26 05:24:39

由于您使用的长度是固定的,因此您可以使用 TWriter 和 TReader 构建一个基于 TList 的访问类,它将考虑您的记录。您将没有 TStringList 的任何开销(并不是说这是一件坏事,而是如果您不需要它,为什么要拥有它),并且您可以在类中构建自己对记录的访问权限。
最终,这取决于将数据加载到内存后您想要用数据完成什么任务。虽然 TStringlist 易于使用,但它不如“滚动自己的”那么高效。

然而,数据操作的效率可能不是什么大问题,因为您使用文本文件来保存数据库。如果您只需要读入文件中的数据并根据文件中的数据做出决策,那么更灵活的 TList 可能会大材小用。

Since you have a fixed length that you're working with, you can build an access class based on TList with a TWriter and TReader that will take your records into account. You'll have none of the overhead of a TStringList (not that it's a bad thing, but if you don't need it, why have it) and you can build in your own access to records into the class.
Ultimately it depends on what you are trying to accomplish with the data once you have it loaded into memory. While TStringlist is easy to use, it isn't as efficient as "rolling your own".

However, efficiency in data manipulation may not be that much of an issue, as you are using text files to hold a database. If you just need to read in and make decisions based on data in the file, the more flexible TList may be overkill.

是伱的 2024-10-26 05:24:39

如果您发现它对您的问题很方便,我建议您坚持使用 TStringList。优化是另一件事应该稍后做。

至于 TStringList ,优化是声明一个覆盖 TStrings.LoadFromStream 方法的后代类 - 考虑到文件的结构,您可以使其尽可能快。

I recommend to adhere to TStringList if you find it convenient for your problem. Optimization is another thing that should be done later.

As for TStringList the optimization is to declare a descendant class that overrides TStrings.LoadFromStream method - you can make it practically as fast as possible, taking into account the structure of your files.

路弥 2024-10-26 05:24:39

从您的问题中尚不完全清楚为什么需要将整个文件加载到内存中,然后再创建内存中数据集......您是否将这两个问题混为一谈? (即因为您需要创建一个内存中的数据集,您认为首先需要将源数据完全加载到内存中?或者是否对源文件进行一些初始预处理,这只有在将整个文件加载到内存中时才可能进行(这不太可能,即使是这种情况,对于可导航的流对象(例如 TFileStream)也没有必要)

但我认为您正在寻找的答案就在问题中......

如果您是的 话。加载此文件以便解析它并填充/初始化进一步的数据结构(数据集)以进行进一步处理,然后使用现有的高级数据结构是不必要的且可能成本高昂(就时间而言)的步骤

。 。

在这种情况下,TFileStream 可能会提供便利性和易用性的最佳平衡

It is not entirely clear from your question why you need to load the entire file into memory, prior to then going on to create an in-memory data set.... are you conflating the two issues? (i.e. because you need to create an in-memory data set you think you first need to load the source data entirely into memory? Or is there some initial pre-processing of the source file which is only possible with the entire file loaded in memory (this is unlikely and even if this is the case, it isn't necessary with a navigable stream object such as a TFileStream).

But I think the answer you are looking for is right there in the question....

If you are loading this file in order to parse it and populate/initialise a further data structure (the data set) for further processing, then using an existing high level data structure is an unnecessary and potentially costly (in terms of time) step.

Use the lowest level means of access that provides the capabilities you need.

In this case a TFileStream will likely provide the best balance of convenience and ease of use.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文