当前位置：文江博客话题详情

如何在不遍历内容的情况下查找文件中的字符数

发布于 2025-01-02 14:38:47 字数 106 浏览 4 评论 0原文

在一个项目中，我必须读取一个文件，并且必须处理文件中的字符数，有没有办法在不逐个字符读取的情况下获取字符数（否则我将不得不读取该文件两次，一次只是为了找到其中的字符数）。

有可能吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

染火枫林 2025-01-09 14:38:47

是的。

求到末尾得到末尾的位置即大小。

FILE*  file = fopen("Plop");
fseek(file, 0, SEEK_END);
size_t  size = ftell(file);      // This is the size of the file.
                                 // But note it is in bytes.
                                 // Also note if you are reading it into memory this is
                                 // is the value you want unless you plan to dynamically
                                 // convert the character encoding as you read.

fseek(file, 0, SEEK_SET);        // Move the position back to the start.

在 C++ 中，流具有相同的功能：

std::ifstream   file("Plop");
file.seekg(0, std::ios_base::end);
size_t size = file.tellg();

file.seekg(0, std::ios_base::beg);

Yes.

Seek to the end get the position of the end that is the size.

FILE*  file = fopen("Plop");
fseek(file, 0, SEEK_END);
size_t  size = ftell(file);      // This is the size of the file.
                                 // But note it is in bytes.
                                 // Also note if you are reading it into memory this is
                                 // is the value you want unless you plan to dynamically
                                 // convert the character encoding as you read.

fseek(file, 0, SEEK_SET);        // Move the position back to the start.

In C++ the stream have the same functionality:

std::ifstream   file("Plop");
file.seekg(0, std::ios_base::end);
size_t size = file.tellg();

file.seekg(0, std::ios_base::beg);

回复收藏 0 原文

如梦 2025-01-09 14:38:47

您可以尝试这样做：

FILE *fp = ... /*open as usual*/;
fseek(fp, 0L, SEEK_END);
size_t fileSize = ftell(fp);

但是，这会返回文件中的字节数，而不是字符数。除非已知编码是每个字符一个字节（例如 ASCII），否则它是不一样的。

了解大小后，您需要将文件“倒带”回开头：

fseek(fp, 0L, SEEK_SET);

You can try this:

FILE *fp = ... /*open as usual*/;
fseek(fp, 0L, SEEK_END);
size_t fileSize = ftell(fp);

However, this returns the number of bytes in the file, not the number of characters. It is not the same unless the encoding is known to be one byte per character (e.g. ASCII).

You'd need to "rewind" the file back to the beginning after you've learned the size:

fseek(fp, 0L, SEEK_SET);

回复收藏 0 原文

一桥轻雨一伞开 2025-01-09 14:38:47

简单的答案是否定的。更准确地说，它取决于系统：
Unix，这是可能的（例如使用stat）；在Windows下，不是
对于文本文件来说是可能的，但是如果您以二进制形式读取文件，
有一个函数GetFileSize可以使用。

尽管不能保证，但在我知道的所有实现下（对于
这两个平台），查找文件末尾，然后执行
ftell，将返回一些内容，当转换为充分的
大整型，将给出与上面相同的结果（使用
相同的限制）。

最后：为什么需要这些信息？如果只是为了分配一个
适当大小的缓冲区，即使是文本文件，GetFileSize（和
tell 查找到最后）将返回一个稍大的值
比您可以读取的字节数。你的缓冲会稍微
过大，但这通常不是问题。

回复收藏 0 原文

毁梦 2025-01-09 14:38:47

我认为您可能正在寻找动态内存解决方案。您实际上问的是“有没有一种方法可以在不读取文件的情况下获取文件中的字符数？”。答案（假设每个字符一个字节）是肯定的，您可以使用 stat 调用来获取文件大小，文件大小（以字节为单位）是字符数。对于 UTF-8，答案是否定的，但我们暂时把它放在一边，因为刚刚学习的计算机科学家通常不担心国际化。

我认为你想知道有多少个字符的原因是这样你就有足够大的存储空间来容纳所有字符。您不需要知道文件有多大来存储整个内容。

如果您有一个 std::vector，它一开始可以容纳十个字符，然后增长到容纳二十个，然后一万个......当您读完文件时，它会容纳所有的东西，即使你永远不知道会有多少。

回复收藏 0 原文