如何用C语言打开任意长度的文件?

发布于 2024-12-04 05:54:03 字数 487 浏览 0 评论 0原文

作为一项学校作业,我的任务是编写一个程序来打开任何文本文件并对文本执行许多操作。必须使用链接列表加载文本,这意味着包含 char 指针和指向下一个结构的指针的结构数组。每个结构一行。

但我在实际加载文件时遇到了问题。看来必须在我实际读取文本之前分配将文本加载到内存中所需的内存。因此我必须多次打开该文件。一次计算行数,然后每行两次;一次计算行中的字符,然后一次读取它们。打开一个文件数百次只是为了将其读入内存似乎很荒谬。

显然有更好的方法可以做到这一点,我只是不知道它们:-)

示例

  • 可以在不重新移动的情况下移动 fgetc 获取字符的点吗-打开文件?
  • 在“打开”文件之前可以检查文件中的行数或字符数吗?
  • 我可以以某种方式从文件中读取行或字符串并将其保存到内存中而不分配固定数量的字节?

As a school assignment I'm tasked with writing a program that opens any text file and performs a number of operations on the text. The text must be loaded using a linked list, meaning an array of structs containing the char pointer and the pointer to the next struct. One line per struct.

But I'm having problems actually loading the file. It seems the memory required to load the text into memory must be allocated before I actually read the text. Hence I have to open the file several times. Once to count the number of lines, then twice per line; once to count the characters in the line then once to read them. It seems absurd to open a file hundreds of times just to read it into memory.

Obviously there are better ways of doing this, I just don't know them :-)

Examples

  • Can the point from which fgetc fetches a character be moved without re-opening the file?
  • Can the number of lines or characters in a file be checked before it is "opened"?
  • Can I somehow read a line or string from a file and save it to memory without allocating a fixed amount of bytes?
  • etc

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

や莫失莫忘 2024-12-11 05:54:03

无需多次打开该文件,也无需多次浏览该文件。

查看 POSIX getline() 函数。它将行读入分配的空间。您可以使用它来读取行,然后将结果复制到链接列表中。

链表不需要提前知道有多少行;这就是列表的优点。

因此,该代码只需一次即可完成。即使您不能使用 getline(),您也可以使用 fgets() 并监视它是否每次读取到行尾,如果没有,则可以根据需要分配(和重新分配)空间来容纳该行(malloc()realloc() 以及最终的free() )。

如果您采用我建议的任何方法,您的具体问题在很大程度上是无关紧要的,但是:

  • 使用fseek()(以及在极端情况下rewind())将移动读取指针(对于 fgetc() 和所有其他函数),除非“文件”不支持查找(例如,作为标准输入提供的管道)。

  • 可以使用 stat()fstat() 或变体来确定字符。除了读取文件之外,无法确定行。

  • 由于文件大小可能从零字节到千兆字节,因此没有明智的方法来进行固定大小分配。您几乎被迫使用 malloc() 等进行动态内存分配。 (在幕后,getline() 使用 malloc()realloc()。)

There is no need to open the file more than once, nor to pass through it more than once.

Look at the POSIX getline() function. It reads lines into allocated space. You can use it to read the lines, and then copy the results for your linked list.

There is no need with a linked list to know how many lines there are ahead of time; that's an advantage of lists.

So, the code can be done with a single pass. Even if you can't use getline(), you can use fgets() and monitor whether it reads to end of line each time, and if it doesn't you can allocate (and reallocate) space to hold the line as needed (malloc(), realloc() and eventually free() from <stdlib.h>).

Your specific questions are largely immaterial if you adopt anything of the approach I suggest, but:

  • Using fseek() (and in extremis rewind()) will move the read pointer (for fgetc() and all other functions), unless the 'file' does not support seeking (eg, a pipe provided as standard input).

  • Characters can be determined with stat() or fstat() or variants. Lines cannot be determined except by reading the file.

  • Since the file could be from zero bytes to gigabytes in size, there isn't a sensible way of doing fixed size allocations. You are pretty much forced into dynamic memory allocation with malloc() et al. (Behind the scenes, getline() uses malloc() and realloc().)

梦过后 2024-12-11 05:54:03

如果不实际遍历文件,则无法计算文件中的行数。您可以获得总文件大小,但这不是这里的目的。使用行链接列表的想法是一次对文件进行一行操作。您无需提前阅读任何内容。当您还没有读取整个文件时,读取一行,将其添加到链表末尾的自己的节点中,然后移动到下一行。

You cannot count the number of lines in a file without actually traversing it. You could get the total file size, but that's not whats intended here. The idea of using a linked list of lines is that you operate on the file one line at a time. You do not need to read anything in advance. While you haven't read the whole file, read a line, add it to its own node at the end of the linked list, move to the next line.

平生欢 2024-12-11 05:54:03

关于您的第一个问题:您可以使用 fseek() 函数更改正在读取的文件中的位置。

有几种方法可以做到这一点。例如,您可以有一个固定大小的缓冲区,用文件中的字节填充它,将缓冲区中的行复制到列表,再次填充缓冲区等等。

Regarding your first question: you can change the position in the file you are reading from with the fseek() function.

There are several ways you could do this. For example, you could have a fixed-size buffer, fill it with bytes from the file, copy lines from the buffer to the list, fill the buffer again and so on.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文