如何使用c中的偏移量读取文件

发布于 2024-12-18 12:27:30 字数 814 浏览 2 评论 0原文

如果必须使用以下参数,如何读取文件的内容:

  1. 我必须使用部分的“起始值”和部分的长度来读取文件的部分
  2. 部分的起始值和长度将从另一个文件中读取

总体而言,我正在尝试计算这些部分的 MD5 值(您也可以将它们称为 CHUNKS)。 块的起始值和长度已被计算并存储在文件中。 我尝试按如下方式使用 fread() ,但它没有给我逻辑结果

char *chunk_buffer;
//chunk_buffer is a pointer to a memory block 
while(cur_poly != NULL) {
    //cur_poly is a structure which is used to store the start and length of chunks
    chunk_buffer = (char*) malloc ((cur_poly->length)*8);
    //here I am trying to allocate memory based on the size of each chunk
    int x=fread (chunk_buffer,1, cur_poly->length, c_file);
    //c_file is the file to be read according to the offsets
    char hash[32];
    hash=md5(chunk_buffer);
    //md5() is a function which can generate the md5 hash values for the chunks
}

How can I read the contents of a file if I have to use the following parameters:

  1. I have to read the file in parts by using "start-value" of the part and length of the part
  2. The start-value and length of the parts will be read from another file

Overall, I am trying to compute the MD5 value of these parts (you can also call them as CHUNKS).
The start-value and length of the chunks have been computed and stored in a file.
I tried to use fread() as follows, but it does not give me logical results

char *chunk_buffer;
//chunk_buffer is a pointer to a memory block 
while(cur_poly != NULL) {
    //cur_poly is a structure which is used to store the start and length of chunks
    chunk_buffer = (char*) malloc ((cur_poly->length)*8);
    //here I am trying to allocate memory based on the size of each chunk
    int x=fread (chunk_buffer,1, cur_poly->length, c_file);
    //c_file is the file to be read according to the offsets
    char hash[32];
    hash=md5(chunk_buffer);
    //md5() is a function which can generate the md5 hash values for the chunks
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

等往事风中吹 2024-12-25 12:27:30

我看到两个潜在的问题。

  1. cur_poly->length 代表什么单位?您将内存分配为 64 位字的计数,而将文件读取为字节。如果该字段表示长度(以字节为单位),那么您读取正确,但分配了太多内存。但是,如果该字段的长度为 64 位字,则您分配了正确的内存量,但只读取了 1/8 的数据。

  2. 代码似乎忽略了偏移量。 (或者假设所有块必须是连续的)。如果您想从任意偏移量读取,请在 fread 之前执行 fseek(fp, offset, SEEK_SET);

如果块应该是连续的,那么末端可能仍然有填充,以迫使它们全部在均匀的边界上开始。每当字节数为奇数时,您就必须寻找填充(例如,.WAV 就是这样做的)

I see two potential issues.

  1. What units does cur_poly->length represent? You are mallocing memory as if it is a count of 64-bit words, yet reading the file as if it is bytes. If the field represents length in bytes, then you are reading correctly, but allocating too much memory. However, if the field is length in 64-bit words, then you are allocating the right amount of memory, but only reading 1/8th the data.

  2. The code seems to be ignoring offsets. (Or assuming all chunks must be contiguous). If you want to read from an arbitrary offset, do a fseek(fp, offset, SEEK_SET); before the fread.

If the chunks are supposed to be contiguous, there still may be padding at the ends to force them all to start on an even boundary. You would have to seek over the padding whenever the byte count was odd (.WAV does this, as an example)

老娘不死你永远是小三 2024-12-25 12:27:30

我想指出该代码的更多问题。您可能需要添加有关这些点的更多详细信息。

  1. 如果您想从文件中读取连续的块,通常不需要修改文件的获取指针。只需读取一个块,然后再读取下一个。如果您需要以随机顺序读取块,则需要使用 fseek。通过这种方式,您可以通过偏移量(从文件的开头或结尾,或相对于当前位置)调整下一个文件操作的开始位置。

  2. 你有一个字符指针chunk_buffer,显然你用它来临时存储文件中的数据。也就是说,它只对当前循环迭代有效。
    如果是这种情况,我建议在进入循环之前执行一次malloc

    char * chunk_buffer = malloc (MAXIMUM_CHUNK_SIZE);
    

    在循环中,您可以使用memset清除此缓冲区,或者只是覆盖数据。另请注意,malloc()ed 内存未使用 '\0' 值进行初始化(我不知道这是否是您所依赖的假设......)。

  3. 我不确定,为什么你实际上分配了一个大小为 length*8 的缓冲区,并只读取 length 字节。大概

    int x = fread (chunk_buffer, SIZE_OF_ITEM, THIS_CHUNK_SIZE, c_file);
    

    如果您的项目确实大于一个字节,那么更适合您的需求。

  4. 尚不清楚 md5() 函数的实际用途。它返回什么值?指向动态分配的缓冲区的指针?指向本地数组的指针?无论如何,您将返回值分配给指向本地 char 数组的指针。您可能不需要为此分配 32 字节,但只需

    char * hash = md5(chunk_buffer);
    

    确保在循环进行下一次迭代时将指向该数组的指针保留在您可以找到的地方。在该函数的本地范围内静态创建的数组当然不能以这种方式传递。

  5. 您的md5()函数。它如何知道块的大小是多少?它传递了一个指针,但没有传递有效数据的大小(据我所知)。您可能需要调整此函数以将输入数组的长度作为附加参数。

  6. md5() 函数生成什么,C 样式字符串(字母数字,以 null 结尾)或字节大小的无符号整数数组 (uint8_t) ?

  7. 确保您free()动态分配的内存。如果要将 malloc() 保留在循环内,请确保循环始终以

    结束

    空闲(chunk_buffer);
    
  8. 结束。为了让我们进一步帮助您,您需要定义
    a) 对您来说合乎逻辑的结果是什么以及
    b) 您得到什么结果

I want to note some more issues with that code. You might need to add some more details on these points.

  1. If you want to read consecutive chunks from your file, you usually don't need to modify the get pointer of your file. Just read a chunk, and then read the next one. If you need to read the chunks in random order, you need to use fseek. This way you adjust the start position of the next file operation by an offset (from beginning, or end of the file, or relative to the current position).

  2. You have a char pointer chunk_buffer, that you obviously use to store the data from your file temporarily. That is, it's only valid for the current loop iteration.
    If this is the case I would suggest to do the malloc once before you enter the loop:

    char * chunk_buffer = malloc (MAXIMUM_CHUNK_SIZE);
    

    in the loop you may clear this buffer using memset or just overwrite the data. Also note that malloc()ed memory is not initialized with '\0' values (I don't know if this is one assumption you rely on ...).

  3. I am not sure, why you actually allocate a buffer of size length*8 and just read length bytes to it. Probably

    int x = fread (chunk_buffer, SIZE_OF_ITEM, THIS_CHUNK_SIZE, c_file);
    

    would fit your needs closer, if your items are indeed larger than a byte.

  4. It is unclear, what the md5() function actually does. What value does it return? A pointer to a buffer that is allocated dynamically? A pointer to a local array? Anyway, you assign the return value to a pointer to a local array of chars. You might not need to allocate 32 bytes for this, but just

    char * hash = md5 (chunk_buffer);
    

    Make sure that you keep the pointer to that array somewhere you find it when the loop takes the next iteration. An array that is created statically in local scope of that function can of course not be passed this way.

  5. Your md5() function. How does it know, what the size of a chunk is? It is passed a pointer, but not the size of the valid data (as far as I see it). You might need to adapt this function to take the length of the input array as additional parameter.

  6. What does the md5() function produce, a C-style string (alphanumeric digits, null-terminated) or an array of byte sized unsigned integers (uint8_t) ?

  7. make sure that you free() the memory you allocate dynamically. If you want to keep the malloc() inside the loop, make sure the loop always ends with

    free (chunk_buffer);
    
  8. For us to help you any further, you need to define
    a) what are logical results for you and
    b) what results do you get

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文