读取通用文件

发布于 2024-11-09 11:26:29 字数 358 浏览 0 评论 0原文

我正在制作一个程序,从标准输入读取文件,对其执行某些操作并将其发送到标准输出。

就目前情况而言,我的程序中有一行:

while((c = getchar()) != EOF){

其中 c 是一个 int。

然而问题是我想在 ELF 可执行文件上使用这个程序。看来可执行文件内必须有代表 ASCII 文件 EOF 的字节,这会导致它被截断(如果我错了,请纠正我 - 这只是我的假设)。

执行此操作的有效一般方法是什么?我可以挖掘 ELF 格式的文档,然后检查最后的内容。这很有用,但我认为如果我仍然可以将此程序应用于任何类型的文件,那就更好了。

I'm making a program that reads in a file from stdin, does something to it and sends it to stdout.

As it stands, I have a line in my program:

while((c = getchar()) != EOF){

where c is an int.

However the problem is I want to use this program on ELF executables. And it appears that there must be the byte that represents EOF for ascii files inside the executable, which results in it being truncated (correct me if I'm wrong here - this is just my hypothesis).

What is an effective general way to go about doing this? I could dig up documents on the ELF format and then just check for whatever comes at the end. That would be useful, but I think it would be better if I could still apply this program to any kind of file.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

江城子 2024-11-16 11:26:29

你会没事的 - EOF 常量不包含有效的 ASCII 值(通常为 -1)。

例如,下面是我系统上的 stdio.h 的摘录:

/* End of file character.
   Some things throughout the library rely on this being -1.  */
#ifndef EOF
# define EOF (-1)
#endif

You'll be fine - the EOF constant doesn't contain a valid ASCII value (it's typically -1).

For example, below is an excerpt from stdio.h on my system:

/* End of file character.
   Some things throughout the library rely on this being -1.  */
#ifndef EOF
# define EOF (-1)
#endif
屋檐 2024-11-16 11:26:29

您可能想要降低一点级别并使用 open()、close() 和 read() 等系统函数,这样您就可以对输入执行您喜欢的操作,因为它将存储在您自己的缓冲区中。

You might want to go a bit lower level and use the system functions like open(), close() and read(), this way you can do what you like with the input as it will get stored in your own buffer.

纸短情长 2024-11-16 11:26:29

你做得正确。

EOF 不是字符。c 不可能用 EOF 来表示流中的任何字节。如果/当c确实包含EOF时,该特定值并非源自文件本身,而是源自底层库/操作系统。 EOF 是出现问题的信号。

确保c是一个int,但是

哦...你可能想从你控制的流中读取数据。如果没有代码可以执行其他操作,stdin 会受到“文本翻译”的影响,这在读取二进制数据时可能并不理想。

FILE *mystream = fopen(filename, "rb");
if (mystream) {
    /* use fgetc() instead of getchar() */
    while((c = fgetc(mystream)) != EOF) {
        /* ... */
    }
    fclose(mystream);
} else {
    /* error */
}

You are doing it correctly.

EOF is not a character. There is no way c will have EOF to represent any byte in the stream. If / when c indeed contains EOF, that particular value did not originate from the file itself, but from the underlying library / OS. EOF is a signal that something went wrong.

Make sure c is an int though

Oh ... and you might want to read from a stream under your control. In the absence of code to do otherwise, stdin is subject to "text translation" which might not be desirable when reading binary data.

FILE *mystream = fopen(filename, "rb");
if (mystream) {
    /* use fgetc() instead of getchar() */
    while((c = fgetc(mystream)) != EOF) {
        /* ... */
    }
    fclose(mystream);
} else {
    /* error */
}
寂寞笑我太脆弱 2024-11-16 11:26:29

从 getchar(3) 手册页:

字符值作为
unsigned char 转换为 int。

这意味着,通过 getchar 读取的字符值永远不能等于有符号整数 -1。这个小程序解释了这一点:

int main(void)
{
        int a;
        unsigned char c = EOF;

        a = (int)c;
        //output: 000000ff - 000000ff - ffffffff
        printf("%08x - %08x - %08x\n", a, c, -1);
        return 0;
}

From the getchar(3) man page:

Character values are returned as an
unsigned char converted to an int.

This means, a character value read via getchar, can never be equal to an signed integer of -1. This little program explains it:

int main(void)
{
        int a;
        unsigned char c = EOF;

        a = (int)c;
        //output: 000000ff - 000000ff - ffffffff
        printf("%08x - %08x - %08x\n", a, c, -1);
        return 0;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文