“文档末尾的额外内容”使用 libxml2 读取 shm_open 创建的文件句柄时出错

发布于 2024-12-29 00:05:49 字数 965 浏览 0 评论 0原文

我正在尝试编写一个单元测试来检查一些 xml 解析代码。单元测试使用 shm_open 在内存中 xml 文档上创建文件描述符,然后将其传递给 xmlTextReaderForFd()。但我在后续的 xmlTextReaderRead() 上收到“文档末尾有额外内容”错误。解析代码在从实际文件创建的文件描述符上运行良好(我已经与 shm_open 创建的文件描述符进行了逐字节比较,它是完全相同的一组字节。)为什么 libxml2 在创建的文件描述符上会卡住与 shm_open?

这是我的代码:

void unitTest() {
  int fd = shm_open("/temporary", O_RDWR | O_CREAT, S_IRUSR | S_IWUSR);
  char *pText = "<?xml version=\"1.0\"?><foo></foo>";
  write(fd, pText, strlen(pText) + 1);
  lseek(fd, 0, SEEK_SET);

  xmlTextReaderPtr pReader = xmlReaderForFd(
    fd,            // file descriptor
    "/temporary",  // base uri
    NULL,          // encoding
    0);            // options

  int result = xmlTextReaderRead(pReader);
  // result is -1
  // Get this error at console:
  //   /temporary:1: parser error : Extra content at the end of the document
  //   <?xml version="1.0"?><foo></foo>
  //                                   ^
}

I'm trying to write a unit test that checks some xml parsing code. The unit test creates a file descriptor on an in-memory xml doc using shm_open and then passes that to xmlTextReaderForFd(). But I'm getting an "Extra content at the end of the document" error on the subsequent xmlTextReaderRead(). The parsing code works fine on a file descriptor created from an actual file (I've done a byte-for-byte comparison with the shm_open created one and it's the exact same set of bytes.) Why is libxml2 choking on a file descriptor created with shm_open?

Here's my code:

void unitTest() {
  int fd = shm_open("/temporary", O_RDWR | O_CREAT, S_IRUSR | S_IWUSR);
  char *pText = "<?xml version=\"1.0\"?><foo></foo>";
  write(fd, pText, strlen(pText) + 1);
  lseek(fd, 0, SEEK_SET);

  xmlTextReaderPtr pReader = xmlReaderForFd(
    fd,            // file descriptor
    "/temporary",  // base uri
    NULL,          // encoding
    0);            // options

  int result = xmlTextReaderRead(pReader);
  // result is -1
  // Get this error at console:
  //   /temporary:1: parser error : Extra content at the end of the document
  //   <?xml version="1.0"?><foo></foo>
  //                                   ^
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

故事与诗 2025-01-05 00:05:49

我发现了问题所在。我正在写出 NULL 终止符,这就是导致 libxml2 阻塞的原因(尽管我可以发誓我已经在没有 NULL 终止符的情况下尝试过它,哦!)固定代码应该是:

 write(fd, pText, strlen(pText));

I figured out the problem. I was writing out the NULL terminator and that's what was causing libxml2 to choke (although I could have sworn I already tried it without the NULL terminator, d'oh!) The fixed code should simply be:

 write(fd, pText, strlen(pText));
合久必婚 2025-01-05 00:05:49

另外,请确保您以二进制而非文本形式读取文件。 “文本”会去除 CR/LF,减小文件大小并在缓冲区末尾留下碎屑。

示例(对比 2010):

struct _stat32 stat;
char *buf;
FILE *f = fopen("123.XML", "rb");    // right
//f = fopen("123.XML", "rt");    // WRONG!
_fstat(fileno(f), &stat);
buf = (char *)malloc(stat.st_size);
int ret = fread(buf, stat.st_size, 1, f);
assert(ret == 1);
// etc.

Also, make sure you are reading the file as binary, not text. 'Text' strips out CR/LF, reduces the size of the file and leaves detritus at the end of the buffer.

Example (VS 2010):

struct _stat32 stat;
char *buf;
FILE *f = fopen("123.XML", "rb");    // right
//f = fopen("123.XML", "rt");    // WRONG!
_fstat(fileno(f), &stat);
buf = (char *)malloc(stat.st_size);
int ret = fread(buf, stat.st_size, 1, f);
assert(ret == 1);
// etc.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文