如何将 libtidy 与 tidyParseBuffer() 一起使用?
我试图用 libtidy (C 语言)清理一些 HTML,问题是:
我想用 tidyParseBuffer() 构造一个 TidyDoc (树状结构)。
我对 tidyParseFile() 没有问题;关于 tidyParseBuffer():我确定我正确读取了文件,并且我提供给 tidyParseBuffer() 的 TidyBuffer 结构已正确填充。
有什么想法吗?
这是代码:
//declaration
tidyInput = malloc(sizeof(TidyBuffer));
tidyOutput = malloc(sizeof(TidyBuffer));
do {
len = fread(pbInputData, 1, nInputData, h->file);
tidyBufAttach(tidyInput, (void*)pbInputData, len);
tidyParseBuffer(h->doc, tidyInput); // doc is the TidyDoc
} while (len >= nInputData);
tidyOptSetBool(h->doc, TidyForceOutput, yes);
tidySaveFile(handler->doc, "C://test.xhtml");
我确实简化了代码。
I'm trying to clean some HTML with libtidy (C language), the problem is:
I want to construct a TidyDoc (a tree-like structure) with tidyParseBuffer().
I have no problem with tidyParseFile(); about tidyParseBuffer(): I'm sure I read the file properly and that the TidyBuffer structure I give to tidyParseBuffer() is correctly filled.
Any ideas?
here is the code:
//declaration
tidyInput = malloc(sizeof(TidyBuffer));
tidyOutput = malloc(sizeof(TidyBuffer));
do {
len = fread(pbInputData, 1, nInputData, h->file);
tidyBufAttach(tidyInput, (void*)pbInputData, len);
tidyParseBuffer(h->doc, tidyInput); // doc is the TidyDoc
} while (len >= nInputData);
tidyOptSetBool(h->doc, TidyForceOutput, yes);
tidySaveFile(handler->doc, "C://test.xhtml");
I did simplify the code.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

问题源于这样的事实:您尝试以块的形式解析文件的内容,将每个块读入缓冲区并为每个块调用
tidyParseBuffer()
。tidyParseXxx()
函数通过在一次调用中解析整个输入来进行操作,因此要执行您想要的操作,您应该查看TidyInputSource
和tidyParseSource()
。The problem stems from the fact that you are trying to parse the contents of a file in chunks, reading each chunk into a buffer and calling
tidyParseBuffer()
for each chunk.The
tidyParseXxx()
functions operate by parsing the whole input in a single call, so to do what you want you should take a look atTidyInputSource
andtidyParseSource()
.