libzip:zip_name_locate() 在特定文件名上失败,即使尝试所有可能的编码组合

发布于 2025-01-13 08:21:31 字数 3248 浏览 1 评论 0原文

我试图在 libzip 之上构建一个“故障安全”层,但 libzip 在这里给我带来了一些麻烦。

首先,我使用 zip_file_add(...) 将文件添加到我的(空)存档中。有 3 种可能的用户定义编码可用。然后我尝试使用 zip_name_locate(...) 来定位名称,它也有 3 种可能的用户定义编码可用。

此 mcve 检查所有可能的编码组合,对于特定文件名 x%2»à-ØÑ–6¡wx.txt,所有这些组合均失败。当使用更传统的 file.txt 文件名时,zip_name_locate() 每次都会成功。

#include <zip.h>
#include <include/libzip.h>//<.pragmas to include the .lib's...
#include <iostream>
#include <vector>
#include <utility>

/*
    'zip_file_add' possible encodings:
        ZIP_FL_ENC_GUESS
        ZIP_FL_ENC_UTF_8
        ZIP_FL_ENC_CP437

    'zip_name_locate' possible encodings:
        ZIP_FL_ENC_RAW
        ZIP_FL_ENC_GUESS
        ZIP_FL_ENC_STRICT
*/

/*
    build encoding pairs (trying all possibilities)
*/
std::vector<std::pair<unsigned, unsigned>>
encoding_pairs{
    { ZIP_FL_ENC_GUESS, ZIP_FL_ENC_RAW },
    { ZIP_FL_ENC_UTF_8, ZIP_FL_ENC_RAW },
    { ZIP_FL_ENC_CP437, ZIP_FL_ENC_RAW },
    { ZIP_FL_ENC_GUESS, ZIP_FL_ENC_GUESS },
    { ZIP_FL_ENC_UTF_8, ZIP_FL_ENC_GUESS },
    { ZIP_FL_ENC_CP437, ZIP_FL_ENC_GUESS },
    { ZIP_FL_ENC_GUESS, ZIP_FL_ENC_STRICT },
    { ZIP_FL_ENC_UTF_8, ZIP_FL_ENC_STRICT },
    { ZIP_FL_ENC_CP437, ZIP_FL_ENC_STRICT },
};

int main(int argc, char** argv) {

    const char* file_buf = "hello world";
#if 0
    const char* file_name = "file.txt";
#else
    const char* file_name = "x%²»Ã-ØÑ–6¨wx.txt";
#endif

    zip_error_t ze;
    zip_error_init(&ze);
    {
        zip_source_t* zs = zip_source_buffer_create(nullptr, 0, 1, &ze);
        if (zs == NULL)
            return -1;

        zip_t* z = zip_open_from_source(zs, ZIP_CHECKCONS, &ze);
        if (z == NULL)
            return -1;
        {
            zip_source_t* s = zip_source_buffer(z, file_buf, strlen(file_buf), 0);//0 = don't let libzip auto-free the const char* buffer on the stack
            if (s == NULL)
                return -1;

            for (size_t ep = 0; ep < encoding_pairs.size(); ep++) {
                std::cout << "ep = " << ep << std::endl;
                zip_uint64_t index;
                if ((index = zip_file_add(z, file_name, s, encoding_pairs[ep].first)) == -1) {
                    std::cout << "could not zip_file_add() with encoding " << encoding_pairs[ep].first << std::endl;
                    continue;
                }

                if (zip_name_locate(z, file_name, encoding_pairs[ep].second) == -1) {
                    std::cout << "the name '" << file_name << "' could not be located." << std::endl;
                    std::cout << " encoding pair: " << encoding_pairs[ep].first << " <-> " << encoding_pairs[ep].second << std::endl;
                }
                else {
                    std::cout << "the name was located." << std::endl;
                }

                if (zip_delete(z, index) == -1)
                    return -1;
            }
        }
        zip_close(z);
    }
    zip_error_fini(&ze);

    return 0;
}

我不明白我在这里可能做错了什么,或者 libzip 是否无法解析这样的名称。

如果不能,那么要避免的名称标准是什么?

I am trying to build a "failsafe" layer on top of libzip but libzip is giving me some trouble here.

First I add a file to my (empty) archive with zip_file_add(...). This has 3 possible user-defined encodings available. Then I try to locate the name with zip_name_locate(...) which also has 3 possible user-defined encodings available.

This mcve checks all possible encoding combinations and all of them fail for the specific filename x%²»Ã-ØÑ–6¨wx.txt. When using a more conventional file.txt filename, zip_name_locate() succeeds every time.

#include <zip.h>
#include <include/libzip.h>//<.pragmas to include the .lib's...
#include <iostream>
#include <vector>
#include <utility>

/*
    'zip_file_add' possible encodings:
        ZIP_FL_ENC_GUESS
        ZIP_FL_ENC_UTF_8
        ZIP_FL_ENC_CP437

    'zip_name_locate' possible encodings:
        ZIP_FL_ENC_RAW
        ZIP_FL_ENC_GUESS
        ZIP_FL_ENC_STRICT
*/

/*
    build encoding pairs (trying all possibilities)
*/
std::vector<std::pair<unsigned, unsigned>>
encoding_pairs{
    { ZIP_FL_ENC_GUESS, ZIP_FL_ENC_RAW },
    { ZIP_FL_ENC_UTF_8, ZIP_FL_ENC_RAW },
    { ZIP_FL_ENC_CP437, ZIP_FL_ENC_RAW },
    { ZIP_FL_ENC_GUESS, ZIP_FL_ENC_GUESS },
    { ZIP_FL_ENC_UTF_8, ZIP_FL_ENC_GUESS },
    { ZIP_FL_ENC_CP437, ZIP_FL_ENC_GUESS },
    { ZIP_FL_ENC_GUESS, ZIP_FL_ENC_STRICT },
    { ZIP_FL_ENC_UTF_8, ZIP_FL_ENC_STRICT },
    { ZIP_FL_ENC_CP437, ZIP_FL_ENC_STRICT },
};

int main(int argc, char** argv) {

    const char* file_buf = "hello world";
#if 0
    const char* file_name = "file.txt";
#else
    const char* file_name = "x%²»Ã-ØÑ–6¨wx.txt";
#endif

    zip_error_t ze;
    zip_error_init(&ze);
    {
        zip_source_t* zs = zip_source_buffer_create(nullptr, 0, 1, &ze);
        if (zs == NULL)
            return -1;

        zip_t* z = zip_open_from_source(zs, ZIP_CHECKCONS, &ze);
        if (z == NULL)
            return -1;
        {
            zip_source_t* s = zip_source_buffer(z, file_buf, strlen(file_buf), 0);//0 = don't let libzip auto-free the const char* buffer on the stack
            if (s == NULL)
                return -1;

            for (size_t ep = 0; ep < encoding_pairs.size(); ep++) {
                std::cout << "ep = " << ep << std::endl;
                zip_uint64_t index;
                if ((index = zip_file_add(z, file_name, s, encoding_pairs[ep].first)) == -1) {
                    std::cout << "could not zip_file_add() with encoding " << encoding_pairs[ep].first << std::endl;
                    continue;
                }

                if (zip_name_locate(z, file_name, encoding_pairs[ep].second) == -1) {
                    std::cout << "the name '" << file_name << "' could not be located." << std::endl;
                    std::cout << " encoding pair: " << encoding_pairs[ep].first << " <-> " << encoding_pairs[ep].second << std::endl;
                }
                else {
                    std::cout << "the name was located." << std::endl;
                }

                if (zip_delete(z, index) == -1)
                    return -1;
            }
        }
        zip_close(z);
    }
    zip_error_fini(&ze);

    return 0;
}

I don't understand what I might be doing wrong here or if libzip just can't even resolve such a name.

If it can't then what would be the criteria on names to avoid ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

靑春怀旧 2025-01-20 08:21:31

事实证明,问题出在我的源文件本身的编码上。它是 ANSI - 所以我将其转换为 UTF8,它解决了问题。

我仍然不明白的是为什么 libzip 不能从输入 C 字符串中 zip_name_locate() 获取与 zip_file_add() 中使用的输入 C 字符串完全相同的名称 (无论源文件编码是什么)。也许“迷失在翻译中”?

(特别感谢 Thomas Klausner 帮助我找到了这个问题)。

It turns out the problem was the encoding of my source file itself. It was ANSI - So I converted it to UTF8 and it solved the issue.

What I still don't understand is why libzip can't zip_name_locate() a name from an input c-string that is exactly the same as the input c-string used in zip_file_add() (whatever the source file encoding might be). "Lost in translation" perhaps ?

(Special thanks to Thomas Klausner for helping me find the issue).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文