如何处理带有“右”空字节的 Python unicode 字符串方式?

发布于 2024-10-30 13:42:08 字数 2330 浏览 5 评论 0原文

问题

看来 PyWin32 很乐意将 null 终止的 unicode 字符串作为返回值。我想以“正确”的方式处理这些字符串。

假设我得到的字符串如下:u'C:\\Users\\Guest\\MyFile.asy\x00\x00sy'。这似乎是一个 C 风格的以 null 结尾的字符串,挂在 Python unicode 对象中。我想将这个坏男孩修剪成常规的字符串,例如,我可以将其显示在窗口标题栏中。

在第一个空字节处修剪字符串是处理它的正确方法吗?

我没想到会得到这样的返回值,所以我想知道我是否遗漏了一些关于 Python、Win32 和 unicode 如何一起运行的重要内容……或者这只是一个 PyWin32 bug。

背景

我正在使用 Win32 文件选择器函数 GetOpenFileNameW 来自 PyWin32 包。根据文档,此函数返回一个包含完整文件名路径的元组作为 Python unicode 对象。

当我打开带有现有路径和文件名集的对话框时,我得到一个奇怪的返回值。

例如,我将默认设置为: C:\\Users\\Guest\\MyFileIsReallyReallyAwesome.asy

在对话框中,我将名称更改为 MyFile.asy 并单击“保存” 。

返回值的完整路径部分是: u'C:\Users\Guest\MyFile.asy\x00wesome.asy'`

我期望它是: u'C:\\Users\\Guest\\MyFile .asy'

该函数返回一个回收的缓冲区,而不删除终止字节。不用说,我的其余代码没有设置用于处理 C 样式以 null 结尾的字符串。

演示代码

以下代码演示了 GetSaveFileNameW 返回值中的 null 终止字符串。

说明: 在对话框中将文件名更改为“MyFile.asy”,然后单击“保存”。观察控制台上打印的内容。我得到的输出是u'C:\\Users\\Guest\\MyFile.asy\x00wesome.asy'

import win32gui, win32con

if __name__ == "__main__":
    initial_dir = 'C:\\Users\\Guest'
    initial_file = 'MyFileIsReallyReallyReallyAwesome.asy'
    filter_string = 'All Files\0*.*\0'
    (filename, customfilter, flags) = \
        win32gui.GetSaveFileNameW(InitialDir=initial_dir,
                    Flags=win32con.OFN_EXPLORER, File=initial_file,
                    DefExt='txt', Title="Save As", Filter=filter_string,
                    FilterIndex=0)
    print repr(filename)

注意:如果您没有足够缩短文件名(例如,如果您尝试 MyFileIsReally.asy),则字符串将是完整的,没有空字节。

环境

Windows 7 Professional 64位(无服务包),Python 2.7.1,PyWin32 Build 216

更新:PyWin32 Tracker Artifact

根据我收到的评论和回答,到目前为止,这可能是 pywin32 错误,所以我提交了 跟踪器工件

更新 2:已修复!

Mark Hammond 在跟踪器工件中报告说,这确实是一个错误。修订版 f3fdaae5e93d 已签入修复程序,因此希望这将在下一个版本中发布。

我认为下面 Aleksi Torhamo 的答案是修复之前 PyWin32 版本的最佳解决方案。

Question

It seems that PyWin32 is comfortable with giving null-terminated unicode strings as return values. I would like to deal with these strings the 'right' way.

Let's say I'm getting a string like: u'C:\\Users\\Guest\\MyFile.asy\x00\x00sy'. This appears to be a C-style null-terminated string hanging out in a Python unicode object. I want to trim this bad boy down to a regular ol' string of characters that I could, for example, display in a window title bar.

Is trimming the string off at the first null byte the right way to deal with it?

I didn't expect to get a return value like this, so I wonder if I'm missing something important about how Python, Win32, and unicode play together... or if this is just a PyWin32 bug.

Background

I'm using the Win32 file chooser function GetOpenFileNameW from the PyWin32 package. According to the documentation, this function returns a tuple containing the full filename path as a Python unicode object.

When I open the dialog with an existing path and filename set, I get a strange return value.

For example I had the default set to: C:\\Users\\Guest\\MyFileIsReallyReallyReallyAwesome.asy

In the dialog I changed the name to MyFile.asy and clicked save.

The full path part of the return value was: u'C:\Users\Guest\MyFile.asy\x00wesome.asy'`

I expected it to be: u'C:\\Users\\Guest\\MyFile.asy'

The function is returning a recycled buffer without trimming off the terminating bytes. Needless to say, the rest of my code wasn't set up for handling a C-style null-terminated string.

Demo Code

The following code demonstrates null-terminated string in return value from GetSaveFileNameW.

Directions: In the dialog change the filename to 'MyFile.asy' then click Save. Observe what is printed to the console. The output I get is u'C:\\Users\\Guest\\MyFile.asy\x00wesome.asy'.

import win32gui, win32con

if __name__ == "__main__":
    initial_dir = 'C:\\Users\\Guest'
    initial_file = 'MyFileIsReallyReallyReallyAwesome.asy'
    filter_string = 'All Files\0*.*\0'
    (filename, customfilter, flags) = \
        win32gui.GetSaveFileNameW(InitialDir=initial_dir,
                    Flags=win32con.OFN_EXPLORER, File=initial_file,
                    DefExt='txt', Title="Save As", Filter=filter_string,
                    FilterIndex=0)
    print repr(filename)

Note: If you don't shorten the filename enough (for example, if you try MyFileIsReally.asy) the string will be complete without a null byte.

Environment

Windows 7 Professional 64-bit (no service pack), Python 2.7.1, PyWin32 Build 216

UPDATE: PyWin32 Tracker Artifact

Based on the comments and answers I have received so far, this is likely a pywin32 bug so I filed a tracker artifact.

UPDATE 2: Fixed!

Mark Hammond reported in the tracker artifact that this is indeed a bug. A fix was checked in to rev f3fdaae5e93d, so hopefully that will make the next release.

I think Aleksi Torhamo's answer below is the best solution for versions of PyWin32 before the fix.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

溺孤伤于心 2024-11-06 13:42:08

我想说这是一个错误。处理它的正确方法可能是修复 pywin32,但如果您觉得不够冒险,只需修剪它即可。

您可以使用 filename.split('\x00', 1)[0] 获取第一个 '\x00' 之前的所有内容。

I'd say it's a bug. The right way to deal with it would probably be fixing pywin32, but in case you aren't feeling adventurous enough, just trim it.

You can get everything before the first '\x00' with filename.split('\x00', 1)[0].

花海 2024-11-06 13:42:08

在我测试的 PyWin32/Windows/Python 版本上不会发生这种情况;即使返回的字符串很短,我也不会得到任何空值。您可以调查上述其中一项的较新版本是否修复了该错误。

This doesn't happen on the version of PyWin32/Windows/Python I tested; I don't get any nulls in the returned string even if it's very short. You might investigate if a newer version of one of the above fixes the bug.

铜锣湾横着走 2024-11-06 13:42:08

ISTR 说我几年前就遇到过这个问题,然后我发现这样的 Win32 文件名对话框相关函数返回一个 'filename1\0filename2\0...filenameN\0\0' 序列,而包括可能的垃圾字符,具体取决于 Windows 分配的缓冲区。

现在,您可能更喜欢列表而不是原始返回值,但这将是 RFE,而不是错误。

PS 当我遇到这个问题时,我很理解为什么人们会期望 GetOpenFileName 可能返回文件名列表,而我无法想象为什么 GetSaveFileName 会返回。也许这被认为是 API 的统一。无论如何,我应该认识谁?

ISTR that I had this issue some years ago, then I discovered that such Win32 filename-dialog-related functions return a sequence of 'filename1\0filename2\0...filenameN\0\0', while including possible garbage characters depending on the buffer that Windows allocated.

Now, you might prefer a list instead of the raw return value, but that would be a RFE, not a bug.

PS When I had this issue, I quite understood why one would expect GetOpenFileName to possibly return a list of filenames, while I couldn't imagine why GetSaveFileName would. Perhaps this is considered as API uniformity. Who am I to know, anyway?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文