检测 Windows 中文件名的大小写不匹配(最好使用 python)?

发布于 2024-08-03 08:13:50 字数 431 浏览 4 评论 0原文

我有一些在 Windows 环境中创建但部署在 Linux 上的 xml 配置文件。这些配置文件通过文件路径相互引用。我们之前遇到过区分大小写和尾随空格的问题,我想编写一个脚本来检查这些问题。如果有帮助的话,我们有 Cygwin。

示例:

假设我有一个对文件 foo/bar/baz.xml 的引用,我会这样做

<someTag fileref="foo/bar/baz.xml" />

现在如果我们错误地这样做:

<someTag fileref="fOo/baR/baz.Xml  " />

它仍然可以在 Windows 上工作,但在 Linux 上会失败。

我想要做的是检测这些文件中的文件引用在区分大小写方面与真实文件不匹配的情况。

I have some xml-configuration files that we create in a Windows environment but is deployed on Linux. These configuration files reference each other with filepaths. We've had problems with case-sensitivity and trailing spaces before, and I'd like to write a script that checks for these problems. We have Cygwin if that helps.

Example:

Let's say I have a reference to the file foo/bar/baz.xml, I'd do this

<someTag fileref="foo/bar/baz.xml" />

Now if we by mistake do this:

<someTag fileref="fOo/baR/baz.Xml  " />

It will still work on Windows, but it will fail on Linux.

What I want to do is detect these cases where the file reference in these files don't match the real file with respect to case sensitivity.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

梦冥 2024-08-10 08:13:50

os.listdir 在目录上,在所有保留大小写的文件系统中(包括 Windows 上的文件),返回您列出的目录中文件名的实际大小写。

因此,您需要在路径的每个级别进行此检查:

def onelevelok(parent, thislevel):
  for fn in os.listdir(parent):
    if fn.lower() == thislevel.lower():
      return fn == thislevel
  raise ValueError('No %r in dir %r!' % (
      thislevel, parent))

我假设名称完全不存在任何大小写变体是一种不同类型的错误,并为此使用异常;并且,对于整个路径(假设没有驱动器号或 UNC 无论如何也不会转换为 Windows):

def allpathok(path):
  levels = os.path.split(path)
  if os.path.isabs(path):
    top = ['/']
  else:
    top = ['.']
  return all(onelevelok(p, t)
             for p, t in zip(top+levels, levels))

您可能需要调整此 if ,例如, foo/bar 不被视为意味着foo 位于当前目录中,但位于其他位置;或者,当然,如果确实需要 UNC 或驱动器号(但正如我提到的,将它们转换为 Linux 无论如何都不是小事;-)。

实现说明:我利用了 zip 只是删除超出其压缩序列的最短序列长度的“额外条目”;所以我不需要在第一个参数中显式地从 levels 中切掉“叶子”(最后一个条目),zip 会为我做到这一点。 all 会尽可能地短路,一旦检测到错误值就返回 False,因此它与显式循环一样好,但更快、更简洁。

os.listdir on a directory, in all case-preserving filesystems (including those on Windows), returns the actual case for the filenames in the directory you're listing.

So you need to do this check at each level of the path:

def onelevelok(parent, thislevel):
  for fn in os.listdir(parent):
    if fn.lower() == thislevel.lower():
      return fn == thislevel
  raise ValueError('No %r in dir %r!' % (
      thislevel, parent))

where I'm assuming that the complete absence of any case variation of a name is a different kind of error, and using an exception for that; and, for the whole path (assuming no drive letters or UNC that wouldn't translate to Windows anyway):

def allpathok(path):
  levels = os.path.split(path)
  if os.path.isabs(path):
    top = ['/']
  else:
    top = ['.']
  return all(onelevelok(p, t)
             for p, t in zip(top+levels, levels))

You may need to adapt this if , e.g., foo/bar is not to be taken to mean that foo is in the current directory, but somewhere else; or, of course, if UNC or drive letters are in fact needed (but as I mentioned translating them to Linux is not trivial anyway;-).

Implementation notes: I'm taking advantage of the fact that zip just drop "extra entries" beyond the length of the shortest of the sequences it's zipping; so I don't need to explicitly slice off the "leaf" (last entry) from levels in the first argument, zip does it for me. all will short circuit where it can, returning False as soon as it detects a false value, so it's just as good as an explicit loop but faster and more concise.

枕花眠 2024-08-10 08:13:50

很难判断你的问题到底是什么,但是如果你应用 os.path.normcase 以及 str.stript 在保存文件名之前,它应该可以解决您的所有问题。

正如我在评论中所说,目前尚不清楚你是如何以这样的错误告终的。但是,只要您有一些合理的约定(例如,所有文件名都是小写),检查现有文件就很简单:

try:
    open(fname)
except IOError:
    open(fname.lower())

it's hard to judge what exactly your problem is, but if you apply os.path.normcase along with str.stript before saving your file name, it should solve all your problems.

as I said in comment, it's not clear how are you ending up with such a mistake. However, it would be trivial to check for existing file, as long as you have some sensible convention (all file names are lower case, for example):

try:
    open(fname)
except IOError:
    open(fname.lower())
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文