使用 findstr 和正则表达式来搜索 CSV

发布于 2024-10-14 15:58:55 字数 129 浏览 6 评论 0原文

我想知道是否可以使用 findstr 在 CSV 中搜索与此正则表达式匹配的任何内容

^([BPXT][0-9]{6})|([a-zA-Z][a-zA-z][0-9][0-9](adm)?)$

原文

I was wondering if it's possible to use findstr to search through a CSV for anything matching this regular expression

^([BPXT][0-9]{6})|([a-zA-Z][a-zA-z][0-9][0-9](adm)?)$

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

独夜无伴 2024-10-21 15:58:55

我不知道你在谈论哪种语言，但你的正则表达式有一个明显的问题： ^ 和 $ 锚点要求它匹配整个字符串，并且您似乎计划匹配 CSV 文件中的各个条目。

因此，如果您的正则表达式引擎支持单词边界锚，则应该使用单词边界锚：

\b(?:([BPXT][0-9]{6})|([a-zA-Z]{2}[0-9]{2}(adm)?))\b

我还在交替周围添加了另一个非捕获组。在您的正则表达式中，字符串开头和结尾的锚点将是交替的一部分，这可能不是有意的。您是否真的需要所有其他括号取决于您要如何处理匹配。

I don't know which language you're talking about, but there is one obvious problem with your regex: The ^ and $ anchors require that it matches the entire string, and you seem to be planning on matching individual entries in your CSV file.

Therefore, you should use word boundary anchors instead if your regex engine supports them:

\b(?:([BPXT][0-9]{6})|([a-zA-Z]{2}[0-9]{2}(adm)?))\b

I've also added another non-capturing group around the alternation. In your regex the anchors at the start and end of the string would have been part of the alternation, which is probably not intended. Whether you really need all the other parentheses depends on what you're going to do with the match.

回复收藏 0 原文

⊕婉儿 2024-10-21 15:58:55

不，无法使用 findstr 搜索匹配的子字符串，尤其是那些与您提供的复杂表达式匹配的子字符串。

findstr 是 Windows 内置函数。

findstr /? 显示它可以使用的正则表达式子集：

正则表达式快速参考：
  。通配符：任意字符
  * 重复：前一个字符或类出现零次或多次
  ^ 行位置：行首
  $ 行位置：行尾
  [class] 字符类：集合中的任意一个字符
  [^class] 逆类：不在集合中的任何一个字符
  [xy]范围：指定范围内的任意字符
  \x 转义：元字符 x 的字面使用
  \>词位置：词尾

这意味着你的大部分表达都在窗外。

此外，findstr 不能将其输出限制为仅匹配的表达式；它仅识别包含匹配项的行。

它完全不适合所描述的任务。

No, it is not possible to use findstr to search for matching substrings, especially those matching the complex expression you've provided.

findstr is a Windows built-in.

findstr /? shows the subset of regex that it can use:

Regular expression quick reference:
  .        Wildcard: any character
  *        Repeat: zero or more occurrences of previous character or class
  ^        Line position: beginning of line
  $        Line position: end of line
  [class]  Character class: any one character in set
  [^class] Inverse class: any one character not in set
  [x-y]    Range: any characters within the specified range
  \x       Escape: literal use of metacharacter x
  \<xyz    Word position: beginning of word
  xyz\>    Word position: end of word

This means that most of your expression is out the window.

Also, findstr can't limit its output to just the matched expression; it only identifies lines containing matches.

It is entirely unsuitable for the task described.

回复收藏 0 原文

~没有更多了~