批量查找目录中其他 .txt 文件中的字符串(包含在 textlist.txt 中)并将值从 .txt 复制到新文件

发布于 2024-12-29 01:39:40 字数 772 浏览 3 评论 0原文

在查找批处理以查找和复制文本文件中 textlist.txt 中列出的字符串时遇到问题。是否可以使批处理女巫应该找到目录中文本文件中的所有值(来自文本列表)并将所有这些值复制到新文件中。 我的目录包含:

textlist.txt 包含:

  • 3010
  • 3020
  • 3030 ....

以及其他包含 txt 文件的目录:

3010.txt 包含制表符分隔,例如:

  • 3010 a1
  • 3011 b1
  • 3012 c1 ....

3020.txt 包含例如:

  • 3020 a4
  • 3021 b3
  • 3022 g5 ....

3030.txt 包含例如:

  • 3030 h5 g7
  • 3031 f2
  • 3032 t4 ....

以及其他 3040.txt、3050.txt 等。

我需要这样的结果 txt 文件。

  • 3010 a1 { 该字符串来自 3010.txt,但可能在其他 txt 文件中找到。}
  • 3020 a4 { 该字符串来自 3010.txt,但可能在其他 txt 文件中找到。}
  • 3030 h5 g7 {此字符串来自 3010.txt,但可能可以在其他 txt 文件中找到。}

感谢您的帮助。

have a problem with finding a batch to find and copy strings which are listed in textlist.txt from text files. Is it possible to make batch witch should find all values (from textlist) in text files in directory and copy all of this values to new file.
I have directory with:

textlist.txt contains:

  • 3010
  • 3020
  • 3030
    ....

and other directory with txt files:

3010.txt contains tab delimited for example:

  • 3010 a1
  • 3011 b1
  • 3012 c1
    ....

3020.txt contains for example:

  • 3020 a4
  • 3021 b3
  • 3022 g5
    ....

3030.txt contains for example:

  • 3030 h5 g7
  • 3031 f2
  • 3032 t4
    ....

and other 3040.txt, 3050.txt etc.

I need result txt file like this.

  • 3010 a1 { this string becams from 3010.txt but is possible that it can be found in other txt file.}
  • 3020 a4 { this string becams from 3010.txt but is possible that it can be found in other txt file.}
  • 3030 h5 g7 { this string becams from 3010.txt but is possible that it can be found in other txt file.}

Thanks for help.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

月寒剑心 2025-01-05 01:39:41

以下将生成一个文件,其中包含所有输入行的所有文件的匹配项:

@ECHO OFF
>results.txt (
  FOR /F "delims=" %%i IN (textlist.txt) DO (
    FIND "%%i" files\*.txt | FINDSTR /v "^---- ^$"
  )
)

它的工作原理如下:

  • FIND 获取输入行,搜索指定的文件并打印结果,其通过管道传输到 FINDSTR

  • FINDSTR 消除了 FIND 产生的“噪音”(空行和像 ------ filename.txt 这样的行)打印其余部分。

  • 解析结果打印在标准输出上,整个循环被重定向到results.txt

如果根据您的评论,您还需要创建另一个包含没有匹配项的行的文件,您可以像这样修改上面的脚本:

@ECHO OFF
>results.txt 2>notinfile.txt (
  FOR /F "delims=" %%i IN (textlist.txt) DO (
    (FIND "%%i" files\*.txt || (ECHO %%i) 1>&2) | FINDSTR /v "^---- ^$"
  )
)

逻辑基本相同,除非 FIND 没有匹配项对于输入线。在这种情况下,搜索词将打印在标准错误上(ECHO … 1>&2)。 FIND 的标准输出仍然通过管道传输到 FINDSTR,并且,因为在这种情况下它只包含噪声,FINDSTR > 什么也不产生。

因此,循环会在标准输出和标准错误上生成结果,每次都取决于 FIND 的结果。标准输出被重定向到 results.txt,就像以前的版本一样,标准错误被重定向到 notinfile.txt

更新

第二个脚本有一个小缺陷,在您的情况下可能会严重,也可能不严重,具体取决于您将如何使用 notinfile.txt 文件。该缺陷在于在发送到 notinfile.txt 的每个值的末尾添加了一个额外的空格。

我不知道这是一个错误还是某种人工制品,但我发现如果我将 FIND 的结果存储到临时文件中,然后将其加载到 FINDSTR,不会产生额外的空间。我找不到任何其他方法来解决额外的空间问题,所以这里是修改后的版本:

@ECHO OFF
>results.txt 2>notinfile.txt (
  FOR /F "delims=" %%i IN (textlist.txt) DO (
    FIND "%%i" files\*.txt >tmpResults || (ECHO %%i) 1>&2
    FINDSTR /v "^---- ^$" <tmpResults
  )
)
DEL tmpResults 2>NUL

UPDATE 2(以下附加注释)

由于文件中的值是制表符分隔的,因此您可以将分隔符包含到搜索字符串,紧接在搜索词之后,以防止将 aaa 与例如 aaa/bbb 匹配。因此,

FIND "%%i"

%%i 后面的

FIND "%%i   "

宽空格是制表符。

我还想建议 FINDSTR 命令的替代版本。当您只搜索一个术语时,您可以将其更改

FINDSTR /v "^---- ^$"

为简单的:

FINDSTR "%%i"

The following will produce a file containing matches across all the files for all the input lines:

@ECHO OFF
>results.txt (
  FOR /F "delims=" %%i IN (textlist.txt) DO (
    FIND "%%i" files\*.txt | FINDSTR /v "^---- ^$"
  )
)

It works like this:

  • FIND takes an input line, searches the specified files for it and prints the results, which are piped to FINDSTR.

  • FINDSTR removes the ‘noise’ produced by FIND (empty lines and lines like ------ filename.txt) and prints the rest.

  • The parsed result is printed on the standard output, which is redirected to results.txt for the entire loop.

If, as per your comment, you need additionally to create another file containing the lines that had no matches, you could modify the above script like this:

@ECHO OFF
>results.txt 2>notinfile.txt (
  FOR /F "delims=" %%i IN (textlist.txt) DO (
    (FIND "%%i" files\*.txt || (ECHO %%i) 1>&2) | FINDSTR /v "^---- ^$"
  )
)

The logic is mostly the same except when FIND gets no matches for the input line. In that case the search term is printed on the standard error (ECHO … 1>&2). The standard output of FIND is still gets piped to FINDSTR and, as it contains only noise in this case, FINDSTR produces nothing.

So, the loop produces results both on the standard output and on the standard error, each time depending on the result of FIND. The standard output is redirected to results.txt, like in the previous version, and the standard error is redirected to notinfile.txt.

UPDATE

The second script has got a tiny flaw which may or may not be serious in your situation, depending on how you are going to use the notinfile.txt file. The flaw consists in an extra space being added to the end of every value that goes to notinfile.txt.

Whether it is a bug or an artefact of some sort, I don't know, but I found out that if I store the results of FIND to a temporary file and later load them from it into FINDSTR, no extra space is produced. I couldn't find any other way to fix the extra space issue, so here's the modified version:

@ECHO OFF
>results.txt 2>notinfile.txt (
  FOR /F "delims=" %%i IN (textlist.txt) DO (
    FIND "%%i" files\*.txt >tmpResults || (ECHO %%i) 1>&2
    FINDSTR /v "^---- ^$" <tmpResults
  )
)
DEL tmpResults 2>NUL

UPDATE 2 (following additional comments)

Since values in the files are tab delimited, you can include the delimiter into the search string, just after the searched term, to prevent matching aaa with e.g. aaa/bbb. So instead of

FIND "%%i"

you would have

FIND "%%i   "

where the wide space after %%i is the tab character.

And I would also like to suggest an alternative version of the FINDSTR command. As you are searching for just one term, you can change this:

FINDSTR /v "^---- ^$"

to simply this:

FINDSTR "%%i"
这样的小城市 2025-01-05 01:39:41

使用 FOR 命令对文件中的每一行执行命令(使用 /F 开关)。使用 FIND 搜索文件。这是演示其用途的示例。

FIND 报告匹配,但还添加一行以“----”开头的行,以显示在哪个文件中找到匹配项。这就是第一个 FOR 转储到临时文件的原因。第二个 for 将以“-”开头的任何行视为注释,从而从 FIND 中过滤掉该信息。如果您多次运行该批处理,它将删除以前的所有 results.txt 文件。

重要的是,textlist.txt 文件不能与搜索文件夹位于同一文件夹中,否则其内容将包含在结果中。

@echo off
setlocal
set searchFolder=C:\theFolder

del tempResults.txt >nul 2>nul
del results.txt >nul 2>nul

for /F "delims=" %%i in (textlist.txt) do find "%%i" %searchFolder%\*.txt >> tempResults.txt
for /F "delims=" "eol=-" %%i in (tempResults.txt) do echo %%i >>results.txt
del tempResults.txt >nul 2>nul

endlocal

Use FOR command to execute a command for each line in a file (with the /F switch). Use FIND to search the files. Here is a sample that demonstrates their use.

FIND reports matches but also adds a line starting with "----" to show which file the match was found in. Thats why the first FOR dumps to a temp file. The second for treats as comments any line that starts with "-" and thus filters out that information from FIND. If you run the batch more than once it will delete any previous results.txt file.

It is important that the textlist.txt file is NOT in the same folder as the search folder, otherwise its contents will be included in the results.

@echo off
setlocal
set searchFolder=C:\theFolder

del tempResults.txt >nul 2>nul
del results.txt >nul 2>nul

for /F "delims=" %%i in (textlist.txt) do find "%%i" %searchFolder%\*.txt >> tempResults.txt
for /F "delims=" "eol=-" %%i in (tempResults.txt) do echo %%i >>results.txt
del tempResults.txt >nul 2>nul

endlocal
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文