更快地解析文件中的所有列表元素并根据列表元素追加到新文件中

发布于 2024-12-02 19:15:14 字数 402 浏览 0 评论 0原文

我正在尝试解析每个类似的带有 threadid 的日志文件。可以配置任意数量的线程。所有线程都写入同一个日志文件,我正在解析日志文件并创建特定于每个线程的新文件,以便稍后检查它们。
下面我在列表中捕获 threadid。
下面的代码正在完成这项工作,但我觉得这效率不高。还能有什么更快的吗?

sThdiD = ["abc", "cde\"efg"]
folderpath = "newdir"
os.system("mkdir " + folderpath)
for line in open(filetoopen):
    for i in sThdiD:
        if i in line:
            open(folderpath+"/"+i+".log","a+").write(line)

I am trying to parse a log file with threadids in every like. There could be any number of threads that can be configured. All threads write to the same log file and I am parsing the log file and creating new files specific for each thread in order to check them later.
Below I am capturing the threadids in a list.
The below code is doing the job but I feel this is not efficient. Can there be anything faster ?.

sThdiD = ["abc", "cde\"efg"]
folderpath = "newdir"
os.system("mkdir " + folderpath)
for line in open(filetoopen):
    for i in sThdiD:
        if i in line:
            open(folderpath+"/"+i+".log","a+").write(line)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

太傻旳人生 2024-12-09 19:15:14

假设您可以将整个日志文件放入内存中,我会保留一个字典,将线程 ID 映射到该线程写入的行,然后在最后写出整个文件。

thread_map = {}  # keys are thread IDs; values are lists of log lines
for line in open(filetoopen):
  for i in sThdiD:
    if i in line:
      if i not in thread_map:
        thread_map[i] = []
      thread_map[i].append(line)

for key in thread_map:
  f = open(folderpath+"/"+key+".log", "w")
  for line in thread_map[key]:
    f.write(line)
  f.close()

如果您无法将整个日志文件保留在内存中,请尝试多次写入解决方案,一次写入一个文件。

in_file = open(filetoopen)
for i in sThdiD:
  in_file.seek(0)  # Reset the file to read from the beginning.
  out_file = open(folderpath+"/"+i+".log", "w")
  for line in in_file:
    if i in line:
      out_file.write(line)
  out_file.close()
in_file.close()

Assuming you can fit the whole log file into memory, I'd keep a dictionary mapping thread IDs to lines written by that thread, and then write out whole files at the end.

thread_map = {}  # keys are thread IDs; values are lists of log lines
for line in open(filetoopen):
  for i in sThdiD:
    if i in line:
      if i not in thread_map:
        thread_map[i] = []
      thread_map[i].append(line)

for key in thread_map:
  f = open(folderpath+"/"+key+".log", "w")
  for line in thread_map[key]:
    f.write(line)
  f.close()

If you can't keep the whole log file in memory, try a multipass solution in which you write each file one at a time.

in_file = open(filetoopen)
for i in sThdiD:
  in_file.seek(0)  # Reset the file to read from the beginning.
  out_file = open(folderpath+"/"+i+".log", "w")
  for line in in_file:
    if i in line:
      out_file.write(line)
  out_file.close()
in_file.close()
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文