更快地解析文件中的所有列表元素并根据列表元素追加到新文件中

发布于 2024-12-02 19:15:14 字数 402 浏览 0 评论 0原文

我正在尝试解析每个类似的带有 threadid 的日志文件。可以配置任意数量的线程。所有线程都写入同一个日志文件，我正在解析日志文件并创建特定于每个线程的新文件，以便稍后检查它们。
下面我在列表中捕获 threadid。
下面的代码正在完成这项工作，但我觉得这效率不高。还能有什么更快的吗？

sThdiD = ["abc", "cde\"efg"]
folderpath = "newdir"
os.system("mkdir " + folderpath)
for line in open(filetoopen):
    for i in sThdiD:
        if i in line:
            open(folderpath+"/"+i+".log","a+").write(line)

原文

I am trying to parse a log file with threadids in every like. There could be any number of threads that can be configured. All threads write to the same log file and I am parsing the log file and creating new files specific for each thread in order to check them later.
Below I am capturing the threadids in a list.
The below code is doing the job but I feel this is not efficient. Can there be anything faster ?.

sThdiD = ["abc", "cde\"efg"]
folderpath = "newdir"
os.system("mkdir " + folderpath)
for line in open(filetoopen):
    for i in sThdiD:
        if i in line:
            open(folderpath+"/"+i+".log","a+").write(line)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

太傻旳人生 2024-12-09 19:15:14

假设您可以将整个日志文件放入内存中，我会保留一个字典，将线程 ID 映射到该线程写入的行，然后在最后写出整个文件。

thread_map = {}  # keys are thread IDs; values are lists of log lines
for line in open(filetoopen):
  for i in sThdiD:
    if i in line:
      if i not in thread_map:
        thread_map[i] = []
      thread_map[i].append(line)

for key in thread_map:
  f = open(folderpath+"/"+key+".log", "w")
  for line in thread_map[key]:
    f.write(line)
  f.close()

如果您无法将整个日志文件保留在内存中，请尝试多次写入解决方案，一次写入一个文件。

in_file = open(filetoopen)
for i in sThdiD:
  in_file.seek(0)  # Reset the file to read from the beginning.
  out_file = open(folderpath+"/"+i+".log", "w")
  for line in in_file:
    if i in line:
      out_file.write(line)
  out_file.close()
in_file.close()

Assuming you can fit the whole log file into memory, I'd keep a dictionary mapping thread IDs to lines written by that thread, and then write out whole files at the end.

thread_map = {}  # keys are thread IDs; values are lists of log lines
for line in open(filetoopen):
  for i in sThdiD:
    if i in line:
      if i not in thread_map:
        thread_map[i] = []
      thread_map[i].append(line)

for key in thread_map:
  f = open(folderpath+"/"+key+".log", "w")
  for line in thread_map[key]:
    f.write(line)
  f.close()

If you can't keep the whole log file in memory, try a multipass solution in which you write each file one at a time.

in_file = open(filetoopen)
for i in sThdiD:
  in_file.seek(0)  # Reset the file to read from the beginning.
  out_file = open(folderpath+"/"+i+".log", "w")
  for line in in_file:
    if i in line:
      out_file.write(line)
  out_file.close()
in_file.close()

回复收藏 0 原文

~没有更多了~