Python-如何从文件中找到配置文件
我是Python的新手。
我想从日志文件中查找 profiles ,并且使用以下标准
- 用户登录,用户更改密码,用户在同一秒内登录了
- 这些操作(登录,更改密码,更改密码,登录)一个接一个地发生,没有其他介于两者之间。
使用.txt文件看起来像这样的
Mon, 22 Aug 2016 13:15:39 +0200|178.57.66.225|asdf| - |user logged in| -
Mon, 22 Aug 2016 13:15:39 +0200|178.57.66.225|asdf| - |user changed password| -
Mon, 22 Aug 2016 13:15:39 +0200|178.57.66.225|asdf| - |user logged off| -
Mon, 22 Aug 2016 13:15:42 +0200|178.57.66.225|iukj| - |user logged in| -
Mon, 22 Aug 2016 13:15:40 +0200|178.57.66.215|klij| - |user logged in| -
Mon, 22 Aug 2016 13:15:49 +0200|178.57.66.215|klij| - |user changed password| -
Mon, 22 Aug 2016 13:15:49 +0200|178.57.66.215|klij| - |user logged off| -
Mon, 22 Aug 2016 13:15:59 +0200|178.57.66.205|plnb| - |user logged in| -
Mon, 22 Aug 2016 13:15:59 +0200|178.57.66.205|plnb| - |user logged in| -
Mon, 22 Aug 2016 13:15:59 +0200|178.57.66.205|plnb| - |user changed password| -
Mon, 22 Aug 2016 13:15:59 +0200|178.57.66.205|plnb| - |user logged off| -
Mon, 22 Aug 2016 13:17:50 +0200|178.57.66.205|qweq| - |user logged in| -
Mon, 22 Aug 2016 13:17:50 +0200|178.57.66.205|qweq| - |user changed password| -
Mon, 22 Aug 2016 13:17:50 +0200|178.57.66.205|qweq| - |user changed profile| -
Mon, 22 Aug 2016 13:17:50 +0200|178.57.66.205|qweq| - |user logged off| -
Mon, 22 Aug 2016 13:19:19 +0200|178.56.66.225|zzad| - |user logged in| -
Mon, 22 Aug 2016 13:19:19 +0200|178.56.66.225|zzad| - |user changed password| -
Mon, 22 Aug 2016 13:19:19 +0200|178.56.66.225|zzad| - |user logged off| -
Mon, 22 Aug 2016 13:20:42 +0200|178.57.67.225|yytr| - |user logged in| -
ASDF-是典型的 profile 在日志文件中名称
这是我到目前为止所做的,
import collections
import time
with open('logfiles.txt') as infile:
counts = collections.Counter(l.strip() for l in infile)
for line, count in counts.most_common():
print(line, count)
time.sleep(10)
我知道逻辑是要获得相同的时间,分钟和秒 如果它们是重复的,那么我会打印配置文件。 但是我很困惑如何从文件中获取时间。
任何帮助都非常感谢。
编辑:
The output would be:
asdf
klij
plnb
zzad
I am new to Python.
I wanted to find profiles from a log file, with following criteria
- user logged in, user changed password, user logged off within same second
- those actions (log in, change password, log off) happened one after another with no other entires in between.
with .txt file looks like this
Mon, 22 Aug 2016 13:15:39 +0200|178.57.66.225|asdf| - |user logged in| -
Mon, 22 Aug 2016 13:15:39 +0200|178.57.66.225|asdf| - |user changed password| -
Mon, 22 Aug 2016 13:15:39 +0200|178.57.66.225|asdf| - |user logged off| -
Mon, 22 Aug 2016 13:15:42 +0200|178.57.66.225|iukj| - |user logged in| -
Mon, 22 Aug 2016 13:15:40 +0200|178.57.66.215|klij| - |user logged in| -
Mon, 22 Aug 2016 13:15:49 +0200|178.57.66.215|klij| - |user changed password| -
Mon, 22 Aug 2016 13:15:49 +0200|178.57.66.215|klij| - |user logged off| -
Mon, 22 Aug 2016 13:15:59 +0200|178.57.66.205|plnb| - |user logged in| -
Mon, 22 Aug 2016 13:15:59 +0200|178.57.66.205|plnb| - |user logged in| -
Mon, 22 Aug 2016 13:15:59 +0200|178.57.66.205|plnb| - |user changed password| -
Mon, 22 Aug 2016 13:15:59 +0200|178.57.66.205|plnb| - |user logged off| -
Mon, 22 Aug 2016 13:17:50 +0200|178.57.66.205|qweq| - |user logged in| -
Mon, 22 Aug 2016 13:17:50 +0200|178.57.66.205|qweq| - |user changed password| -
Mon, 22 Aug 2016 13:17:50 +0200|178.57.66.205|qweq| - |user changed profile| -
Mon, 22 Aug 2016 13:17:50 +0200|178.57.66.205|qweq| - |user logged off| -
Mon, 22 Aug 2016 13:19:19 +0200|178.56.66.225|zzad| - |user logged in| -
Mon, 22 Aug 2016 13:19:19 +0200|178.56.66.225|zzad| - |user changed password| -
Mon, 22 Aug 2016 13:19:19 +0200|178.56.66.225|zzad| - |user logged off| -
Mon, 22 Aug 2016 13:20:42 +0200|178.57.67.225|yytr| - |user logged in| -
asdf - is typical profile name from the log file
Here is what I have done so far
import collections
import time
with open('logfiles.txt') as infile:
counts = collections.Counter(l.strip() for l in infile)
for line, count in counts.most_common():
print(line, count)
time.sleep(10)
I know the logic is to get same hours, minutes, and seconds
if they are duplicates, then I print the profiles.
But I am confuse how to get time from a file.
Any help is very much appreciated.
EDIT:
The output would be:
asdf
klij
plnb
zzad
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我认为这比您想象的要复杂。您的示例数据非常简单,但是描述(要求)暗示日志可能需要考虑到您需要考虑的线条。因此,我认为这是通过日志文件依次记录某些操作(登录,登录)并记下任何以前行上观察到的内容的情况。这似乎与您的数据合作:
输出:
I think this is more complicated than you might have imagined. Your sample data is very straightforward but the description (requirements) imply that the log might have interspersed lines that you need to account for. So I think it's a case of working through the log file sequentially recording certain actions (log on, log off) and keeping a note of what was observed on any previous line. This seems to work with your data:
Output:
为了解释时间,我将使用正则表达式来完成此任务,以匹配每行的时间表达式。
这样的事情会起作用。
编辑:我省略了与格式不符的行。
就配置文件的名称而言,我将在最常见的行中使用拆分函数,例如建议的@matthias,您的代码看起来像这样:
To parse a time I would use regex for this task to match a time expression on each line.
Something like this would work.
EDIT: I omitted the lines which don't correspond to the formatting.
As far as the profile name is concerned, I would use a split function on the most common lines like @Matthias suggested and your code would look something like this: