如何从文本文件集合中提取某些值
假设我有一组需要处理的文本文件(例如搜索某个标签并提取值)。解决这个问题的一般方法是什么?
我还读过这个:“检索变量来自Python的值”,但它似乎不适用于我面临的某些情况(例如使用tab
而不是:
)
我只想知道最无论使用何种语言,都应采用适当的方法来解决问题。
假设我有这样的事情:
Name: Backup Operators SID: S-1-5-32-551 Caption: COMMSVR21\Backup Operators Description: Backup Operators can override security restrictions for the sole purpose of backing up or restoring files Domain: COMMSVR21
COMMERCE/cabackup
COMMSVR21/sys5erv1c3
我希望能够访问/检索 Backup Operators
的值并获取 COMMERCE/cabackup
& COMMSVR21/sys5erv1c3
作为回报。
你会怎么做?
我想到的是读取整个文本文件、正则表达式搜索以及可能的一些 if else 语句。这有效吗?或者也许将文本文件解析为某个数组并检索它?我不知道。
就像另一个示例中所说:
GPO: xxx & yyy Servers
Policy: MaximumPasswordAge
Computer Setting: 45
如何检查文本文件中的 Policy = MaximumPasswordAge
并返回值 45
?
谢谢!
p/s——我可能会用 Python(零知识,所以即时学习)或 Java 来做这件事
pp/s——我刚刚意识到没有剧透标签。嗯
——
例如日志: 使用目录权限进行日志:
C:\:
BUILTIN\Administrators Allowed: Full Control
NT AUTHORITY\SYSTEM Allowed: Full Control
BUILTIN\Users Allowed: Read & Execute
BUILTIN\Users Allowed: Special Permissions:
Create Folders
BUILTIN\Users Allowed: Special Permissions:
Create Files
\Everyone Allowed: Read & Execute
(No auditing)
C:\WINDOWS:
BUILTIN\Users Allowed: Read & Execute
BUILTIN\Power Users Allowed: Modify
BUILTIN\Power Users Allowed: Special Permissions:
Delete
BUILTIN\Administrators Allowed: Full Control
NT AUTHORITY\SYSTEM Allowed: Full Control
(No auditing)
另一项具有以下内容:
Audit Policy
------------
GPO: xxx & yyy Servers
Policy: AuditPolicyChange
Computer Setting: Success
GPO: xxx & yyy Servers
Policy: AuditPrivilegeUse
Computer Setting: Failure
GPO: xxx & yyy Servers
Policy: AuditDSAccess
Computer Setting: No Auditing
这是制表符分隔的权限:
User Name Full Name Description Account Type SID Domain PasswordIsChangeable PasswordExpires PasswordRequired AccountDisabled AccountLocked Last Login
53cuR1ty Built-in account for administering the computer/domain 512 S-1-5-21-2431866339-2595301809-2847141052-500 COMMSVR21 True False True False False 09/11/2010 7:14:27 PM
ASPNET ASP.NET Machine Account Account used for running the ASP.NET worker process (aspnet_wp.exe) 512
Say, I have a collection of text files I need to process (e.g. search for a certain label and extract the value). What would be the general way to tackle the problem?
I also read this: "Retrieve Variable Values from Python" but it seems not applicable to some of the cases I face (like tab
is used instead of :
)
I just want to know the most appropriate way to tackle the problem regardless of the language used.
Say I have something like:
Name: Backup Operators SID: S-1-5-32-551 Caption: COMMSVR21\Backup Operators Description: Backup Operators can override security restrictions for the sole purpose of backing up or restoring files Domain: COMMSVR21
COMMERCE/cabackup
COMMSVR21/sys5erv1c3
I want to be able to access/retrieve the values of Backup Operators
and get COMMERCE/cabackup
& COMMSVR21/sys5erv1c3
in return.
How would you do it?
What I thought of is to read the whole text file, regex search and probably some if else statements. Is this effective? Or maybe parsing the text file into probably some array and retrieve it? I'm not sure.
Like in another example say:
GPO: xxx & yyy Servers
Policy: MaximumPasswordAge
Computer Setting: 45
How would you check the text file for Policy = MaximumPasswordAge
and return the value 45
?
Thanks!
p/s -- I might be doing this in Python (zero knowledge, so picking it up on the fly) or Java
pp/s -- I just realised that there's no spoiler tag. Hmm
--
E.g. of the logs:
Log with Directory Permissions:
C:\:
BUILTIN\Administrators Allowed: Full Control
NT AUTHORITY\SYSTEM Allowed: Full Control
BUILTIN\Users Allowed: Read & Execute
BUILTIN\Users Allowed: Special Permissions:
Create Folders
BUILTIN\Users Allowed: Special Permissions:
Create Files
\Everyone Allowed: Read & Execute
(No auditing)
C:\WINDOWS:
BUILTIN\Users Allowed: Read & Execute
BUILTIN\Power Users Allowed: Modify
BUILTIN\Power Users Allowed: Special Permissions:
Delete
BUILTIN\Administrators Allowed: Full Control
NT AUTHORITY\SYSTEM Allowed: Full Control
(No auditing)
Another one with the following:
Audit Policy
------------
GPO: xxx & yyy Servers
Policy: AuditPolicyChange
Computer Setting: Success
GPO: xxx & yyy Servers
Policy: AuditPrivilegeUse
Computer Setting: Failure
GPO: xxx & yyy Servers
Policy: AuditDSAccess
Computer Setting: No Auditing
This is the tab delimited one:
User Name Full Name Description Account Type SID Domain PasswordIsChangeable PasswordExpires PasswordRequired AccountDisabled AccountLocked Last Login
53cuR1ty Built-in account for administering the computer/domain 512 S-1-5-21-2431866339-2595301809-2847141052-500 COMMSVR21 True False True False False 09/11/2010 7:14:27 PM
ASPNET ASP.NET Machine Account Account used for running the ASP.NET worker process (aspnet_wp.exe) 512
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我总是把 Python 推到人们面前;)
我建议查看 Regex:http://docs.python。 org/howto/regex.html,因为它可能适合您的需求。我不会为你做这件事(因为我不能),但我知道如果你的文件是以冒号分隔的键/值对并用换行符分隔,那么这会起作用。这是一个快速开始(可能有效):
这匹配三个组(希望如此):冒号之前的组(组 1)、空格(组 2,可以丢弃)以及该组和新行之间的文本(第 3 组)。
玩这个(我不想有正则表达式动脉瘤,所以这是我目前能提供的帮助)。祝你好运!
I always shove Python into people's faces ;)
I recommend looking at Regex: http://docs.python.org/howto/regex.html, as it might fit your needs. I won't do it for you (because I can't), but I know this will work if your files are colon-delimited key/value pairs separated by newline characters. Here's a quick start (which might work):
This matches three groups (hopefully): A group before the colon (group 1), the spaces (group 2, which can be thrown away), and the text between that and a new line (group 3).
Play with that (I don't want to have a regex aneurysm, so this is far as I can help for now). Good luck!