如何从文本文件集合中提取某些值

发布于 2024-10-10 17:52:18 字数 2889 浏览 7 评论 0原文

假设我有一组需要处理的文本文件(例如搜索某个标签并提取值)。解决这个问题的一般方法是什么?

我还读过这个:“检索变量来自Python的值”,但它似乎不适用于我面临的某些情况(例如使用tab而不是

我只想知道最无论使用何种语言,都应采用适当的方法来解决问题。

假设我有这样的事情:

Name: Backup Operators  SID: S-1-5-32-551   Caption: COMMSVR21\Backup Operators Description: Backup Operators can override security restrictions for the sole purpose of backing up or restoring files  Domain: COMMSVR21   
COMMERCE/cabackup
COMMSVR21/sys5erv1c3

我希望能够访问/检索 Backup Operators 的值并获取 COMMERCE/cabackup & COMMSVR21/sys5erv1c3 作为回报。

你会怎么做?

我想到的是读取整个文本文件、正则表达式搜索以及可能的一些 if else 语句。这有效吗?或者也许将文本文件解析为某个数组并检索它?我不知道。

就像另一个示例中所说:

        GPO: xxx & yyy Servers
            Policy:            MaximumPasswordAge
            Computer Setting:  45

如何检查文本文件中的 Policy = MaximumPasswordAge 并返回值 45

谢谢!

p/s——我可能会用 Python(零知识,所以即时学习)或 Java 来做这件事

pp/s——我刚刚意识到没有剧透标签。嗯

——

例如日志: 使用目录权限进行日志:

C:\:
    BUILTIN\Administrators  Allowed:    Full Control
    NT AUTHORITY\SYSTEM Allowed:    Full Control
    BUILTIN\Users   Allowed:    Read & Execute
    BUILTIN\Users   Allowed:    Special Permissions: 
            Create Folders
    BUILTIN\Users   Allowed:    Special Permissions: 
            Create Files
    \Everyone   Allowed:    Read & Execute
    (No auditing)

C:\WINDOWS:
    BUILTIN\Users   Allowed:    Read & Execute
    BUILTIN\Power Users Allowed:    Modify
    BUILTIN\Power Users Allowed:    Special Permissions: 
            Delete
    BUILTIN\Administrators  Allowed:    Full Control
    NT AUTHORITY\SYSTEM Allowed:    Full Control
    (No auditing)

另一项具有以下内容:

    Audit Policy
    ------------
        GPO: xxx & yyy Servers
            Policy:            AuditPolicyChange
            Computer Setting:  Success

        GPO: xxx & yyy Servers
            Policy:            AuditPrivilegeUse
            Computer Setting:  Failure

        GPO: xxx & yyy Servers
            Policy:            AuditDSAccess
            Computer Setting:  No Auditing

这是制表符分隔的权限:

User Name   Full Name   Description Account Type    SID Domain  PasswordIsChangeable    PasswordExpires PasswordRequired    AccountDisabled AccountLocked   Last Login
53cuR1ty        Built-in account for administering the computer/domain  512 S-1-5-21-2431866339-2595301809-2847141052-500   COMMSVR21   True    False   True    False   False   09/11/2010 7:14:27 PM
ASPNET  ASP.NET Machine Account Account used for running the ASP.NET worker process (aspnet_wp.exe) 512 

Say, I have a collection of text files I need to process (e.g. search for a certain label and extract the value). What would be the general way to tackle the problem?

I also read this: "Retrieve Variable Values from Python" but it seems not applicable to some of the cases I face (like tab is used instead of :)

I just want to know the most appropriate way to tackle the problem regardless of the language used.

Say I have something like:

Name: Backup Operators  SID: S-1-5-32-551   Caption: COMMSVR21\Backup Operators Description: Backup Operators can override security restrictions for the sole purpose of backing up or restoring files  Domain: COMMSVR21   
COMMERCE/cabackup
COMMSVR21/sys5erv1c3

I want to be able to access/retrieve the values of Backup Operators and get COMMERCE/cabackup & COMMSVR21/sys5erv1c3 in return.

How would you do it?

What I thought of is to read the whole text file, regex search and probably some if else statements. Is this effective? Or maybe parsing the text file into probably some array and retrieve it? I'm not sure.

Like in another example say:

        GPO: xxx & yyy Servers
            Policy:            MaximumPasswordAge
            Computer Setting:  45

How would you check the text file for Policy = MaximumPasswordAge and return the value 45?

Thanks!

p/s -- I might be doing this in Python (zero knowledge, so picking it up on the fly) or Java

pp/s -- I just realised that there's no spoiler tag. Hmm

--

E.g. of the logs:
Log with Directory Permissions:

C:\:
    BUILTIN\Administrators  Allowed:    Full Control
    NT AUTHORITY\SYSTEM Allowed:    Full Control
    BUILTIN\Users   Allowed:    Read & Execute
    BUILTIN\Users   Allowed:    Special Permissions: 
            Create Folders
    BUILTIN\Users   Allowed:    Special Permissions: 
            Create Files
    \Everyone   Allowed:    Read & Execute
    (No auditing)

C:\WINDOWS:
    BUILTIN\Users   Allowed:    Read & Execute
    BUILTIN\Power Users Allowed:    Modify
    BUILTIN\Power Users Allowed:    Special Permissions: 
            Delete
    BUILTIN\Administrators  Allowed:    Full Control
    NT AUTHORITY\SYSTEM Allowed:    Full Control
    (No auditing)

Another one with the following:

    Audit Policy
    ------------
        GPO: xxx & yyy Servers
            Policy:            AuditPolicyChange
            Computer Setting:  Success

        GPO: xxx & yyy Servers
            Policy:            AuditPrivilegeUse
            Computer Setting:  Failure

        GPO: xxx & yyy Servers
            Policy:            AuditDSAccess
            Computer Setting:  No Auditing

This is the tab delimited one:

User Name   Full Name   Description Account Type    SID Domain  PasswordIsChangeable    PasswordExpires PasswordRequired    AccountDisabled AccountLocked   Last Login
53cuR1ty        Built-in account for administering the computer/domain  512 S-1-5-21-2431866339-2595301809-2847141052-500   COMMSVR21   True    False   True    False   False   09/11/2010 7:14:27 PM
ASPNET  ASP.NET Machine Account Account used for running the ASP.NET worker process (aspnet_wp.exe) 512 

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

心房的律动 2024-10-17 17:52:18

我总是把 Python 推到人们面前;)

我建议查看 Regex:http://docs.python。 org/howto/regex.html,因为它可能适合您的需求。我不会为你做这件事(因为我不能),但我知道如果你的文件是以冒号分隔的键/值对并用换行符分隔,那么这会起作用。这是一个快速开始(可能有效):

regex = '(.*):( *)(.*)\n'

这匹配三个组(希望如此):冒号之前的组(组 1)、空格(组 2,可以丢弃)以及该组和新行之间的文本(第 3 组)。

玩这个(我不想有正则表达式动脉瘤,所以这是我目前能提供的帮助)。祝你好运!

I always shove Python into people's faces ;)

I recommend looking at Regex: http://docs.python.org/howto/regex.html, as it might fit your needs. I won't do it for you (because I can't), but I know this will work if your files are colon-delimited key/value pairs separated by newline characters. Here's a quick start (which might work):

regex = '(.*):( *)(.*)\n'

This matches three groups (hopefully): A group before the colon (group 1), the spaces (group 2, which can be thrown away), and the text between that and a new line (group 3).

Play with that (I don't want to have a regex aneurysm, so this is far as I can help for now). Good luck!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文