使用 powershell 解析逗号分隔的文件

发布于 2024-09-19 18:58:33 字数 1096 浏览 2 评论 0原文

我有一个包含多行的文本文件,每行都是逗号分隔的字符串。每行的格式为:

BitnessOSType 是可选的。

例如,文件可以是这样的:

Name1, Value1, X64, Windows7
Name2, Value2, X86, XP
Name3, Value3, X64, XP
Name4, Value3, , Windows7
Name4, Value3, X64 /*Note that no comma follows X64 */
....
....

我想将每一行解析为 4 个变量并对其执行一些操作。这是我使用的 PowerShell 脚本。

Get-Content $inputFile | ForEach-Object {
    $Line = $_;

    $_var = "";
    $_val = "";
    $_bitness = "";
    $_ostype = "";

    $envVarArr = $Line.Split(",");
    For($i=0; $i -lt $envVarArr.Length; $i++) {
        Switch ($i) {
            0 {$_var = $envVarArr[$i].Trim();}
            1 {$_val = $envVarArr[$i].Trim();}
            2 {$_bitness = $envVarArr[$i].Trim();}
            3 {$_ostype = $envVarArr[$i].Trim();}
        }                                    
    }
    //perform some operation using the 4 temporary variables
}

但是,我想知道是否可以在 PowerShell 中使用正则表达式来执行此操作。您能提供执行此操作的示例代码吗?请注意,每行中的第三个和第四个值可以选择为空。

I have a text file which contains several lines, each of which is a comma separated string. The format of each line is:

<Name, Value, Bitness, OSType>

Bitness and OSType are optional.

For example the file can be like this:

Name1, Value1, X64, Windows7
Name2, Value2, X86, XP
Name3, Value3, X64, XP
Name4, Value3, , Windows7
Name4, Value3, X64 /*Note that no comma follows X64 */
....
....

I want to parse each line into 4 variables and perform some operation on it. This is the PowerShell script that I use..

Get-Content $inputFile | ForEach-Object {
    $Line = $_;

    $_var = "";
    $_val = "";
    $_bitness = "";
    $_ostype = "";

    $envVarArr = $Line.Split(",");
    For($i=0; $i -lt $envVarArr.Length; $i++) {
        Switch ($i) {
            0 {$_var = $envVarArr[$i].Trim();}
            1 {$_val = $envVarArr[$i].Trim();}
            2 {$_bitness = $envVarArr[$i].Trim();}
            3 {$_ostype = $envVarArr[$i].Trim();}
        }                                    
    }
    //perform some operation using the 4 temporary variables
}

However, I wanted to know if it is possible to do this using regex in PowerShell. Would you please provide sample code for doing that? Note that the 3rd and 4th values in each line can be optionally empty.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

夏末的微笑 2024-09-26 18:58:34

使用 Import-Csv 不是更好吗 哪一个可以为您完成这一切(并且更可靠)?

Wouldn't it be better to use Import-Csv which does all this (and more reliably) for you?

请远离我 2024-09-26 18:58:34

正如 Tim 建议的那样,您可以使用 Import-Csv。不同之处在于 Import-Csv 从文件中读取。

@"
Name1, Value1, X64, Windows7
Name2, Value2, X86, XP
Name3, Value3, X64, XP
Name4, Value3, , Windows7
Name4, Value3, X64 /*Note that no comma follows X64 */
"@ | ConvertFrom-Csv -header var, val, bitness, ostype

# Result

var   val    bitness                                 ostype  
---   ---    -------                                 ------  
Name1 Value1 X64                                     Windows7
Name2 Value2 X86                                     XP      
Name3 Value3 X64                                     XP      
Name4 Value3                                         Windows7
Name4 Value3 X64 /*Note that no comma follows X64 */         

As Tim suggests, you can use use Import-Csv. The difference is that Import-Csv reads from a file.

@"
Name1, Value1, X64, Windows7
Name2, Value2, X86, XP
Name3, Value3, X64, XP
Name4, Value3, , Windows7
Name4, Value3, X64 /*Note that no comma follows X64 */
"@ | ConvertFrom-Csv -header var, val, bitness, ostype

# Result

var   val    bitness                                 ostype  
---   ---    -------                                 ------  
Name1 Value1 X64                                     Windows7
Name2 Value2 X86                                     XP      
Name3 Value3 X64                                     XP      
Name4 Value3                                         Windows7
Name4 Value3 X64 /*Note that no comma follows X64 */         
你曾走过我的故事 2024-09-26 18:58:34

比糖蜜慢,但在花了 20 年时间拼凑出十几个或更多的部分解决方案后,我决定彻底解决这个问题。当然,随着时间的推移,现在可以使用各种解析器库。


function SplitDelim($Line, $Delim=",", $Default=$Null, $Size=$Null) {

    # 4956968,"""Visible,"" 4D ""Human"" Torso Anatomy Kit (4-5/8)",FDV-26051,"" ,"",,,,,,a
    # "4956968,"""Visible,"" 4D ""Human"" Torso Anatomy Kit (4-5/8)",FDV-26051,"" ,"",,,,,,a
    # ,4956968,"""Visible,"" 4D ""Human"" Torso Anatomy Kit (4-5/8)",FDV-26051,"" ,"",,,,,,a

    $Field = ""
    $Fields = @()
    $Quotes = 0
    $State = 'INF' # INFIELD, INQFIELD, NOFIELD
    $NextState = $Null

    for ($i=0; $i -lt $Line.length; $i++) {
        $Char = $Line.substring($i,1)

        if($State -eq 'NOF') {

            # NOF and Char is Quote
            # NextState becomes INQ
            if ($Char -eq '"') {
                $NextState = 'INQ'
            }

            # NOF and Char is Delim
            # NextState becomes NOF
            elseif ($Char -eq $Delim) {
                $NextState = 'NOF'
                $Char = $Null
            }

            # NOF and Char is not Delim, Quote or space
            # NextState becomes INF
            elseif ($Char -ne " ") {
                $NextState = 'INF'
            }

        } elseif ($State -eq 'INF') {

            # INF and Char is Quote
            # Error
            if ($Char -eq '"') {
                return $Null}

            # INF and Char is Delim
            # NextState Becomes NOF
            elseif ($Char -eq $Delim) {
                $NextState = 'NOF'
                $Char = $Null
            }

        } elseif ($State -eq 'INQ') {

            # INQ and Char is Delim and consecutive Quotes mod 2 is 0
            # NextState is NOF
            if ($Char -eq $Delim -and $Quotes % 2 -eq 0) {
                $NextState = 'NOF'
                $Char = $Null
            }
        }

        # Track consecutive quote for purposes of mod 2 logic
        if ($Char -eq '"') {
            $Quotes++
        } elseif ($NextState -eq 'INQ') {
            $Quotes = 0
        }

        # Normal duty
        if ($State -ne 'NOF' -or $NextState -ne 'NOF') {
            $Field += $Char
        }

        # Push to $Fields and clear
        if ($NextState -eq 'NOF') {
            $Fields += (IfBlank $Field $Default)
            $Field = ''
        }

        if ($NextState) {
            $State = $NextState
            $NextState = $Null
        }
    }

    $Fields += (IfNull $Field $Default)

    while ($Size -and $Fields.count -lt $Size) {
        $Fields += $Default
    }

    return $Fields
}

Slower than molasses but after spending 20 years cobbling together a dozen or more partial solutions I decided to tackle it definitively. Of course in the course of time all sorts of parser libraries are now available.


function SplitDelim($Line, $Delim=",", $Default=$Null, $Size=$Null) {

    # 4956968,"""Visible,"" 4D ""Human"" Torso Anatomy Kit (4-5/8)",FDV-26051,"" ,"",,,,,,a
    # "4956968,"""Visible,"" 4D ""Human"" Torso Anatomy Kit (4-5/8)",FDV-26051,"" ,"",,,,,,a
    # ,4956968,"""Visible,"" 4D ""Human"" Torso Anatomy Kit (4-5/8)",FDV-26051,"" ,"",,,,,,a

    $Field = ""
    $Fields = @()
    $Quotes = 0
    $State = 'INF' # INFIELD, INQFIELD, NOFIELD
    $NextState = $Null

    for ($i=0; $i -lt $Line.length; $i++) {
        $Char = $Line.substring($i,1)

        if($State -eq 'NOF') {

            # NOF and Char is Quote
            # NextState becomes INQ
            if ($Char -eq '"') {
                $NextState = 'INQ'
            }

            # NOF and Char is Delim
            # NextState becomes NOF
            elseif ($Char -eq $Delim) {
                $NextState = 'NOF'
                $Char = $Null
            }

            # NOF and Char is not Delim, Quote or space
            # NextState becomes INF
            elseif ($Char -ne " ") {
                $NextState = 'INF'
            }

        } elseif ($State -eq 'INF') {

            # INF and Char is Quote
            # Error
            if ($Char -eq '"') {
                return $Null}

            # INF and Char is Delim
            # NextState Becomes NOF
            elseif ($Char -eq $Delim) {
                $NextState = 'NOF'
                $Char = $Null
            }

        } elseif ($State -eq 'INQ') {

            # INQ and Char is Delim and consecutive Quotes mod 2 is 0
            # NextState is NOF
            if ($Char -eq $Delim -and $Quotes % 2 -eq 0) {
                $NextState = 'NOF'
                $Char = $Null
            }
        }

        # Track consecutive quote for purposes of mod 2 logic
        if ($Char -eq '"') {
            $Quotes++
        } elseif ($NextState -eq 'INQ') {
            $Quotes = 0
        }

        # Normal duty
        if ($State -ne 'NOF' -or $NextState -ne 'NOF') {
            $Field += $Char
        }

        # Push to $Fields and clear
        if ($NextState -eq 'NOF') {
            $Fields += (IfBlank $Field $Default)
            $Field = ''
        }

        if ($NextState) {
            $State = $NextState
            $NextState = $Null
        }
    }

    $Fields += (IfNull $Field $Default)

    while ($Size -and $Fields.count -lt $Size) {
        $Fields += $Default
    }

    return $Fields
}
动次打次papapa 2024-09-26 18:58:33

您可以使用 Import-Csv cmdlet 的 -Header 参数为导入的文件指定备用列标题行:

Import-Csv .\test.txt -Header Col1,Col2,Bitness,OSType

You can specify an alternate column header row for the imported file file with the -Header parameter of the Import-Csv cmdlet:

Import-Csv .\test.txt -Header Col1,Col2,Bitness,OSType
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文