电源外壳 | csv 文件编辑 A 列中的每一行以设置最大字符数

发布于 2025-01-19 02:29:12 字数 295 浏览 4 评论 0原文

$lower = Import-Csv "C:\\Users\\X\\Desktop\\U\\cvv.csv"  
$lower | ForEach-Object {

       src['A']=src['A'].str[:20].str.lower()
    
     } 

$lower |
Export-Csv -Path "C:\\Users\\X\\Desktop\\U\\cvv2.csv"

我尝试过这个方法,但是不起作用。

我希望如果超过 20 个字符就删除并将其匹配到最多 20 个。

$lower = Import-Csv "C:\\Users\\X\\Desktop\\U\\cvv.csv"  
$lower | ForEach-Object {

       src['A']=src['A'].str[:20].str.lower()
    
     } 

$lower |
Export-Csv -Path "C:\\Users\\X\\Desktop\\U\\cvv2.csv"

I tried this method, but it does not work.

I want that if it is over 20 characters to delete and match it to a maximum of 20.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

只是一片海 2025-01-26 02:29:13

看起来您正在混合使用 Python 和 PowerShell 语法。

您可能正在寻找这个:

$lower = Import-Csv 'C:\Users\X\Desktop\U\cvv.csv'
$lower | ForEach-Object {
  $_.A = $_.A.Substring(0, 20).ToLower() 
}
# ... Export-Csv command omitted.

但是,如果某些属性值有可能少于 20 个字符,则需要做更多的工作,即避免 .Substring() 出现异常 方法否则会抛出异常。

$lower = Import-Csv 'C:\Users\X\Desktop\U\cvv.csv'
$lower | ForEach-Object {
  $val = if ($_.A.Length -gt 20) { $_.A.Substring(0, 20) } else { $_.A }
  $_.A = $val.ToLower() 
}
  • 以下是一个较短的替代方案,但如果许多输入字符串短于 20 个字符,则性能会很差,因为异常处理在性能方面代价高昂:

    • 尝试 { $_.A.Substring(0, 20) } catch { $_.A }
  • 在 PowerShell (Core) 7+ 中,您可以将 if 语句缩短为:

    • $_.A.长度 -gt 20 ? $_.A.Substring(0, 20) : $_.A

可选阅读:比较各种子字符串提取方法的性能。

  • 在 PowerShell 中提取子字符串的方法有多种,它们在详细程度和性能方面差异很大:

    • 但这两个方面并不相关,事实上,在这种情况下,最冗长的方法速度最快。

    • 从广义上讲,这些方法可分为:

  • 下面是基准测试的结果,它给出了粗略的感觉相对性能:

    • PowerShell 中的性能测量并不是一门精确的科学,结果取决于许多因素 - 尤其是主机硬件;低于平均 50 次运行的基准测试是为了获得更好的感觉,而感兴趣的是 Factor 列中反映的相对性能(1.00 反映最快时间,所有其他值都是该时间的倍数)。

    • (最多)20 个字符的子字符串提取。对 1,000 字符串执行,其中一半比该字符串长,一半比该字符串短。

  • 重要:基准测试将 .Substring() 调用的条件解决方案与无条件并列 -替换和数组切片解决方案,这会扭曲结果 - 为了比较真正的子字符串提取性能,后两种方法也需要修改为使用条件。

    • 仅对 .Substring() 方法使用条件处理的原因是,它是一种必要 - 为了避免异常 - 而其他方法简洁,即不必使用条件。

基准测试结果

  • 在 Windows 10 计算机上的 Windows PowerShell v5.1 中运行的结果:
Factor Secs (50-run avg.) Command
------ ------------------ -------
1.00   0.001              # .Substring + if...
1.71   0.002              # .Substring + [math]::Min()...
5.24   0.006              # -replace + capture group...
8.32   0.010              # -replace + lookbehind...
160.84 0.198              # .Substring + try...
229.07 0.281              # array slicing + [string]::new()...
294.62 0.362              # array slicing + -join ...
  • PowerShell (Core) 7.3.0 中运行的结果在同一台 Windows 10 计算机上:
Factor Secs (50-run avg.) Command
------ ------------------ -------
1.00   0.002              # .Substring + ternary conditional…
1.09   0.002              # .Substring + if…
2.98   0.005              # .Substring + [math]::Min()…
3.79   0.006              # -replace + capture group…
6.64   0.011              # -replace + lookbehind…
132.11 0.215              # array slicing + [string]::new()…
160.99 0.262              # array slicing + -join …
163.68 0.266              # .Substring + try…
  • 摘要
    • 基于 .Substring() 的方法是迄今为止最快的 - 除非与 try / catch 结合使用(异常处理的成本很高) )。
      • v7+ 三元条件 (? :) 的执行效果与等效的 if 语句大致相同。
    • 使用捕获组变体时,基于 -replace 的解决方案速度慢 3-5 倍,大约是使用后视断言变体的两倍。
    • 到目前为止,最慢的是数组切片方法和涉及 try / catch 的解决方案,慢了两个数量级。

基准源代码

  • 要自己运行这些基准,您必须从 此要点

    • 假设您已查看链接的 Gist 源代码以确保其安全(我个人可以向您保证,但您应该始终检查),您可以直接安装它,如下所示:

      irm https://gist.github.com/mklement0/9e1f13978620b09ab2d15da5535d1b27/raw/Time-Command.ps1 |埃克斯
      
# Create 1000 strings, half of which longer than 20 chars., and half shorter.
$strs = , ('x' * 30) * 500 + , ('y' * 10) * 500

# Construct an array of script blocks with the various
# substring-extraction methods.
$cmds = 
{ # -replace + capture group
  foreach ($s in $strs) {
    $s -replace '^(.{20}).+', '$1'
  }
}, 
{ # -replace + lookbehind
  foreach ($s in $strs) {
    $s -replace '(?<=^.{20}).+'
  }
},
{ # .Substring + try
  foreach ($s in $strs) {
    try { $s.Substring(0, 20) } catch { $_}
  }
},
{ # .Substring + if
  foreach ($s in $strs) {
    if ($s.Length -gt 20) { $s.Substring(0, 20) } else { $s }
  }
},
{ # .Substring + [math]::Min()
  foreach ($s in $strs) {
    $s.Substring(0, [Math]::Min($s.Length, 20))
  }
},
{ # array slicing + -join 
  foreach ($s in $strs) {
    -join $s[0..19]
  }
},
{ # array slicing + [string]::new()
  foreach ($s in $strs) {
    [string]::new($s[0..19])
  }
}

# PowerShell (Core): add variant with ternary conditional.
if ($IsCoreClr) {
  # Note: The script block must be constructed *as a string*,
  #       to avoid breaking the parsing stage of the script in Windows PowerShell.
  $cmds += [scriptblock]::Create(@'
  # .Substring + ternary conditional
  foreach ($s in $strs) {
    $s.Length -gt 20 ? $s.Substring(0, 20) : $s
  }  
'@)
}

# Compare the performance of various substring extraction methods,
# averaged over 50 runs.
Time-Command -Count 50 $cmds

It looks like you're mixing Python and PowerShell syntax.

You're probably looking for this:

$lower = Import-Csv 'C:\Users\X\Desktop\U\cvv.csv'
$lower | ForEach-Object {
  $_.A = $_.A.Substring(0, 20).ToLower() 
}
# ... Export-Csv command omitted.

However, if there's a chance that some property values have fewer than 20 characters, more work is needed, namely to avoid the exception that the .Substring() method would otherwise throw.

$lower = Import-Csv 'C:\Users\X\Desktop\U\cvv.csv'
$lower | ForEach-Object {
  $val = if ($_.A.Length -gt 20) { $_.A.Substring(0, 20) } else { $_.A }
  $_.A = $val.ToLower() 
}
  • The following is a shorter alternative, but will perform poorly if many of the input strings are shorter than 20 characters, because exception handling is expensive in terms of performance:

    • try { $_.A.Substring(0, 20) } catch { $_.A }
  • In PowerShell (Core) 7+, you can shorten the if statement to:

    • $_.A.Length -gt 20 ? $_.A.Substring(0, 20) : $_.A

Optional reading: comparing the performance of various substring-extraction approaches.

  • There are several approaches to extracting substrings in PowerShell, and they vary widely with respect to verbosity and performance:

    • The two aspects aren't related, however, and, in fact, the most verbose approach is fastest in this case.

    • Broadly speaking, the approaches can be classified as:

      • Use of the .NET .Substring() method
      • Use of array slicing, i.e. treating a string as a character array to extract a sub-array from and
      • Use of regex operations via the -replace operator
  • Below are the results of benchmarks, which give a rough sense of relative performance:

    • Performance measurements in PowerShell aren't an exact science, and the results depend on many factors - not least the host hardware; the benchmarks below average 50 runs to get a better sense, and it is the relative performance, reflected in the Factor column that is of interest (1.00 reflecting the fastest time, all other values being multiples of that).

    • Substring extraction of (up to) 20 chars. is performed on 1,000 strings, half of which are longer than that, half of which are shorter.

  • Important: The benchmarks juxtapose conditional solutions for .Substring() calls with unconditional -replace and array-slicing solutions, which skews the results - to compare true substring-extraction performance, the latter two approaches need to be modified to use conditionals too.

    • The reason for using conditional processing only for the .Substring() approach is that it is a necessity there - in order to avoid exceptions - whereas the appeal of the other approaches is concision, i.e. not having to use conditionals.

Benchmark results:

  • Results from running in Windows PowerShell v5.1 on a Windows 10 machine:
Factor Secs (50-run avg.) Command
------ ------------------ -------
1.00   0.001              # .Substring + if...
1.71   0.002              # .Substring + [math]::Min()...
5.24   0.006              # -replace + capture group...
8.32   0.010              # -replace + lookbehind...
160.84 0.198              # .Substring + try...
229.07 0.281              # array slicing + [string]::new()...
294.62 0.362              # array slicing + -join ...
  • Results from running in PowerShell (Core) 7.3.0 on the same Windows 10 machine:
Factor Secs (50-run avg.) Command
------ ------------------ -------
1.00   0.002              # .Substring + ternary conditional…
1.09   0.002              # .Substring + if…
2.98   0.005              # .Substring + [math]::Min()…
3.79   0.006              # -replace + capture group…
6.64   0.011              # -replace + lookbehind…
132.11 0.215              # array slicing + [string]::new()…
160.99 0.262              # array slicing + -join …
163.68 0.266              # .Substring + try…
  • Summary:
    • The .Substring()-based approaches are by far the fastest - except if combined with try / catch (exception handling is expensive).
      • The v7+ ternary conditional (? :) performs about the same as the equivalent if statement.
    • The -replace-based solutions are slower by a factor of 3-5 with the capture-group variant, and about twice as slow as that with the variant that uses a look-behind assertion.
    • By far the slowest are the array-slicing approaches and the solution involving try / catch, by two orders of magnitude.

Benchmark source code:

  • To run these benchmarks yourself, you must download function Time-Command from this Gist.

    • Assuming you have looked at the linked Gist's source code to ensure that it is safe (which I can personally assure you of, but you should always check), you can install it directly as follows:

      irm https://gist.github.com/mklement0/9e1f13978620b09ab2d15da5535d1b27/raw/Time-Command.ps1 | iex
      
# Create 1000 strings, half of which longer than 20 chars., and half shorter.
$strs = , ('x' * 30) * 500 + , ('y' * 10) * 500

# Construct an array of script blocks with the various
# substring-extraction methods.
$cmds = 
{ # -replace + capture group
  foreach ($s in $strs) {
    $s -replace '^(.{20}).+', '$1'
  }
}, 
{ # -replace + lookbehind
  foreach ($s in $strs) {
    $s -replace '(?<=^.{20}).+'
  }
},
{ # .Substring + try
  foreach ($s in $strs) {
    try { $s.Substring(0, 20) } catch { $_}
  }
},
{ # .Substring + if
  foreach ($s in $strs) {
    if ($s.Length -gt 20) { $s.Substring(0, 20) } else { $s }
  }
},
{ # .Substring + [math]::Min()
  foreach ($s in $strs) {
    $s.Substring(0, [Math]::Min($s.Length, 20))
  }
},
{ # array slicing + -join 
  foreach ($s in $strs) {
    -join $s[0..19]
  }
},
{ # array slicing + [string]::new()
  foreach ($s in $strs) {
    [string]::new($s[0..19])
  }
}

# PowerShell (Core): add variant with ternary conditional.
if ($IsCoreClr) {
  # Note: The script block must be constructed *as a string*,
  #       to avoid breaking the parsing stage of the script in Windows PowerShell.
  $cmds += [scriptblock]::Create(@'
  # .Substring + ternary conditional
  foreach ($s in $strs) {
    $s.Length -gt 20 ? $s.Substring(0, 20) : $s
  }  
'@)
}

# Compare the performance of various substring extraction methods,
# averaged over 50 runs.
Time-Command -Count 50 $cmds
别靠近我心 2025-01-26 02:29:13

我会亲自使用索引运算符[]

Import-Csv "C:\\Users\\X\\Desktop\\U\\cvv.csv" | ForEach-Object {
    $_.A = [string]::new($_.A[0..19]).ToLower() # Update the the `A` value
    $_ # Output the object
} | Export-Csv -Path "C:\\Users\\X\\Desktop\\U\\cvv2.csv"

​或超过所需的长度:

PS /> 'HELLO WORLD', 'ONLY 20 CHARS LENGTH ALLOWED' | ForEach-Object {
    [string]::new($_[0..19]).ToLower()
}


hello world
only 20 chars length

I would personally use the index operator [ ] in combination with the range operator ..:

Import-Csv "C:\\Users\\X\\Desktop\\U\\cvv.csv" | ForEach-Object {
    $_.A = [string]::new($_.A[0..19]).ToLower() # Update the the `A` value
    $_ # Output the object
} | Export-Csv -Path "C:\\Users\\X\\Desktop\\U\\cvv2.csv"

It would handle strings that are below or above the desired Length:

PS /> 'HELLO WORLD', 'ONLY 20 CHARS LENGTH ALLOWED' | ForEach-Object {
    [string]::new($_[0..19]).ToLower()
}


hello world
only 20 chars length
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文