电源外壳 | csv 文件编辑 A 列中的每一行以设置最大字符数

发布于 2025-01-19 02:29:12 字数 295 浏览 4 评论 0原文

$lower = Import-Csv "C:\\Users\\X\\Desktop\\U\\cvv.csv"  
$lower | ForEach-Object {

       src['A']=src['A'].str[:20].str.lower()
    
     } 

$lower |
Export-Csv -Path "C:\\Users\\X\\Desktop\\U\\cvv2.csv"

我尝试过这个方法，但是不起作用。

我希望如果超过 20 个字符就删除并将其匹配到最多 20 个。

原文

$lower = Import-Csv "C:\\Users\\X\\Desktop\\U\\cvv.csv"  
$lower | ForEach-Object {

       src['A']=src['A'].str[:20].str.lower()
    
     } 

$lower |
Export-Csv -Path "C:\\Users\\X\\Desktop\\U\\cvv2.csv"

I tried this method, but it does not work.

I want that if it is over 20 characters to delete and match it to a maximum of 20.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

只是一片海 2025-01-26 02:29:13

看起来您正在混合使用 Python 和 PowerShell 语法。

您可能正在寻找这个：

$lower = Import-Csv 'C:\Users\X\Desktop\U\cvv.csv'
$lower | ForEach-Object {
  $_.A = $_.A.Substring(0, 20).ToLower() 
}
# ... Export-Csv command omitted.

但是，如果某些属性值有可能少于 20 个字符，则需要做更多的工作，即避免 .Substring() 出现异常 方法否则会抛出异常。

$lower = Import-Csv 'C:\Users\X\Desktop\U\cvv.csv'
$lower | ForEach-Object {
  $val = if ($_.A.Length -gt 20) { $_.A.Substring(0, 20) } else { $_.A }
  $_.A = $val.ToLower() 
}

以下是一个较短的替代方案，但如果许多输入字符串短于 20 个字符，则性能会很差，因为异常处理在性能方面代价高昂：
- 尝试 { $_.A.Substring(0, 20) } catch { $_.A }
在 PowerShell (Core) 7+ 中，您可以将 if 语句缩短为：
- $_.A.长度 -gt 20 ? $_.A.Substring(0, 20) : $_.A

可选阅读：比较各种子字符串提取方法的性能。

在 PowerShell 中提取子字符串的方法有多种，它们在详细程度和性能方面差异很大：
- 但这两个方面并不相关，事实上，在这种情况下，最冗长的方法速度最快。
- 从广义上讲，这些方法可分为：
  - 使用 .NET .Substring() 方法
  - 使用数组切片，即将字符串视为字符数组并从中提取子数组
  - 使用正则表达式通过 -替换运算符
下面是基准测试的结果，它给出了粗略的感觉相对性能：
- PowerShell 中的性能测量并不是一门精确的科学，结果取决于许多因素 - 尤其是主机硬件；低于平均 50 次运行的基准测试是为了获得更好的感觉，而感兴趣的是 Factor 列中反映的相对性能（1.00 反映最快时间，所有其他值都是该时间的倍数）。
- （最多）20 个字符的子字符串提取。对 1,000 字符串执行，其中一半比该字符串长，一半比该字符串短。
重要：基准测试将 .Substring() 调用的条件解决方案与无条件并列 -替换和数组切片解决方案，这会扭曲结果 - 为了比较真正的子字符串提取性能，后两种方法也需要修改为使用条件。
仅对 .Substring() 方法使用条件处理的原因是，它是一种必要 - 为了避免异常 - 而其他方法简洁，即不必使用条件。

基准测试结果：

在 Windows 10 计算机上的 Windows PowerShell v5.1 中运行的结果：

Factor Secs (50-run avg.) Command
------ ------------------ -------
1.00   0.001              # .Substring + if...
1.71   0.002              # .Substring + [math]::Min()...
5.24   0.006              # -replace + capture group...
8.32   0.010              # -replace + lookbehind...
160.84 0.198              # .Substring + try...
229.07 0.281              # array slicing + [string]::new()...
294.62 0.362              # array slicing + -join ...

在 PowerShell (Core) 7.3.0 中运行的结果在同一台 Windows 10 计算机上：

Factor Secs (50-run avg.) Command
------ ------------------ -------
1.00   0.002              # .Substring + ternary conditional…
1.09   0.002              # .Substring + if…
2.98   0.005              # .Substring + [math]::Min()…
3.79   0.006              # -replace + capture group…
6.64   0.011              # -replace + lookbehind…
132.11 0.215              # array slicing + [string]::new()…
160.99 0.262              # array slicing + -join …
163.68 0.266              # .Substring + try…

摘要：
- 基于 .Substring() 的方法是迄今为止最快的 - 除非与 try / catch 结合使用（异常处理的成本很高））。
  - v7+ 三元条件 (? :) 的执行效果与等效的 if 语句大致相同。
- 使用捕获组变体时，基于 -replace 的解决方案速度慢 3-5 倍，大约是使用后视断言变体的两倍。
- 到目前为止，最慢的是数组切片方法和涉及 try / catch 的解决方案，慢了两个数量级。

基准源代码：

要自己运行这些基准，您必须从此要点。
- 假设您已查看链接的 Gist 源代码以确保其安全（我个人可以向您保证，但您应该始终检查），您可以直接安装它，如下所示：
```
irm https://gist.github.com/mklement0/9e1f13978620b09ab2d15da5535d1b27/raw/Time-Command.ps1 |埃克斯
```

# Create 1000 strings, half of which longer than 20 chars., and half shorter.
$strs = , ('x' * 30) * 500 + , ('y' * 10) * 500

# Construct an array of script blocks with the various
# substring-extraction methods.
$cmds = 
{ # -replace + capture group
  foreach ($s in $strs) {
    $s -replace '^(.{20}).+', '$1'
  }
}, 
{ # -replace + lookbehind
  foreach ($s in $strs) {
    $s -replace '(?<=^.{20}).+'
  }
},
{ # .Substring + try
  foreach ($s in $strs) {
    try { $s.Substring(0, 20) } catch { $_}
  }
},
{ # .Substring + if
  foreach ($s in $strs) {
    if ($s.Length -gt 20) { $s.Substring(0, 20) } else { $s }
  }
},
{ # .Substring + [math]::Min()
  foreach ($s in $strs) {
    $s.Substring(0, [Math]::Min($s.Length, 20))
  }
},
{ # array slicing + -join 
  foreach ($s in $strs) {
    -join $s[0..19]
  }
},
{ # array slicing + [string]::new()
  foreach ($s in $strs) {
    [string]::new($s[0..19])
  }
}

# PowerShell (Core): add variant with ternary conditional.
if ($IsCoreClr) {
  # Note: The script block must be constructed *as a string*,
  #       to avoid breaking the parsing stage of the script in Windows PowerShell.
  $cmds += [scriptblock]::Create(@'
  # .Substring + ternary conditional
  foreach ($s in $strs) {
    $s.Length -gt 20 ? $s.Substring(0, 20) : $s
  }  
'@)
}

# Compare the performance of various substring extraction methods,
# averaged over 50 runs.
Time-Command -Count 50 $cmds

It looks like you're mixing Python and PowerShell syntax.

You're probably looking for this:

$lower = Import-Csv 'C:\Users\X\Desktop\U\cvv.csv'
$lower | ForEach-Object {
  $_.A = $_.A.Substring(0, 20).ToLower() 
}
# ... Export-Csv command omitted.

However, if there's a chance that some property values have fewer than 20 characters, more work is needed, namely to avoid the exception that the .Substring() method would otherwise throw.

$lower = Import-Csv 'C:\Users\X\Desktop\U\cvv.csv'
$lower | ForEach-Object {
  $val = if ($_.A.Length -gt 20) { $_.A.Substring(0, 20) } else { $_.A }
  $_.A = $val.ToLower() 
}

The following is a shorter alternative, but will perform poorly if many of the input strings are shorter than 20 characters, because exception handling is expensive in terms of performance:
- try { $_.A.Substring(0, 20) } catch { $_.A }
In PowerShell (Core) 7+, you can shorten the if statement to:
- $_.A.Length -gt 20 ? $_.A.Substring(0, 20) : $_.A

Optional reading: comparing the performance of various substring-extraction approaches.

There are several approaches to extracting substrings in PowerShell, and they vary widely with respect to verbosity and performance:
- The two aspects aren't related, however, and, in fact, the most verbose approach is fastest in this case.
- Broadly speaking, the approaches can be classified as:
  - Use of the .NET .Substring() method
  - Use of array slicing, i.e. treating a string as a character array to extract a sub-array from and
  - Use of regex operations via the -replace operator
Below are the results of benchmarks, which give a rough sense of relative performance:
- Performance measurements in PowerShell aren't an exact science, and the results depend on many factors - not least the host hardware; the benchmarks below average 50 runs to get a better sense, and it is the relative performance, reflected in the Factor column that is of interest (1.00 reflecting the fastest time, all other values being multiples of that).
- Substring extraction of (up to) 20 chars. is performed on 1,000 strings, half of which are longer than that, half of which are shorter.
Important: The benchmarks juxtapose conditional solutions for .Substring() calls with unconditional -replace and array-slicing solutions, which skews the results - to compare true substring-extraction performance, the latter two approaches need to be modified to use conditionals too.
- The reason for using conditional processing only for the .Substring() approach is that it is a necessity there - in order to avoid exceptions - whereas the appeal of the other approaches is concision, i.e. not having to use conditionals.

Benchmark results:

Results from running in Windows PowerShell v5.1 on a Windows 10 machine:

Factor Secs (50-run avg.) Command
------ ------------------ -------
1.00   0.001              # .Substring + if...
1.71   0.002              # .Substring + [math]::Min()...
5.24   0.006              # -replace + capture group...
8.32   0.010              # -replace + lookbehind...
160.84 0.198              # .Substring + try...
229.07 0.281              # array slicing + [string]::new()...
294.62 0.362              # array slicing + -join ...

Results from running in PowerShell (Core) 7.3.0 on the same Windows 10 machine:

Factor Secs (50-run avg.) Command
------ ------------------ -------
1.00   0.002              # .Substring + ternary conditional…
1.09   0.002              # .Substring + if…
2.98   0.005              # .Substring + [math]::Min()…
3.79   0.006              # -replace + capture group…
6.64   0.011              # -replace + lookbehind…
132.11 0.215              # array slicing + [string]::new()…
160.99 0.262              # array slicing + -join …
163.68 0.266              # .Substring + try…

Summary:
- The .Substring()-based approaches are by far the fastest - except if combined with try / catch (exception handling is expensive).
  - The v7+ ternary conditional (? :) performs about the same as the equivalent if statement.
- The -replace-based solutions are slower by a factor of 3-5 with the capture-group variant, and about twice as slow as that with the variant that uses a look-behind assertion.
- By far the slowest are the array-slicing approaches and the solution involving try / catch, by two orders of magnitude.

Benchmark source code:

To run these benchmarks yourself, you must download function Time-Command from this Gist.
- Assuming you have looked at the linked Gist's source code to ensure that it is safe (which I can personally assure you of, but you should always check), you can install it directly as follows:
```
irm https://gist.github.com/mklement0/9e1f13978620b09ab2d15da5535d1b27/raw/Time-Command.ps1 | iex
```

# Create 1000 strings, half of which longer than 20 chars., and half shorter.
$strs = , ('x' * 30) * 500 + , ('y' * 10) * 500

# Construct an array of script blocks with the various
# substring-extraction methods.
$cmds = 
{ # -replace + capture group
  foreach ($s in $strs) {
    $s -replace '^(.{20}).+', '$1'
  }
}, 
{ # -replace + lookbehind
  foreach ($s in $strs) {
    $s -replace '(?<=^.{20}).+'
  }
},
{ # .Substring + try
  foreach ($s in $strs) {
    try { $s.Substring(0, 20) } catch { $_}
  }
},
{ # .Substring + if
  foreach ($s in $strs) {
    if ($s.Length -gt 20) { $s.Substring(0, 20) } else { $s }
  }
},
{ # .Substring + [math]::Min()
  foreach ($s in $strs) {
    $s.Substring(0, [Math]::Min($s.Length, 20))
  }
},
{ # array slicing + -join 
  foreach ($s in $strs) {
    -join $s[0..19]
  }
},
{ # array slicing + [string]::new()
  foreach ($s in $strs) {
    [string]::new($s[0..19])
  }
}

# PowerShell (Core): add variant with ternary conditional.
if ($IsCoreClr) {
  # Note: The script block must be constructed *as a string*,
  #       to avoid breaking the parsing stage of the script in Windows PowerShell.
  $cmds += [scriptblock]::Create(@'
  # .Substring + ternary conditional
  foreach ($s in $strs) {
    $s.Length -gt 20 ? $s.Substring(0, 20) : $s
  }  
'@)
}

# Compare the performance of various substring extraction methods,
# averaged over 50 runs.
Time-Command -Count 50 $cmds

回复收藏 0 原文

别靠近我心 2025-01-26 02:29:13

我会亲自使用索引运算符[] 与

Import-Csv "C:\\Users\\X\\Desktop\\U\\cvv.csv" | ForEach-Object {
    $_.A = [string]::new($_.A[0..19]).ToLower() # Update the the `A` value
    $_ # Output the object
} | Export-Csv -Path "C:\\Users\\X\\Desktop\\U\\cvv2.csv"

或超过所需的长度：

PS /> 'HELLO WORLD', 'ONLY 20 CHARS LENGTH ALLOWED' | ForEach-Object {
    [string]::new($_[0..19]).ToLower()
}


hello world
only 20 chars length

I would personally use the index operator [ ] in combination with the range operator ..:

Import-Csv "C:\\Users\\X\\Desktop\\U\\cvv.csv" | ForEach-Object {
    $_.A = [string]::new($_.A[0..19]).ToLower() # Update the the `A` value
    $_ # Output the object
} | Export-Csv -Path "C:\\Users\\X\\Desktop\\U\\cvv2.csv"

It would handle strings that are below or above the desired Length:

PS /> 'HELLO WORLD', 'ONLY 20 CHARS LENGTH ALLOWED' | ForEach-Object {
    [string]::new($_[0..19]).ToLower()
}


hello world
only 20 chars length

回复收藏 0 原文

~没有更多了~

关于作者

美羊羊

暂无简介

文章

26 人气

关注发私信

十二

文章 0 评论 0

关注

飞烟轻若梦

文章 0 评论 0

关注

OPleyuhuo

文章 0 评论 0

关注

wxb0109

文章 0 评论 0

关注

旧城空念

文章 0 评论 0

关注

-小熊_

文章 0 评论 0

友情链接

文江博客

电源外壳 | csv 文件编辑 A 列中的每一行以设置最大字符数

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

可选阅读：比较各种子字符串提取方法的性能。

Optional reading: comparing the performance of various substring-extraction approaches.

关于作者

相关话题

热门标签

推荐作者

十二

飞烟轻若梦

OPleyuhuo

wxb0109

旧城空念

-小熊_

友情链接

电源外壳 | csv 文件编辑 A 列中的每一行以设置最大字符数

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

可选阅读：比较各种子字符串提取方法的性能。

Optional reading: comparing the performance of various substring-extraction approaches.

关于作者

相关话题

热门标签

推荐作者

十二

飞烟轻若梦

OPleyuhuo

wxb0109

旧城空念

-小熊_

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。