解决 sed 中字符的特定出现位置

发布于 2024-11-02 01:15:21 字数 268 浏览 0 评论 0原文

如何删除或解决 sed 中特定出现的字符?

我正在编辑 CSV 文件,并且想要删除第三次和第五次出现的逗号之间的所有文本(即删除字段四和字段五)。有没有办法使用 sed 来实现这一点?

例如:

% cat myfile
one,two,three,dropthis,dropthat,six,...

% sed -i 's/someregex//' myfile

% cat myfile
one,two,three,,six,...

How do I remove or address a specific occurrence of a character in sed?

I'm editing a CSV file and I want to remove all text between the third and the fifth occurrence of the comma (that is, dropping fields four and five) . Is there any way to achieve this using sed?

E.g:

% cat myfile
one,two,three,dropthis,dropthat,six,...

% sed -i 's/someregex//' myfile

% cat myfile
one,two,three,,six,...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

人事已非 2024-11-09 01:15:21

如果可以考虑 cut 命令,那么:

$ cut -d, -f1-3,6- file

If it is okay to consider cut command then:

$ cut -d, -f1-3,6- file
菩提树下叶撕阳。 2024-11-09 01:15:21

awk 或任何其他能够在分隔符上分割字符串的工具比

$ cat file
1,2,3,4,5,6,7,8,9,10

sed Ruby(1.9+)更适合这项工作

$ ruby -ne 's=$_.split(","); s[2,3]=nil ;puts s.compact.join(",") ' file
1,2,6,7,8,9,10

使用 awk 的

$ awk 'BEGIN{FS=OFS=","}{$3=$4=$5="";}{gsub(/,,*/,",")}1'  file
1,2,6,7,8,9,10

awk or any other tools that are able to split strings on delimiters are better for the job than sed

$ cat file
1,2,3,4,5,6,7,8,9,10

Ruby(1.9+)

$ ruby -ne 's=$_.split(","); s[2,3]=nil ;puts s.compact.join(",") ' file
1,2,6,7,8,9,10

using awk

$ awk 'BEGIN{FS=OFS=","}{$3=$4=$5="";}{gsub(/,,*/,",")}1'  file
1,2,6,7,8,9,10
晨曦÷微暖 2024-11-09 01:15:21

实际的解析器

#!/usr/bin/python

import csv
import sys

cr = csv.reader(open('my-data.csv', 'rb'))
cw = csv.writer(open('stripped-data.csv', 'wb'))

for row in cr:
    cw.writerow(row[0:3] + row[5:])

但请注意csv 模块的前言

所谓的CSV(逗号分隔
Values) 格式是最常见的
导入和导出格式为
电子表格和数据库。有
没有“CSV 标准”,所以格式是
在操作上由许多人定义
读取和写入它的应用程序。
缺乏标准意味着
细微的差别往往存在于
产生和消耗的数据
不同的应用。这些
差异可能会让人烦恼
处理多个 CSV 文件
来源。尽管如此,虽然分隔符
并且引用字符有所不同,
整体格式足够相似
可以写一个
模块可以有效地
操纵这些数据,隐藏
阅读和写作的细节
来自程序员的数据。

$ cat my-data.csv
1
1,2
1,2,3
1,2,3,4,
1,2,3,4,5
1,2,3,4,5,6
1,2,3,4,5,6,
1,2,,4,5,6
1,2,"3,3",4,5,6
1,"2,2",3,4,5,6
,,3,4,5
,,,4,5
,,,,5
$ python csvdrop.py
$ cat stripped-data.csv
1
1,2
1,2,3
1,2,3
1,2,3
1,2,3,6
1,2,3,6,
1,2,,6
1,2,"3,3",6
1,"2,2",3,6
,,3
,,
,,

A real parser in action

#!/usr/bin/python

import csv
import sys

cr = csv.reader(open('my-data.csv', 'rb'))
cw = csv.writer(open('stripped-data.csv', 'wb'))

for row in cr:
    cw.writerow(row[0:3] + row[5:])

But do note the preface to the csv module:

The so-called CSV (Comma Separated
Values) format is the most common
import and export format for
spreadsheets and databases. There is
no “CSV standard”, so the format is
operationally defined by the many
applications which read and write it.
The lack of a standard means that
subtle differences often exist in the
data produced and consumed by
different applications. These
differences can make it annoying to
process CSV files from multiple
sources. Still, while the delimiters
and quoting characters vary, the
overall format is similar enough that
it is possible to write a single
module which can efficiently
manipulate such data, hiding the
details of reading and writing the
data from the programmer.

$ cat my-data.csv
1
1,2
1,2,3
1,2,3,4,
1,2,3,4,5
1,2,3,4,5,6
1,2,3,4,5,6,
1,2,,4,5,6
1,2,"3,3",4,5,6
1,"2,2",3,4,5,6
,,3,4,5
,,,4,5
,,,,5
$ python csvdrop.py
$ cat stripped-data.csv
1
1,2
1,2,3
1,2,3
1,2,3
1,2,3,6
1,2,3,6,
1,2,,6
1,2,"3,3",6
1,"2,2",3,6
,,3
,,
,,
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文