脚本要在file1中查找一个单词并复制下一个单词并在file2中替换该单词

发布于 2025-01-23 03:05:08 字数 1172 浏览 0 评论 0原文

我有 file1

(1'a lot of singapore 1.2.3.4 'some other info',0,null, 12345),

(2,'a lot of brazil  4.2.3.1, 'some other info',0,null, 12345),

(3,'a lot of india 3.4.2.1, 'some other info',0,null, 12345),

(4,'a lot of laos 1.3.4.5, 'some other info',0,null, 12345),

(5,'a lot of china 1.2.3.5, 'some other info',0,null, 12345);

file2

(1'a lot of singapore A.B.C.D 'some other info',0,null, 12345),

(2,'a lot of brazil E.F.G.H, 'some other info',0,null, 12345),

(3,'a lot of india H.I.J.K, 'some other info',0,null, 12345),

(4,'a lot of laos L.M.N.O, 'some other info',0,null, 12345),

(5,'a lot of china P.Q.R.S, 'some other info',0,null, 12345);

我创建了一个脚本,但是要复制和替换为行号,但需要输入以在文件1和复制下一个单词1.2.3.4并在File2中查找新加坡,然后从1.2.3.4代码>和最终文件2看起来像这个

(1'a lot of singapore 1.2.3.4 'some other info',0,null, 12345),

Python脚本或尴尬或sed任何脚本都会有所帮助。

到目前为止,我已经创建了这个来复制和替换行号

sed -i '2d' File2.txt
awk 'NR==5380{a=$0}NR==FNR{next}FNR==2{print a}1' file1.txt file2.txt

I have file1

(1'a lot of singapore 1.2.3.4 'some other info',0,null, 12345),

(2,'a lot of brazil  4.2.3.1, 'some other info',0,null, 12345),

(3,'a lot of india 3.4.2.1, 'some other info',0,null, 12345),

(4,'a lot of laos 1.3.4.5, 'some other info',0,null, 12345),

(5,'a lot of china 1.2.3.5, 'some other info',0,null, 12345);

and file2

(1'a lot of singapore A.B.C.D 'some other info',0,null, 12345),

(2,'a lot of brazil E.F.G.H, 'some other info',0,null, 12345),

(3,'a lot of india H.I.J.K, 'some other info',0,null, 12345),

(4,'a lot of laos L.M.N.O, 'some other info',0,null, 12345),

(5,'a lot of china P.Q.R.S, 'some other info',0,null, 12345);

I have created a script but to copy and replace with LINE number but need input to look for SINGAPORE in file 1 and copy next word 1.2.3.4 and look for singapore in file2 and replace the next word here from 1.2.3.4 - A.B.C.D and the final file2 looks like this

(1'a lot of singapore 1.2.3.4 'some other info',0,null, 12345),

Python script or Awk or sed any script will be helpful.

So far I have created this to copy and replace line numbers

sed -i '2d' File2.txt
awk 'NR==5380{a=$0}NR==FNR{next}FNR==2{print a}1' file1.txt file2.txt

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

南街女流氓 2025-01-30 03:05:08

我不确定它是否有效,这是最好的解决方案,但是您需要这样的东西。

import re

def try_to_get_country_data(line, country):
    line_parts = line.split(',')
    part_with_data = line_parts[1]
    
    if (match := re.search(f'.* {country} (.*)', part_with_data)) is not None:
        return match.group(1)
    
    return None
    
if __name__ == "__main__":
    found_data = None
    country = 'singapore'

    with open('some_file.txt', 'r') as f:
        for line in f:
            if (found_data := try_to_get_country_data(line, country)) is not None:
                break

    if found_data is not None:
        with open('second_file.txt', 'r') as f2:
            data = f2.readlines()

        for i, line in enumerate(data):
            if (replaced_data := try_to_get_country_data(line, country)) is not None:
                data[i] = line.replace(replaced_data, found_data)
                break

        with open('second_file.txt', 'w') as f2:
            f2.writelines(data)

因此,我已经检查了它,如果每行的行模式相同,则可以使用。

I'm not sure it will work and it's the best solution, but you need something like this.

import re

def try_to_get_country_data(line, country):
    line_parts = line.split(',')
    part_with_data = line_parts[1]
    
    if (match := re.search(f'.* {country} (.*)', part_with_data)) is not None:
        return match.group(1)
    
    return None
    
if __name__ == "__main__":
    found_data = None
    country = 'singapore'

    with open('some_file.txt', 'r') as f:
        for line in f:
            if (found_data := try_to_get_country_data(line, country)) is not None:
                break

    if found_data is not None:
        with open('second_file.txt', 'r') as f2:
            data = f2.readlines()

        for i, line in enumerate(data):
            if (replaced_data := try_to_get_country_data(line, country)) is not None:
                data[i] = line.replace(replaced_data, found_data)
                break

        with open('second_file.txt', 'w') as f2:
            f2.writelines(data)

So, I've checked it, and it work if line pattern same for each line.

梓梦 2025-01-30 03:05:08

这是一个简单的尴尬脚本,可以从第一个输入文件中查找替换文本,并在第二个输入文件中替换相应的令牌。

awk -v country="singapore" 'NR == FNR {
    for (i=2; i<=NF; i++) if ($(i-1) == country) token = $i; next }
  $0 ~ country { for(i=2; i<=NF; i++) if ($(i-1) == country) $i = token
    } 1' file1 file2 >newfile2

当我们读取file1时,nr == fnr是正确的。我们在输入令牌上循环,并检查一个与country匹配的;如果找到一个,我们将令牌设置为该值。这意味着,如果国家关键字上有多个匹配项,则将提取第一个输入文件中的最后一个。

Next语句使Awk跳过此输入文件的其余脚本,因此仅读取file1的行,而不会进一步处理。

如果我们落到最后一行,我们现在正在阅读file2。如果我们看到包含关键字的行,我们将在country关键字之后对关键字进行替换。 (这要求关键字是一个孤立的令牌,而不是较长单词等的子字符串。第二个文件的任何替换。

如果您对此处使用的数据格式有任何控制权,也许尝试找出一种以较少随意的adphazard ad-hoc格式获得输入的方法,例如JSON。

Here is a simple Awk script to look for the replacement text from the first input file and replace the corresponding token in the second input file.

awk -v country="singapore" 'NR == FNR {
    for (i=2; i<=NF; i++) if ($(i-1) == country) token = $i; next }
  $0 ~ country { for(i=2; i<=NF; i++) if ($(i-1) == country) $i = token
    } 1' file1 file2 >newfile2

When we are reading file1, NR == FNR is true. We loop over the input tokens and check for one which matches country; if we find one, we set token to that value. This means that if there are multiple matches on the country keyword, the last one in the first input file will be extracted.

The next statement causes Awk to skip the rest of the script for this input file, so the lines from file1 are only read, and not processed further.

If we fall through to the last line, we are now reading file2. If we see a line which contains the keyword, we perform a substitution on the keyword after the country keyword. (This requires the keyword to be an isolated token, not a substring within a longer word etc.) The final 1 causes all lines which get this far to be printed back to standard output, thus generating a copy of the second file with any substitutions performed.

If you have any control over the data format used here, perhaps try to figure out a way to get the input in a less haphazard ad-hoc format, like JSON.

网白 2025-01-30 03:05:08

如果您想要一个简短的bash脚本,并且假设文件的结构是常数,则可以尝试这样的事情:

country="singapore"
a=$(grep "${country}" file0 | awk '{print $5}')

if [[ "${a}" ]]
then
    b=$(grep -w "${country}" file1 | awk '{print $5}')
    sed "s/${country} ${b}/${country} ${a}/g" file1
fi

在脚本的输出下方找到:

(1'a lot of singapore 1.2.3.4 'some other info',0,null, 12345),

(2,'a lot of brazil E.F.G.H, 'some other info',0,null, 12345),

(3,'a lot of india H.I.J.K, 'some other info',0,null, 12345),

(4,'a lot of laos L.M.N.O, 'some other info',0,null, 12345),

(5,'a lot of china P.Q.R.S, 'some other info',0,null, 12345);

使用sed -i 为了编辑file1到位。

为了避免多次读取相同的文件并降低可读性,可以轻松地重构如下:

country="singapore"
file0c=$(cat file0)
file1c=$(cat file1)

a=$(echo "${file1c}" | grep -w "${country}" | awk '{print $5}')

if [[ "${a}" ]]
then
    b=$(echo "${file1c}" | grep -w "${country}" | awk '{print $5}')
    echo "${file1c}" | sed "s/${country} ${b}/${country} ${a}/g" | 
    tee file1_new
fi

In case you would like a short bash script and assuming that the structure of the files is constant you could try something like this:

country="singapore"
a=$(grep "${country}" file0 | awk '{print $5}')

if [[ "${a}" ]]
then
    b=$(grep -w "${country}" file1 | awk '{print $5}')
    sed "s/${country} ${b}/${country} ${a}/g" file1
fi

Find below the output of the script:

(1'a lot of singapore 1.2.3.4 'some other info',0,null, 12345),

(2,'a lot of brazil E.F.G.H, 'some other info',0,null, 12345),

(3,'a lot of india H.I.J.K, 'some other info',0,null, 12345),

(4,'a lot of laos L.M.N.O, 'some other info',0,null, 12345),

(5,'a lot of china P.Q.R.S, 'some other info',0,null, 12345);

Use sed -i in order to edit file1 in place.

In order to avoid reading the same file multiple times and reducing a little bit the readability, the initial approach may be easily refactored as follows:

country="singapore"
file0c=$(cat file0)
file1c=$(cat file1)

a=$(echo "${file1c}" | grep -w "${country}" | awk '{print $5}')

if [[ "${a}" ]]
then
    b=$(echo "${file1c}" | grep -w "${country}" | awk '{print $5}')
    echo "${file1c}" | sed "s/${country} ${b}/${country} ${a}/g" | 
    tee file1_new
fi
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文