BASH：将数据从平面文件导入模板

发布于 2024-12-25 05:48:17 字数 1194 浏览 9 评论 0原文

我有一个平面记录文件，每行有 33 行。我需要将此文件格式化为模板中的规格。模板为 DOS 格式，源文件为 NIX 格式。该模板具有必须遵守的特定缩进和间距。我想到了几个选项：

BASH 与经典的 nix 工具：sed、awk、grep 等...
BASH 与模板工具包
Perl eith 模板工具包
Perl

这些是按照我熟悉的顺序排列的。这是一个示例源记录（NIX 格式）：我减少了换行符的数量以节省空间（通常为 33 行）：

JACKSON HOLE SANITARIUM AND REPTILE ZOO
45 GREASY HOLLER LN
JACKSON HOLE, AK   99999


Change Service Requested


BUBBA HOTEP
3 DELIVERANCE RD
MINNEAPOLIS, MN   99998


BUBBA HOTEP 09090909090909

You have a hold available for pickup as of 2012-01-04:

Title: Banjo for Fun and Profit
Author: Williams, Billy Dee
Price: $10

这是模板（DOS 格式 - 行数减少 - 通常为 66 行）：

     <%BRANCH-NAME%>
     <%BRANCH-ADDR%>
     <%BRANCH-CTY%>


<%CUST-NAME%> <%BARCODE%>
You have a hold available for pickup as of <%DATE%>:

Title: <%TITLE%>
Author: <%AUTHOR%>
Price: <%PRICE%>


             <%CUST-NAME%>
             <%CUST-ADDR%>
             <%CUST-CTY%>

end of file

它实际上确实在每条记录的末尾表示“文件结束”。

想法？我倾向于把事情过于复杂化。

UPDATE2

弄清楚了。

我的回答如下。请随意提出改进建议。

原文

I have a flat file of records, each 33 lines long. I need to format this file to specs in a template. The template is in DOS format while the source file is in NIX format. The template has specific indenting and spacing which must be adhered to. I've thought of a few options:

BASH with classic nix tools: sed, awk, grep etc...
BASH with template toolkit
Perl eith template toolkit
Perl

These are in order of my familiarity. Here's a sample source record ( NIX format ):
I've reduced the number of newlines to save space ( normally 33 lines ):

JACKSON HOLE SANITARIUM AND REPTILE ZOO
45 GREASY HOLLER LN
JACKSON HOLE, AK   99999


Change Service Requested


BUBBA HOTEP
3 DELIVERANCE RD
MINNEAPOLIS, MN   99998


BUBBA HOTEP 09090909090909

You have a hold available for pickup as of 2012-01-04:

Title: Banjo for Fun and Profit
Author: Williams, Billy Dee
Price: $10

Here's the template ( DOS format -- lines reduced - 66 lines normally):

     <%BRANCH-NAME%>
     <%BRANCH-ADDR%>
     <%BRANCH-CTY%>


<%CUST-NAME%> <%BARCODE%>
You have a hold available for pickup as of <%DATE%>:

Title: <%TITLE%>
Author: <%AUTHOR%>
Price: <%PRICE%>


             <%CUST-NAME%>
             <%CUST-ADDR%>
             <%CUST-CTY%>

end of file

It actually does say "end of file" at the end of each record.

Thoughts? I tend to over-complicate things.

UPDATE2

Figured it out.

My answer is below. Feel free to suggest improvements.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

江南月 2025-01-01 05:48:17

作为初学者，这里有一个提示：Perl HERE-documents（仅显示一些替换作为演示）：

#!/usr/bin/perl
use strict;
use warnings;

my @lines = qw/branchname cust_name barcode bogus whatever/; # (<>);

my ($branchname, $cust_name, $barcode, undef, $whatever) = @lines;

print <<TEMPLATE;
     $branchname
     <%BRANCH-ADDR%>
     <%BRANCH-CTY%>


$cust_name $barcode
You have a hold available for pickup as of <%DATE%>:

Title: <%TITLE%>
Author: <%AUTHOR%>
Price: <%PRICE%>


             $cust_name
             <%CUST-ADDR%>
             <%CUST-CTY%>

end of file
TEMPLATE

将虚拟输入数组替换为从标准输入读取的行 (<>)如果你愿意的话。（使用循环读取 n 行并将其推送到数组（如果效率更高）。我只是展示了要点，根据需要添加更多变量，并通过为“capture”变量指定 undef 来跳过输入行（如图所示）。

现在，只需将这些变量插入到您的文本中即可。

如果行结束给你带来任何麻烦，请考虑使用 chomp 例如：

my @lines = (<>); # just read em all...
my @cleaned = map { chomp } @lines;

As a starter, here is a hint: Perl HERE-documents (showing just a few substitutions as a demo):

#!/usr/bin/perl
use strict;
use warnings;

my @lines = qw/branchname cust_name barcode bogus whatever/; # (<>);

my ($branchname, $cust_name, $barcode, undef, $whatever) = @lines;

print <<TEMPLATE;
     $branchname
     <%BRANCH-ADDR%>
     <%BRANCH-CTY%>


$cust_name $barcode
You have a hold available for pickup as of <%DATE%>:

Title: <%TITLE%>
Author: <%AUTHOR%>
Price: <%PRICE%>


             $cust_name
             <%CUST-ADDR%>
             <%CUST-CTY%>

end of file
TEMPLATE

Replace the dummy input array with the lines read from the stdin with (<>) if you will. (Use a loop reading n lines and push it to the array if that is more efficient). I just showed the gist, add more variables as required, and skip input lines by specifting undef for the 'capture' variable (as shown).

Now, simply interpolate these variables into your text.

If line-ends are giving you any grief, consider using chomp eg.:

my @lines = (<>); # just read em all...
my @cleaned = map { chomp } @lines;

回复收藏 0 原文

℉絮湮 2025-01-01 05:48:17

这就是我在这个项目中使用的。请随意提出改进建议，或者提交更好的解决方案。

cp $FILE $WORKING # we won't mess with original

NUM_RECORDS=$( grep "^Price:" "$FILE" | wc -l ) # need to know how many records we have 
                                              # counting occurences of end of record r

TMP=record.txt # holds single record, used as temp storage in loop below

# Sanity
# Make sure temp storage exists. If not create -- if so, clear it.
[ ! -f $TMP ] && touch $TMP || cat /dev/null >$TMP

# functions
function make_template () {
    local _file="$1"
    mapfile -t filecontent < "$_file"
    _loc_name="${filecontent[0]}"
    _loc_strt="${filecontent[1]}"
    _loc_city="${filecontent[2]}"
    _pat_name="${filecontent[14]}"
    _pat_addr="${filecontent[15]}"
    _pat_city="${filecontent[16]}"
    _barcode=${filecontent[27]:(-14)} # pull barcode from end of string
    _date=${filecontent[29]:(-11)}    # pull date from end of string
    # Test title length - truncate if necessary - 70 chars.
    _title=$(grep -E "^Title:" $_file)
    MAXLEN=70
    [ "${#_title}" -gt "$MAXLEN" ] && _title="${filecontent[31]:0:70}" || :
    _auth=$(grep -E "^Author:" $_file)
    _price=$(grep -E "^Price:" $_file)
    sed "
        s@<%BRANCH-NAME%>@${_loc_name}@g
        s@<%BRANCH-ADDR%>@${_loc_strt}@g
        s@<%BRANCH-CTY%>@${_loc_city}@g
        s@<%CUST-NAME%>@${_pat_name}@g
        s@<%CUST-ADDR%>@${_pat_addr}@
        s@<%CUST-CTY%>@${_pat_city}@
        s@<%BARCODE%>@${_barcode}@g
        s@<%DATE%>@${_date}@
        s@<%TITLE%>@${_title}@
        s@<%AUTHOR%>@${_auth}@
        s@<%PRICE%>@${_price}@" "$TEMPLATE"
}

####################################
#  MAIN
####################################

for((i=1;i<="$NUM_RECORDS";i++))
do
    sed -n '1,/^Price:/{p;}' "$WORKING" >"$TMP"  # copy first record with end of record
                                                # and copy to temp storage.

    sed -i '1,/^Price:/d' "$WORKING"             # delete first record using EOR regex.
    make_template "$TMP"                        # send temp file/record to template fu
done

# cleanup
exit 0

This is what I am using for this project. Feel free to suggest improvements, or, submit better solutions.

cp $FILE $WORKING # we won't mess with original

NUM_RECORDS=$( grep "^Price:" "$FILE" | wc -l ) # need to know how many records we have 
                                              # counting occurences of end of record r

TMP=record.txt # holds single record, used as temp storage in loop below

# Sanity
# Make sure temp storage exists. If not create -- if so, clear it.
[ ! -f $TMP ] && touch $TMP || cat /dev/null >$TMP

# functions
function make_template () {
    local _file="$1"
    mapfile -t filecontent < "$_file"
    _loc_name="${filecontent[0]}"
    _loc_strt="${filecontent[1]}"
    _loc_city="${filecontent[2]}"
    _pat_name="${filecontent[14]}"
    _pat_addr="${filecontent[15]}"
    _pat_city="${filecontent[16]}"
    _barcode=${filecontent[27]:(-14)} # pull barcode from end of string
    _date=${filecontent[29]:(-11)}    # pull date from end of string
    # Test title length - truncate if necessary - 70 chars.
    _title=$(grep -E "^Title:" $_file)
    MAXLEN=70
    [ "${#_title}" -gt "$MAXLEN" ] && _title="${filecontent[31]:0:70}" || :
    _auth=$(grep -E "^Author:" $_file)
    _price=$(grep -E "^Price:" $_file)
    sed "
        s@<%BRANCH-NAME%>@${_loc_name}@g
        s@<%BRANCH-ADDR%>@${_loc_strt}@g
        s@<%BRANCH-CTY%>@${_loc_city}@g
        s@<%CUST-NAME%>@${_pat_name}@g
        s@<%CUST-ADDR%>@${_pat_addr}@
        s@<%CUST-CTY%>@${_pat_city}@
        s@<%BARCODE%>@${_barcode}@g
        s@<%DATE%>@${_date}@
        s@<%TITLE%>@${_title}@
        s@<%AUTHOR%>@${_auth}@
        s@<%PRICE%>@${_price}@" "$TEMPLATE"
}

####################################
#  MAIN
####################################

for((i=1;i<="$NUM_RECORDS";i++))
do
    sed -n '1,/^Price:/{p;}' "$WORKING" >"$TMP"  # copy first record with end of record
                                                # and copy to temp storage.

    sed -i '1,/^Price:/d' "$WORKING"             # delete first record using EOR regex.
    make_template "$TMP"                        # send temp file/record to template fu
done

# cleanup
exit 0

回复收藏 0 原文

~没有更多了~