1cli 中文文档教程

发布于 4年前 浏览 26 项目主页 更新于 3年前

1cli

在“单行”shell 命令行中使用 JavaScript 代码片段 文件修改。

  • 特别支持 json,jsonl, 逗号分隔的 csv 文件、制表符分隔的 csv 文件和管道 | 分离的 csv 文件。

  • 有用功能的简称:

    • _ 代表当前“行”。

    • _p(...) 类似于 console.log(...)

    • _j(o)o 转换为 json 或从 json 转换。

例如,要将 .csv 文件转换为 .jsonl 文件:

# show contents of csv file:
$ cat examples/vm-ips.csv 
user,course,ip
john,cs123,192.168.1.2
bill,cs223,192.168.1.3
mary,cs123,10.1.2.3
sue,cs223,192.168.1.4
# convert
$ $ 1cli -e '_p(_j(_))'  examples/vm-ips.csv > ~/tmp/vm-ips.jsonl
# show result
$ cat ~/tmp/vm-ips.jsonl 
{"user":"john","course":"cs123","ip":"192.168.1.2"}
{"user":"bill","course":"cs223","ip":"192.168.1.3"}
{"user":"mary","course":"cs123","ip":"10.1.2.3"}
{"user":"sue","course":"cs223","ip":"192.168.1.4"}
$

输入文件的 .csv 扩展名导致它被 读入对象列表。 -e --eval 选项将运行 提供代码块_p(_j(_)),打印出每个csv对象 作为 json。

以下示例显示了转换同一文件 vm-ips.csv 放入一个包含每个用户密钥的 json 文件中:

$ 1cli -e 'BEGIN o={}' -e 'o[_.user]=_' -e 'END _p(_j(o))' \
     examples/vm-ips.csv >~/tmp/vm-ips.json
#show output (edited for limiting line length)
$ cat ~/tmp/vm-ips.json
{"john":{"user":"john","course":"cs123","ip":"192.168.1.2"},
 "bill":{"user":"bill","course":"cs223","ip":"192.168.1.3"},
  "mary":{"user":"mary","course":"cs123","ip":"10.1.2.3"},
  "sue":{"user":"sue","course":"cs223","ip":"192.168.1.4"}
}
$

BEGIN 代码块仅在开始时执行一次 脚本的一部分,用于设置一个空对象 o。 随后的代码块将每个 csv 对象添加到 o 对象 在其 user 键下。

以下示例过滤 vm-ips.csv 以生成 一个 jsonl 包含所有具有 ip10 开头的 csv 对象。

$ 1cli -e '_.ip.startsWith('10.') && _p(_j(_))' examples/vm-ips.csv 
{"user":"mary","course":"cs123","ip":"10.1.2.3"}
$

以下示例使用之前生成的 vm-ips.json 文件和一个新的 student-info.jsonl 文件一起加入 相同 ID 的条目。 请注意,我们匹配 user 字段 到来自 student-info.jsonl 的电子邮件的 @ 之前的部分 文件:

# show student-infos.jsonl input file
$ cat examples/student-info.jsonl 
{ "id": "123-465-mar", "first": "Mary", "last": "Traub", "email": "mary@x.com" }
{ "id": "132-456-sue", "first": "Sue", "last": "Rawls", "email": "sue@x.com" }
{ "id": "123-456-jhn", "first": "John", "last": "Smith", "email": "john@x.com" }
{ "id": "123-456-bil", "first": "Bill", "last": "Gray", "email": "bill@x.com" }
# join user from vm-ips.json to email id from student-infos.json
$ 1cli -e 'u=_.email.m(/^[^@]+/)[0]; x={..._contents[0][u], ..._}; _p(_j(x))' \
    ~/tmp/vm-ips.json examples/student-info.jsonl \
    > ~/tmp/student-vm-ips.jsonl
# show results (edited for line-length)
$ cat ~/tmp/student-vm-ips.jsonl 
{"user":"mary","course":"cs123","ip":"10.1.2.3","id":"123-465-mar",
    "first":"Mary","last":"Traub","email":"mary@x.com"}
{"user":"sue","course":"cs223","ip":"192.168.1.4","id":"132-456-sue",
    "first":"Sue","last":"Rawls","email":"sue@x.com"}
{"user":"john","course":"cs123","ip":"192.168.1.2","id":"123-456-jhn",
    "first":"John","last":"Smith","email":"john@x.com"}
{"user":"bill","course":"cs223","ip":"192.168.1.3","id":"123-456-bil",
    "first":"Bill","last":"Gray","email":"bill@x.com"}
$

这是使用 --man 选项生成的手册:

$ 1cli --man
Usage: 1cli [options] [...path]

Options:
  -d, --debug                  output generated functions on stderr
  -e, --eval <code>            evaluate <code>. Can be specified multiple
                               times. If --loop, then evaluate for each _
                               "line". If <code> starts with BEGIN/END then
                               evaluate only at start/end.  (default: [])
  -f, --field-sep <sep>        use <sep> to split _ line into _0, _1, ... when
                               --loop (default: "/\\s+/")
  -h, --help                   display help for command
  -l, --no-loop                run code only once
  -L, --loop                   repeat code for each _ "line" (default: true)
  -m, --no-monkey              do not monkey-patch standard classes
  -M, --monkey                 monkey-patch standard classes (default: true)
  --man                        output manual on stdout
  -p, --no-print               do not print _ "line" after each loop iteration
  -P, --print                  print _ "line" after each loop iteration
                               (default: false)
  --src <path>                 like specifying <path> in [...path]; always
                               recognize --ext extensions and loop over "lines"
                               when applicable
  --src-l <path>               alias for --src-no-loop
  --src-lx <path>              alias for --src-no-ext-no-split
  --src-no-ext <path>          like specifying <path> in [...path]; loop over
                               "lines" but do not recognize special -X
                               extensions
  --src-no-loop <path>         like specifying <path> in [...path]; recognize
                               --ext extensions but do not loop over "lines"
  --src-no-loop-no-ext <path>  like specifying <path> in [...path] arguments;
                               do not recognize special extensions; do not loop
                               over "lines"
  --src-x <path>               alias for --src-no-ext
  -v, --version                output the version number
  -x, --no-ext                 no special handling for extensions in [path...]
  -X, --ext                    special handling for json, jsonl, csv, psv, tsv
                               extensions in [path...] (default: true)

Process files specified by [...path] or by --src* options
using --eval <code> blocks. 

If a path is specified as '-', then read from standard input; if an
extension is required, then attempt to guess an extension based on the
initial content.

If --loop, then repeat <code> blocks for each "line" of file contents.
A <code> block starting with 'BEGIN' is executed only once at the
start. A <code> block starting with 'END' is executed only once at the
end.

Unless extension processing has been turned off by specifying
--no-ext or by using the --src-*no-ext options, the following
special extensions are recognized:

  .csv:       parsed as comma-separated CSV with first line as header
  .json:      parsed as JSON content; never split into lines.
  .jsonl:     each line parsed as JSON; always split into lines
  .psv:       parsed as pipe '|' separated CSV with first line as header
  .tsv:       parsed as tab-separated CSV with first line as header


Note that all of the above extensions except .json are read
in as an array of objects and processed within the --eval
loop blocks (unless --no-loop or --src-no-loop* is specified).

The code for each block has access to the following constants:

  _contents:  array of contents of all files specified by <path...> or --src
  _d:         _d(path): return array of contents of directory dir
  _entries:   _entries(obj) => Object.entries(obj)
  _f:         _f(path): returns array of "lines" from path
  _j:         _j(arg) => convert arg to/from JSON
  _keys:      _keys(obj) => Object.keys(obj)
  _p:         _p(...) is an alias for console.log(...)
  _paths:     array of paths of all files specified by <path...> or --src
  _values:    _values(obj) => Object.values(obj)
  _x:         _x(cmd) returns stdout for executing shell command cmd


When a block is being executed repeatedly because
of the --loop option, it has access to the following additional
variables:

  _:          current "line" being processed
  _c:         contents of current path
  _n:         current line number (1-origin)
  _path:      current path being processed


Specifying --monkey-patch, patches standard classes with
convenience methods:

  m:          str.m(...) => str.match(...); results[0, 1]...] put into $0, $1...
  r:          str.r(...) => str.replace(...)
  s:          str.s(...) => str.split(...)


$ 

1cli

Use JavaScript code snippets within "one-line" shell command-lines for file munging.

  • Special support for json, jsonl, comma-separated csv files, tab-separated csv files and pipe | separated csv files.

  • Short names for useful functionality:

    • _ represents the current "line".

    • _p(...) like console.log(...).

    • _j(o) convert o to/from json.

For example, to convert a .csv file to a .jsonl file:

# show contents of csv file:
$ cat examples/vm-ips.csv 
user,course,ip
john,cs123,192.168.1.2
bill,cs223,192.168.1.3
mary,cs123,10.1.2.3
sue,cs223,192.168.1.4
# convert
$ $ 1cli -e '_p(_j(_))'  examples/vm-ips.csv > ~/tmp/vm-ips.jsonl
# show result
$ cat ~/tmp/vm-ips.jsonl 
{"user":"john","course":"cs123","ip":"192.168.1.2"}
{"user":"bill","course":"cs223","ip":"192.168.1.3"}
{"user":"mary","course":"cs123","ip":"10.1.2.3"}
{"user":"sue","course":"cs223","ip":"192.168.1.4"}
$

The .csv extension on the input file results in it being read into a list of object. The -e --eval option will run provided code block _p(_j(_)), printing out each csv object as json.

The following example shows converting the same file vm-ips.csv into a json file with keys for each user:

$ 1cli -e 'BEGIN o={}' -e 'o[_.user]=_' -e 'END _p(_j(o))' \
     examples/vm-ips.csv >~/tmp/vm-ips.json
#show output (edited for limiting line length)
$ cat ~/tmp/vm-ips.json
{"john":{"user":"john","course":"cs123","ip":"192.168.1.2"},
 "bill":{"user":"bill","course":"cs223","ip":"192.168.1.3"},
  "mary":{"user":"mary","course":"cs123","ip":"10.1.2.3"},
  "sue":{"user":"sue","course":"cs223","ip":"192.168.1.4"}
}
$

The BEGIN code block is executed only once at the start of the script and is used to set up an empty object o. The subsequent code block adds each csv object to the o object under its user key.

The following example filters vm-ips.csv to produce a jsonl contains all csv objects have ip's starting with 10.

$ 1cli -e '_.ip.startsWith('10.') && _p(_j(_))' examples/vm-ips.csv 
{"user":"mary","course":"cs123","ip":"10.1.2.3"}
$

The following example uses the previously generated vm-ips.json file along with a new student-info.jsonl file to join entries for the same id's. Note that we match the user field to the portion before the @ of the email from the student-info.jsonl file:

# show student-infos.jsonl input file
$ cat examples/student-info.jsonl 
{ "id": "123-465-mar", "first": "Mary", "last": "Traub", "email": "mary@x.com" }
{ "id": "132-456-sue", "first": "Sue", "last": "Rawls", "email": "sue@x.com" }
{ "id": "123-456-jhn", "first": "John", "last": "Smith", "email": "john@x.com" }
{ "id": "123-456-bil", "first": "Bill", "last": "Gray", "email": "bill@x.com" }
# join user from vm-ips.json to email id from student-infos.json
$ 1cli -e 'u=_.email.m(/^[^@]+/)[0]; x={..._contents[0][u], ..._}; _p(_j(x))' \
    ~/tmp/vm-ips.json examples/student-info.jsonl \
    > ~/tmp/student-vm-ips.jsonl
# show results (edited for line-length)
$ cat ~/tmp/student-vm-ips.jsonl 
{"user":"mary","course":"cs123","ip":"10.1.2.3","id":"123-465-mar",
    "first":"Mary","last":"Traub","email":"mary@x.com"}
{"user":"sue","course":"cs223","ip":"192.168.1.4","id":"132-456-sue",
    "first":"Sue","last":"Rawls","email":"sue@x.com"}
{"user":"john","course":"cs123","ip":"192.168.1.2","id":"123-456-jhn",
    "first":"John","last":"Smith","email":"john@x.com"}
{"user":"bill","course":"cs223","ip":"192.168.1.3","id":"123-456-bil",
    "first":"Bill","last":"Gray","email":"bill@x.com"}
$

Here is the manual produced using the --man option:

$ 1cli --man
Usage: 1cli [options] [...path]

Options:
  -d, --debug                  output generated functions on stderr
  -e, --eval <code>            evaluate <code>. Can be specified multiple
                               times. If --loop, then evaluate for each _
                               "line". If <code> starts with BEGIN/END then
                               evaluate only at start/end.  (default: [])
  -f, --field-sep <sep>        use <sep> to split _ line into _0, _1, ... when
                               --loop (default: "/\\s+/")
  -h, --help                   display help for command
  -l, --no-loop                run code only once
  -L, --loop                   repeat code for each _ "line" (default: true)
  -m, --no-monkey              do not monkey-patch standard classes
  -M, --monkey                 monkey-patch standard classes (default: true)
  --man                        output manual on stdout
  -p, --no-print               do not print _ "line" after each loop iteration
  -P, --print                  print _ "line" after each loop iteration
                               (default: false)
  --src <path>                 like specifying <path> in [...path]; always
                               recognize --ext extensions and loop over "lines"
                               when applicable
  --src-l <path>               alias for --src-no-loop
  --src-lx <path>              alias for --src-no-ext-no-split
  --src-no-ext <path>          like specifying <path> in [...path]; loop over
                               "lines" but do not recognize special -X
                               extensions
  --src-no-loop <path>         like specifying <path> in [...path]; recognize
                               --ext extensions but do not loop over "lines"
  --src-no-loop-no-ext <path>  like specifying <path> in [...path] arguments;
                               do not recognize special extensions; do not loop
                               over "lines"
  --src-x <path>               alias for --src-no-ext
  -v, --version                output the version number
  -x, --no-ext                 no special handling for extensions in [path...]
  -X, --ext                    special handling for json, jsonl, csv, psv, tsv
                               extensions in [path...] (default: true)

Process files specified by [...path] or by --src* options
using --eval <code> blocks. 

If a path is specified as '-', then read from standard input; if an
extension is required, then attempt to guess an extension based on the
initial content.

If --loop, then repeat <code> blocks for each "line" of file contents.
A <code> block starting with 'BEGIN' is executed only once at the
start. A <code> block starting with 'END' is executed only once at the
end.

Unless extension processing has been turned off by specifying
--no-ext or by using the --src-*no-ext options, the following
special extensions are recognized:

  .csv:       parsed as comma-separated CSV with first line as header
  .json:      parsed as JSON content; never split into lines.
  .jsonl:     each line parsed as JSON; always split into lines
  .psv:       parsed as pipe '|' separated CSV with first line as header
  .tsv:       parsed as tab-separated CSV with first line as header


Note that all of the above extensions except .json are read
in as an array of objects and processed within the --eval
loop blocks (unless --no-loop or --src-no-loop* is specified).

The code for each block has access to the following constants:

  _contents:  array of contents of all files specified by <path...> or --src
  _d:         _d(path): return array of contents of directory dir
  _entries:   _entries(obj) => Object.entries(obj)
  _f:         _f(path): returns array of "lines" from path
  _j:         _j(arg) => convert arg to/from JSON
  _keys:      _keys(obj) => Object.keys(obj)
  _p:         _p(...) is an alias for console.log(...)
  _paths:     array of paths of all files specified by <path...> or --src
  _values:    _values(obj) => Object.values(obj)
  _x:         _x(cmd) returns stdout for executing shell command cmd


When a block is being executed repeatedly because
of the --loop option, it has access to the following additional
variables:

  _:          current "line" being processed
  _c:         contents of current path
  _n:         current line number (1-origin)
  _path:      current path being processed


Specifying --monkey-patch, patches standard classes with
convenience methods:

  m:          str.m(...) => str.match(...); results[0, 1]...] put into $0, $1...
  r:          str.r(...) => str.replace(...)
  s:          str.s(...) => str.split(...)


$ 
    我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
    原文