使用 Erlang Regex 删除空格

发布于 2024-07-20 21:51:11 字数 1112 浏览 8 评论 0原文

我想删除所有空白,即制表符/空格/换行符。

T = {xmlelement,"presence",
                                         [{"xml:lang","en"}],
                                         [{xmlcdata,<<"\n">>},
                                          {xmlelement,"priority",[],
                                              [{xmlcdata,<<"5">>}]},
                                          {xmlcdata,<<"\n">>},
                                          {xmlelement,"c",
                                              [{"xmlns",
                                                "http://jabber.org/protocol/caps"},
                                               {"node","http://psi-im.org/caps"},
                                               {"ver","0.12.1"},
                                               {"ext","cs ep-notify html"}],
                                              []},
                                          {xmlcdata,<<"\n">>}]}.

我尝试了以下方法,但它不起作用:

trim_whitespace(Input) ->
re:replace(Input, "(\r\n)*", "").

I want to remove all the whitespace i..e tabs/spaces/newline chars.

T = {xmlelement,"presence",
                                         [{"xml:lang","en"}],
                                         [{xmlcdata,<<"\n">>},
                                          {xmlelement,"priority",[],
                                              [{xmlcdata,<<"5">>}]},
                                          {xmlcdata,<<"\n">>},
                                          {xmlelement,"c",
                                              [{"xmlns",
                                                "http://jabber.org/protocol/caps"},
                                               {"node","http://psi-im.org/caps"},
                                               {"ver","0.12.1"},
                                               {"ext","cs ep-notify html"}],
                                              []},
                                          {xmlcdata,<<"\n">>}]}.

I tried the following, but it does not work:

trim_whitespace(Input) ->
re:replace(Input, "(\r\n)*", "").

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

霞映澄塘 2024-07-27 21:51:11

如果要删除字符串中的所有内容,则需要将全局选项传递给 re:replace()。 您也只是使用该正则表达式替换换行符。 该调用可能应该如下所示:

trim_whitespace(Input) -> re:replace(Input, "\\s+", "", [global]).

If you want to remove everything in a string, you need to pass the global option to re:replace(). You're also only replacing newlines by using that regex. The call should probably look like this:

trim_whitespace(Input) -> re:replace(Input, "\\s+", "", [global]).
夜空下最亮的亮点 2024-07-27 21:51:11

我遇到了同样的问题......来这里分享我更高效的工作:

trim(Subject) ->
  {match, [[Trimmed]|_]} = re:run(Subject, "^\\s*([^\\s]*(?:.*[^\\s]+)?)\\s*$", 
    [{capture, all_but_first, binary}, global, dollar_endonly, unicode, dotall]),
  Trimmed.

想法非常相似。 正则表达式更好。

I faced the same issue… came here to share my more efficient work:

trim(Subject) ->
  {match, [[Trimmed]|_]} = re:run(Subject, "^\\s*([^\\s]*(?:.*[^\\s]+)?)\\s*$", 
    [{capture, all_but_first, binary}, global, dollar_endonly, unicode, dotall]),
  Trimmed.

The idea is very much the same. The regex is just better.

辞旧 2024-07-27 21:51:11

您问题中的所有空白都在 cdata 部分中 - 为什么不直接从元组中过滤掉这些空白呢?

remove_cdata(List) when is_list(List) ->
    remove_list_cdata(List);
remove_cdata({xmlelement, Name, Attrs, Els}) ->
    {xmlelement, Name, remove_cdata(Attrs), remove_cdata(Els)}.

remove_list_cdata([]) ->
    [];
remove_list_cdata([{xmlcdata,_}|Rest]) ->
    remove_list_cdata(Rest);
remove_list_cdata([E = {xmlelement,_,_,_}|Rest]) ->
    [remove_cdata(E) | remove_list_cdata(Rest)];
remove_list_cdata([Item | Rest]) ->
    [Item | remove_list_cdata(Rest)].


remove_cdata(T) =:= 
    {xmlelement,"presence",
     [{"xml:lang","en"}],
     [{xmlelement,"priority",[],[]},
      {xmlelement,"c",
       [{"xmlns","http://jabber.org/protocol/caps"},
        {"node","http://psi-im.org/caps"},
        {"ver","0.12.1"},
        {"ext","cs ep-notify html"}],
       []}]}

All the whitespace in your question is in cdata sections - why not just filter those out of the tuple?

remove_cdata(List) when is_list(List) ->
    remove_list_cdata(List);
remove_cdata({xmlelement, Name, Attrs, Els}) ->
    {xmlelement, Name, remove_cdata(Attrs), remove_cdata(Els)}.

remove_list_cdata([]) ->
    [];
remove_list_cdata([{xmlcdata,_}|Rest]) ->
    remove_list_cdata(Rest);
remove_list_cdata([E = {xmlelement,_,_,_}|Rest]) ->
    [remove_cdata(E) | remove_list_cdata(Rest)];
remove_list_cdata([Item | Rest]) ->
    [Item | remove_list_cdata(Rest)].


remove_cdata(T) =:= 
    {xmlelement,"presence",
     [{"xml:lang","en"}],
     [{xmlelement,"priority",[],[]},
      {xmlelement,"c",
       [{"xmlns","http://jabber.org/protocol/caps"},
        {"node","http://psi-im.org/caps"},
        {"ver","0.12.1"},
        {"ext","cs ep-notify html"}],
       []}]}
怪我太投入 2024-07-27 21:51:11

re:replace 很棘手,需要记住:

Eshell V5.9.3.1  (abort with ^G)
1> re:replace("0 1  2    3 4 5 6 7 8 9", " ", "", [global, {return, list}]).
"0123456789"
2> re:replace("0 1  2    3 4 5 6 7 8 9", " ", "", [{return, list}]).
"01  2    3 4 5 6 7 8 9"
3> re:replace("0 1  2    3 4 5 6 7 8 9", " ", "").
[<<"0">>,[]|<<"1  2    3 4 5 6 7 8 9">>]

re:replace is tricky, something to keep in mind:

Eshell V5.9.3.1  (abort with ^G)
1> re:replace("0 1  2    3 4 5 6 7 8 9", " ", "", [global, {return, list}]).
"0123456789"
2> re:replace("0 1  2    3 4 5 6 7 8 9", " ", "", [{return, list}]).
"01  2    3 4 5 6 7 8 9"
3> re:replace("0 1  2    3 4 5 6 7 8 9", " ", "").
[<<"0">>,[]|<<"1  2    3 4 5 6 7 8 9">>]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文