使用 grep 替换 bbedit 中第一个模式之后的每个模式实例

发布于 2024-11-29 10:16:52 字数 1290 浏览 0 评论 0原文

所以我有一个非常长的txt文件，遵循这个模式：

},
"303" :
{
   "id" : "4k4hk2l",
   "color" : "red",
   "moustache" : "no"
},
"303" :
{
   "id" : "4k52k2l",
   "color" : "red",
   "moustache" : "yes"
},
"303" :
{
   "id" : "fask2l",
   "color" : "green",
   "moustache" : "yes"
},
"304" :
{
   "id" : "4k4hf4f4",
   "color" : "red",
   "moustache" : "yes"
},
"304" :
{
   "id" : "tthj2l",
   "color" : "red",
   "moustache" : "yes"
},
"304" :
{
   "id" : "hjsk2l",
   "color" : "green",
   "moustache" : "no"
},
"305" :
{
   "id" : "h6shgfbs",
   "color" : "red",
   "moustache" : "no"
},
"305" :
{
   "id" : "fdh33hk7",
   "color" : "cyan",
   "moustache" : "yes"
},

我试图将它格式化为具有以下结构的正确json对象....

"303" :
   { "list" : [
     {
      "id" : "4k4hk2l",
      "color" : "red",
      "moustache" : "no"
     },
     {
      "id" : "4k52k2l",
      "color" : "red",
      "moustache" : "yes"
     },
     {
      "id" : "fask2l",
      "color" : "green",
      "moustache" : "yes"
     }
    ]
   }
"304" :
   { "list" : [
 etc...

这意味着我寻找^“\ d \ d的所有模式\d" ：保留第一个唯一的，但删除所有后续的（例如，保留“303”的第一个实例：，但完全删除其余的。然后保留“304”的第一个实例：，但完全删除删除所有其余的， ETC。）。

我一直在尝试在 bbedit 应用程序中执行此操作，该应用程序具有用于搜索/替换的 grep 选项。我的模式匹配 fu 太弱，无法完成此任务。有什么想法吗？或者有更好的方法来完成这项任务？

原文

So I've got a really long txt file that follows this pattern:

},
"303" :
{
   "id" : "4k4hk2l",
   "color" : "red",
   "moustache" : "no"
},
"303" :
{
   "id" : "4k52k2l",
   "color" : "red",
   "moustache" : "yes"
},
"303" :
{
   "id" : "fask2l",
   "color" : "green",
   "moustache" : "yes"
},
"304" :
{
   "id" : "4k4hf4f4",
   "color" : "red",
   "moustache" : "yes"
},
"304" :
{
   "id" : "tthj2l",
   "color" : "red",
   "moustache" : "yes"
},
"304" :
{
   "id" : "hjsk2l",
   "color" : "green",
   "moustache" : "no"
},
"305" :
{
   "id" : "h6shgfbs",
   "color" : "red",
   "moustache" : "no"
},
"305" :
{
   "id" : "fdh33hk7",
   "color" : "cyan",
   "moustache" : "yes"
},

and I'm trying to format it to be a proper json object with the following structure....

"303" :
   { "list" : [
     {
      "id" : "4k4hk2l",
      "color" : "red",
      "moustache" : "no"
     },
     {
      "id" : "4k52k2l",
      "color" : "red",
      "moustache" : "yes"
     },
     {
      "id" : "fask2l",
      "color" : "green",
      "moustache" : "yes"
     }
    ]
   }
"304" :
   { "list" : [
 etc...

meaning I look for all patterns of ^"\d\d\d" : and leave the first unique one , but remove all the subsequent ones (example, leave first instance of "303" :, but completely remove the rest of them. then leave the first instance of "304" :, but completely remove all the rest of them, etc.).

I've been attempting to do this within the bbedit application, which has a grep option for search/replace. My pattern matching fu is too weak to accomplish this. Any ideas? Or a better way to accomplish this task?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

灯角 2024-12-06 10:16:52

您无法捕获重复捕获组。捕获将始终仅包含组中的最后一场比赛。因此，除了以模式重复您的组之外，您无法通过单个搜索/替换来完成此操作。但即便如此，只有当您知道结果组中元素的最大数量时，这才可能是一个解决方案。

假设我们有一个 tring，它是数据的简化版本：

1a;1b;1c;1d;1e;2d;2e;2f;2g;3x;3y;3z;

我们看到元素的最大计数为 5，因此我们重复捕获组 5 次。

/([0-9])([a-z]*);?(\1([a-z]);)?(\1([a-z]);)?(\1([a-z]);)?(\1([a-z]);)?/

并将其替换为

\1:\2\4\6\8\10;

然后我们得到所需的结果：

1:abcde;2:defg;3:xyz;

如果您非常着急（两天后我想您不会），您可以将此技术应用于您的数据，但使用某种脚本语言将是更好、更干净的解决方案。

对于我的简化示例，您必须迭代 /([0-9])[az];?(\1[az];?)*/ 的匹配。这些将是：

1a;1b;1c;1d;1e;
2d;2e;2f;2g;
3x;3y;3z;

在那里您可以捕获所有值并将它们绑定到响应键，该键每次迭代只有一个。

You can't capture repeating capturing group. The capture will always contain only last match of a group. So there's no way you can do this with a single search/replace except of dumb repeating your group in pattern. But even that can be a solution only if you know a max count of elements in resulting groups.

Say we have a tring that is a simplified version of your data:

1a;1b;1c;1d;1e;2d;2e;2f;2g;3x;3y;3z;

We see that maximum count of element is 5, so we repeat the capturing group 5 times.

/([0-9])([a-z]*);?(\1([a-z]);)?(\1([a-z]);)?(\1([a-z]);)?(\1([a-z]);)?/

And replace that with

\1:\2\4\6\8\10;

Then we get desired result:

1:abcde;2:defg;3:xyz;

You can apply this technique to your data if you're in great hurry (and after 2 days I suppose you don't), but using some scripting language will be better and cleaner solution.

For my simplified example you have to iterate through matches of /([0-9])[a-z];?(\1[a-z];?)*/. Those will be: