Javascript正则表达式循环所有匹配

发布于 2024-11-04 07:14:50 字数 1578 浏览 3 评论 0原文

我正在尝试使用堆栈溢出的富文本编辑器做类似的事情。鉴于此文本:

[Text Example][1]

[1][http://www.example.com]

我想循环找到的每个 [string][int] 我这样做:

var Text = "[Text Example][1]\n[1][http: //www.example.com]";
// Find resource links
var arrMatch = null;
var rePattern = new RegExp(
  "\\[(.+?)\\]\\[([0-9]+)\\]",
  "gi"
);
while (arrMatch = rePattern.exec(Text)) {
  console.log("ok");
}

这非常有效,它会针对每个 [string][int] 发出“ok”警报。不过,我需要做的是,对于找到的每个匹配项,用第二个匹配项的组件替换初始匹配项。

因此,在循环中 $2 将代表最初匹配的 int 部分,我将运行此 regexp (pseduo)

while (arrMatch = rePattern.exec(Text)) {
    var FindIndex = $2; // This would be 1 in our example
    new RegExp("\\[" + FindIndex + "\\]\\[(.+?)\\]", "g")

    // Replace original match now with hyperlink
}

这将匹配

[1][http://www.example.com]

第一个示例的最终结果将是:

<a href="http://www.example.com" rel="nofollow">Text Example</a>

编辑

我现在已经得到了:

var Text = "[Text Example][1]\n[1][http: //www.example.com]";
// Find resource links
reg = new RegExp(
  "\\[(.+?)\\]\\[([0-9]+)\\]",
  "gi");
var result;
while ((result = reg.exec(Text)) !== null) {
  var LinkText = result[1];
  var Match = result[0];
  Text = Text.replace(new RegExp(Match, "g"), '<a href="#">" + LinkText + "</a>');
}
console.log(Text);

I'm trying to do something similar with stack overflow's rich text editor. Given this text:

[Text Example][1]

[1][http://www.example.com]

I want to loop each [string][int] that is found which I do this way:

var Text = "[Text Example][1]\n[1][http: //www.example.com]";
// Find resource links
var arrMatch = null;
var rePattern = new RegExp(
  "\\[(.+?)\\]\\[([0-9]+)\\]",
  "gi"
);
while (arrMatch = rePattern.exec(Text)) {
  console.log("ok");
}

This works great, it alerts 'ok' for each [string][int]. What I need to do though, is for each match found, replace the initial match with components of the second match.

So in the loop $2 would represent the int part originally matched, and I would run this regexp (pseduo)

while (arrMatch = rePattern.exec(Text)) {
    var FindIndex = $2; // This would be 1 in our example
    new RegExp("\\[" + FindIndex + "\\]\\[(.+?)\\]", "g")

    // Replace original match now with hyperlink
}

This would match

[1][http://www.example.com]

End result for first example would be:

<a href="http://www.example.com" rel="nofollow">Text Example</a>

Edit

I've gotten as far as this now:

var Text = "[Text Example][1]\n[1][http: //www.example.com]";
// Find resource links
reg = new RegExp(
  "\\[(.+?)\\]\\[([0-9]+)\\]",
  "gi");
var result;
while ((result = reg.exec(Text)) !== null) {
  var LinkText = result[1];
  var Match = result[0];
  Text = Text.replace(new RegExp(Match, "g"), '<a href="#">" + LinkText + "</a>');
}
console.log(Text);

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

情绪失控 2024-11-11 07:14:50

我同意 Jason 的观点,即使用现有的 Markdown 库会更快/更安全,但您正在寻找 String.prototype.replace (另外,使用 RegExp 文字!):

var Text = "[Text Example][1]\n[1][http: //www.example.com]";
var rePattern = /\[(.+?)\]\[([0-9]+)\]/gi;

console.log(Text.replace(rePattern, function(match, text, urlId) {
  // return an appropriately-formatted link
  return `<a href="${urlId}">${text}</a>`;
}));

I agree with Jason that it’d be faster/safer to use an existing Markdown library, but you’re looking for String.prototype.replace (also, use RegExp literals!):

var Text = "[Text Example][1]\n[1][http: //www.example.com]";
var rePattern = /\[(.+?)\]\[([0-9]+)\]/gi;

console.log(Text.replace(rePattern, function(match, text, urlId) {
  // return an appropriately-formatted link
  return `<a href="${urlId}">${text}</a>`;
}));

一花一树开 2024-11-11 07:14:50

我最终成功做到了:

var Text = "[Text Example][1]\n[1][http: //www.example.com]";
// Find resource links
reg = new RegExp(
  "\\[(.+?)\\]\\[([0-9]+)\\]",
  "gi");
var result;
while (result = reg.exec(Text)) {
  var LinkText = result[1];
  var Match = result[0];
  var LinkID = result[2];
  var FoundURL = new RegExp("\\[" + LinkID + "\\]\\[(.+?)\\]", "g").exec(Text);
  Text = Text.replace(Match, '<a href="' + FoundURL[1] + '" rel="nofollow">' + LinkText + '</a>');
}
console.log(Text);

I managed to do it in the end with this:

var Text = "[Text Example][1]\n[1][http: //www.example.com]";
// Find resource links
reg = new RegExp(
  "\\[(.+?)\\]\\[([0-9]+)\\]",
  "gi");
var result;
while (result = reg.exec(Text)) {
  var LinkText = result[1];
  var Match = result[0];
  var LinkID = result[2];
  var FoundURL = new RegExp("\\[" + LinkID + "\\]\\[(.+?)\\]", "g").exec(Text);
  Text = Text.replace(Match, '<a href="' + FoundURL[1] + '" rel="nofollow">' + LinkText + '</a>');
}
console.log(Text);

风追烟花雨 2024-11-11 07:14:50

这里我们使用 exec 方法,它有助于获取所有匹配项(在 while 循环的帮助下)并获取匹配字符串的位置。

    var input = "A 3 numbers in 333";
    var regExp = /\b(\d+)\b/g, match;
    while (match = regExp.exec(input))
      console.log("Found", match[1], "at", match.index);
    // → Found 3 at 2 //   Found 333 at 15 

Here we're using exec method, it helps to get all matches (with help while loop) and get position of matched string.

    var input = "A 3 numbers in 333";
    var regExp = /\b(\d+)\b/g, match;
    while (match = regExp.exec(input))
      console.log("Found", match[1], "at", match.index);
    // → Found 3 at 2 //   Found 333 at 15 
半山落雨半山空 2024-11-11 07:14:50

使用反向引用来限制匹配,以便如果您的文本是:,则代码将匹配;

[Text Example][1]\n[1][http://www.example.com]

如果您的文本是:,则代码将不匹配:

[Text Example][1]\n[2][http://www.example.com]

var re = /\[(.+?)\]\[([0-9]+)\s*.*\s*\[(\2)\]\[(.+?)\]/gi;
var str = '[Text Example][1]\n[1][http://www.example.com]';
var subst = '<a href="$4">$1</a>';

var result = str.replace(re, subst);
console.log(result);

\number 在正则表达式中用于引用组匹配编号,$number 由替换函数以相同的方式使用,以引用组结果。

Using back-references to to restrict the match so that the code will match if your text is:

[Text Example][1]\n[1][http://www.example.com]

and the code will not match if your text is:

[Text Example][1]\n[2][http://www.example.com]

var re = /\[(.+?)\]\[([0-9]+)\s*.*\s*\[(\2)\]\[(.+?)\]/gi;
var str = '[Text Example][1]\n[1][http://www.example.com]';
var subst = '<a href="$4">$1</a>';

var result = str.replace(re, subst);
console.log(result);

\number is used in regex to refer a group match number, and $number is used by the replace function in the same way, to refer group results.

戒ㄋ 2024-11-11 07:14:50

此格式基于 Markdown。有多个 JavaScript 端口可用。如果您不想要整个语法,那么我建议窃取与链接相关的部分。

This format is based on Markdown. There are several JavaScript ports available. If you don't want the whole syntax, then I recommend stealing the portions related to links.

魂归处 2024-11-11 07:14:50

另一种迭代所有匹配而不依赖于 exec 和匹配微妙之处的方法是使用字符串替换函数,使用正则表达式作为第一个参数,使用函数作为第二个参数。当这样使用时,函数参数接收整个匹配作为第一个参数,分组匹配作为下一个参数,索引作为最后一个参数:

var text = "[Text Example][1]\n[1][http: //www.example.com]";
// Find resource links
var arrMatch = null;
var rePattern = new RegExp("\\[(.+?)\\]\\[([0-9]+)\\]", "gi");
text.replace(rePattern, function(match, g1, g2, index){
    // Do whatever
})

您甚至可以使用全局 JS 变量 arguments 迭代每个匹配的所有组,不包括第一个和最后一个。

Another way to iterate over all matches without relying on exec and match subtleties, is using the string replace function using the regex as the first parameter and a function as the second one. When used like this, the function argument receives the whole match as the first parameter, the grouped matches as next parameters and the index as the last one:

var text = "[Text Example][1]\n[1][http: //www.example.com]";
// Find resource links
var arrMatch = null;
var rePattern = new RegExp("\\[(.+?)\\]\\[([0-9]+)\\]", "gi");
text.replace(rePattern, function(match, g1, g2, index){
    // Do whatever
})

You can even iterate over all groups of each match using the global JS variable arguments, excluding the first and last ones.

俏︾媚 2024-11-11 07:14:50

我知道它已经过时了,但自从我偶然发现这篇文章以来,我想把事情弄清楚。

首先,你解决这个问题的思维方式太复杂了,当看似简单的问题的解决方案变得太复杂时,是时候停下来思考哪里出了问题。
其次,您的解决方案在某种程度上效率非常低,您首先尝试找到要替换的内容,然后尝试在同一文本中搜索引用的链接信息。所以计算复杂度最终变成O(n^2)

看到这么多对错误的赞成票是非常令人失望的,因为来到这里的人们主要是从公认的解决方案中学习,认为这似乎是合法的答案,并在他们的项目中使用这个概念,然后该项目就变成了一个实施得很糟糕的产品。

解决这个问题的方法非常简单。您需要做的就是找到文本中所有引用的链接,将它们保存为字典,然后使用字典搜索要替换的占位符。就是这样。就是这么简单!在这种情况下,您的复杂度仅为 O(n)

事情是这样的:

const text = `
 [2][https://en.wikipedia.org/wiki/Scientific_journal][5][https://en.wikipedia.org/wiki/Herpetology]

The Wells and Wellington affair was a dispute about the publication of three papers in the Australian Journal of [Herpetology][5] in 1983 and 1985. The publication was established in 1981 as a [peer-reviewed][1] [scientific journal][2] focusing on the study of [3][https://en.wikipedia.org/wiki/Amphibian][amphibians][3] and [reptiles][4] ([herpetology][5]). Its first two issues were published under the editorship of Richard W. Wells, a first-year biology student at Australia's University of New England. Wells then ceased communicating with the journal's editorial board for two years before suddenly publishing three papers without peer review in the journal in 1983 and 1985. Coauthored by himself and high school teacher Cliff Ross Wellington, the papers reorganized the taxonomy of all of Australia's and New Zealand's [amphibians][3] and [reptiles][4] and proposed over 700 changes to the binomial nomenclature of the region's herpetofauna.
[1][https://en.wikipedia.org/wiki/Academic_peer_review]    
[4][https://en.wikipedia.org/wiki/Reptile]          
`;

const linkRefs = {};
const linkRefPattern = /\[(?<id>\d+)\]\[(?<link>[^\]]+)\]/g;
const linkPlaceholderPattern = /\[(?<text>[^\]]+)\]\[(?<refid>\d+)\]/g;

const parsedText = text
    .replace(linkRefPattern, (...[,,,,,ref]) => (linkRefs[ref.id] = ref.link, ''))
    .replace(linkPlaceholderPattern, (...[,,,,,placeholder]) => `<a href="${linkRefs[placeholder.refid]}">${placeholder.text}</a>`)
    .trim();

console.log(parsedText);

I know it's old, but since I stumble upon this post, I want to strait the things up.

First of all, your way of thinking into solving this problem is too complicated, and when the solution of supposedly simple problem becomes too complicated, it is time to stop and think what went wrong.
Second, your solution is super inefficient in a way, that you are first trying to find what you want to replace and then you are trying to search the referenced link information in the same text. So calculation complexity eventually becomes O(n^2).

This is very disappointing to see so many upvotes on something wrong, because people that are coming here, learning mostly from the accepted solution, thinking that this seems be legit answer and using this concept in their project, which then becomes a very badly implemented product.

The approach to this problem is pretty simple. All you need to do, is to find all referenced links in the text, save them as a dictionary and only then search for the placeholders to replace, using the dictionary. That's it. It is so simple! And in this case you will get complexity of just O(n).

So this is how it goes:

const text = `
 [2][https://en.wikipedia.org/wiki/Scientific_journal][5][https://en.wikipedia.org/wiki/Herpetology]

The Wells and Wellington affair was a dispute about the publication of three papers in the Australian Journal of [Herpetology][5] in 1983 and 1985. The publication was established in 1981 as a [peer-reviewed][1] [scientific journal][2] focusing on the study of [3][https://en.wikipedia.org/wiki/Amphibian][amphibians][3] and [reptiles][4] ([herpetology][5]). Its first two issues were published under the editorship of Richard W. Wells, a first-year biology student at Australia's University of New England. Wells then ceased communicating with the journal's editorial board for two years before suddenly publishing three papers without peer review in the journal in 1983 and 1985. Coauthored by himself and high school teacher Cliff Ross Wellington, the papers reorganized the taxonomy of all of Australia's and New Zealand's [amphibians][3] and [reptiles][4] and proposed over 700 changes to the binomial nomenclature of the region's herpetofauna.
[1][https://en.wikipedia.org/wiki/Academic_peer_review]    
[4][https://en.wikipedia.org/wiki/Reptile]          
`;

const linkRefs = {};
const linkRefPattern = /\[(?<id>\d+)\]\[(?<link>[^\]]+)\]/g;
const linkPlaceholderPattern = /\[(?<text>[^\]]+)\]\[(?<refid>\d+)\]/g;

const parsedText = text
    .replace(linkRefPattern, (...[,,,,,ref]) => (linkRefs[ref.id] = ref.link, ''))
    .replace(linkPlaceholderPattern, (...[,,,,,placeholder]) => `<a href="${linkRefs[placeholder.refid]}">${placeholder.text}</a>`)
    .trim();

console.log(parsedText);

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文