如何使用正则表达式查找和查找替换html表格标签？

发布于 2024-08-12 06:05:16 字数 986 浏览 10 评论 0原文

我的代码块如下所示：

<table border="0"><tr><td><img src='http://profile.ak.fbcdn.net/object3/686/9/q142163634919_249.jpg'/>&nbsp;&nbsp;</td><td>Gift of Life Marathon Blood Drive - "the group stood before a sea of 1,000 Long Trail Brewing Co. pint glasses..." (Rutland Herald, VT)</td></tr></table>

我需要查找 &替换除 http://profile.ak.fbcdn.net/object3 之外的所有内容/686/9/q142163634919_249.jpg 没有任何内容。所以最后，它应该只是 url。

唯一与我们循环时不同的值是第二组 td 标签中的 url 和描述。描述中的字符数并不总是相同。

我得到了正则表达式好友 &昨晚看了几个小时的参考网站。匹配单个角色似乎非常简单，但我认为我需要一段时间才能弄清楚这个问题。

我相信有不同类型的正则表达式。我正在使用的那个是在 Yahoo Pipes 中，不确定它是什么类型： http://pipes.yahoo.com/pipes/pipe.edit?_id=436a316234281be629d357bbecae46b1

原文

I have blocks of code that look like this:

<table border="0"><tr><td><img src='http://profile.ak.fbcdn.net/object3/686/9/q142163634919_249.jpg'/>  </td><td>Gift of Life Marathon Blood Drive - "the group stood before a sea of 1,000 Long Trail Brewing Co. pint glasses..." (Rutland Herald, VT)</td></tr></table>

I need to find & replace everything but http://profile.ak.fbcdn.net/object3/686/9/q142163634919_249.jpg with nothing. So at the end, it should just be the url.

The only values that will not be the same as we go through the loop is the url and the description within the 2nd set of td tags. The # of characters in the description won't always be the same.

I got Regex Buddy & looked at a reference site for hours last night. Matching a single character seems pretty straightforward but I think it will take a while for me to figure this one out.

I believe there are different types of RegEx. The one I am working with is in Yahoo Pipes, not sure what type it is: http://pipes.yahoo.com/pipes/pipe.edit?_id=436a316234281be629d357bbecae46b1

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

烟火散人牵绊 2024-08-19 06:05:16

如果您的 html 看起来与上面的完全一样，那应该很简单：
img src='([^']*)'
() 意味着这将被存储在一个特殊的结果变量中。因此，不要查看正则表达式匹配的内容，而是查看结果变量。
[^']* 匹配除“'”之外的所有字符。

...而且我认为您不需要 HTML 解析器来完成此任务。仅当您想创建真正健壮的代码时:-)

回复收藏 0 原文

谜泪 2024-08-19 06:05:16

我强烈建议使用 HTML 解析器。 HTML 不规则，因此使用正则表达式进行解析很容易出现错误、边缘情况等。

回复收藏 0 原文

爱你是孤单的心事 2024-08-19 06:05:16

Pipes 是一个略有不同的野兽。因为我是新手，所以我最终创建了 3 个单独的查找和替换规则，以将代码简化为基本的 url：

Replace ^.*= with [nothing]

这剩下：

'http://profile.ak.fbcdn.net/object3/686/9/q142163634919_249.jpg'/>   Gift of Life Marathon Blood Drive - "the group stood before a sea of 1,000 Long Trail Brewing Co. pint glasses..." (Rutland Herald, VT)

Replace 。 with [nothing]

这只是删除开头的 ' 。

将 '.* 替换为 [nothing]

这将删除 jpg 之后以 ' 开头的所有内容

最终结果： http://profile.ak.fbcdn.net/object3/686/9/q142163634919_249.jpg

我确信有一种方法可以将这 3 条规则合并为一条，但是当我尝试时出现错误这样做。这有效并且始终如一。

Pipes is a slightly different beast. Because I am new at this, I ended up creating 3 separate find and replace rules to get the code down to just the essential url:

Replace ^.*= with [nothing]

This leaves:

'http://profile.ak.fbcdn.net/object3/686/9/q142163634919_249.jpg'/>   Gift of Life Marathon Blood Drive - "the group stood before a sea of 1,000 Long Trail Brewing Co. pint glasses..." (Rutland Herald, VT)

Replace . with [nothing]

This just removes ' at the beginning.

Replace '.* with [nothing]

This removes everything after jpg beginning with '

End result: http://profile.ak.fbcdn.net/object3/686/9/q142163634919_249.jpg

I'm sure there is a way to combine those 3 rules into one but I got errors when I tried to do that. This works and does so consistently.

回复收藏 0 原文

~没有更多了~