如何从字符串中删除 HTML 编码的字符?
我有一个包含一些 HTML 编码字符的字符串,我想删除它们:
"<div>Hi All,</div><div class=\"paragraph_break\">< /></div><div>Starting today we are initiating PoLS.</div><div class=\"paragraph_break\"><br /></div><div>Please use the following communication protocols:<br /></div><div>1. Task Breakup and allocation - Gravity<br /></div><div>2. All mail communications - BC messages<br /></div><div>3. Reports on PoC / Spikes: Writeboard<br /></div><div>4. Non story related tasks: BC To-Do<br /></div><div>5. All UI and HTML will communicated to you through BC.<br /></div><div>6. For File sharing, we'll be using Dropbox.<br /></div><div>7. Use Skype for lighter and generic desicussions. However, in case you need any approvals, data for later reference, etc, then please use BC. PoLS conversation has been created on skype.</div><div class=\"paragraph_break\"><br /></div><div>You'll have been given necessary accesses to all these portals. Please start using them judiciously.</div><div class=\"paragraph_break\"><br /></div><div>All the best!</div><div class=\"paragraph_break\"><br /></div><div>Thanks,<br /></div><div>Saurav<br /></div>"
I have a string which contains some HTML encoded characters and I want to remove them:
"<div>Hi All,</div><div class=\"paragraph_break\">< /></div><div>Starting today we are initiating PoLS.</div><div class=\"paragraph_break\"><br /></div><div>Please use the following communication protocols:<br /></div><div>1. Task Breakup and allocation - Gravity<br /></div><div>2. All mail communications - BC messages<br /></div><div>3. Reports on PoC / Spikes: Writeboard<br /></div><div>4. Non story related tasks: BC To-Do<br /></div><div>5. All UI and HTML will communicated to you through BC.<br /></div><div>6. For File sharing, we'll be using Dropbox.<br /></div><div>7. Use Skype for lighter and generic desicussions. However, in case you need any approvals, data for later reference, etc, then please use BC. PoLS conversation has been created on skype.</div><div class=\"paragraph_break\"><br /></div><div>You'll have been given necessary accesses to all these portals. Please start using them judiciously.</div><div class=\"paragraph_break\"><br /></div><div>All the best!</div><div class=\"paragraph_break\"><br /></div><div>Thanks,<br /></div><div>Saurav<br /></div>"
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
你想做的事情可以通过多种方式实现。也许看看你为什么要这样做会有所帮助。通常当我想要删除编码的 HTML 时,我想要恢复 HTML 的内容。 Ruby 有一些模块可以让这一切变得简单。
其输出:
如果我想更进一步并删除标签,检索所有文本:
将输出:
当我看到这种字符串时,这就是我通常想要得到的位置。
Ruby 的 CGI 进行编码和解码HTML 简单。 Nokogiri gem 可以轻松删除标签。
What you want to do is doable many ways. Perhaps looking at why you might want to do that will help. Usually when I want to remove encoded HTML, I want to recover the contents of the HTML. Ruby has some modules that make it easy.
which outputs:
If I want to take it a step farther and remove the tags, retrieving all the text:
Will output:
Which is where I usually want to get when I see that sort of string.
Ruby's CGI makes encoding and decoding HTML easy. The Nokogiri gem makes it easy to remove the tags.
我认为最简单的方法是,假设你想在字符串中使用html。
I think the easiest way to do this is, Assuming you want to use the html in the string.
如果您已将该字符串分配给变量
s
,这是您想要的结果吗?If you have assigned that string to a variable
s
, is this the result you want?我建议:
I would suggest: