删除html标签mondodb
我正在创建一个查询,以提取MongoDB客户的描述。不幸的是,描述是以HTML格式的。有没有办法替换所有HTML标签并将其作为“”。将其替换为“”或删除HTML标签。
以下是示例文档,
{
"_id" : ObjectId("61f72aefdc85500a8baa6bb8")
"CustomerPin" : "22010871",
"CustomerName" : "TestLastName, TestFirstName",
"Age" : 39.0,
"Gender" : "Male",
"Description" : "<p><span>This will be a test description</span><br/></p>",
}
输出应删除“ P”,“ SPAN”和“ BR”。 MongoDB是否有一个功能可以一次删除它们,而无需重复$项目,
这是预期的输出:
{
"_id" : ObjectId("61f72aefdc85500a8baa6bb8")
"CustomerPin" : "22010871",
"CustomerName" : "TestLastName, TestFirstName",
"Age" : 39.0,
"Gender" : "Male",
"Description" : "This will be a test description",
}
谢谢!
I am creating a query to extract description of customers in mongodb. Unfortunately, the description is in HTML Format. Is there a way to replace all HTML tags and make it as " ". Either replace it with " " or remove HTML Tags.
Below is a sample document
{
"_id" : ObjectId("61f72aefdc85500a8baa6bb8")
"CustomerPin" : "22010871",
"CustomerName" : "TestLastName, TestFirstName",
"Age" : 39.0,
"Gender" : "Male",
"Description" : "<p><span>This will be a test description</span><br/></p>",
}
The output should remove "p", "span", and "br". Is there a function in mongodb to remove them all at once without repeating $project
This is the expected output:
{
"_id" : ObjectId("61f72aefdc85500a8baa6bb8")
"CustomerPin" : "22010871",
"CustomerName" : "TestLastName, TestFirstName",
"Age" : 39.0,
"Gender" : "Male",
"Description" : "This will be a test description",
}
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一种方法是通过在保存方法的hook hook of Save方法中删除所有
标签/a>
One way to do it is by removing all tags by regex in pre hook of save method
See hooks here
如果您使用Mongo 4.2,则必须找到确切的正则是从HTML中提取内容的正则。在下面,您还可以找到聚合管道和正则拨号。
If you use Mongo 4.2 then you have to find the exact regex which will extract content from HTML. Below you can find an aggregate pipeline and the regex also.