HTML数据提取

发布于 2024-09-19 15:45:54 字数 211 浏览 3 评论 0 原文

我正在访问一些网站，我需要提取一些数据。更具体地说 - 从这部分开始：

<input type="hidden" value="1" name="d520783895194bd08750e47c744d553d">

我需要提取“名称”部分。我听说正则表达式不是最好的解决方案，所以我想问一下访问我需要的这段数据的最佳方式是什么。

原文

I'm accessing some website and I need to extract some data. To be more specific - from this part:

<input type="hidden" value="1" name="d520783895194bd08750e47c744d553d">

I need to extract the "name" part. I heard that reular expressions are not the best solution, so I'd like to ask what is the best way to access this piece of data I need.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

坠似风落 2024-09-26 15:45:54

使用 NekoHTML 或 TagSoup 解析网站后（应该注意您的输入字段标记未关闭的事实），我建议使用 xpath 表达式：

//input[@type='hidden'][@value=1]/@name

在 groovy 中，您将以 GPath。

After parsing a website with NekoHTML or TagSoup (which should take care of the fact that your input field tag is not closed), I suggest to use a xpath expression: