美丽的汤解析HTML包含JSON

发布于 2025-02-03 08:53:20 字数 787 浏览 3 评论 0原文

输出了此物体：BS输出此（摘要）

>>> soup.body
<body><p>{
    "@context": [
        "https://geojson.org/geojson-ld/geojson-context.jsonld",
        {
            "@version": "1.1",
            "wx": "https://api.weather.gov/ontology#",
            "@vocab": "https://api.weather.gov/ontology#"
        }
    ],
    "type": "FeatureCollection",
    "features": [
        {
            "id": "https://api.weather.gov/alerts/urn:oid:2.49.0.1.840.0.957a95b11de1ec54b622b137ccf43a662d44061f.001.1",
            "type": "Feature",
            "geometry": null,
            "properties": ....(snip)

使用Python3并尝试解析似乎包含JSON对象的NWS天气警报，并从我的了解“ @context”标签中是JSON数据；那是对的吗？

我如何获得广场和卷曲括号内的元素？

BS显然有JSON解析器，但我没有找到有关这种情况的菜鸟的好教程。

指针最受欢迎。

原文

Using Python3 and trying to parse NWS weather alerts which appear to contain JSON objects using Beautiful Soup and got this far: BS outputs this (snippet from top of output)

>>> soup.body
<body><p>{
    "@context": [
        "https://geojson.org/geojson-ld/geojson-context.jsonld",
        {
            "@version": "1.1",
            "wx": "https://api.weather.gov/ontology#",
            "@vocab": "https://api.weather.gov/ontology#"
        }
    ],
    "type": "FeatureCollection",
    "features": [
        {
            "id": "https://api.weather.gov/alerts/urn:oid:2.49.0.1.840.0.957a95b11de1ec54b622b137ccf43a662d44061f.001.1",
            "type": "Feature",
            "geometry": null,
            "properties": ....(snip)

From what I understand the "@context" tag indicates that the subsequent lines within braces are JSON data; is that correct?

How do I get at the elements inside the square and curly braces?

BS apparently has a JSON parser but I haven't found any good tutorials about how-to for someone who's a noob to this situation.

Pointers would be most welcome.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

×纯※雪 2025-02-10 08:53:20

应该通过一些其他细节来改善问题，如评论中所述，该问题看起来不像，该回答是简单的HTML，而是JSON。

html 汤是从'lxml'解析器包装的
包装的，您不需要beautifulsoup no，而不是JSON解析器。
而不是在响应 - ＆gt上使用.json（）; <

docs

...
json_data = requests.get('YOUR URL').json()

for i in json_data['features']:
    print(i['id'])

...

Question should be improved by some additional details and as mentioned in the comments it do not look like, that response is plain HTML but rather JSON.

HTML in your soup is wrapping from 'lxml' parser
You do not need beautifulsoup for that task and no it is not a JSON parser.
Instead use .json() on your response -> docs

Example

...
json_data = requests.get('YOUR URL').json()

for i in json_data['features']:
    print(i['id'])

...

回复收藏 0 原文

~没有更多了~

关于作者

岁月无声

暂无简介

文章

28 人气

关注发私信

友情链接

文江博客

美丽的汤解析HTML包含JSON

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

docs

Example

关于作者

相关话题

热门标签

推荐作者

櫻之舞

弥枳

m2429

寻找一个思念的角度

野却迷人

我怀念的。

友情链接

美丽的汤解析HTML包含JSON

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

docs

Example

关于作者

相关话题

热门标签

推荐作者

櫻之舞

弥枳

m2429

寻找一个思念的角度

野却迷人

我怀念的。

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。