jsondecodeerror:期望','长json字符串中的定界符

发布于 2025-01-23 16:41:06 字数 509 浏览 0 评论 0原文

我正在尝试解析以下JSON,但我总是面临“ jsondecodeerror:期望”,“定界符

这是我正在做的代码:

import requests
from bs4 import  BeautifulSoup
import json

page_link="https://www.indeed.com/cmp/Ocean-Beauty-Seafoods/reviews?start=0"
page_response = requests.get(page_link, verify=False)
soup = BeautifulSoup(page_response.content, 'html.parser')
strJson=soup.findAll('script')[16].text.replace("\n    window._initialData=JSON.parse(\'","").replace("');","")
json.loads(strJson)

非常感谢

i'm trying to parse the following JSON but I always face the error stating "JSONDecodeError: Expecting ',' delimiter"

Here is the code i'm doing:

import requests
from bs4 import  BeautifulSoup
import json

page_link="https://www.indeed.com/cmp/Ocean-Beauty-Seafoods/reviews?start=0"
page_response = requests.get(page_link, verify=False)
soup = BeautifulSoup(page_response.content, 'html.parser')
strJson=soup.findAll('script')[16].text.replace("\n    window._initialData=JSON.parse(\'","").replace("');","")
json.loads(strJson)

manyy thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

庆幸我还是我 2025-01-30 16:41:06

JSON是无效的。尝试首先使用 ast.code> ast.literal_eval

import json
import requests
from ast import literal_eval
from bs4 import BeautifulSoup


page_link = "https://www.indeed.com/cmp/Ocean-Beauty-Seafoods/reviews?start=0"
page_response = requests.get(page_link, verify=False)
soup = BeautifulSoup(page_response.content, "html.parser")

strJson = (
    soup.findAll("script")[16]
    .text.replace("\n    window._initialData=JSON.parse('", "")
    .replace("');", "")
)

s = literal_eval("'''" + strJson + "'''")
data = json.loads(s)

print(json.dumps(data, indent=4))

打印:

{
    "breadcrumbs": {
        "breadcrumbs": [
            {
                "name": "Companies",
                "noFollow": false,
                "url": "https://www.indeed.com/companies"
            },
            {
                "name": "Ocean Beauty Seafoods",
                "noFollow": false,
                "url": "https://www.indeed.com/cmp/Ocean-Beauty-Seafoods"
            },
            {
                "name": "Employee Reviews",
                "noFollow": false
            }
        ]
    },
    "companyPageFooter": {
        "enabledToShowUserFeedbackForm": false,
        "encodedFccId": "a9c95405fb0cdb1c",
        "stickyJobsTabLink": {
            "jobsLink": "/cmp/Ocean-Beauty-Seafoods/jobs"
        }
    },
    "companyPageHeader": {
        "auroraLogoUrl": "https://d2q79iu7y748jz.cloudfront.net/s/_squarelogo/64x64/147cafc3914ffb4693dc99df6ad0b169",
        "auroraLogoUrl2x": "https://d2q79iu7y748jz.cloudfront.net/s/_squarelogo/128x128/147cafc3914ffb4693dc99df6ad0b169",
        "brandColor": "#FFFFFF",
        "companyHeader": {
            "name": "Ocean Beauty Seafoods",
            "rating": 3.7,
            "reviewCount": 114,
            "reviewCountFormatted": "114",
            "reviewsUrl": "/cmp/Ocean-Beauty-Seafoods/reviews"
        },

...and so on.

The Json as it is isn't valid. Try to "preprocess" it first with ast.literal_eval:

import json
import requests
from ast import literal_eval
from bs4 import BeautifulSoup


page_link = "https://www.indeed.com/cmp/Ocean-Beauty-Seafoods/reviews?start=0"
page_response = requests.get(page_link, verify=False)
soup = BeautifulSoup(page_response.content, "html.parser")

strJson = (
    soup.findAll("script")[16]
    .text.replace("\n    window._initialData=JSON.parse('", "")
    .replace("');", "")
)

s = literal_eval("'''" + strJson + "'''")
data = json.loads(s)

print(json.dumps(data, indent=4))

Prints:

{
    "breadcrumbs": {
        "breadcrumbs": [
            {
                "name": "Companies",
                "noFollow": false,
                "url": "https://www.indeed.com/companies"
            },
            {
                "name": "Ocean Beauty Seafoods",
                "noFollow": false,
                "url": "https://www.indeed.com/cmp/Ocean-Beauty-Seafoods"
            },
            {
                "name": "Employee Reviews",
                "noFollow": false
            }
        ]
    },
    "companyPageFooter": {
        "enabledToShowUserFeedbackForm": false,
        "encodedFccId": "a9c95405fb0cdb1c",
        "stickyJobsTabLink": {
            "jobsLink": "/cmp/Ocean-Beauty-Seafoods/jobs"
        }
    },
    "companyPageHeader": {
        "auroraLogoUrl": "https://d2q79iu7y748jz.cloudfront.net/s/_squarelogo/64x64/147cafc3914ffb4693dc99df6ad0b169",
        "auroraLogoUrl2x": "https://d2q79iu7y748jz.cloudfront.net/s/_squarelogo/128x128/147cafc3914ffb4693dc99df6ad0b169",
        "brandColor": "#FFFFFF",
        "companyHeader": {
            "name": "Ocean Beauty Seafoods",
            "rating": 3.7,
            "reviewCount": 114,
            "reviewCountFormatted": "114",
            "reviewsUrl": "/cmp/Ocean-Beauty-Seafoods/reviews"
        },

...and so on.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文