LinkedIn Webscraping Selenium问题

发布于 2025-02-11 11:33:22 字数 1761 浏览 2 评论 0原文

element < div class =“ block mt2”>在打印输出(汤)中搜索时未显示出现。

# Scrap the data of 1 LinkedIn profile, write the data to a csv file 
wd.get(https://www.linkedin.com/company/pacific-retail-capital-partners/)
soup = BeautifulSoup(wd.page_source, "html.parser")
soup
Output exceeds the size limit. Open the full output data in a text editor
<html class="theme theme--mercado artdeco windows" lang="en"><head>
<script type="application/javascript">!function(i,n){void 0!==i.addEventListener&&void 0!==i.hidden&&(n.liVisibilityChangeListener=function(){i.hidden&&(n.liHasWindowHidden=!0)},i.addEventListener("visibilitychange",n.liVisibilityChangeListener))}(document,window);</script>
<title>LinkedIn</title>
<meta charset="utf-8"/>
<meta content="IE=edge" http-equiv="X-UA-Compatible"/>
<meta class="mercado-icons-sprite" content="https://static-exp2.licdn.com/sc/h/7438dbnn8galtczp2gk2s4bgb" id="artdeco-icons/static/images/sprite-asset" name="asset-url"/>
<meta content="" name="description"/>
<meta content="notranslate" name="google"/>
<meta content="voyager-web" name="service"/>

html检查,

<div class="block mt2">
    <div>
    <h1 id="ember30" class="ember-view t-24 t-black t-bold
        full-width" title="Pacific Retail Capital Partners">
      <span dir="ltr">Pacific Retail Capital Partners</span>
    </h1>

因为HTML文档已加载在我们的脚本。我们可以使用DIV标签刮擦公司的名称。

info_div = soup.find('div', {'class' : 'block mt2'})
print(info_div)

输出为无效。我没有打印任何信息。 您能解释发生的事情,需要纠正。

Element <div class="block mt2"> is not showing up when searching in output of print(soup).

# Scrap the data of 1 LinkedIn profile, write the data to a csv file 
wd.get(https://www.linkedin.com/company/pacific-retail-capital-partners/)
soup = BeautifulSoup(wd.page_source, "html.parser")
soup
Output exceeds the size limit. Open the full output data in a text editor
<html class="theme theme--mercado artdeco windows" lang="en"><head>
<script type="application/javascript">!function(i,n){void 0!==i.addEventListener&&void 0!==i.hidden&&(n.liVisibilityChangeListener=function(){i.hidden&&(n.liHasWindowHidden=!0)},i.addEventListener("visibilitychange",n.liVisibilityChangeListener))}(document,window);</script>
<title>LinkedIn</title>
<meta charset="utf-8"/>
<meta content="IE=edge" http-equiv="X-UA-Compatible"/>
<meta class="mercado-icons-sprite" content="https://static-exp2.licdn.com/sc/h/7438dbnn8galtczp2gk2s4bgb" id="artdeco-icons/static/images/sprite-asset" name="asset-url"/>
<meta content="" name="description"/>
<meta content="notranslate" name="google"/>
<meta content="voyager-web" name="service"/>

HTML inspect

<div class="block mt2">
    <div>
    <h1 id="ember30" class="ember-view t-24 t-black t-bold
        full-width" title="Pacific Retail Capital Partners">
      <span dir="ltr">Pacific Retail Capital Partners</span>
    </h1>

Since the html document is loaded in our script. We can scrape the name of the company using the div tag.

info_div = soup.find('div', {'class' : 'block mt2'})
print(info_div)

Output is null. I am not getting any information printed.
Can you explain what's happening and needed to be rectified.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

浮萍、无处依 2025-02-18 11:33:22

似乎find()方法不支持多个类名称。
使用以下css selector select_one()获取公司详细信息。

info_div = soup.select_one('div.block.mt2')
print(info_div.text)

或获得公司名称仅使用此。

  company = soup.select_one('div.block.mt2 h1>span')
  print(company.text)

如果您仍然想使用find()方法,请尝试使用此操作。

info_div = soup.find('div', {'class' : 'mt2'})
print(info_div.text)

It seems find() method doesn't support multiple class name.
Use the following css selector select_one()to get the company details.

info_div = soup.select_one('div.block.mt2')
print(info_div.text)

or to get company name only use this.

  company = soup.select_one('div.block.mt2 h1>span')
  print(company.text)

If you still want use find() method then try with this.

info_div = soup.find('div', {'class' : 'mt2'})
print(info_div.text)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文