LinkedIn Webscraping Selenium问题

发布于 2025-02-11 11:33:22 字数 1761 浏览 2 评论 0原文

element ＆lt; div class =“ block mt2”＆gt;在打印输出（汤）中搜索时未显示出现。

# Scrap the data of 1 LinkedIn profile, write the data to a csv file 
wd.get(https://www.linkedin.com/company/pacific-retail-capital-partners/)
soup = BeautifulSoup(wd.page_source, "html.parser")
soup

Output exceeds the size limit. Open the full output data in a text editor
<html class="theme theme--mercado artdeco windows" lang="en"><head>
<script type="application/javascript">!function(i,n){void 0!==i.addEventListener&&void 0!==i.hidden&&(n.liVisibilityChangeListener=function(){i.hidden&&(n.liHasWindowHidden=!0)},i.addEventListener("visibilitychange",n.liVisibilityChangeListener))}(document,window);</script>
<title>LinkedIn</title>
<meta charset="utf-8"/>
<meta content="IE=edge" http-equiv="X-UA-Compatible"/>
<meta class="mercado-icons-sprite" content="https://static-exp2.licdn.com/sc/h/7438dbnn8galtczp2gk2s4bgb" id="artdeco-icons/static/images/sprite-asset" name="asset-url"/>
<meta content="" name="description"/>
<meta content="notranslate" name="google"/>
<meta content="voyager-web" name="service"/>

html检查，

<div class="block mt2">
    <div>
    <h1 id="ember30" class="ember-view t-24 t-black t-bold
        full-width" title="Pacific Retail Capital Partners">
      <span dir="ltr">Pacific Retail Capital Partners</span>
    </h1>

因为HTML文档已加载在我们的脚本。我们可以使用DIV标签刮擦公司的名称。

info_div = soup.find('div', {'class' : 'block mt2'})
print(info_div)

输出为无效。我没有打印任何信息。 您能解释发生的事情，需要纠正。

原文

Element <div class="block mt2"> is not showing up when searching in output of print(soup).

# Scrap the data of 1 LinkedIn profile, write the data to a csv file 
wd.get(https://www.linkedin.com/company/pacific-retail-capital-partners/)
soup = BeautifulSoup(wd.page_source, "html.parser")
soup

Output exceeds the size limit. Open the full output data in a text editor
<html class="theme theme--mercado artdeco windows" lang="en"><head>
<script type="application/javascript">!function(i,n){void 0!==i.addEventListener&&void 0!==i.hidden&&(n.liVisibilityChangeListener=function(){i.hidden&&(n.liHasWindowHidden=!0)},i.addEventListener("visibilitychange",n.liVisibilityChangeListener))}(document,window);</script>
<title>LinkedIn</title>
<meta charset="utf-8"/>
<meta content="IE=edge" http-equiv="X-UA-Compatible"/>
<meta class="mercado-icons-sprite" content="https://static-exp2.licdn.com/sc/h/7438dbnn8galtczp2gk2s4bgb" id="artdeco-icons/static/images/sprite-asset" name="asset-url"/>
<meta content="" name="description"/>
<meta content="notranslate" name="google"/>
<meta content="voyager-web" name="service"/>

HTML inspect

<div class="block mt2">
    <div>
    <h1 id="ember30" class="ember-view t-24 t-black t-bold
        full-width" title="Pacific Retail Capital Partners">
      <span dir="ltr">Pacific Retail Capital Partners</span>
    </h1>

Since the html document is loaded in our script. We can scrape the name of the company using the div tag.

info_div = soup.find('div', {'class' : 'block mt2'})
print(info_div)

Output is null. I am not getting any information printed.
Can you explain what's happening and needed to be rectified.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

浮萍、无处依 2025-02-18 11:33:22

似乎find（）方法不支持多个类名称。
使用以下css selector select_one（）获取公司详细信息。

info_div = soup.select_one('div.block.mt2')
print(info_div.text)

或获得公司名称仅使用此。

  company = soup.select_one('div.block.mt2 h1>span')
  print(company.text)

如果您仍然想使用find（）方法，请尝试使用此操作。

info_div = soup.find('div', {'class' : 'mt2'})
print(info_div.text)

It seems find() method doesn't support multiple class name.
Use the following css selector select_one()to get the company details.

info_div = soup.select_one('div.block.mt2')
print(info_div.text)

or to get company name only use this.

  company = soup.select_one('div.block.mt2 h1>span')
  print(company.text)

If you still want use find() method then try with this.

info_div = soup.find('div', {'class' : 'mt2'})
print(info_div.text)

回复收藏 0 原文

~没有更多了~

关于作者

冷情

暂无简介

文章

27 人气

关注发私信

闻呓

文章 0 评论 0

关注

深府石板幽径

文章 0 评论 0

关注

mabiao

文章 0 评论 0

关注

枕花眠

文章 0 评论 0

关注

qq_CrTt6n

文章 0 评论 0

关注

红颜悴

文章 0 评论 0

友情链接

文江博客

LinkedIn Webscraping Selenium问题

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

闻呓

深府石板幽径

mabiao

枕花眠

qq_CrTt6n

红颜悴

友情链接

LinkedIn Webscraping Selenium问题

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

闻呓

深府石板幽径

mabiao

枕花眠

qq_CrTt6n

红颜悴

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。