LinkedIn Webscraping Selenium问题
element < div class =“ block mt2”>
在打印输出(汤)中搜索时未显示出现。
# Scrap the data of 1 LinkedIn profile, write the data to a csv file
wd.get(https://www.linkedin.com/company/pacific-retail-capital-partners/)
soup = BeautifulSoup(wd.page_source, "html.parser")
soup
Output exceeds the size limit. Open the full output data in a text editor
<html class="theme theme--mercado artdeco windows" lang="en"><head>
<script type="application/javascript">!function(i,n){void 0!==i.addEventListener&&void 0!==i.hidden&&(n.liVisibilityChangeListener=function(){i.hidden&&(n.liHasWindowHidden=!0)},i.addEventListener("visibilitychange",n.liVisibilityChangeListener))}(document,window);</script>
<title>LinkedIn</title>
<meta charset="utf-8"/>
<meta content="IE=edge" http-equiv="X-UA-Compatible"/>
<meta class="mercado-icons-sprite" content="https://static-exp2.licdn.com/sc/h/7438dbnn8galtczp2gk2s4bgb" id="artdeco-icons/static/images/sprite-asset" name="asset-url"/>
<meta content="" name="description"/>
<meta content="notranslate" name="google"/>
<meta content="voyager-web" name="service"/>
html检查,
<div class="block mt2">
<div>
<h1 id="ember30" class="ember-view t-24 t-black t-bold
full-width" title="Pacific Retail Capital Partners">
<span dir="ltr">Pacific Retail Capital Partners</span>
</h1>
因为HTML文档已加载在我们的脚本。我们可以使用DIV标签刮擦公司的名称。
info_div = soup.find('div', {'class' : 'block mt2'})
print(info_div)
输出为无效。我没有打印任何信息。 您能解释发生的事情,需要纠正。
Element <div class="block mt2">
is not showing up when searching in output of print(soup).
# Scrap the data of 1 LinkedIn profile, write the data to a csv file
wd.get(https://www.linkedin.com/company/pacific-retail-capital-partners/)
soup = BeautifulSoup(wd.page_source, "html.parser")
soup
Output exceeds the size limit. Open the full output data in a text editor
<html class="theme theme--mercado artdeco windows" lang="en"><head>
<script type="application/javascript">!function(i,n){void 0!==i.addEventListener&&void 0!==i.hidden&&(n.liVisibilityChangeListener=function(){i.hidden&&(n.liHasWindowHidden=!0)},i.addEventListener("visibilitychange",n.liVisibilityChangeListener))}(document,window);</script>
<title>LinkedIn</title>
<meta charset="utf-8"/>
<meta content="IE=edge" http-equiv="X-UA-Compatible"/>
<meta class="mercado-icons-sprite" content="https://static-exp2.licdn.com/sc/h/7438dbnn8galtczp2gk2s4bgb" id="artdeco-icons/static/images/sprite-asset" name="asset-url"/>
<meta content="" name="description"/>
<meta content="notranslate" name="google"/>
<meta content="voyager-web" name="service"/>
HTML inspect
<div class="block mt2">
<div>
<h1 id="ember30" class="ember-view t-24 t-black t-bold
full-width" title="Pacific Retail Capital Partners">
<span dir="ltr">Pacific Retail Capital Partners</span>
</h1>
Since the html document is loaded in our script. We can scrape the name of the company using the div tag.
info_div = soup.find('div', {'class' : 'block mt2'})
print(info_div)
Output is null. I am not getting any information printed.
Can you explain what's happening and needed to be rectified.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
似乎
find()
方法不支持多个类名称。使用以下
css selector select_one()
获取公司详细信息。或获得公司名称仅使用此。
如果您仍然想使用
find()
方法,请尝试使用此操作。It seems
find()
method doesn't support multiple class name.Use the following
css selector select_one()
to get the company details.or to get company name only use this.
If you still want use
find()
method then try with this.