Selenium(Python)Webdriver JavaScript(Noscrypt)
我正在尝试从网站上刮擦数据,以提供学生的注释来进行分析 我会尝试这么好的
from selenium import webdriver
#set chromodriver.exe path
driver = webdriver.Chrome(executable_path="C:\\chromedriver.exe")
#set page load timeout
#launch URL
driver.get("https://amatti.education.gov.dz/")
当运行此代码打开网站时, 第一件事: [该网站打开正常] [1] https://i.sstatic.net/ay7qj.png 网站打开后,将转到此网站:
[打开后访问此站点] [2] https://i.sstatic.net/nwvea.png
我注意到网站的html中有很好 这意味着,如果浏览器不支持JavaScript将转到URL:Google.com
<noscript>
<meta http-equiv="refresh" content="0; url=http://www.google.com/" />
</noscript>
有任何解决方案可以自动化此网站 [1]: https://i.sstatic.net/AY7QJ.PNG [2]: https://i.sstatic.net/nwvea.png
I am trying to scraping data from a site provide note of student to make analysis
I try this good
from selenium import webdriver
#set chromodriver.exe path
driver = webdriver.Chrome(executable_path="C:\\chromedriver.exe")
#set page load timeout
#launch URL
driver.get("https://amatti.education.gov.dz/")
the first thing happen when run this code is open the site :
[the site open normal][1]
https://i.sstatic.net/ay7QJ.png
after the site open it go to this site :
[after open go to this site][2]
https://i.sstatic.net/NWvEa.png
I notice there is this good in the html of the site
that mean if the browser not support JavaScript will go to URL : google.com
<noscript>
<meta http-equiv="refresh" content="0; url=http://www.google.com/" />
</noscript>
there is any solution to automate this site
[1]: https://i.sstatic.net/ay7QJ.png
[2]: https://i.sstatic.net/NWvEa.png
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我找到了解决方案
问题来自网络驱动器
该网站知道有机器人刮擦数据
所以我使用这个论点
,它的工作很好
I found the solution
the problem comes from WebDrive
the site knows there is bot scraping data
so i use this argument
and its work fine