Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
The community is reviewing whether to reopen this question as of 2 years ago.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(1)
如建议,您应该使用调试器运行此操作,或在代码中添加一些打印语句,以便您可以查看代码的每一行/一部分发生的事情。
如果执行此操作,则在运行代码时会看到
link = content ['href']
,因此'https://chhouk-krohom.com/%E1; 98%E1%9E%A0%E1%9E%B6%E1%9E%9C%E1%9E%B7%E1%9E%97%97%E1%9E%84%E1%E1%9F%92%92%E1%E1%9E%9E%82% E1%9F%A1/'
存储为链接
。您正在用中的L中的
进行
的字符串迭代。因此,第一个迭代
l
是'h'
(字符串中的第一个字符。因此,它尝试执行s = bs(requests.get('H) ').content,'html.parser')
不是有效的URL。 ,然后't'
。 ','h'等.....您要做的是从每个
coptents
中获取所有内容。代码>带有HREF
的标签(这些是链接)。As suggested, you should run this with a debugger, or add some print statements in your code so you can see what is happening at each line/part of the code.
If you do that, you will see when you run the code,
link = content['href']
, so'https://chhouk-krohom.com/%E1%9E%98%E1%9E%A0%E1%9E%B6%E1%9E%9C%E1%9E%B7%E1%9E%97%E1%9E%84%E1%9F%92%E1%9E%82%E1%9F%A1/'
is stored aslink
.You are iterating over a string with
for l in link:
. So the first iterationl
is'h'
(the first character in the string. So then it's trying to dos = bs(requests.get('h').content, 'html.parser')
which isn't a valid url. So what it's doing is iteratingl
stored as a'h'
, then't'
. then't'
,'p', 's', ':', '/', '/', 'c', 'h', etc.....
What you want to do is first get the contents. From each
contents
, find all the<a>
tags with ahref
(those are the links). Then iterate through that list oflinks
.