在Twitter API中仅进行一次搜索电话后,我一直在达到速率限制
我正在对带有来自用户列表的关键字的推文进行完整的存档搜索。我循环浏览每个用户名的搜索查询,并检查关键字“共和党人”。问题在于,它将在达到速率限制之前循环遍历数量不错的用户名,然后每个其他用户搜索都会提示速率限制等待,而不是完全刷新。我的问题基本上是为什么它会迫使我在一次搜索电话后等待,我该怎么做才能避免这种情况?
df = pd.read_csv("RealIdMasterList.csv")
id_str_df = df['id_str'].tolist()
theta_df = df['theta'].tolist()
accounts_followed_df=df['accounts_followed'].tolist()
# Your bearer token here
t = Twarc2(bearer_token="<token>")
# Start and end times must be in UTC
start_time = datetime.datetime(2010, 3, 21, 0, 0, 0, 0, datetime.timezone.utc)
end_time = datetime.datetime(2022, 3, 22, 0, 0, 0, 0, datetime.timezone.utc)
# search_results is a generator, max_results is max tweets per page, 500 max for full archive search.
i = 0
pings = 0
with open('realtweets.csv', 'w', encoding='UTF8') as f:
writer = csv.writer(f)
writer.writerow(header)
while(pings <=10):
for x in range(len(id_str_df)):
userid = str(int(id_str_df[i]))
print(userid)
q = "republican lang:en -is:retweet from:" + userid
try:
search_results = list(t.search_all(query=q, start_time=start_time, end_time=end_time, max_results=100))
count = 0
if search_results:
pings+=1
for page in search_results:
if count<1:
for tweet in ensure_flattened(page):
writer.writerow([tweet['id'], tweet['author_id'], tweet['text'], theta_df[0], 0, accounts_followed_df[0], tweet['created_at']])
#print(tweet['text'] + "," + tweet['author_id'] + ',' + tweet['created_at'])
# Do something with the tweet
#print(tweet)
# Stop iteration prematurely, to only get 1 page of results.
# break
except Exception as e:
print(e)
print("ANAL SACK")
pass
i+=1
output:
757877990969659520
2848529739
rate limit exceeded: sleeping 909.0393960475922 secs
902416406771244928
rate limit exceeded: sleeping 909.7210428714752 secs
I'm performing a full archive search for tweets with a keyword that come from a list of users. I loop through search queries for each username and check for the keyword 'republican'. The problem is that it will loop through a decent number of usernames before reaching a rate limit, then each additional user search prompts a rate limit wait instead of refreshing completely. My question is basically why does it force me to wait after a single search call after a bit and what can I do to avoid this?
df = pd.read_csv("RealIdMasterList.csv")
id_str_df = df['id_str'].tolist()
theta_df = df['theta'].tolist()
accounts_followed_df=df['accounts_followed'].tolist()
# Your bearer token here
t = Twarc2(bearer_token="<token>")
# Start and end times must be in UTC
start_time = datetime.datetime(2010, 3, 21, 0, 0, 0, 0, datetime.timezone.utc)
end_time = datetime.datetime(2022, 3, 22, 0, 0, 0, 0, datetime.timezone.utc)
# search_results is a generator, max_results is max tweets per page, 500 max for full archive search.
i = 0
pings = 0
with open('realtweets.csv', 'w', encoding='UTF8') as f:
writer = csv.writer(f)
writer.writerow(header)
while(pings <=10):
for x in range(len(id_str_df)):
userid = str(int(id_str_df[i]))
print(userid)
q = "republican lang:en -is:retweet from:" + userid
try:
search_results = list(t.search_all(query=q, start_time=start_time, end_time=end_time, max_results=100))
count = 0
if search_results:
pings+=1
for page in search_results:
if count<1:
for tweet in ensure_flattened(page):
writer.writerow([tweet['id'], tweet['author_id'], tweet['text'], theta_df[0], 0, accounts_followed_df[0], tweet['created_at']])
#print(tweet['text'] + "," + tweet['author_id'] + ',' + tweet['created_at'])
# Do something with the tweet
#print(tweet)
# Stop iteration prematurely, to only get 1 page of results.
# break
except Exception as e:
print(e)
print("ANAL SACK")
pass
i+=1
output:
757877990969659520
2848529739
rate limit exceeded: sleeping 909.0393960475922 secs
902416406771244928
rate limit exceeded: sleeping 909.7210428714752 secs
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论