使用 Python 无浏览器访问 LinkedIn
我正在编写一个访问 linkedin 的命令行应用程序。我正在使用 python-linkedin API。
事情按我的预期进行,但我对身份验证过程有很大的抱怨。目前,我需要:
- 启动我的应用程序并等待它打印身份验证 URL
- 使用浏览器访问该 URL
- 为应用程序表示祝福并等待它将我重定向到 URL
- 从 URL 中提取访问令牌
- 输入访问我的应用程序中的令牌
- 执行我需要使用 linkedin 执行的操作
我不喜欢手动执行步骤 2 到 5,因此我希望将其自动化。我想做的是:
- 使用像 mechanize 这样的无头客户端来访问步骤 1 中的 URL上面
- 刮屏并自动祝福(可能需要输入用户名和密码——我知道这些,所以没问题)
- 等待重定向并抓取重定向 URL
- 从 URL 中提取令牌
- PROFIT!
提问时间:
- 环顾四周,这个人就在这里试图做类似的事情但被告知这是不可能的。为什么?
- 然后,这个人在 Jython 和 HtmlUnit 中执行 。直接用 Python 和 mechanize 应该可以吧?
- 最后,有人见过直接使用 Python 和 mechanize (或任何其他无头浏览器替代方案)的解决方案吗?我不想重新发明轮子,但如果有必要的话我会编写代码。
编辑:
初始化令牌的代码(使用接受的答案的方法):
api = linkedin.LinkedIn(KEY, SECRET, RETURN_URL)
result = api.request_token()
if not result:
print 'Initialization error:', api.get_error()
return
print 'Go to URL:', api.get_authorize_url()
print 'Enter verifier: ',
verifier = sys.stdin.readline().strip()
if not result:
print 'Initialization error:', api.get_error()
return
result = api.access_token(verifier=verifier)
if not result:
print 'Initialization error:', api.get_error()
return
fin = open('tokens.pickle', 'w')
for t in (api._request_token, api._request_token_secret,
api._access_token, api._access_token_secret ):
pickle.dump(t, fin)
fin.close()
print 'Initialization complete.'
使用令牌的代码:
api = linkedin.LinkedIn(KEY, SECRET, RETURN_URL)
tokens = tokens_fname()
try:
fin = open(tokens)
api._request_token = pickle.load(fin)
api._request_token_secret = pickle.load(fin)
api._access_token = pickle.load(fin)
api._access_token_secret = pickle.load(fin)
except IOError, ioe:
print ioe
print 'Please run `python init_tokens.py\' first'
return
profiles = api.get_search({ 'name' : name })
I'm writing a command-line application that accesses linkedin. I'm using the python-linkedin API.
Things work as I expected, but I have a really big gripe about the authentication process. Currently, I need to:
- Start my application and wait for it to print an authentication URL
- Go to that URL with my browser
- Give my blessing for the application and wait for it to redirect me to a URL
- Extract the access token from the URL
- Input that access token into my application
- Do what I need to do with linkedin
I don't like doing steps 2 to 5 manually so I would like to automate them. What I was thinking of doing was:
- Use a headless client like mechanize to access the URL from step 1 above
- Scrape the screen and give my blessing automatically (may be required to input username and password -- I know these, so it's OK)
- Wait to be redirected and grab the redirection URL
- Extract the token from the URL
- PROFIT!
Question time:
- Looking around, this guy right here on SO tried to do something similar but was told that it's impossible. Why?
- Then, this guy here does it in Jython and HtmlUnit. Should be possible with straight Python and mechanize, right?
- Finally, has anybody seen a solution with straight Python and mechanize (or any other headless browser alternative)? I don't want to reinvent the wheel, but will code it up if necessary.
EDIT:
Code to initialize tokens (using the approach of the accepted answer):
api = linkedin.LinkedIn(KEY, SECRET, RETURN_URL)
result = api.request_token()
if not result:
print 'Initialization error:', api.get_error()
return
print 'Go to URL:', api.get_authorize_url()
print 'Enter verifier: ',
verifier = sys.stdin.readline().strip()
if not result:
print 'Initialization error:', api.get_error()
return
result = api.access_token(verifier=verifier)
if not result:
print 'Initialization error:', api.get_error()
return
fin = open('tokens.pickle', 'w')
for t in (api._request_token, api._request_token_secret,
api._access_token, api._access_token_secret ):
pickle.dump(t, fin)
fin.close()
print 'Initialization complete.'
Code to use tokens:
api = linkedin.LinkedIn(KEY, SECRET, RETURN_URL)
tokens = tokens_fname()
try:
fin = open(tokens)
api._request_token = pickle.load(fin)
api._request_token_secret = pickle.load(fin)
api._access_token = pickle.load(fin)
api._access_token_secret = pickle.load(fin)
except IOError, ioe:
print ioe
print 'Please run `python init_tokens.py\' first'
return
profiles = api.get_search({ 'name' : name })
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
由于您计划仅对自己进行一次授权,然后调用 API 来获取您自己的信息,因此我只需手动检索您的访问令牌,而不必担心将其自动化。
除非您在授权屏幕上另行指定,否则当您授权给定应用程序时 LinkedIn 生成的用户访问令牌是永久性的。您需要做的就是使用您的应用程序生成授权屏幕,完成该过程,并在成功后回显并存储您的用户访问令牌(令牌和秘密)。一旦掌握了这些,您就可以将它们硬编码到文件、数据库等中,并在调用 API 时使用它们。
它是用 PHP 编写的,但这个演示基本上就是这样做的。只需修改 demo.php 脚本即可根据需要回显您的令牌。
As you are planning on authorizing yourself just once, and then making calls to the API for your own information, I would just manually retrieve your access token rather than worrying about automating it.
The user access token generated by LinkedIn when you authorize a given application is permanent unless you specify otherwise on the authorization screen. All you need to do is to generate the authorization screen with your application, go through the process and upon success echo out and store your user access token (token and secret). Once you have that, you can hard code those into a file, database, etc and when making calls to the API, use those.
It's in PHP, but this demo does basically this. Just modify the demo.php script to echo out your token as needed.
我自己没有尝试过,但我相信理论上应该可以使用 Selenium WebDriver 和 PyVirtualDisplay。这个想法在此处进行了描述。
I have not tried it myself, but I believe in theory it should be possible with Selenium WebDriver with PyVirtualDisplay. This idea is described here.