当前位置：文江博客话题详情

Python google-translate

使用 google 翻译 python 脚本翻译 url

发布于 2024-11-04 23:54:54 字数 621 浏览 5 评论 0 原文

我正在尝试从 python 脚本使用谷歌翻译：

#!/usr/bin/env python
from urllib2 import urlopen
from urllib import urlencode

base_url = "http://www.google.com/translate?"
params = (('langpair','en|es'), ('u','http://asdf.com'),)
url = base_url+urlencode(params)
print "Encoded URL: %s" % url 
print urlopen(url).read()

当我使用它时，我收到错误 403。

# ./1.py 
Encoded URL: http://www.google.com/translate?langpair=en%7Ces&u=http%3A%2F%2Fasdf.com
Traceback (most recent call last):
...
urllib2.HTTPError: HTTP Error 403: Forbidden

但是，从浏览器访问时，相同的 URL 可以正常工作。有人能发现错误吗？或者谷歌不允许这种类型的使用？

原文

I'm trying to use google translate from a python script:

#!/usr/bin/env python
from urllib2 import urlopen
from urllib import urlencode

base_url = "http://www.google.com/translate?"
params = (('langpair','en|es'), ('u','http://asdf.com'),)
url = base_url+urlencode(params)
print "Encoded URL: %s" % url 
print urlopen(url).read()

I'm getting the error 403 when I use it.

# ./1.py 
Encoded URL: http://www.google.com/translate?langpair=en%7Ces&u=http%3A%2F%2Fasdf.com
Traceback (most recent call last):
...
urllib2.HTTPError: HTTP Error 403: Forbidden

However, the same URL works fine when accessed from browser. Could anyone spot the error? Or is it that google does not allow this type of usage?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梨涡 2024-11-11 23:54:54

如果 Google 不允许您这样做，您可以通过 Google 的 API 以编程方式翻译普通网站的源代码。

我不久前为此编写了一个函数：

def translate(text, src = '', to = 'en'):
  parameters = ({'langpair': '{0}|{1}'.format(src, to), 'v': '1.0' })
  translated = ''

  for text in (text[index:index + 4500] for index in range(0, len(text), 4500)):
    parameters['q'] = text
    response = json.loads(urllib.request.urlopen('http://ajax.googleapis.com/ajax/services/language/translate', data = urllib.parse.urlencode(parameters).encode('utf-8')).read().decode('utf-8'))

    try:
      translated += response['responseData']['translatedText']
    except:
      pass

  return translated

If Google doesn't let you do this, you could programatically translate the normal website's source via the Google's APIs.

I wrote a function for this a little while back:

def translate(text, src = '', to = 'en'):
  parameters = ({'langpair': '{0}|{1}'.format(src, to), 'v': '1.0' })
  translated = ''

  for text in (text[index:index + 4500] for index in range(0, len(text), 4500)):
    parameters['q'] = text
    response = json.loads(urllib.request.urlopen('http://ajax.googleapis.com/ajax/services/language/translate', data = urllib.parse.urlencode(parameters).encode('utf-8')).read().decode('utf-8'))

    try:
      translated += response['responseData']['translatedText']
    except:
      pass

  return translated

回复收藏 0 原文

半世晨晓 2024-11-11 23:54:54

您应该使用 google API。我找到并测试了这段代码，它有效：

#!/usr/bin/env python
from urllib2 import urlopen
from urllib import urlencode
import sys

 lang1=sys.argv[1]
lang2=sys.argv[2]
langpair='%s|%s'%(lang1,lang2)
文本=''.join(sys.argv[3:])
base_url='http://ajax.googleapis.com/ajax/services/language/translate?'
params=urlencode( (('v',1.0),
                   （'q'，文本），
                   ('语言对',语言对),) )
url=base_url+参数
内容=urlopen(url).read()
start_idx=content.find('"translatedText":"')+18
翻译=内容[start_idx:]
end_idx=translation.find('"}, "')
翻译=翻译[:end_idx]
印刷版翻译

来源

You should be using the google API. I found and tested this code, it works:

#!/usr/bin/env python
from urllib2 import urlopen
from urllib import urlencode
import sys

lang1=sys.argv[1]
lang2=sys.argv[2]
langpair='%s|%s'%(lang1,lang2)
text=' '.join(sys.argv[3:])
base_url='http://ajax.googleapis.com/ajax/services/language/translate?'
params=urlencode( (('v',1.0),
                   ('q',text),
                   ('langpair',langpair),) )
url=base_url+params
content=urlopen(url).read()
start_idx=content.find('"translatedText":"')+18
translation=content[start_idx:]
end_idx=translation.find('"}, "')
translation=translation[:end_idx]
print translation

source

回复收藏 0 原文

一片旧的回忆 2024-11-11 23:54:54

您想要使用官方 Google Translate API：

http:// /code.google.com/intl/de-DE/apis/language/translate/overview.html

除此之外：

http://www.catonmat.net/blog/python-library-for-google-search/

回复收藏 0 原文

萌逼全场 2024-11-11 23:54:54

你的问题是因为你没有标题

（它告诉谷歌你的浏览器和兼容性是什么）

我之前在制作谷歌翻译API时遇到过这个错误，

你可以在这里找到它： https://github.com/mouuff/Google-Translate-API

回复收藏 0 原文

绿阴红影里的.如风往事 2024-11-11 23:54:54

你可以使用更好的Python代码来使用谷歌进行翻译：

来源：https://neculaifantanaru.com/en/python-code-text-google-translate-website-translation-beautifulsoup-new.html

from bs4 import BeautifulSoup
from bs4.formatter import HTMLFormatter
import requests
import sys
import os

class UnsortedAttributes(HTMLFormatter):
    def attributes(self, tag):
        for k, v in tag.attrs.items():
            yield k, v

files_from_folder = r"c:\Folder2"
use_translate_folder = True
destination_language = 'vi'  #aici schimbi limba in care vrei sa traduci
extension_file = ".html"
directory = os.fsencode(files_from_folder)

def translate(text, target_language):
    url = "https://translate.google.com/translate_a/single"
    headers = {
        "Host": "translate.google.com",
        "Accept": "*/*",
        "Cookie": "",
        "User-Agent": "GoogleTranslate/5.9.59004 (iPhone; iOS 10.2; ja; iPhone9,1)",
        "Accept-Language": "fr",
        "Accept-Encoding": "gzip, deflate",
        "Connection": "keep-alive",
        }
    sentence = text
    params = {
        "client": "it",
        "dt": ["t", "rmt", "bd", "rms", "qca", "ss", "md", "ld", "ex"],
        "otf": "2",
        "dj": "1",
        "q": sentence,
        "hl": "ja",
        "ie": "UTF-8",
        "oe": "UTF-8",
        "sl": "en",
        "tl": target_language,
        }

    res = requests.get(
        url=url,
        headers=headers,
        params=params,
        )

    res = res.json()

    paragraph = ''
    for i in range(0, len(res["sentences"])):
        paragraph += res["sentences"][i]["trans"]

    return paragraph

def recursively_translate(node):
    for x in range(len(node.contents)):
        if isinstance(node.contents[x], str):
            if node.contents[x].strip() != '':
                try:
                    node.contents[x].replaceWith(translate(text=node.contents[x], target_language=destination_language))
                except:
                    pass
        elif node.contents[x] != None:
            recursively_translate(node.contents[x])

for file in os.listdir(directory):
    filename = os.fsdecode(file)
    print(filename)
    if filename == 'y_key_e479323ce281e459.html' or filename == 'directory.html': #ignore this 2 files
        continue
    if filename.endswith(extension_file):
        with open(os.path.join(files_from_folder, filename), encoding='utf-8') as html:
            soup = BeautifulSoup('<pre>' + html.read() + '</pre>', 'html.parser')
            for title in soup.findAll('title'):
                recursively_translate(title)

            for meta in soup.findAll('meta', {'name':'description'}):
                try:
                    meta['content'] = translate(text=meta['content'], target_language=destination_language)
                except:
                    pass

            for h1 in soup.findAll('h1', {'itemprop':'name'}, class_='den_articol'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(h1)) < end_comment:
                    recursively_translate(h1)

            for p in soup.findAll('p', class_='text_obisnuit'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(p)) < end_comment:
                    recursively_translate(p)

            for p in soup.findAll('p', class_='text_obisnuit2'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(p)) < end_comment:
                    recursively_translate(p)

            for span in soup.findAll('span', class_='text_obisnuit2'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(span)) < end_comment:
                    recursively_translate(span)

            for li in soup.findAll('li', class_='text_obisnuit'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(li)) < end_comment:
                    recursively_translate(li)

            for a in soup.findAll('a', class_='linkMare'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(a)) < end_comment:
                    recursively_translate(a)

            for h4 in soup.findAll('h4', class_='text_obisnuit2'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(h4)) < end_comment:
                    recursively_translate(h4)

            for h5 in soup.findAll('h5', class_='text_obisnuit2'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(h5)) < end_comment:
                    recursively_translate(h5)

            for h1 in soup.findAll('h1', {'itemprop':'name'}, class_='den_webinar'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(h1)) < end_comment:
                    recursively_translate(h1)

        print(f'{filename} translated')
        soup = soup.encode(formatter=UnsortedAttributes()).decode('utf-8')
        new_filename = f'{filename.split(".")[0]}_{destination_language}.html'
        if use_translate_folder:
            try:
                with open(os.path.join(files_from_folder+r'\translated', new_filename), 'w', encoding='utf-8') as new_html:
                    new_html.write(soup[5:-6])
            except:
                os.mkdir(files_from_folder+r'\translated')
                with open(os.path.join(files_from_folder+r'\translated', new_filename), 'w', encoding='utf-8') as new_html:
                    new_html.write(soup[5:-6])
        else:
            with open(os.path.join(files_from_folder, new_filename), 'w', encoding='utf-8') as html:
                html.write(soup[5:-6])

you can use a much better python code for translating with google:

SOURCE: https://neculaifantanaru.com/en/python-code-text-google-translate-website-translation-beautifulsoup-new.html

from bs4 import BeautifulSoup
from bs4.formatter import HTMLFormatter
import requests
import sys
import os

class UnsortedAttributes(HTMLFormatter):
    def attributes(self, tag):
        for k, v in tag.attrs.items():
            yield k, v

files_from_folder = r"c:\Folder2"
use_translate_folder = True
destination_language = 'vi'  #aici schimbi limba in care vrei sa traduci
extension_file = ".html"
directory = os.fsencode(files_from_folder)

def translate(text, target_language):
    url = "https://translate.google.com/translate_a/single"
    headers = {
        "Host": "translate.google.com",
        "Accept": "*/*",
        "Cookie": "",
        "User-Agent": "GoogleTranslate/5.9.59004 (iPhone; iOS 10.2; ja; iPhone9,1)",
        "Accept-Language": "fr",
        "Accept-Encoding": "gzip, deflate",
        "Connection": "keep-alive",
        }
    sentence = text
    params = {
        "client": "it",
        "dt": ["t", "rmt", "bd", "rms", "qca", "ss", "md", "ld", "ex"],
        "otf": "2",
        "dj": "1",
        "q": sentence,
        "hl": "ja",
        "ie": "UTF-8",
        "oe": "UTF-8",
        "sl": "en",
        "tl": target_language,
        }

    res = requests.get(
        url=url,
        headers=headers,
        params=params,
        )

    res = res.json()

    paragraph = ''
    for i in range(0, len(res["sentences"])):
        paragraph += res["sentences"][i]["trans"]

    return paragraph

def recursively_translate(node):
    for x in range(len(node.contents)):
        if isinstance(node.contents[x], str):
            if node.contents[x].strip() != '':
                try:
                    node.contents[x].replaceWith(translate(text=node.contents[x], target_language=destination_language))
                except:
                    pass
        elif node.contents[x] != None:
            recursively_translate(node.contents[x])

for file in os.listdir(directory):
    filename = os.fsdecode(file)
    print(filename)
    if filename == 'y_key_e479323ce281e459.html' or filename == 'directory.html': #ignore this 2 files
        continue
    if filename.endswith(extension_file):
        with open(os.path.join(files_from_folder, filename), encoding='utf-8') as html:
            soup = BeautifulSoup('<pre>' + html.read() + '</pre>', 'html.parser')
            for title in soup.findAll('title'):
                recursively_translate(title)

            for meta in soup.findAll('meta', {'name':'description'}):
                try:
                    meta['content'] = translate(text=meta['content'], target_language=destination_language)
                except:
                    pass

            for h1 in soup.findAll('h1', {'itemprop':'name'}, class_='den_articol'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(h1)) < end_comment:
                    recursively_translate(h1)

            for p in soup.findAll('p', class_='text_obisnuit'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(p)) < end_comment:
                    recursively_translate(p)

            for p in soup.findAll('p', class_='text_obisnuit2'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(p)) < end_comment:
                    recursively_translate(p)

            for span in soup.findAll('span', class_='text_obisnuit2'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(span)) < end_comment:
                    recursively_translate(span)

            for li in soup.findAll('li', class_='text_obisnuit'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(li)) < end_comment:
                    recursively_translate(li)

            for a in soup.findAll('a', class_='linkMare'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(a)) < end_comment:
                    recursively_translate(a)

            for h4 in soup.findAll('h4', class_='text_obisnuit2'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(h4)) < end_comment:
                    recursively_translate(h4)

            for h5 in soup.findAll('h5', class_='text_obisnuit2'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(h5)) < end_comment:
                    recursively_translate(h5)

            for h1 in soup.findAll('h1', {'itemprop':'name'}, class_='den_webinar'):
                begin_comment = str(soup).index('<!-- ARTICOL START -->')
                end_comment = str(soup).index('<!-- ARTICOL FINAL -->')
                if begin_comment < str(soup).index(str(h1)) < end_comment:
                    recursively_translate(h1)

        print(f'{filename} translated')
        soup = soup.encode(formatter=UnsortedAttributes()).decode('utf-8')
        new_filename = f'{filename.split(".")[0]}_{destination_language}.html'
        if use_translate_folder:
            try:
                with open(os.path.join(files_from_folder+r'\translated', new_filename), 'w', encoding='utf-8') as new_html:
                    new_html.write(soup[5:-6])
            except:
                os.mkdir(files_from_folder+r'\translated')
                with open(os.path.join(files_from_folder+r'\translated', new_filename), 'w', encoding='utf-8') as new_html:
                    new_html.write(soup[5:-6])
        else:
            with open(os.path.join(files_from_folder, new_filename), 'w', encoding='utf-8') as html:
                html.write(soup[5:-6])

回复收藏 0 原文

那些过往 2024-11-11 23:54:54

#your use socks5 proxy  
#python 3.11
# set socks5 proxy 
import socket

importockssocks.set_default_proxy(socks.SOCKS5, "127.0.0.1", 10808) socket.socket =ocks.socksocket #使用socks建立连接

#your use socks5 proxy  
#python 3.11
# set socks5 proxy 
import socket

import sockssocks.set_default_proxy(socks.SOCKS5, "127.0.0.1", 10808) socket.socket = socks.socksocket #使用socks建立连接

回复收藏 0 原文

~没有更多了~

关于作者

冬天的雪花

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

使用 google 翻译 python 脚本翻译 url

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

李珊平

Quxin

范无咎

github_ZOJ2N8YxBm

若言

南…巷孤猫

友情链接

使用 google 翻译 python 脚本翻译 url

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

李珊平

Quxin

范无咎

github_ZOJ2N8YxBm

若言

南…巷孤猫

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。