如何从 shell 脚本通过邮件发送特殊字符?

发布于 2024-09-07 14:03:26 字数 543 浏览 7 评论 0原文

我有一个在 cron 上运行的脚本,它输出一些我们发送到“邮件”程序的文本。一般思路是这样的:

./command.sh | mail -s "My Subject" [email protected] -- -F "Sender Name" -f [email protected]

问题是脚本生成的文本有一些特殊字符 - é、ã、ç - 因为它不是英语。收到电子邮件后,每个字符都会替换为 ??。

现在我明白这很可能是由于编码设置不正确造成的。解决这个问题最简单的方法是什么?

I have a script that runs on cron that outputs some text which we send to the 'mail' program. The general line is like this:

./command.sh | mail -s "My Subject" [email protected] -- -F "Sender Name" -f [email protected]

The problem is that the text generated by the script has some special characters - é, ã, ç - since it is not in english. When the e-mail is received, each character is replaced by ??.

Now I understand that this is most likely due to the encoding that is not set correctly. What is the easiest way to fix this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

鹊巢 2024-09-14 14:03:26

我的 /usr/bin/mail 符号链接到 /etc/alternatives/mail ,它也符号链接到 /usr/bin/bsd-mailx

我必须在邮件标头中指定自己的编码。 (此处不支持 -S。)

cat myutf8-file | mail -a“内容类型:text/plain;字符集=UTF-8”-s“我的主题”[电子邮件受保护]

My /usr/bin/mail is symlinked to /etc/alternatives/mail which is also symlinked to /usr/bin/bsd-mailx

I had to specify myself the encoding in the mail header. (The -S is not supported here.)

cat myutf8-file | mail -a "Content-Type: text/plain; charset=UTF-8" -s "My Subject" [email protected]

夏末的微笑 2024-09-14 14:03:26

您假设这是字符集问题是正确的。您需要在 crontab 的开头设置适当的环境变量。

像这样的东西应该可以工作:

LANG=en_US.UTF-8
LC_CTYPE=en_US.UTF-8

可以选择使用 LC_ALL 代替 LC_CTYPE。

参考:http://opengroup.org/onlinepubs/007908799/xbd/envvar.html

编辑: 当您在 shell 中运行它时显示良好的原因可能是因为上述环境变量是在您的 shell 中设置的。

要进行验证,请在 shell 中执行“locale”,然后与运行相同命令的 cronjob 的输出进行比较。

重新编辑:好的,所以这不是环境变量问题。

我假设您正在使用 mailx,因为它是当今最常见的。它的联机帮助页说:

传出的字符集
消息不一定相同
与终端上使用的一样。如果一个
传出短信包含
无法表示的字符
US-ASCII,正在使用的字符集
必须在其标头中声明。
可以声明允许的值
使用 sendcharsets 变量,

因此,尝试在调用邮件时添加以下参数:

-S sendcharsets=utf-8,iso-8859-1

You're right in assuming this is a charset issue. You need to set the appropriate environment variables to the beginning of your crontab.

Something like this should work:

LANG=en_US.UTF-8
LC_CTYPE=en_US.UTF-8

Optionally use LC_ALL in place of LC_CTYPE.

Reference: http://opengroup.org/onlinepubs/007908799/xbd/envvar.html

Edit: The reason it displays fine when you run it in your shell is probably because the above env vars are set in your shell.

To verify, execute 'locale' in your shell, then compare to the output of a cronjob that runs the same command.

Re-Edit: Ok, so it's not an env var problem.

I am assuming you're using mailx, as it is the most common nowdays. It's manpage says:

The character set for outgoing
messages is not necessarily the same
as the one used on the terminal. If an
outgoing text message contains
characters not representable in
US-ASCII, the character set being used
must be declared within its header.
Permissible values can be declared
using the sendcharsets variable,

So, try and add the following arguments when calling mail:

-S sendcharsets=utf-8,iso-8859-1
陌若浮生 2024-09-14 14:03:26

我编写了一个 bash 函数来向收件人发送电子邮件。该函数发送 utf-8 编码的邮件,并通过执行 base64 编码来处理主题和内容中的 utf-8 字符。

发送纯文本电子邮件:

send_email "plain" "[email protected]" "subject" "contents" "[email protected]" "[email protected]" "[email protected]" ...

发送 HTML 电子邮件:

send_email "html" "[email protected]" "subject" "contents" "[email protected]" "[email protected]" "[email protected]" ...

以下是功能代码。

# Send a email to recipients.
#
# @param string $content_type Email content mime type: 'html' or 'plain'.
# @param string $from_address Sender email.
# @param string $subject Email subject.
# @param string $contents Email contents.
# @param array $recipients Email recipients.
function send_email() {
  [[ ${#} -lt 5 ]] && exit 1

  local content_type="${1}"
  local from_address="${2}"
  local subject="${3}"
  local contents="${4}"

  # Remove all args but recipients.
  shift 4

  local encoded_contents="$(base64 <<< "${contents}")"
  local encoded_subject="=?utf-8?B?$(base64 --wrap=0 <<< "${subject}")?="

  for recipient in ${@}; do
    if [[ -n "${recipient}" ]]; then
    sendmail -f "${from_address}" "${recipient}" \
        <<< "Subject: ${encoded_subject}
MIME-Version: 1.0
From: ${from_address}
To: ${recipient}
Content-Type: text/${content_type}; charset=\"utf-8\"
Content-Transfer-Encoding: base64
Content-Disposition: inline

${encoded_contents}"
    fi
  done

  return 0
} # send_message()

i've written a bash function to send an email to recipients. The function send utf-8 encoded mails and work with utf-8 chars in subject and content by doing a base64 encode.

To send a plain text email:

send_email "plain" "[email protected]" "subject" "contents" "[email protected]" "[email protected]" "[email protected]" ...

To send a HTML email:

send_email "html" "[email protected]" "subject" "contents" "[email protected]" "[email protected]" "[email protected]" ...

Here is the function code.

# Send a email to recipients.
#
# @param string $content_type Email content mime type: 'html' or 'plain'.
# @param string $from_address Sender email.
# @param string $subject Email subject.
# @param string $contents Email contents.
# @param array $recipients Email recipients.
function send_email() {
  [[ ${#} -lt 5 ]] && exit 1

  local content_type="${1}"
  local from_address="${2}"
  local subject="${3}"
  local contents="${4}"

  # Remove all args but recipients.
  shift 4

  local encoded_contents="$(base64 <<< "${contents}")"
  local encoded_subject="=?utf-8?B?$(base64 --wrap=0 <<< "${subject}")?="

  for recipient in ${@}; do
    if [[ -n "${recipient}" ]]; then
    sendmail -f "${from_address}" "${recipient}" \
        <<< "Subject: ${encoded_subject}
MIME-Version: 1.0
From: ${from_address}
To: ${recipient}
Content-Type: text/${content_type}; charset=\"utf-8\"
Content-Transfer-Encoding: base64
Content-Disposition: inline

${encoded_contents}"
    fi
  done

  return 0
} # send_message()
夜深人未静 2024-09-14 14:03:26

只是为了向 KumZ 回答提供更多信息:
如果您需要使用 -a 开关指定更多标头,请随意将它们添加起来,如下所示(请注意 -a 的多用途)。

echo /path/to/file | mail -s "Some subject" [email protected] -a "From: Human Name <[email protected]>" -a "Content-Type: text/plain; charset=UTF-8"

Just to give additional information to KumZ answer:
if you need to specify more headers with the -a switch, feel free to add them up, like this (note the polyusage of -a).

echo /path/to/file | mail -s "Some subject" [email protected] -a "From: Human Name <[email protected]>" -a "Content-Type: text/plain; charset=UTF-8"
明明#如月 2024-09-14 14:03:26

您可以直接使用 sendmail 命令,无需 mail 包装器/帮助器。
它将允许您生成“原始”UTF-8 正文所需的所有标头
(提问者的评论中提到了UTF-8),

WARNING-1
标头中的非 7bit/ASCII 字符(例如 Subject:,< code>From:,To:) 需要特殊编码
警告2
sendmail 可能会中断长行(> 990 字节)。

[email protected]
SENDER_NAME="Sender Name"
[email protected]
(
# BEGIN of mail generation chain of commands
# "HERE" document with all headers and headers-body separator
cat << END
Subject: My Subject
From: $SENDER_NAME <$SENDER_ADDR>
To: $RECIPIENT_ADDR
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

END
# custom script to generate email body
./command.sh
# END   of mail generation chain of commands
) | /usr/sbin/sendmail -i -f$SENDER_ADDR -F"$SENDER_NAME" $RECIPIENT_ADDR

You may use sendmail command directly without mail wrapper/helper.
It would allow you to generate all headers required for "raw" UTF-8 body
(UTF-8 is mentioned in asker's comments),

WARNING-1:
Non 7bit/ASCII characters in headers (e.g. Subject:,From:,To:) require special encoding
WARNING-2:
sendmail may break long lines (>990 bytes).

[email protected]
SENDER_NAME="Sender Name"
[email protected]
(
# BEGIN of mail generation chain of commands
# "HERE" document with all headers and headers-body separator
cat << END
Subject: My Subject
From: $SENDER_NAME <$SENDER_ADDR>
To: $RECIPIENT_ADDR
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

END
# custom script to generate email body
./command.sh
# END   of mail generation chain of commands
) | /usr/sbin/sendmail -i -f$SENDER_ADDR -F"$SENDER_NAME" $RECIPIENT_ADDR
放血 2024-09-14 14:03:26

rfc2045 - (5)(软换行符)Quoted-Printable 编码要求编码行的长度不超过 76 个字符。对于 bash shell 脚本代码:

#!/bin/bash
subject_encoder(){
  echo -n "$1" | xxd -ps -c3 |awk -Wposix 'BEGIN{
    BASE64 = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
    printf " =?UTF-8?B?"; bli=8
  }
  function encodeblock (strin){
    b1=sprintf("%d","0x" substr(strin,1,2))
    b2=sprintf("%d","0x" substr(strin,3,2))
    b3=sprintf("%d","0x" substr(strin,5,2))
    o=substr(BASE64,b1/4 + 1,1) substr(BASE64,(b1%4)*16 + b2/16 + 1,1)
    len=length(strin)
    if(len>1) o=o substr(BASE64,(b2%16)*4 + b3/64 + 1,1); else o=o"="
    if(len>2) o=o substr(BASE64,b3%64 +1 ,1); else o=o"="
    return o
  }{
    bs=encodeblock($0)
    bl=length(bs)
    if((bl+bli)>64){
      printf "?=\n =?UTF-8?B?"
      bli=bl
    }
    printf bs
    bli+=bl
  }END{
    printf "?=\n"
  }'
}
SUBJECT="Relatório de utilização"
SUBJECT=`subject_encoder "${SUBJECT}"`
echo '<html>test</html>'| mail -a "Subject:${SUBJECT}" -a "MIME-Version: 1.0" -a "Content-Type: text/html; charset=UTF-8" [email protected]

rfc2045 - (5) (Soft Line Breaks) The Quoted-Printable encoding REQUIRES that encoded lines be no more than 76 characters long. For bash shell script code:

#!/bin/bash
subject_encoder(){
  echo -n "$1" | xxd -ps -c3 |awk -Wposix 'BEGIN{
    BASE64 = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
    printf " =?UTF-8?B?"; bli=8
  }
  function encodeblock (strin){
    b1=sprintf("%d","0x" substr(strin,1,2))
    b2=sprintf("%d","0x" substr(strin,3,2))
    b3=sprintf("%d","0x" substr(strin,5,2))
    o=substr(BASE64,b1/4 + 1,1) substr(BASE64,(b1%4)*16 + b2/16 + 1,1)
    len=length(strin)
    if(len>1) o=o substr(BASE64,(b2%16)*4 + b3/64 + 1,1); else o=o"="
    if(len>2) o=o substr(BASE64,b3%64 +1 ,1); else o=o"="
    return o
  }{
    bs=encodeblock($0)
    bl=length(bs)
    if((bl+bli)>64){
      printf "?=\n =?UTF-8?B?"
      bli=bl
    }
    printf bs
    bli+=bl
  }END{
    printf "?=\n"
  }'
}
SUBJECT="Relatório de utilização"
SUBJECT=`subject_encoder "${SUBJECT}"`
echo '<html>test</html>'| mail -a "Subject:${SUBJECT}" -a "MIME-Version: 1.0" -a "Content-Type: text/html; charset=UTF-8" [email protected]
旧城空念 2024-09-14 14:03:26

这可能不是命令行问题,而是字符集问题。通常发送电子邮件时,字符集为iso-8859-1。您放入进程中的文本很可能不是 iso-8859-1 编码的。检查您从中获取文本的任何数据源的编码是什么。

必读的“好读”链接: 每个软件开发人员绝对必须了解 Unicode 和字符的绝对最低限度设置(没有借口!)

重新更新: 在这种情况下,如果您手动输入特殊字符,您的终端可能正在使用 UTF-8 编码。例如,您应该能够使用 iconv 转换文件的字符集。另一种方法是告诉 mail 使用 UTF-8 编码,但 IIRC 这并不完全是微不足道的。

This is probably not a command line issue, but a character set problem. Usually when sending E-Mails, the character set will be iso-8859-1. Most likely the text you are putting into the process is not iso-8859-1 encoded. Check out what the encoding is of whatever data source you are getting the text from.

Obligatory "good reading" link: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

Re your update: In that case, if you enter the special characters manually, your terminal may be using UTF-8 encoding. You should be able to convert the file's character set using iconv for example. The alternative would be to tell mail to use UTF-8 encoding, but IIRC that is not entirely trivial.

一张白纸 2024-09-14 14:03:26

使用选项 -o message-charset="utf-8",如下所示:

sendemail -f your_email -t destination_email -o message-charset="utf-8" -u "Subject" -m "Message" -s smtp-mail.outlook.com:587 -xu your_mail -xp your_password

use the option -o message-charset="utf-8", like that:

sendemail -f your_email -t destination_email -o message-charset="utf-8" -u "Subject" -m "Message" -s smtp-mail.outlook.com:587 -xu your_mail -xp your_password
相守太难 2024-09-14 14:03:26

我有点晚了,但以前的解决方案都不适合我。

定位 mail 命令 (CentOS)

# locate mail | grep -v www | grep -v yum | grep -v share
# ls -l /bin/mail
lrwxrwxrwx. 1 root root 22 jul 21  2016 /bin/mail -> /etc/alternatives/mail
# ls -l /etc/alternatives/mail
lrwxrwxrwx. 1 root root 10 jul 21  2016 /etc/alternatives/mail -> /bin/mailx
# ls -l /bin/mailx
-rwxr-xr-x. 1 root root 390744 dic 16  2014 /bin/mailx

所以 mail 命令实际上是 mailx。这有助于搜索,最终让我找到 Unix&Linux Stackexchange 上的这个答案,其中指出:

Mailx 期望输入文本采用 Unix 格式,各行仅由换行符(^J、\n)分隔。另外使用回车符(^M、\r)的非 Unix 文本文件将被视为二进制数据;要将此类文件作为文本发送,请删除这些字符,例如通过 tr -d '\015'

并:

如果文件中还有其他控制字符,它们将导致 mailx 将数据视为二进制,然后附加它而不是将其用作正文。以下内容将去除所有特殊字符并将文件内容放入消息正文中

因此解决方案是使用 tr 命令删除这些特殊字符。像这样的事情:

./command.sh \
| tr -cd "[:print:]\n" \
| mail -s "My Subject" [email protected] -- -F "Sender Name" -f [email protected]

我已经用我的命令使用了这个解决方案

grep -v "pattern" $file \
| grep -v "another pattern" \
| ... several greps more ... \
| tr -cd "[:print:]\n" \
| mail -s "$subject" -a $file -r '$sender' $destination_email

I'm a bit late but none of the previous solutions worked for me.

Locating mail command (CentOS)

# locate mail | grep -v www | grep -v yum | grep -v share
# ls -l /bin/mail
lrwxrwxrwx. 1 root root 22 jul 21  2016 /bin/mail -> /etc/alternatives/mail
# ls -l /etc/alternatives/mail
lrwxrwxrwx. 1 root root 10 jul 21  2016 /etc/alternatives/mail -> /bin/mailx
# ls -l /bin/mailx
-rwxr-xr-x. 1 root root 390744 dic 16  2014 /bin/mailx

So mail command is in fact mailx. This helped with the search that finally took me to this answer at Unix&Linux Stackexchange that states:

Mailx expects input text to be in Unix format, with lines separated by newline (^J, \n) characters only. Non-Unix text files that use carriage return (^M, \r) characters in addition will be treated as binary data; to send such files as text, strip these characters e. g. by tr -d '\015'

From man page and:

If there are other control characters in the file they will result on mailx treating the data as binary and will then attach it instead of using it as the body. The following will strip all special characters and place the contents of the file into the message body

So the solution is using tr command to remove those special characters. Something like this:

./command.sh \
| tr -cd "[:print:]\n" \
| mail -s "My Subject" [email protected] -- -F "Sender Name" -f [email protected]

I've used this solution with my command

grep -v "pattern" $file \
| grep -v "another pattern" \
| ... several greps more ... \
| tr -cd "[:print:]\n" \
| mail -s "$subject" -a $file -r '$sender' $destination_email
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文