如何可靠地捕获 Sybase bcp 错误?
我们将命名管道与 Sybase bcp 结合使用,以便可以即时压缩输出。
Sybase bcp 实用程序在其退出代码中不会返回太多信息。 Sybase 文档指导用户检查进程写入的错误消息。
这是我们使用的错误处理习惯用法的释义,脚本的非 bcp 部分中的一些错误检查已被删除以缩短示例。
while :
do
{
rm -f $fifo
mkfifo $fifo
cat $fifo &
CatPid=$!
bcp $db.$owner.$table out $fifo -c $db_creds >$log 2>&1
grep -qi deadlock $log || break
# Must have been a deadlock, clean up.
kill $CatPid
} > $output
done
基本上,如果 bcp 输出消息中出现“死锁”一词,我们会重试。
两个问题
- 这种做法看起来合理吗?
- 除了死锁之外,我们还需要担心哪些其他 bcp 错误?
我对检测瞬时 bcp 错误特别感兴趣,我们可以在其中重试。
我们使用复合语句,以便可以在压缩之前在 bcp 数据周围插入页眉和页脚,但为了简化示例,我省略了这一点。
We're using named pipes with Sybase bcp so that we can compress output on-the-fly.
The Sybase bcp utility does not return much information in its exit code.
Sybase documentation directs the user to inspect error messages written by the process.
This is a paraphrase of the error handling idiom we use, some error checking in the non-bcp parts of the script has been removed to shorten the example.
while :
do
{
rm -f $fifo
mkfifo $fifo
cat $fifo &
CatPid=$!
bcp $db.$owner.$table out $fifo -c $db_creds >$log 2>&1
grep -qi deadlock $log || break
# Must have been a deadlock, clean up.
kill $CatPid
} > $output
done
Basically, if the word 'deadlock' appears in bcp output messages, we try again.
Two questions
- Does this approach look reasonable?
- What other bcp errors than deadlock might we need to worry about?
I'm specifically interested in detecting transient bcp errors, where we can try again.
We use a compound statement so that we can insert headers and footers around the bcp data before the compression, but I've omitted that to simplify the example.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我的想法是使用命名管道进行 bcp out 并将数据从 Sybase ASE 压缩到文件,然后使用 LOAD TABLE 语句加载到 Sybase IQ 中。不幸的是,演出损失惨重。在 HP-UX 上,从命名管道加载表比从文件加载表慢 10 倍:-( 我投票支持直接将简单压缩算法实现到 OC utils(bcp、isql)中。
I had idea use named pipe for bcp out and compress data from Sybase ASE to file and then load into Sybase IQ using LOAD TABLE statement. Unfortunately there was a big performance lost. LOAD TABLE from named pipe was 10x slower than LOAD TABLE from file on HP-UX :-( I vote for implementing simple compress alghoritm directly into OC utils (bcp,isql).
因此,您所需要的只是可靠的fail fast bcp。某些 Sybase 版本的 Bcp 有一个控制最大错误计数的命令参数。
1)如果设置错误计数= 1,那么它会更可靠地失败。
2) 问题归结为捕获 bcp 进程的退出代码,该进程使用 & 在后台启动。我不知道具体应该使用什么 shell 语法,但可能有一些常见的 shell 技术。
So all you need is just reliable fail fast bcp. Bcp for some of Sybase versions has a command argument controlling max error-count.
1) if you set error count = 1, then it will fail more reliably.
2) The problem then boils down to trapping the exit code of bcp process, launched on background with &. I dont know what shell syntax should be used for this exactly, but there might be some common shell technique for this.
这是一个很简单的方法。
这种方法看起来合理吗?
事实并非如此。首先,shell 脚本不是很好,它不需要所有这些工作,但我不会管它,因为这不是问题。
其次,bcp 不会死锁,即使在活动数据库上(除非您在数据库中做一些非常奇怪的事情,或者运行多个并行 bcp 流),它也会等待共享锁定要清除,因此无需检查。
第三,
bcp
提供完整且完整的错误消息。使用-e bcp_err_file
调用参数。然后grep ... bcp_err_file
查找错误或模式("^[Ee]rr"
和"^Msg"
是典型的)。我单独捕获错误;例外情况;和其他消息。第四,我永远不会像在 shell 脚本中那样永远重试。潜在的无限循环,浪费资源。让它执行一次,并生成“成功”x或“失败”以及错误列表。任何循环都应该仅针对要导出的表列表。
它是作为一个正确的 UNIX 实用程序/命令编写的。 如果您不指定错误文件,当然,所有错误消息都会发送到
$stdout
并且它们会与进度消息混合在一起。您可以捕获该流中的错误,但这是没有腿的;指定一个单独的错误文件。将
$stdout
捕获到bcp_log_file
是正常的,但它与bcp_err_file
是分开的在 Unix 上检查退出状态是另一回事。如果 bcp 成功运行(无论是否产生错误消息),则以“成功”退出;仅当 Unix 程序失败时,您才会获得非零退出状态。
它将容忍任意数量的错误,除非您通过
-m max_errors
限制它除了死锁之外,我们还需要担心哪些 bcp 错误?
任何和所有错误,它们都无法预测(它是具有固定资源的在线服务器),捕获所有错误,然后检查
bcp_err_file
。您可以通过grep
自动执行检查,如上所述。我对检测瞬时 bcp 错误特别感兴趣,我们可以重试。
完全没问题。详述如上。瞬态错误很少见。
您还应该担心硬错误和资源错误,并且不要重试(因为它会再次失败)。
我们大多数人担心会导致行丢失的错误,这意味着 bcp_data_file 不完整或无法使用。
This is one is easy.
Does this approach look reasonable?
Not really. First, the shell script is not very good, it does not need all that work but I will leave that alone, as that is not the question.
Second,
bcp
will not deadlock, even on an active database (unless you are doing something very strange in the database, or running multiple parallelbcp
streams), it waits for shared locks to clear, so there is no need to check for that.Third,
bcp
provides complete and full error messages. Use the-e bcp_err_file
invocation parameter. Thengrep ... bcp_err_file
for errors or patterns ("^[Ee]rr"
and"^Msg"
are typical). I trap separately for errors; exceptions; and other messages.Fourth, I would never retry forever like that within a shell script. Potential infinite loop, waste of resources. Have it execute once, and produce "success" xor "failure" and the list of errors. Any loop should be only for the list of tables to be exported.
It is written as a proper unix utility/command. If you do not specify an error file, sure, then all error messages go to
$stdout
and they are mixed up with progress messages. You can trap for errors in that stream, but that is legless; specify a separate error file.It is normal to capture
$stdout
to abcp_log_file
, but that is separate to thebcp_err_file
Inspecting exit status on Unixis a different thing. If
bcp
ran successfully (whether it produced error messages or not), it exits as "success"; you will get a non-zero exit status only if a unix program fails.it will tolerate any number of errors, unless you limit that by
-m max_errors
What other bcp errors than deadlock might we need to worry about?
Any and all errors, they cannot be predicted (it is an online server with fixed resources), capture all of them, and then inspect the
bcp_err_file
. You can automate the inspection viagrep
as detailed above.I'm specifically interested in detecting transient bcp errors, where we can try again.
No problem at all. Detailed above. Transient errors are few and far between.
You should also worry about hard errors and resource errors, and not try again (because it will fail again).
Most of us worry about errors that result in missing rows, which means the
bcp_data_file
is incomplete or unusable.这真的会做你想做的事吗?我对 bcp 命令行工具的理解是没有事务 - 即。如果您正在加载 M 行,但插入 N 行由于任何原因(约束等)失败,则前 N-1 行已被插入。因此重新启动整个文件并不是一个好主意。
您可以使用 -m X 选项允许 bcp 在遇到最多 X 个错误时继续执行,然后尝试识别哪些行插入失败并重试。
您还可以查看 Michael Peppler 的 Sybase::BCP Perl 模块,但我们的调查表明它可能存在 ASE 15 问题。
Is that really going to do what you want? My understanding of the bcp commandline tool is that there is no transaction - ie. if you are loading M rows, but inserting row N fails for any reason (constraints etc.), the first N-1 rows have been inserted. So restarting the whole file isn't a great idea.
You can use the -m X option to allow bcp to carry on in the face of up to X errors, and then try to identify which rows failed to insert and retry them.
You could also look into Michael Peppler's Sybase::BCP Perl module, but our investigations suggest that it may have issues with ASE 15.