如何恢复损坏的 python “cPickle” 倾倒?
我正在使用 rss2email 将许多 RSS 提要转换为邮件,以便于使用。 也就是说,我使用它是因为它今天以一种可怕的方式崩溃了:在每次运行中,它只给我这个回溯:
Traceback (most recent call last):
File "/usr/share/rss2email/rss2email.py", line 740, in <module>
elif action == "list": list()
File "/usr/share/rss2email/rss2email.py", line 681, in list
feeds, feedfileObject = load(lock=0)
File "/usr/share/rss2email/rss2email.py", line 422, in load
feeds = pickle.load(feedfileObject)
TypeError: ("'str' object is not callable", 'sxOYAAuyzSx0WqN3BVPjE+6pgPU', ((2009, 3, 19, 1, 19, 31, 3, 78, 0), {}))
我能够从这个回溯构建的唯一有用的事实是文件 ~/.rss2email/feeds.dat
(其中 rss2email
保存其所有配置和运行时状态)以某种方式被破坏。 显然,rss2email
会读取其状态,并在每次运行时使用 cPickle
将其转储回来。
我什至在巨大的 (>12MB) feeds.dat
文件中找到了包含上面提到的 'sxOYAAuyzSx0WqN3BVPjE+6pgPU'
字符串的行。 在我未经训练的眼睛看来,转储似乎没有被截断或以其他方式损坏。
我可以尝试什么方法来重建文件?
在 Debian/不稳定系统上,Python 版本为 2.5.4。
编辑
Peter Gibson 和 JF Sebastian 建议直接从 pickle 文件,我之前已经尝试过。 显然,一个 Feed
类 需要在 rss2email.py 中定义,所以这是我的脚本:
#!/usr/bin/python
import sys
# import pickle
import cPickle as pickle
sys.path.insert(0,"/usr/share/rss2email")
from rss2email import Feed
feedfile = open("feeds.dat", 'rb')
feeds = pickle.load(feedfile)
“plain”pickle 变体产生以下回溯:
Traceback (most recent call last):
File "./r2e-rescue.py", line 8, in <module>
feeds = pickle.load(feedfile)
File "/usr/lib/python2.5/pickle.py", line 1370, in load
return Unpickler(file).load()
File "/usr/lib/python2.5/pickle.py", line 858, in load
dispatch[key](self)
File "/usr/lib/python2.5/pickle.py", line 1133, in load_reduce
value = func(*args)
TypeError: 'str' object is not callable
cPickle 变体产生与调用基本相同的东西 r2e
本身:
Traceback (most recent call last):
File "./r2e-rescue.py", line 10, in <module>
feeds = pickle.load(feedfile)
TypeError: ("'str' object is not callable", 'sxOYAAuyzSx0WqN3BVPjE+6pgPU', ((2009, 3, 19, 1, 19, 31, 3, 78, 0), {}))
编辑 2
遵循 JF Sebastian 的建议,放置“printf debug” 到 Feed.__setstate__
到我的测试脚本中,这些是 Python 退出之前的最后几行。
u'http:/com/news.ars/post/20080924-everyone-declares-victory-in-smutfree-wireless-broadband-test.html': u'http:/com/news.ars/post/20080924-everyone-declares-victory-in-smutfree-wireless-broadband-test.html'},
'to': None,
'url': 'http://arstechnica.com/'}
Traceback (most recent call last):
File "./r2e-rescue.py", line 23, in ?
feeds = pickle.load(feedfile)
TypeError: ("'str' object is not callable", 'sxOYAAuyzSx0WqN3BVPjE+6pgPU', ((2009, 3, 19, 1, 19, 31, 3, 78, 0), {}))
使用 python 2.4.4-2 的 Debian/etch 盒子上也会发生同样的事情。
I am using rss2email
for converting a number of RSS feeds into mail for easier consumption. That is, I was using it because it broke in a horrible way today: On every run, it only gives me this backtrace:
Traceback (most recent call last):
File "/usr/share/rss2email/rss2email.py", line 740, in <module>
elif action == "list": list()
File "/usr/share/rss2email/rss2email.py", line 681, in list
feeds, feedfileObject = load(lock=0)
File "/usr/share/rss2email/rss2email.py", line 422, in load
feeds = pickle.load(feedfileObject)
TypeError: ("'str' object is not callable", 'sxOYAAuyzSx0WqN3BVPjE+6pgPU', ((2009, 3, 19, 1, 19, 31, 3, 78, 0), {}))
The only helpful fact that I have been able to construct from this backtrace is that the file ~/.rss2email/feeds.dat
in which rss2email
keeps all its configuration and runtime state is somehow broken. Apparently, rss2email
reads its state and dumps it back using cPickle
on every run.
I have even found the line containing that 'sxOYAAuyzSx0WqN3BVPjE+6pgPU'
string mentioned above in the giant (>12MB) feeds.dat
file. To my untrained eye, the dump does not appear to be truncated or otherwise damaged.
What approaches could I try in order to reconstruct the file?
The Python version is 2.5.4 on a Debian/unstable system.
EDIT
Peter Gibson and J.F. Sebastian have suggested directly loading from the
pickle file and I had tried that before. Apparently, a Feed
class
that is defined in rss2email.py
is needed, so here's my script:
#!/usr/bin/python
import sys
# import pickle
import cPickle as pickle
sys.path.insert(0,"/usr/share/rss2email")
from rss2email import Feed
feedfile = open("feeds.dat", 'rb')
feeds = pickle.load(feedfile)
The "plain" pickle variant produces the following traceback:
Traceback (most recent call last):
File "./r2e-rescue.py", line 8, in <module>
feeds = pickle.load(feedfile)
File "/usr/lib/python2.5/pickle.py", line 1370, in load
return Unpickler(file).load()
File "/usr/lib/python2.5/pickle.py", line 858, in load
dispatch[key](self)
File "/usr/lib/python2.5/pickle.py", line 1133, in load_reduce
value = func(*args)
TypeError: 'str' object is not callable
The cPickle
variant produces essentially the same thing as callingr2e
itself:
Traceback (most recent call last):
File "./r2e-rescue.py", line 10, in <module>
feeds = pickle.load(feedfile)
TypeError: ("'str' object is not callable", 'sxOYAAuyzSx0WqN3BVPjE+6pgPU', ((2009, 3, 19, 1, 19, 31, 3, 78, 0), {}))
EDIT 2
Following J.F. Sebastian's suggestion around putting "printf
debugging" into Feed.__setstate__
into my test script, these are the
last few lines before Python bails out.
u'http:/com/news.ars/post/20080924-everyone-declares-victory-in-smutfree-wireless-broadband-test.html': u'http:/com/news.ars/post/20080924-everyone-declares-victory-in-smutfree-wireless-broadband-test.html'},
'to': None,
'url': 'http://arstechnica.com/'}
Traceback (most recent call last):
File "./r2e-rescue.py", line 23, in ?
feeds = pickle.load(feedfile)
TypeError: ("'str' object is not callable", 'sxOYAAuyzSx0WqN3BVPjE+6pgPU', ((2009, 3, 19, 1, 19, 31, 3, 78, 0), {}))
The same thing happens on a Debian/etch box using python 2.4.4-2.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我如何解决我的问题
pickle.py
的 Perl 端口遵循 JF Sebastian 关于
pickle
多么简单的评论格式是,我将
pickle.py
的部分内容移植到 Perl。 一对夫妇快速正则表达式将是访问我的更快的方式
数据,但我觉得黑客的价值和了解更多的机会
关于 Python 是值得的。 另外,我还有更多的感受
使用 Perl(以及在其中调试代码)比 Python 更舒服。
大部分移植工作(简单类型、元组、列表、字典)
非常简单。 Perl 和 Python 的不同概念
类和对象是迄今为止唯一的问题
需要的不仅仅是习语的简单翻译。 结果是一个模块
称为
Pickle::Parse
,经过一些打磨后将发表在 CPAN 上。
CPAN 上存在一个名为
Python::Serialise::Pickle
的模块,但我发现它的解析能力缺乏:它喷出所有调试输出
在这个地方,似乎不支持类/对象。
解析、转换数据、检测流中的实际错误
基于
Pickle::Parse
,我尝试解析feeds.dat
文件。经过几次迭代修复解析代码中的小错误后,我得到了
与 pickle.py 的原始错误消息非常相似
对象不可调用错误消息:
哈! 现在我们正处于这样一个时刻,实际数据很可能是这样的:
流已损坏。 另外,我们还知道它在哪里被破坏了。
事实证明,以下序列的第一行是错误的:
“备忘录”中的位置 7724 指向该字符串
“sxOYAAuyzSx0WqN3BVPjE+6pgPU”
。 从早期的类似记录来看流,很明显需要一个
time.struct_time
对象反而。 后来的所有记录都共享这个错误的指针。 用一个简单的
搜索/替换操作,修复这个问题很简单。
我觉得很讽刺的是我偶然发现了错误的根源
通过 Perl 的功能告诉用户其在输入中的位置
当它死亡时的数据流。
结论
rss2email
自动将其腌制配置/状态混乱转换为
另一种工具的格式。
pickle.py
需要更有意义的错误消息来告诉用户关于数据流的位置(不是其自身的位置)
代码)哪里出了问题。
pickle.py
部分移植到 Perl 很有趣,而且最终也很有意义。How I solved my problem
A Perl port of
pickle.py
Following J.F. Sebastian's comment about how simple the
pickle
format is, I went out to port parts of
pickle.py
to Perl. A coupleof quick regular expressions would have been a faster way to access my
data, but I felt that the hack value and an opportunity to learn more
about Python would be be worth it. Plus, I still feel much more
comfortable using (and debugging code in) Perl than Python.
Most of the porting effort (simple types, tuples, lists, dictionaries)
went very straightforward. Perl's and Python's different notions of
classes and objects has been the only issue so far where a bit more
than simple translation of idioms was needed. The result is a module
called
Pickle::Parse
which after a bit of polishing will bepublished on CPAN.
A module called
Python::Serialise::Pickle
existed on CPAN, but Ifound its parsing capabilities lacking: It spews debugging output all
over the place and doesn't seem to support classes/objects.
Parsing, transforming data, detecting actual errors in the stream
Based upon
Pickle::Parse
, I tried to parse thefeeds.dat
file.After a few iteration of fixing trivial bugs in my parsing code, I got
an error message that was strikingly similar to
pickle.py
's originalobject not callable error message:
Ha! Now we're at a point where it's quite likely that the actual data
stream is broken. Plus, we get an idea where it is broken.
It turned out that the first line of the following sequence was wrong:
Position 7724 in the "memo" pointed to that string
"sxOYAAuyzSx0WqN3BVPjE+6pgPU"
. From similar records earlier in thestream, it was clear that a
time.struct_time
object was neededinstead. All later records shared this wrong pointer. With a simple
search/replace operation, it was trivial to fix this.
I find it ironic that I found the source of the error by accident
through Perl's feature that tells the user its position in the input
data stream when it dies.
Conclusion
rss2email
as soon as I find time toautomatically transform its pickled configuration/state mess to
another tool's format.
pickle.py
needs more meaningful error messages that tell the userabout the position of the data stream (not the poision in its own
code) where things go wrong.
pickle.py
to Perl was fun and, in the end, rewarding.您是否尝试过使用 cPickle 和 pickle 手动加载 feeds.dat 文件? 如果输出不同,则可能暗示错误。
类似于(从您的主目录):(
如果 rss2email 未在 ascii 中进行 pickle,您可能需要以二进制模式“rb”打开)。
Pete
编辑:cPickle 和 pickle 给出相同错误的事实表明 feeds.dat 文件是问题所在。 可能是 rss2email 版本之间的 Feed 类发生了变化,如 Ubuntu bug JF Sebastian 链接中所建议的。
Have you tried manually loading the feeds.dat file using both cPickle and pickle? If the output differs it might hint at the error.
Something like (from your home directory):
(you might need to open in binary mode 'rb' if rss2email doesn't pickle in ascii).
Pete
Edit: The fact that cPickle and pickle give the same error suggests that the feeds.dat file is the problem. Probably a change in the Feed class between versions of rss2email as suggested in the Ubuntu bug J.F. Sebastian links to.
听起来 cPickle 的内部结构正在变得混乱。 此线程(http://bytes.com/groups/python/565085-cpickle-problems )看起来可能有线索..
Sounds like the internals of cPickle are getting tangled up. This thread (http://bytes.com/groups/python/565085-cpickle-problems) looks like it might have a clue..
'sxOYAAuyzSx0WqN3BVPjE+6pgPU'
很可能与 pickle 的问题无关发布错误回溯(以确定哪个类定义了无法调用的属性(导致 TypeError 的属性) ):
编辑:
将以下内容添加到您的代码并运行(将 stderr 重定向到文件,然后在其上使用
'tail -2'
来打印最后 2 行):如果上述内容没有产生有趣的输出,则使用一般故障排除策略:
确认
'feeds.dat'
是问题所在:~/.rss2email
目录,'feeds.dat'
大小大于当前的大小请参阅 r2e 因 TypeError 而退出Ubuntu 上的 错误。
'sxOYAAuyzSx0WqN3BVPjE+6pgPU'
is most probably unrelated to the pickle's problemPost an error traceback for (to determine what class defines the attribute that can't be called (the one that leads to the TypeError):
EDIT:
Add the following to your code and run (redirect stderr to file then use
'tail -2'
on it to print last 2 lines):If the above doesn't yield an interesting output then use general troubleshooting tactics:
Confirm that
'feeds.dat'
is the problem:~/.rss2email
directory'feeds.dat'
size is greater than the current. Run some tests.See r2e bails out with TypeError bug on Ubuntu.