如何用Python保存数据?

发布于 2024-08-04 12:04:30 字数 300 浏览 3 评论 0原文

我正在用 Python 编写一个程序,希望用户能够保存他们正在处理的数据。我研究过cPickle;看起来这将是一种快速而简单的保存数据的方法,但似乎不安全。由于整个函数、类等都可以被腌制,我担心流氓保存文件可能会将有害代码注入程序中。有没有办法可以阻止这种情况,或者我应该研究其他保存数据的方法,例如直接转换为字符串(这似乎也不安全)或创建 XML 层次结构并将数据放入其中。

我是Python新手,所以请耐心等待。

提前致谢!

编辑:至于我存储的数据类型,主要是字典和列表。诸如姓名、速度等信息。现在相当简单,但将来可能会变得更加复杂。

I am working on a program in Python and want users to be able to save data they are working on. I have looked into cPickle; it seems like it would be a fast and easy way to save data, it seems insecure. Since entire functions, classes, etc can be pickled, I am worried that a rogue save file could inject harmful code into the program. Is there a way I can prevent that, or should I look into other methods of saving data, such as directly converting to a string (which also seems insecure,) or creating an XML hierarchy, and putting data in that.

I am new to python, so please bear with me.

Thanks in advance!

EDIT: As for the type of data I am storing, it is mainly dictionaries and lists. Information such as names, speeds, etc. It is fairly simple right now, but may get more complex in the future.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

壹場煙雨 2024-08-11 12:04:31

根据您的描述,JSON 编码是安全且快速的解决方案。 python2.6中有一个json模块,你可以这样使用它:

import json
obj = {'key1': 'value1', 'key2': [1, 2, 3, 4], 'key3': 1322}
encoded = json.dumps(obj)
obj = json.loads(encoded)

JSON格式是人类可读的,与python中的字典字符串表示非常相似。并且不存在像pickle那样的任何安全问题。如果没有 python2.6 可以安装 cjson 或 simplejson

不能使用JSON 用于保存像 Pickle 这样的 Python 对象。但你可以用它来保存:字符串、字典、列表……对于大多数情况来说这已经足够了。

解释为什么pickle不安全。来自python 文档

大多数安全问题
围绕 pickle 和 cPickle
模块涉及 unpickling。有
没有已知的安全漏洞
与酸洗有关,因为你(
程序员)控制的对象
pickle 会与之交互,以及所有它
产生的是一个字符串。

但是,对于 unpickling 来说,它永远
解开不信任的好主意
来源可疑的字符串,例如
例如,从套接字读取字符串。
这是因为 unpickling 可以创建
意想不到的物体,甚至
这些潜在的运行方法
对象,例如它们的类
构造函数或析构函数
...这个故事的寓意是你
应该非常小心
您的应用程序的字符串来源
解泡菜。

有一些方法可以保护自己,但在您的情况下使用 JSON 会容易得多。

From your description JSON encoding is the secure and fast solution. There is a json module in python2.6, you can use it like this:

import json
obj = {'key1': 'value1', 'key2': [1, 2, 3, 4], 'key3': 1322}
encoded = json.dumps(obj)
obj = json.loads(encoded)

JSON format is human readable and is very similar to the dictionary string representation in python. And doesn't have any security issues like pickle. If you don't have python2.6 you can install cjson or simplejson

You can't use JSON to save python objects like Pickle. But you can use it to save: strings, dictionaries, lists, ... It can be enough for most cases.

To explain why pickle is insecure. From python docs:

Most of the security issues
surrounding the pickle and cPickle
module involve unpickling. There are
no known security vulnerabilities
related to pickling because you (the
programmer) control the objects that
pickle will interact with, and all it
produces is a string.

However, for unpickling, it is never a
good idea to unpickle an untrusted
string whose origins are dubious, for
example, strings read from a socket.
This is because unpickling can create
unexpected objects and even
potentially run methods of those
objects, such as their class
constructor or destructor
... The moral of the story is that you
should be really careful about the
source of the strings your application
unpickles.

There are some ways to defend yourself but it is much easier to use JSON in your case.

囍孤女 2024-08-11 12:04:31

您可以执行以下操作:

写入

  • Pickle
  • 对 pickled 文件进行签名
  • 完成

读取

  • 检查 pickled 文件的签名
  • Unpickle
  • 使用

我想知道是什么让您认为数据文件将被篡改,但您的应用程序不会被篡改?

You could do something like:

to write

  • Pickle
  • Sign pickled file
  • Done

to read

  • Check pickled file's signature
  • Unpickle
  • Use

I wonder though what makes you think that the data files are going to be tampered but your application is not going to be?

故乡的云 2024-08-11 12:04:31

*****在这个答案中,我只关心应用程序完整性的意外损坏。*****

Pickle 是“安全的”。可能不安全的是访问您没有编写的代码,例如在插件中;但这与泡菜无关。

当您pickle一个对象时,它的所有数据都会被保存,但代码和实现不会。这意味着当 unpickle 时,更新的对象可能会发现它内部有“旧式”数据(如果您更新实现)。这是您必须了解和处理的事情(如果适用)。

Pickling 字符串、列表、数字、字典非常简单并且工作完美,与 JSON 相当。 Pickle 的魔力在于——有时无需调整——即使是复杂的 Python 对象也可以被 pickle。但只有数据被腌制;只需通过保存的模块名称和对象的类型名称即可重建实例。

*****In this answer, I'm only concerned about accidental corruption of the application's integrity.*****

Pickle is "secure". What might be insecure is accessing code you didn't write, for example in plugins; that is not relevant to pickles though.

When you pickle an object, all its data is saved, but code and implementation is not. This means when unpickled, an updated object might find it has "old-style" data inside (if you update the implementation). This is something you must know and handle, if applicable.

Pickling strings, lists, numbers, dicts is very easy and works perfectly, and comparably to JSON. The Pickle magic is that -- sometimes without adjustment -- even complex python objects can be pickled. But only data is pickled; the instances are reconstructed simply by the saved module name and type name of the object.

莫多说 2024-08-11 12:04:31

在我们回答之前,您需要向我们提供更多背景信息:您要保存什么类型的数据、有多少数据、您想如何访问它?

至于泡菜:它们不存储代码。当您 pickle 函数或类时,存储的是名称,而不是实际代码本身。

You need to give us more context before we can answer: what type of data are you saving, how much is there, how do you want to access it?

As for pickles: they do not store code. When you pickle a function or class, it is the name that is stored, not the actual code itself.

愚人国度 2024-08-11 12:04:31

您应该使用某种数据库。以 pickle 格式存储并不是一个好主意(在大多数情况下)。您可能会考虑:

  • SQLite -(包含在 Python 2.5+ 中)快速且简单,但需要 SQL 和 DB-API 知识
  • buzhug - 非 SQL、基于文件的数据库,具有 Pythonic 语法
  • SQL 数据库 - 您可以使用某些 DBMS 的接口(如 MySQL、PostreSQL 等),但是它只适用于大量数据(数千条记录)。

您可能会在此处找到一些其他解决方案。

You should use a database of some kind. Storing in pickle format isn't a good idea (in most cases). You may consider:

  • SQLite - (included in Python 2.5+) fast and simple, but requires knowledge of SQL and DB-API
  • buzhug - non-SQL, file based database with pythonic syntax
  • SQL database - you may use interface to some of DBMS (like MySQL, PostreSQL etc.), but it's only good for larger amount of data (thousands of records).

You may find some other solutions here.

筱武穆 2024-08-11 12:04:31

具体来说,谁是反社会者,正在努力通过破解 pickled 文件来破坏程序?

这是Python。反社会者有你的来源。他们不需要胡乱破解你的pickle 文件。他们可以编辑您的源代码并进行他们想要的所有“破坏”。

除非您卷入与有组织犯罪集团的诉讼,否则不要担心“不安全”。

不用担心“流氓保存文件可能会将有害代码注入程序”。当他们拥有源代码时,没有人会为流氓保存文件而烦恼。

Who -- specifically -- is the sociopath who's going through the effort to break a program by hacking the pickled file?

It's Python. The sociopath has your source. They don't need to fool around hacking your pickle file. They can just edit your source and do all the "damage" they want.

Don't worry about "insecurity" unless you're involved in litigation with organized crime syndicates.

Don't worry about "a rogue save file could inject harmful code into the program". No one will bother with a rogue save file when they have the source.

半窗疏影 2024-08-11 12:04:31

您可能喜欢使用 y_serial 模块
http://yserial.sourceforge.net

读起来像教程,但实际上提供了
用于序列化和持久化的工作代码。
评论讨论了一些优点和缺点
与此处提出的问题相关。

它被设计为一个通用的解决方案
使用 SQLite 存储压缩的 Python 对象
(几乎没有 SQL 大惊小怪;-)

希望这会有所帮助。

You might enjoy working with the y_serial module over at
http://yserial.sourceforge.net

which reads like a tutorial but operationally offers
working code for serialization and persistance.
The commentary discusses some of the pros and cons
relevant to issues raised here.

It's designed to be a general solution to
warehousing compressed Python objects with SQLite
(with almost no SQL fuss ;-)

Hope this helps.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文