使用 type() 信息来转换存储为字符串的值
在我的应用程序中,我生成了许多值(三列,类型为 int、str 和 datetime,请参见下面的示例),这些值作为逗号分隔的字符串存储在平面文件中。此外,我存储一个包含值类型的文件(见下文)。现在,我如何使用此信息将平面文件中的值转换为 Python 中的正确数据类型?可以吗还是我需要做一些其他的事情?
数据文件:
#id,value,date
1,a,2011-09-13 15:00:00
2,b,2011-09-13 15:10:00
3,c,2011-09-13 15:20:00
4,d,2011-09-13 15:30:00
类型文件:
id,<type 'int'>
value,<type 'str'>
date,<type 'datetime.datetime'>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
据我了解,您已经解析了文件,现在只需要获得正确的类型。假设
id_
、type_
和value
是包含文件中的值的三个字符串。 (请注意,type_
应包含'int'
— 例如 —,而不是''
。然后您可以像这样使用它......:
不幸的是,对于日期时间,这不起作用,因为它不能简单地通过其字符串表示形式进行初始化。
As I understand, you already parsed the file, you now just need to get the right type. So let's say
id_
,type_
andvalue
are three strings that contain the values in the file. (Note,type_
should contain'int'
— for example —, not'<type 'int'>'
.Then you can use it like..:
Unfortunately for datetime this doesnt work though, as it can not be simply initialized by its string representation.
你的类型文件可以更简单:
然后在你的主程序中你可以
Your types file can be simpler:
Then in your main program you can
请按照下列步骤操作:
split()
以,
作为分隔符分割行。(例如使用切片)
并创建一个相同的datetime
对象。Follow these steps:
split()
with,
as the separator.(e.g. using slices)
and make adatetime
object of the same.我必须在最近的一个程序中处理类似的情况,该程序必须转换许多字段。我使用了一个元组列表,其中元组的一个元素是要使用的转换函数。有时是
int
或float
;有时它是一个简单的lambda
;有时它是其他地方定义的函数的名称。I had to deal with a similar situation in a recent program, that had to convert many fields. I used a list of tuples, where one element of the tuples was the conversion function to use. Sometimes it was
int
orfloat
; sometimes it was a simplelambda
; and sometimes it was the name of a function defined elsewhere.不要使用单独的“类型”文件,而是使用
(id, value, date)
的元组列表和pickle
它。或者你必须解决将字符串到类型转换器存储为文本的问题(在你的“type”文件中),这可能是一个有趣的问题,但如果你只是想完成一些事情,使用
pickle
或cPickle
Instead of having a separate "type" file, take your list of tuples of
(id, value, date)
and justpickle
it.Or you'll have to solve the problem of storing your string-to-type converters as text (in your "type" file), which might be a fun problem to solve, but if you're just trying to get something done, go with
pickle
orcPickle
首先,你不能编写一个“通用”或“智能”转换来神奇地处理任何事情。
其次,试图用代码以外的任何东西来总结字符串到数据的转换似乎从来都没有很好的效果。因此,不必编写命名转换的字符串,只需编写转换即可。
最后,尝试用特定于域的语言编写配置文件是愚蠢的。只需编写Python代码即可。这并不比尝试解析某些配置文件复杂多少。
不要浪费时间尝试创建一个不仅仅是 Python 的“类型文件”。这没有帮助。将转换编写为 Python 函数会更简单。您可以导入该函数,就像它是您的“类型文件”一样。
这就是“类型文件”中的所有内容
现在您可以像这样读取(并处理)您的输入。
这意味着你注定要失败。
您必须对文件内容有实际的定义,否则无法进行任何处理。
您不知道“23507”是否应该是整数、字符串、邮政编码或浮点数(省略了句点)、持续时间(以天或秒为单位)或其他更复杂的东西。你无法希望,也无法猜测。
得到定义后,需要根据实际的定义编写显式转换函数。
编写转换后,您需要 (a) 使用简单的单元测试来测试转换,以及 (b) 测试数据以确保它确实可以转换。
然后就可以处理该文件了。
First, you cannot write a "universal" or "smart" conversion that magically handles anything.
Second, trying to summarize a string-to-data conversion in anything other than code never seems to work out well. So rather than write a string that names the conversion, just write the conversion.
Finally, trying to write a configuration file in a domain-specific language is silly. Just write Python code. It's not much more complicated than trying to parse some configuration file.
Don't waste time trying to create a "type file" that's not simply Python. It doesn't help. It is simpler to write the conversion as a Python function. You can import that function as if it was your "type file".
That's all you have in your "type file"
Now you can read (and process) your input like this.
This means you are doomed.
You must have an actual definition the file content or you cannot do any processing.
You don't know if "23507" should be an integer, a string, a postal code, or a floating-point (which omitted the period), a duration (in days or seconds) or some other more complex thing. You can't hope and you can't guess.
After getting a definition, you need to write an explicit conversion function based on the actual definition.
After writing the conversion, you need to (a) test the conversion with a simple unit test, and (b) test the data to be sure it really converts.
Then you can process the file.
您可能想查看 xlrd 模块。如果您可以将数据加载到 Excel 中,并且它知道与每列关联的类型,则 xlrd 会在您读取 Excel 文件时为您提供类型。当然,如果数据以 csv 形式提供给您,则必须有人进入 excel 文件并手动更改列类型。
不确定这是否能让您一路到达您想去的地方,但这可能会有所帮助
You might want to look at the xlrd module. If you can load your data into excel, and it knows what type is associated with each column, xlrd will give you the type when you read the excel file. Of course, if the data is given to you as a csv then someone would have to go into the excel file and change the column types by hand.
Not sure this gets you all the way to where you want to go, but it might help