如何在Python中读取具有换行符和制表符的文件到字符串中?
我正在尝试读取一个包含制表符和换行符等的文件,并且数据为 JSON 格式。
当我使用 file.read()/readlines() 等读取它时,所有换行符和制表符也会被读取。
我已经尝试过 rstrip() 、 split 等,但徒劳无功,也许我错过了一些东西:
这基本上就是我正在做的事情:
f = open('/path/to/file.txt')
line = f.readlines()
line.split('\n')
这是数据(包括原始选项卡,因此格式很差) :
{
"foo": [ {
"id1" : "1",
"blah": "blah blah",
"id2" : "5885221122",
"bar" : [
{
"name" : "Joe JJ",
"info": [ {
"custid": "SSN",
"type" : "String", } ]
} ] } ] }
我想知道我们是否可以优雅地忽略它。
也希望使用 json.dumps()
I am trying to read a file which has tabs and newline etc and the data is JSON format.
When I read it using file.read()
/readlines()
etc, all the newlines and tabs are also read.
I have tried rstrip()
, split etc but in vain, maybe I am missing some thing:
Here is essentially what I am doing:
f = open('/path/to/file.txt')
line = f.readlines()
line.split('\n')
This is the data (including the raw tabs, hence the poor formatting):
{
"foo": [ {
"id1" : "1",
"blah": "blah blah",
"id2" : "5885221122",
"bar" : [
{
"name" : "Joe JJ",
"info": [ {
"custid": "SSN",
"type" : "String", } ]
} ] } ] }
I was wondering if we can ignore it elegantly.
Also hoping to use json.dumps()
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
如果数据是 json 为什么不直接使用 json.load() 呢?
Why not just use json.load() if the data is json?
我猜是一个小技巧,效率低下:
A little hack, inefficient I guess:
这个结构是从哪里来的?我的哀悼。无论如何,作为开始,您可以尝试以下操作:
这是强力删除换行符和制表符。它返回的内容可能适合输入到
json.loads
中。这在很大程度上取决于清除多余的空格和换行符后文件的内容是否实际上是有效的 JSON。Where did that structure come from? My condolences. Anyway, as a start you might try this:
That's a brute-force removal of newline and tab characters. What it returns might be suitable for feeding into
json.loads
. It'll depend greatly on whether or not the contents of the file are actually valid JSON once you clear out the extra white space and line breaks.如果你想循环每一行,你可以:
If you want to loop over each line, you can just:
那么json模块的用法呢?
What about the usage of the json module?
"type" : "String",
中的逗号导致 JSON 解码器阻塞。如果不是这个问题,您可以使用 json.load() 直接加载文件。换句话说,您的 JSON 格式不正确,这意味着您需要在将其提供给
json.loads()
之前执行替换操作。由于无论如何您都需要将文件完全读入字符串才能执行替换操作,因此请使用 json.loads(jsonstr) 而不是 json.load(jsonfilep):我只使用了 re 模块,因为它可能发生在任何值、数字或字符串上。
The comma in
"type" : "String",
is causing the JSON decoder to choke. If it wasn't for that problem, you could usejson.load()
to load the file directly.In other words, you have malformed JSON, meaning you'll need to perform a replacement operation before feeding it to
json.loads()
. Since you'll need to read the file into a string completely to do the replacement operation anyway, usejson.loads(jsonstr)
instead ofjson.load(jsonfilep)
:I only used the
re
module because it could happen for any value, number or string.