如何循环到字典中的字典以指定方式组织CSV中的数据
我制作了一个脚本:
- 从CSV文件中获取数据 - 在数据文件的第一列中以相同的值来删除数据 -Istert在不同模板文本文件中的指定线中排序数据 - 将文件放入第一列中与数据文件中有不同值不同的副本中,下面的图显示了它的工作方式:
”
但是我还需要做两件事。在如上所述的单独文件中,有一些相同的值从数据文件的第二列中进行了相同的值,则该文件应从第三列插入值,而不是从第二列中重复相同的值。在下图上,我显示了它的外观:
我还需要添加某个地方的sproded value通过“ _”从数据文件中的第一列。
有数据文件:
111_0,3005,QWE
111_0,3006,SDE
111_0,3006,LFR
111_1,3005,QWE
111_1,5345,JTR
112_0,3103,JPP
112_0,3343,PDK
113_0,2137,TRE
113_0,2137,OMG
还有我制作的代码:
import shutil
with open("data.csv") as f:
contents = f.read()
contents = contents.splitlines()
values_per_baseline = dict()
for line in contents:
key = line.split(',')[0]
values = line.split(',')[1:]
if key not in values_per_baseline:
values_per_baseline[key] = []
values_per_baseline[key].append(values)
for file in values_per_baseline.keys():
x = 3
shutil.copyfile("of.txt", (f"of_%s.txt" % file))
filename = f"of_%s.txt" % file
for values in values_per_baseline[file]:
with open(filename, "r") as f:
contents = f.readlines()
contents.insert(x, ' o = ' + values[0] + '\n ' + 'a = ' + values[1] +'\n')
with open(filename, "w") as f:
contents = "".join(contents)
f.write(contents)
f.close()
我一直在尝试制作列表字典词典,但我无法以正确的方式实现它来使它起作用。
I made a script that:
-takes data from CSV file -sort it by same values in first column of data file
-instert sorted data in specifield line in different template text file
-save the file in as many copies as there are different values in first column from data file This picture below show how it works:
But there are two more things I need to do. When in separate files as showed above, there are some of the same values from second column of the data file, then this file should insert value from third column instead of repeating the same value from second column. On the picture below I showed how it should look like:
What I also need is to add somewhere separeted value of first column from data file by "_".
There is datafile:
111_0,3005,QWE
111_0,3006,SDE
111_0,3006,LFR
111_1,3005,QWE
111_1,5345,JTR
112_0,3103,JPP
112_0,3343,PDK
113_0,2137,TRE
113_0,2137,OMG
and there is code i made:
import shutil
with open("data.csv") as f:
contents = f.read()
contents = contents.splitlines()
values_per_baseline = dict()
for line in contents:
key = line.split(',')[0]
values = line.split(',')[1:]
if key not in values_per_baseline:
values_per_baseline[key] = []
values_per_baseline[key].append(values)
for file in values_per_baseline.keys():
x = 3
shutil.copyfile("of.txt", (f"of_%s.txt" % file))
filename = f"of_%s.txt" % file
for values in values_per_baseline[file]:
with open(filename, "r") as f:
contents = f.readlines()
contents.insert(x, ' o = ' + values[0] + '\n ' + 'a = ' + values[1] +'\n')
with open(filename, "w") as f:
contents = "".join(contents)
f.write(contents)
f.close()
I have been trying to make something like a dictionary of dictionaries of lists but I can't implement it in correct way to make it works.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
当我运行您的代码时,我会收到此错误:
让我们考虑一下此错误来自哪里。它是列表上的
indexError
。此行上使用的唯一列表是values
,因此这似乎是一个开始寻找的好地方。要调试,您可以考虑在吐出错误的行之前添加类似的内容:
这给出了
value [3]
的问题,这是有道理的,因为len(values)= = 2
,因此索引需要为0
和1
。如果我们将值[3]
更改为值[1]
,那么我认为您得到了想要的东西。例如:要了解问题的下一步,我建议您将第一个循环更改为:
这使您的字典为:
,当写入文件时,您需要将循环更改为:
然后 像:
您可以做的其他事情
现在,您可以做几件事来简化代码,同时保持其可读性。*
line.split
两次。只需添加一个具有类似split_line = line.split(',')
的行,然后具有key = split_line [0]
和valuts = splity_line [1 :]
。 (您可以消除键
和value
一起,然后参考split_line [0]
和splite_line [1]
这将使您的代码(f“ of_%s.txt”%文件)
,然后在下一行中的文件中定义它。具有shutil.copyfile(“ of.txt”,fileName)
。 =“ nofollow noreferrer”> f-strings 您可以编写filename = f“ of _ {file} .txt”
中,在values_per_baseline.keys()
loop中的值中,您正在打开和关闭文件。比您需要的。您可以重新订购操作:*对于这样的简短脚本,我认为确保可读性比确保其效率更重要,因为您希望能够在3周或3年内回来了解你做了什么。因此,我还建议您评论您的所作所为。
When I run your code, I get this error:
Let's think where this error is coming from. It is an
IndexError
on a list. The only list used on this line isvalues
so that seems like a good place to start looking.To debug, you can consider adding something like this before the line that is spitting the error:
which gives
So the problem is with
values[3]
, which makes sense sincelen(values)==2
and so the indices need to be0
and1
. If we changevalues[3]
tovalues[1]
then I think you get what you want. e.g.:To get to the next step in your problem, I would suggest you change your first loop to:
That gives your dictionary to be:
Then when writing to the file, you would need to change your loop to:
And your file now looks like:
Other things you could do
Now, there are a couple of things you can do to streamline your code while keeping it readable.*
line.split
twice. Just add a line that has something likesplit_line = line.split(',')
and then havekey = split_line[0]
andvalues = split_line[1:]
. (You could do away withkey
andvalues
all together and just referencesplit_line[0]
andsplit_line[1]
but that would make your code less readable.x
in every loop. Just take it out of the loop.(f"of_%s.txt" % file)
and then defining it in a file on the next line. Suggest you definefilename
first and then just haveshutil.copyfile("of.txt", filename)
. Also, you are using f-strings incorrectly. You could just writefilename = f"of_{file}.txt"
.insert
command to an f-string (if you find it more readable). For example:contents.insert(x, f'{6*sp}o = {values[0]}\n{10*sp}a = {values[1]}\n')
for values in values_per_baseline.keys()
loop, you are opening and closing files way more than you need to. You can reorder your operations:*For a short script like this, I would argue that making sure it is readable is more important than making sure it is efficient, since you will want to be able to come back in 3 weeks or 3 years and understand what you did. For that reason, I would also recommend you comment what you did.