如何在 python 中使用带有法语字符的 csv.reader，例如 é,à,ç,ê,ë,

发布于 2024-09-11 11:06:43 字数 1405 浏览 7 评论 0原文

我有一个 120 列 x 4500 行的 csv 文件。我读到第一行第一列中的“客户名称”字段。然后，我在第二个 cvs 文件中查找此字段，其中包含“客户名称和客户 ID” 我编写了一个新的 cvs 文件，其中包含“客户名称”、客户 ID“以及 119 列的所有其余部分。并继续直到第一个文件的末尾。

这是有效的，但前两个 csv 文件中到处都有特殊字符。我不想用“Montr\xe9al-Nord”代替 Montréal-Nord 或在生成的 csv 文件中使用“Val\xe9rie Lamarche”而不是“Valérie Lamarche”。

这是一个测试代码示例：

# -*- coding: utf-8 -*-


import  types
import  wx
import sys
import os, os.path
import win32file
import shutil
import string
import  wx.lib.dialogs
import re
import EmailAttache
import StringIO,csv
import time
import csv

outputfile=open(os.path.join(u"c:\\transales","Resultat-second_contact_act.csv"), "wb")

resultat = csv.writer (outputfile )

def Writefile ( info1, info2 ):
    print info1, info2
    resultat.writerow( [ `info1`,`info2` ,`line[1]`,`line[2]`,`line[3]`,`line[4]`,`line[5]`,`line[6]`,`line[7]`,`line[8]`,`line[9]`,`line[10]`,`line[11]`,`line[12]`,`line[13]`,`line[14]`,`line[15]`,`line[16]`,`line[17]` ] )


data = open(os.path.join(u"c:\\transales","SECONDARY_CONTACTS.CSV"),"rb")
data2 = open(os.path.join(u"c:\\transales","AccountID+ContactID.csv"),"rb")

source1 = csv.reader(data)
source2 = csv.reader(data2)



for line in source1:
    name= line[0]
    data2.seek(0)
    for line2 in source2:
        if line[0] == line2[0]:    
            Writefile(line[0],line2[1])
            break

outputfile.close()

有帮助吗？

问候，弗朗索瓦

原文

I have a csv file like 120 column by 4500 row.
I read the field "customer name" in the first column, first row.
I then look fot this field in a second cvs file containing the "customer name , and customer ID"
I write a new cvs file with "customer name", customer ID", and all the rest of the 119 colunm.and continue until end of first file.

This is working, but I have special character everywhere in the first two csv files.
And I dont want to have 'Montr\xe9al-Nord' instead of Montréal-Nord
or 'Val\xe9rie Lamarche' instead of 'Valérie Lamarche' in the resulting csv file.

here is a test code exemple:

# -*- coding: utf-8 -*-


import  types
import  wx
import sys
import os, os.path
import win32file
import shutil
import string
import  wx.lib.dialogs
import re
import EmailAttache
import StringIO,csv
import time
import csv

outputfile=open(os.path.join(u"c:\\transales","Resultat-second_contact_act.csv"), "wb")

resultat = csv.writer (outputfile )

def Writefile ( info1, info2 ):
    print info1, info2
    resultat.writerow( [ `info1`,`info2` ,`line[1]`,`line[2]`,`line[3]`,`line[4]`,`line[5]`,`line[6]`,`line[7]`,`line[8]`,`line[9]`,`line[10]`,`line[11]`,`line[12]`,`line[13]`,`line[14]`,`line[15]`,`line[16]`,`line[17]` ] )


data = open(os.path.join(u"c:\\transales","SECONDARY_CONTACTS.CSV"),"rb")
data2 = open(os.path.join(u"c:\\transales","AccountID+ContactID.csv"),"rb")

source1 = csv.reader(data)
source2 = csv.reader(data2)



for line in source1:
    name= line[0]
    data2.seek(0)
    for line2 in source2:
        if line[0] == line2[0]:    
            Writefile(line[0],line2[1])
            break

outputfile.close()

Any help ?

regards, francois

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

银河中√捞星星 2024-09-18 11:06:43

虽然我不熟悉 csv.reader 或 writer，但我最近一直在处理 utf-8 文件读取，也许使用编解码器模块可能会帮助你。

而不是

data = open(..., "wb")

尝试，

import codecs

然后对于所有 utf-8 文件，使用，

data = codecs.open(..., "rb", "utf-8")

这会自动将您的文件读取为 unicode (utf-8)，并可能将它们正确写入您的文件。

Although I am not familiar with csv.reader or writer, I have been dealing with utf-8 file reading recently and perhaps using the codecs module might help you out.

Instead of,

data = open(..., "wb")

try,

import codecs

and then for all your utf-8 files, use,

data = codecs.open(..., "rb", "utf-8")

This automatically reads your files in as unicode (utf-8) and might write them to your file correctly.

回复收藏 0 原文

空心↖ 2024-09-18 11:06:43

问题出在这一行：

resultat.writerow( [ `info1`,`info2` ,`line[1]`,`line[2]`,`line[3]`,`line[4]`,`line[5]`,`line[6]`,`line[7]`,`line[8]`,`line[9]`,`line[10]`,`line[11]`,`line[12]`,`line[13]`,`line[14]`,`line[15]`,`line[16]`,`line[17]` ] )

用“反引号”（又名“重音符号”）包装表达式是一种老式且已弃用的表示 repr(expression) 的方式。

请考虑以下事项：

>>> s = "Montréal"
>>> print s
Montréal
>>> print repr(s)
'Montr\xe9al'
>>> ord(s[5])
233
>>> hex(233)
'0xe9'
>>> s == "Montr\xe9al"
True
>>> `s` == repr(s)
True

有问题的（以 3 种方式）行应简单地替换为

resultat.writerow([info1, info2] + [line[1:18]]) # WRONG (sorry!)
resultat.writerow([info1, info2] + line[1:18]) # RIGHT

The problem is in this line:

resultat.writerow( [ `info1`,`info2` ,`line[1]`,`line[2]`,`line[3]`,`line[4]`,`line[5]`,`line[6]`,`line[7]`,`line[8]`,`line[9]`,`line[10]`,`line[11]`,`line[12]`,`line[13]`,`line[14]`,`line[15]`,`line[16]`,`line[17]` ] )

Wrapping an expression in "back-ticks" aka "grave accents" is an old-fashioned and deprecated way of saying repr(expression).

Please consider the following:

>>> s = "Montréal"
>>> print s
Montréal
>>> print repr(s)
'Montr\xe9al'
>>> ord(s[5])
233
>>> hex(233)
'0xe9'
>>> s == "Montr\xe9al"
True
>>> `s` == repr(s)
True

The offending (in 3 ways) line should be simply replaced by

resultat.writerow([info1, info2] + [line[1:18]]) # WRONG (sorry!)
resultat.writerow([info1, info2] + line[1:18]) # RIGHT

回复收藏 0 原文

~没有更多了~

关于作者

苄①跕圉湢

暂无简介

0 文章

0 评论

22 人气

关注发私信

友情链接

文江博客

如何在 python 中使用带有法语字符的 csv.reader，例如 é,à,ç,ê,ë,

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

内心激荡

JSmiles

赏烟花じ飞满天

左秋

迪街小绵羊

瞳孔里扚悲伤

友情链接

如何在 python 中使用带有法语字符的 csv.reader，例如 é,à,ç,ê,ë,

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

内心激荡

JSmiles

赏烟花じ飞满天

左秋

迪街小绵羊

瞳孔里扚悲伤

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。