如何使我的CSV/Excel文件从多个输出中编译数据?
制作一个代码来扫描我的电子邮件,以寻找某种模式。我的目标是制作一个CSV文件,其中一个文件中列出了所有事件,但是我的代码仅将最后一封电子邮件添加到CSV中。这是输入:
pattern = re.compile(
r"([a-zA-Z]+[0-9]+) Line ([0-9]+) Seq ([0-9]) ([0-9]+/[0-9]+/[0-9]+)")
matches = pattern.finditer(body)
with open("data.csv", "w") as f_out:
writer = csv.writer(f_out)
writer.writerows(map(lambda m: m.groups(), matches))
电子邮件通过以下内容:
第一封电子邮件:
PUU128378 Line 20 Seq 1 5/22/2023
PUN102939 Line 100 Seq 8 11/1/2024
PUU012939 Line 120 Seq 4 1/1/2025
第二封电子邮件:
PUU128377 Line 20 Seq 1 5/22/2023
PUN102938 Line 100 Seq 8 11/1/2024
PUU012938 Line 120 Seq 4 1/1/2025
excel文件看起来像:
我希望它看起来像:
我的代码:
for i in range(messages, messages-N, -1):
# fetch the email message by ID
res, msg = imap.fetch(str(i), "(RFC822)")
for response in msg:
if isinstance(response, tuple):
# parse a bytes email into a message object
msg = email.message_from_bytes(response[1])
# decode the email subject
subject, encoding = decode_header(msg["Subject"])[0]
if isinstance(subject, bytes):
# if it's a bytes, decode to str
subject = subject.decode(encoding)
# decode email sender
From, encoding = decode_header(msg.get("From"))[0]
if isinstance(From, bytes):
From = From.decode(encoding)
print("Subject:", subject)
print("From:", From)
# if the email message is multipart
if msg.is_multipart():
# iterate over email parts
for part in msg.walk():
# extract content type of email
content_type = part.get_content_type()
content_disposition = str(part.get("Content-Disposition"))
try:
# get the email body
body = part.get_payload(decode=True).decode()
except:
pass
if content_type == "text/plain" and "attachment" not in content_disposition:
# print text/plain emails and skip attachments
pattern = re.compile(r"([a-zA-Z]+[0-9]+) Line ([0-9]+) Seq ([0-9]) ([0-9]+/[0-9]+/[0-9]+)")
matches = pattern.finditer(body)
with open("data.csv", "w") as f_out:
writer = csv.writer(f_out)
writer.writerows(map(lambda m: m.groups(), matches))
for match in matches:
print(match)
新编辑
with open("data.csv", "w") as f_out:
writer = csv.writer(f_out)
for i in range(messages, messages-N, -1):
# fetch the email message by ID
res, msg = imap.fetch(str(i), "(RFC822)")
for response in msg:
if isinstance(response, tuple):
# parse a bytes email into a message object
msg = email.message_from_bytes(response[1])
# decode the email subject
subject, encoding = decode_header(msg["Subject"])[0]
if isinstance(subject, bytes):
# if it's a bytes, decode to str
subject = subject.decode(encoding)
# decode email sender
From, encoding = decode_header(msg.get("From"))[0]
if isinstance(From, bytes):
From = From.decode(encoding)
print("Subject:", subject)
print("From:", From)
# iterate over email parts
for part in msg.walk():
# extract content type of email
content_type = part.get_content_type()
content_disposition = str(part.get("Content-Disposition"))
payload = part.get_payload(decode=True)
if payload is None:
continue
body = payload.decode()
pattern = re.compile(
r"([a-zA-Z]+[0-9]+) Line ([0-9]+) Seq ([0-9]) ([0-9]+/[0-9]+/[0-9]+)")
matches = pattern.finditer(body)
writer.writerows(map(lambda m: m.groups(), matches))
Making a code that scans my emails looking for a certain pattern. My goal is to make a csv file with all the occurrences listed in one file, but my code adds ONLY the last email into the csv. Here's the input:
pattern = re.compile(
r"([a-zA-Z]+[0-9]+) Line ([0-9]+) Seq ([0-9]) ([0-9]+/[0-9]+/[0-9]+)")
matches = pattern.finditer(body)
with open("data.csv", "w") as f_out:
writer = csv.writer(f_out)
writer.writerows(map(lambda m: m.groups(), matches))
The emails ran through are the following:
First email:
PUU128378 Line 20 Seq 1 5/22/2023
PUN102939 Line 100 Seq 8 11/1/2024
PUU012939 Line 120 Seq 4 1/1/2025
Second email:
PUU128377 Line 20 Seq 1 5/22/2023
PUN102938 Line 100 Seq 8 11/1/2024
PUU012938 Line 120 Seq 4 1/1/2025
The excel file looks like:
I would like it to look like:
rest of my code:
for i in range(messages, messages-N, -1):
# fetch the email message by ID
res, msg = imap.fetch(str(i), "(RFC822)")
for response in msg:
if isinstance(response, tuple):
# parse a bytes email into a message object
msg = email.message_from_bytes(response[1])
# decode the email subject
subject, encoding = decode_header(msg["Subject"])[0]
if isinstance(subject, bytes):
# if it's a bytes, decode to str
subject = subject.decode(encoding)
# decode email sender
From, encoding = decode_header(msg.get("From"))[0]
if isinstance(From, bytes):
From = From.decode(encoding)
print("Subject:", subject)
print("From:", From)
# if the email message is multipart
if msg.is_multipart():
# iterate over email parts
for part in msg.walk():
# extract content type of email
content_type = part.get_content_type()
content_disposition = str(part.get("Content-Disposition"))
try:
# get the email body
body = part.get_payload(decode=True).decode()
except:
pass
if content_type == "text/plain" and "attachment" not in content_disposition:
# print text/plain emails and skip attachments
pattern = re.compile(r"([a-zA-Z]+[0-9]+) Line ([0-9]+) Seq ([0-9]) ([0-9]+/[0-9]+/[0-9]+)")
matches = pattern.finditer(body)
with open("data.csv", "w") as f_out:
writer = csv.writer(f_out)
writer.writerows(map(lambda m: m.groups(), matches))
for match in matches:
print(match)
New Edit
with open("data.csv", "w") as f_out:
writer = csv.writer(f_out)
for i in range(messages, messages-N, -1):
# fetch the email message by ID
res, msg = imap.fetch(str(i), "(RFC822)")
for response in msg:
if isinstance(response, tuple):
# parse a bytes email into a message object
msg = email.message_from_bytes(response[1])
# decode the email subject
subject, encoding = decode_header(msg["Subject"])[0]
if isinstance(subject, bytes):
# if it's a bytes, decode to str
subject = subject.decode(encoding)
# decode email sender
From, encoding = decode_header(msg.get("From"))[0]
if isinstance(From, bytes):
From = From.decode(encoding)
print("Subject:", subject)
print("From:", From)
# iterate over email parts
for part in msg.walk():
# extract content type of email
content_type = part.get_content_type()
content_disposition = str(part.get("Content-Disposition"))
payload = part.get_payload(decode=True)
if payload is None:
continue
body = payload.decode()
pattern = re.compile(
r"([a-zA-Z]+[0-9]+) Line ([0-9]+) Seq ([0-9]) ([0-9]+/[0-9]+/[0-9]+)")
matches = pattern.finditer(body)
writer.writerows(map(lambda m: m.groups(), matches))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您仅将最后一封电子邮件放入CSV文件的原因是因为您的代码 oftrites 每次处理消息时,任何现有的
.csv
文件。解决方案是仅在消息检索循环之外打开文件曾经的。以下是您根据您在问题中添加的代码所建议的概述:
The reason you're only getting the last email put into the CSV file is because your code overwrites any existing
.csv
file each time it processes a message. The solution is to only open the file for writing once outside the message retrieval loop.Below is an outline of what I'm suggesting based on the code you added to your question: