使用PYWIN32构建电子邮件数据库
我必须通过公司内部的许多共享文件夹进行迭代。我需要将消息的所有字符串存储在机器学习项目的数据库中。 我在某些线程中阅读了我不应该使用Pywin32迭代所有项目,
messages = mapi.Folders(str(key)).Folders(str(i)).Items
我们正在谈论10个共享收件箱,每封超过10.000邮件。当我尝试为第一个收件箱运行代码时,它在使用此错误代码达到95%(根据TQDM)后崩溃了
文件 〜。conda \ envs \ myenv \ lib \ site-packages \ spyder_kernels \ py3compat.py:356 在compat_exec中 exec(代码,全球,当地人)
文件C:\ user \ user \ desktop \ untitled0.py:206 in 对于TQDM中的M(消息):
file 〜.conda \ envs \ myenv \ lib \ lib \ site-packages \ tqdm \ std.py:1195 in iter 对于OBJ,
文件 〜\ appdata \ roaming \ python \ python39 \ site-packages \ win32com \ client \ client \ dynamic.py.py:324 在 getItem 中 返回self。 get_good_object (self
文件 〜\ appdata \ roaming \ python \ python39 \ site-packages \ win32com \ client \ client \ util.py:41 在 getItem 中 返回self .__ getIndex(index)
文件 〜\ appdata \ roaming \ python \ python39 \ site-packages \ win32com \ client \ client \ util.py:62 在__getIndex中 结果= self。 oleobj .next(1)
com_error :( -459013867,'ole错误0xe4a40115',无,无)
self 。因此,我认为我应该将它们切成较小的部分并运行代码。但是,在获得错误代码时,我无法将消息切成薄片
TypeError: Objects of type 'slice' can not be converted to a COM VARIANT
,甚至还有一种更有效的方法来完成此任务?我应该使用其他遗嘱代替pywin32吗?看来这是来自微软的最熟练的诽谤,应该与Outlook具有最佳兼容性。但是,没有计划浏览48000封50000的电子邮件,然后没有计划看到这次崩溃。我如何在途中保存进度?我可以先对消息进行排序,然后迭代,然后将我的数据检索到每5000封电子邮件中的SQL数据库?
I have to iterate through many shared folders within my company. I need to store all the strings of the messages in a database for a machine learning project.
I read in some threads that I should not iterate through all items using pywin32 and
messages = mapi.Folders(str(key)).Folders(str(i)).Items
We are talking about 10 shared inboxes with each over 10.000 mails. When i tried to run my code for the first inbox it crashed after reaching 95% (according to tqdm) with this error code
File
~.conda\envs\myenv\lib\site-packages\spyder_kernels\py3compat.py:356
in compat_exec
exec(code, globals, locals)File c:\users\user\desktop\untitled0.py:206 in
for m in tqdm(messages):File ~.conda\envs\myenv\lib\site-packages\tqdm\std.py:1195 in
iter
for obj in iterable:File
~\AppData\Roaming\Python\Python39\site-packages\win32com\client\dynamic.py:324
in getitem
return self.get_good_object(self.enum.getitem(index))File
~\AppData\Roaming\Python\Python39\site-packages\win32com\client\util.py:41
in getitem
return self.__GetIndex(index)File
~\AppData\Roaming\Python\Python39\site-packages\win32com\client\util.py:62
in __GetIndex
result = self.oleobj.Next(1)com_error: (-459013867, 'OLE error 0xe4a40115', None, None)
I don't understand this error and i can't find anything googling this error. So i thought I should just slice them just in smaller parts and run the code. But I can't slice the messages as I get the error code
TypeError: Objects of type 'slice' can not be converted to a COM VARIANT
Is there any other way or even a more efficient way to do this task? Should I use instead of pywin32 any other libary? It seems like this is the most proficient libary in this field as it is from microsoft and should have the best compatibility with outlook. But going through 48000 emails of 50000 and then seeing this crash was not planned. How could I save my progress on the way? Can I just sort the messages first and then iterate and therefore commit my retrieved data to an sql database every 5000 emails?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
完成使用后,必须释放Outlook对象。如果您的加载项尝试在Microsoft Exchange Server上存储的收藏集中的256多个Outlook项目,这一点尤其重要。该数字在最新的交换版本中增加了,但是该策略仍然有效。如果您不及时释放这些对象,则可以在任何时候以最大打开的物品数量交换来达到限制。您可以在 Exchange Server中的托管商店限制文章。
我建议您在束上处理项目,例如,将它们范围范围固定到一天或一周,以避免Exchange Server引入的限制。
查找
/findnext
或限制
项目
类的方法可以帮助您完成此类任务。您可以在以下文章中阅读有关它们的更多信息:请注意,Outlook所基于的低级API允许运行辅助线程,因此您可以在辅助线程上运行内容,从而使Outlook UI响应。如果API对您来说太复杂了,请考虑使用周围API(例如赎回)的任何包装器。
You must release an Outlook object when you have finished using it. This is particularly important if your add-in attempts to enumerate more than 256 Outlook items in a collection that is stored on a Microsoft Exchange Server. This number was increased in latest Exchange versions, but the strategy remains valid. If you do not release these objects in a timely manner, you can reach the limit imposed by Exchange on the maximum number of items opened at any one time. You can read more about Exchange limits in the Managed Store Limits in Exchange Server article.
I'd recommend processing items in bunch, for example, scope them to a day or week, to be able to avoid limits introduced by the Exchange server. The
Find
/FindNext
orRestrict
methods of theItems
class can help you with such tasks. You can read more about them in the following articles:Be aware, a low-level API on which Outlook is based on allows running secondary threads, so you could run your stuff on a secondary thread leaving the Outlook UI responsive. If the API is too complex for you, consider using any wrappers around that API such as Redemption.