python 2.7 中的正则表达式和 csv 问题
使用以下方法修复问题(对于其余问题,将更改我的代码)。很抱歉我最初的帖子中的代码格式不正确。
import csv, re, mechanize
htmlML = br.response().read()
#escaping ? fixed the regex match
patMemberName = re.compile('<a href=/foo.php\?XID=(d+) ><font color=#000000><b>(.*) </b>')
searchMemberName = re.findall(patMemberName,htmlML)
MembersCsv = 'path-to-csv'
MemberWriter = csv.writer(open(MembersCsv, 'wb')) #adding b fixed the \n in csv
for i in searchMemberName:
MemberWriter.writerow(i)
print (i)
谢谢您的宝贵时间
Used the following to fix the problems (for the remaining issues, will change my code around). Sorry for the improper code format in my initial post.
import csv, re, mechanize
htmlML = br.response().read()
#escaping ? fixed the regex match
patMemberName = re.compile('<a href=/foo.php\?XID=(d+) ><font color=#000000><b>(.*) </b>')
searchMemberName = re.findall(patMemberName,htmlML)
MembersCsv = 'path-to-csv'
MemberWriter = csv.writer(open(MembersCsv, 'wb')) #adding b fixed the \n in csv
for i in searchMemberName:
MemberWriter.writerow(i)
print (i)
Thank you for your time
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
不幸的是,我现在找不到适合 Python 的转义序列。通常,您将使用不应在“\Q...\E”中解释的元字符来包装表达式。
尝试将字符串包装在 re.escape(string) 中。所以:
Unfortunately, I can't find the proper escape sequence for Python right now. Generally, you would wrap an expression with meta-characters that should not be interpreted in "\Q...\E".
Try wrapping your string in re.escape(string). So:
对于问题 1),您必须转义模式中的
?
。然后可以从字符串中提取
123
问题 2a)
您可以使用
(.*?)
替换some string
,即?
表示非贪婪匹配For question 1), you have to escape the
?
in the pattern.Then the
123
can be extracted from the stringQuestion 2a)
You can use
(.*?)
to replacesome string
, the?
maens non-greedy match