python如何写sql语句通过pymysql导入?
日志格式如下:
111.206.221.23 - - [14/Apr/2017:23:59:22 +0800] "POST /baidutj/tj HTTP/1.1" 302 -
207.46.13.93 - - [14/Apr/2017:23:59:21 +0800] "GET /article/412525 HTTP/1.1" 200 9418
68.180.228.253 - - [14/Apr/2017:23:59:22 +0800] "GET /article/513370 HTTP/1.1" 200 12765
180.163.255.198 - - [14/Apr/2017:23:59:23 +0800] "GET /article/225705 HTTP/1.1" 200 9704
207.46.13.93 - - [14/Apr/2017:23:59:23 +0800] "GET /article/3790 HTTP/1.1" 200 8610
180.163.255.199 - - [14/Apr/2017:23:59:23 +0800] "GET /article/67312 HTTP/1.1" 200 8200
180.163.255.195 - - [14/Apr/2017:23:59:25 +0800] "GET /article/251676 HTTP/1.1" 200 10269
180.163.255.197 - - [14/Apr/2017:23:59:27 +0800] "GET /article/381091 HTTP/1.1" 200 8223
111.206.221.104 - - [14/Apr/2017:23:59:27 +0800] "POST /baidutj/tj HTTP/1.1" 302 -
180.163.255.195 - - [14/Apr/2017:23:59:27 +0800] "GET /article/5928 HTTP/1.1" 200 10455
40.77.167.58 - - [14/Apr/2017:23:59:29 +0800] "GET /article/100364 HTTP/1.1" 200 12909
180.163.255.197 - - [14/Apr/2017:23:59:29 +0800] "GET /article/135051 HTTP/1.1" 200 8640
用正则提取这些数据并处理,然后通过pymysql存入mysql
但是运行后并没有存入mysql
python也没有报错,
这个要怎么改?
# 定义日志格式,利用非贪婪匹配和分组匹配,需要严格参照日志定义中的分隔符和引号
log_pattern = r'^(?P<remote_addr>.*?) - \[(?P<time_local>.*?)\] "(?P<request>.*?)"' \
r' (?P<status>.*?) (?P<body_bytes_sent>.*?) (?P<request_time>.*?)' \
r' "(?P<http_referer>.*?)" "(?P<http_user_agent>.*?)" - (?P<http_x_forwarded_for>.*)$'
# request的正则,其实是由 "request_method request_uri server_protocol"三部分组成
request_uri_pattern = r'^(?P<request_method>(GET|POST|HEAD|DELETE)?) (?P<request_uri>.*?) (?P<server_protocol>HTTP.*)$'
# 日志目录
log_dir = '/data/wwwlogs/baiwenjie/'
# 要处理的站点(可随需要想list中添加)
todo = ['baiwenjie']
# MySQL相关设置
mysql_host = 'xxxx'
mysql_user = 'xxxxxx'
mysql_passwd = 'xxx'
mysql_port = 3306
mysql_database = 'xxx'
# 表结构
creat_table = "CREATE TABLE IF NOT EXISTS {} (\
id bigint unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,\
server char(11) NOT NULL DEFAULT '',\
# uri_abs varchar(200) NOT NULL DEFAULT '' COMMENT '对$uri做uridecode,然后做抽象化处理',\
# uri_abs_crc32 bigint unsigned NOT NULL DEFAULT '0' COMMENT '对上面uri_abs字段计算crc32',\
# args_abs varchar(200) NOT NULL DEFAULT '' COMMENT '对$args做uridecode,然后做抽象化处理',\
# args_abs_crc32 bigint unsigned NOT NULL DEFAULT '0' COMMENT '对上面args字段计算crc32',\
time_local timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',\
response_code smallint NOT NULL DEFAULT '0',\
bytes_sent int NOT NULL DEFAULT '0' COMMENT '发送给客户端的响应大小',\
request_time float(6,3) NOT NULL DEFAULT '0.000',\
# user_ip varchar(40) NOT NULL DEFAULT '',\
# cdn_ip varchar(15) NOT NULL DEFAULT '' COMMENT 'CDN最后节点的ip:空字串表示没经过CDN; - 表示没经过CDN和F5',\
request_method varchar(7) NOT NULL DEFAULT '',\
uri varchar(255) NOT NULL DEFAULT '' COMMENT '$uri,已做uridecode',\
args varchar(255) NOT NULL DEFAULT '' COMMENT '$args,已做uridecode',\
referer varchar(255) NOT NULL DEFAULT '' COMMENT '',\
# KEY time_local (time_local),\
# KEY uri_abs_crc32 (uri_abs_crc32),\
# KEY args_abs_crc32 (args_abs_crc32)\
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 row_format=compressed"
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论