免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 4068 | 回复: 2
打印 上一主题 下一主题

python正则多行匹配 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2015-06-30 14:45 |只看该作者 |倒序浏览
本人因工作需要,需要写一个python脚本定时检测263邮箱并查看mysql的slowquery邮件,脚本代码如下:
#!/usr/bin/env python
# -*- coding: utf-8 -*-  
import imaplib
import email
def extract_body(payload):
    if isinstance(payload,str):
        return payload
    else:
        return '\n'.join([extract_body(part.get_payload()) for part in payload])

conn = imaplib.IMAP4("imap.263.net",143)
conn.login("ops@tech.com", "xxxxxxxxx"
conn.select()
typ, data = conn.search(None,’FROM’,’”mysql”’,’UNSEEN')
try:
    for num in data[0].split():
        typ, msg_data = conn.fetch(num, '(RFC822)')
        for response_part in msg_data:
            if isinstance(response_part, tuple):
                msg = email.message_from_string(response_part[1])
                From=msg['from']
                date=msg['date']
                subject=msg['subject']
                payload=msg.get_payload()
                body=extract_body(payload)
                print(subject)               
                print(body)

        #typ, response = conn.store(num, '+FLAGS', r'(\Seen)')  
finally:
    try:
        conn.close()
    except:
        pass
    conn.logout()





输出内容如下:
mysql@tech.com (发件人)

# 839s user time, 1.1s system time, 66.83M rss, 247.18M vsz (正文)
# Current date: Sat Jun 27 04:14:06 2015
# Hostname: mysqldb
# Files: /home/dba/slowlog/db-slow-150626.log
# Overall: 3.19M total, 891 unique, 36.92 QPS, 0.61x concurrency _________
# Time range: 2015-06-26 00:00:03 to 23:59:58
# Attribute total min max avg 95% stddev median
# ============ ======= ======= ======= ======= ======= ======= =======
# Exec time 52727s 57us 421s 17ms 14ms 385ms 366us
# Lock time 1004s 0 4s 314us 159us 5ms 31us
# Rows sent 140.35M 0 901.28k 46.14 51.63 519.96 26.08
# Rows examine 3.17G 0 13.31M 1.04k 793.42 25.83k 51.63
# Rows affecte 5.23M 0 4.50M 1.72 0 2.57k 0
# Bytes sent 18.82G 0 1005.37M 6.19k 6.01k 550.16k 3.19k
# Query size 284.84M 6 174.03k 93.64 223.14 180.86 54.21
# Time 2.25G 147.10k 147.10k 147.10k 147.10k 0 147.10k


# Query 1: 0.17 QPS, 69.11x concurrency, ID 0x79D92C01B791F7D2 at byte 653180000
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.20
# Time range: 2015-06-26 19:51:22 to 19:51:34
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 0 2
# Exec time 1 829s 408s 421s 415s 421s 9s 415s
# Lock time 0 364us 175us 189us 182us 189us 9us 182us
# Rows sent 0 0 0 0 0 0 0 0
# Rows examine 0 11.05M 5.46M 5.59M 5.52M 5.59M 96.52k 5.52M
# Rows affecte 0 0 0 0 0 0 0 0
# Bytes sent 0 0 0 0 0 0 0 0
# Query size 0 408 204 204 204 204 0 204
# String:
# Databases xxxxxxx
# Hosts 172.16.0.166 (1/50%), 172.16.0.90 (1/50%)
# Last errno 1160
# Users xxxxxxx
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms
# 1s
# 10s+ ################################################################
# Tables
# SHOW TABLE STATUS FROM `xxxx` LIKE 'xxxx'\G
# SHOW CREATE TABLE `xxxxx`.`xxxxx`\G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT `xxxxx`.* FROM `xxxxxxxxxx` WHERE (xxxxxxx NOT IN (266)) AND (xxxxxxx = '1179') AND (xxxxxx != AND (xxxxxx IN (2, 3, 4, 5, 6))

ORDER BY `xxxxxx`.`xxxxxxxxx` DESC LIMIT 50\G

# Query 2: 0 QPS, 0x concurrency, ID 0xC28655C387745DC1 at byte 116383530
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.00
# Time range: all events occurred at 2015-06-26 16:05:00
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 0 1
# Exec time 0 74s 74s 74s 74s 74s 0 74s
# Lock time 0 6ms 6ms 6ms 6ms 6ms 0 6ms
# Rows sent 0 0 0 0 0 0 0 0
# Rows examine 0 4.50M 4.50M 4.50M 4.50M 4.50M 0 4.50M
# Rows affecte 86 4.50M 4.50M 4.50M 4.50M 4.50M 0 4.50M
# Bytes sent 0 59 59 59 59 59 0 59
# Query size 0 85 85 85 85 85 0 85
# String:
# Databases testdb
# Hosts 172.16.1.194
# Last errno 0
# Users xxxxxxxx
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms
# 1s
# 10s+ ################################################################
# Tables
# SHOW TABLE STATUS FROM `xxxxxxxx` LIKE 'xxxxxxxxx'\G
# Converted for EXPLAIN
# EXPLAIN /*!50100 PARTITIONS*/
select * from xxxxxxx.xxxxxx\G

# Query 3: 0.03 QPS, 0.51x concurrency, ID 0x625F7392919EF414 at byte 30988
# This item is included in the report because it matches --limit.
# Scores: V/M = 2.89
# Time range: 2015-06-26 03:05:22 to 03:24:33
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 0 31
# Exec time 1 582s 15s 39s 19s 35s 7s 15s
# Lock time 0 847us 19us 55us 27us 44us 8us 23us
# Rows sent 0 14.65k 0 500 483.87 487.09 86.06 487.09
# Rows examine 6 200.22M 5.27M 13.31M 6.46M 11.87M 2.43M 5.18M
# Rows affecte 0 0 0 0 0 0 0 0
# Bytes sent 0 267.82k 137 8.92k 8.64k 8.89k 1.55k 8.89k
# Query size 0 5.53k 181 183 182.65 174.84 0 174.84
# Time 0 4.45M 147.10k 147.10k 147.10k 147.10k 0 147.10k
# String:
# Databases xxxxxxx
# Hosts 172.16.2.133
# Last errno 0
# Users xxxxxxxx
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms
# 1s
# 10s+ ################################################################
# Tables
# SHOW TABLE STATUS FROM `xxxxxxx` LIKE 'xxxxxxxxxxxxx'\G
# SHOW CREATE TABLE `xxxxxxxx`.`xxxxxxx`\G
# EXPLAIN /*!50100 PARTITIONS*/
select xxxxxx,count(1) as total_count from xxxxxxx where xxxxxx >=1433952000 and xxxxxx < 1435248000 and xxxxxx=7 and xxxxxxx=0 group by

xxxxxx limit 5500 , 500\G






这其中内容太多,有些内容是我不需要的,想把正文内容输出为以下形式:
# Hostname: mysqldb
# Databases xxxxxxx
# Users xxxxxxxx
select xxxxxx,count(1) as total_count from xxxxxxx where xxxxxx >=1433952000 and xxxxxx < 1435248000 and xxxxxx=7 and xxxxxxx=0 group by
xxxxxx limit 5500 , 500\G

基本思路已经有了,就是在定义body的那一行对变量进行正则匹配多个字符串,输出多个字符串的所在行,但是不知道具体的匹配方法,请问各位大神如何才能达到需求? 匹配规则如下:
# Hostname所在行,#Databases所在行,#Users所在行,# EXPLAIN的下一行,脚本中定义body的相关代码如下:
payload=msg.get_payload()
body=extract_body(payload)

请问对body做何种处理才能满足我的需求呢? 谢谢!


论坛徽章:
1
2015年辞旧岁徽章
日期:2015-03-03 16:54:15
2 [报告]
发表于 2015-07-01 21:19 |只看该作者
sql语句的行。
  1. import re
  2. data = re.findall("EXPLAIN.*?\*/(.*?)G",body,re.S)
  3. for line in data:
  4. print line + "G"
复制代码
调试通过

论坛徽章:
4
白羊座
日期:2013-11-05 10:26:09冥斗士
日期:2015-11-17 14:19:55白银圣斗士
日期:2015-11-17 15:13:0815-16赛季CBA联赛之新疆
日期:2016-04-01 09:10:58
3 [报告]
发表于 2015-07-02 11:06 |只看该作者
回复 1# 1034136669
还是用正则吧:
  1. import re
  2. result = re.findall(r"^(?:# (?:Hostname|Databases|Users).*?$)|select.+?\\G", content, re.MULTILINE | re.IGNORECASE | re.DOTALL)
  3. print '\n'.join(result)
复制代码
其中,content是保存内容的变量。
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP