您现在的位置是:网站首页> 编程资料编程资料

正则表达式(regular)知识(整理)_正则表达式_

2023-05-25 388人已围观

简介 正则表达式(regular)知识(整理)_正则表达式_

正则(regular),要使用正则表达式需要导入Python中的re(regular正则的缩写)模块。正则表达式是对字符串的处理,我们知道,字符串中有时候包含很多我们想要提取的信息,掌握这些处理字符串的方法,能够方便很多我们的操作。

    正则表达式(regular),处理字符串的方法。

    正则是一种常用的方法,因为python中文件处理很常见,文件里面包含的是字符串,要想处理字符串,那么就需要用到正则表达式。因而要掌握好正则表达式。下面下来看看正则表达式中包含的方法:

    (1)match(pattern, string, flags=0)

 def match(pattern, string, flags=0):     """Try to apply the pattern at the start of the string, returning     a match object, or None if no match was found."""     return _compile(pattern, flags).match(string)

     从上面注释:Try to apply the pattern at the start of the string,returning a match object,or None if no match was found.从字符串的开头开始查找,返回一个match object对象,如果没有找到,返回一个None。

    重点:(1)从开头开始查找;(2)如果查找不到返回None。

    下面来看看几个实例: 

 import re   string = "abcdef"   m = re.match("abc",string) (1)匹配"abc",并查看返回的结果是什么   print(m)   print(m.group())   n = re.match("abcf",string)   print(n) (2)字符串不在列表中查找的情况   l = re.match("bcd",string) (3)字符串在列表中间查找情况   print(l)

    运行结果如下:

<_sre.SRE_Match object; span=(0, 3), match='abc'> (1)abc (2) None (3) None (4)

    从上面输出结果(1)可以看出,使用match()匹配,返回的是一个match object对象,要想转换为看得到的情况,要使用group()进行转换(2)处所示;如果匹配的正则表达式不在字符串中,则返回None(3);match(pattern,string,flag)是从字符串开始的地方匹配的,并且只能从字符串的开始处进行匹配(4)所示。

    (2)fullmatch(pattern, string, flags=0)

 def fullmatch(pattern, string, flags=0):     """Try to apply the pattern to all of the string, returning     a match object, or None if no match was found."""     return _compile(pattern, flags).fullmatch(string)

    从上面注释:Try to apply the pattern to all of the string,returning a match object,or None if no match was found...

    (3)search(pattern,string,flags)

 def search(pattern, string, flags=0):     """Scan through string looking for a match to the pattern, returning     a match object, or None if no match was found."""     return _compile(pattern, flags).search(string) search(pattern,string,flags)的注释是Scan throgh string looking for a match to the pattern,returning a match object,or None if no match was found.在字符串任意一个位置查找正则表达式,如果找到了则返回match object对象,如果查找不到则返回None。

    重点:(1)从字符串中间任意一个位置查找,不像match()是从开头开始查找;(2)如果查找不到则返回None;

 import re   string = "ddafsadadfadfafdafdadfasfdafafda"   m = re.search("a",string) (1)从中间开始匹配   print(m)   print(m.group())   n = re.search("N",string) (2)匹配不到的情况   print(n)

    运行结果如下:

<_sre.SRE_Match object; span=(2, 3), match='a'> (1)a (2)None (3)

    从上面结果(1)可以看出,search(pattern,string,flag=0)可以从中间任意一个位置匹配,扩大了使用范围,不像match()只能从开头匹配,并且匹配到了返回的也是一个match_object对象;(2)要想展示一个match_object对象,那么需要使用group()方法;(3)如果查找不到,则返回一个None。

    (4)sub(pattern,repl,string,count=0,flags=0)

 def sub(pattern, repl, string, count=0, flags=0):     """Return the string obtained by replacing the leftmost     non-overlapping occurrences of the pattern in string by the     replacement repl. repl can be either a string or a callable;     if a string, backslash escapes in it are processed. If it is     a callable, it's passed the match object and must return     a replacement string to be used."""     return _compile(pattern, flags).sub(repl, string, count) sub(pattern,repl,string,count=0,flags=0)查找替换,就是先查找pattern是否在字符串string中;repl是要把pattern匹配的对象,就要把正则表达式找到的字符替换为什么;count可以指定匹配个数,匹配多少个。示例如下: import re   string = "ddafsadadfadfafdafdadfasfdafafda"   m = re.sub("a","A",string) #不指定替换个数(1)   print(m)   n = re.sub("a","A",string,2) #指定替换个数(2)   print(n)   l = re.sub("F","B",string) #匹配不到的情况(3)   print(l) 

    运行结果如下:

    ddAfsAdAdfAdfAfdAfdAdfAsfdAfAfdA        --(1)
  ddAfsAdadfadfafdafdadfasfdafafda        -- (2)
  ddafsadadfadfafdafdadfasfdafafda        --(3)

    上面代码(1)是没有指定匹配的个数,那么默认是把所有的都匹配了;(2)处指定了匹配的个数,那么只匹配指定个数的;(3)处要匹配的正则pattern不在字符串中,则返回原来的字符串。

    重点:(1)可以指定匹配个数,不指定匹配所有;(2)如果匹配不到会返回原来的字符串;

    (5)subn(pattern,repl,string,count=0,flags=0)

 def subn(pattern, repl, string, count=0, flags=0):     """Return a 2-tuple containing (new_string, number).     new_string is the string obtained by replacing the leftmost     non-overlapping occurrences of the pattern in the source     string by the replacement repl. number is the number of     substitutions that were made. repl can be either a string or a     callable; if a string, backslash escapes in it are processed.     If it is a callable, it's passed the match object and must     return a replacement string to be used."""     return _compile(pattern, flags).subn(repl, string, count)

    上面注释Return a 2-tuple containing(new_string,number):返回一个元组,用于存放正则匹配之后的新的字符串和匹配的个数(new_string,number)。

 import re   string = "ddafsadadfadfafdafdadfasfdafafda"   m = re.subn("a","A",string) #全部替换的情况 (1)   print(m)   n = re.subn("a","A",string,3) #替换部分 (2)   print(n)   l = re.subn("F","A",string) #指定替换的字符串不存在 (3)   print(l)

    运行结果如下:

    ('ddAfsAdAdfAdfAfdAfdAdfAsfdAfAfdA', 11)     (1)
  ('ddAfsAdAdfadfafdafdadfasfdafafda', 3)      (2)
  ('ddafsadadfadfafdafdadfasfdafafda', 0)       (3)

    从上面代码输出的结果可以看出,sub()和subn(pattern,repl,string,count=0,flags=0)可以看出,两者匹配的效果是一样的,只是返回的结果不同而已,sub()返回的还是一个字符串,而subn()返回的是一个元组,用于存放正则之后新的字符串,和替换的个数。

    (6)split(pattern,string,maxsplit=0,flags=0)   

 def split(pattern, string, maxsplit=0, flags=0):     """Split the source string by the occurrences of the pattern,     returning a list containing the resulting substrings. If     capturing parentheses are used in pattern, then the text of all     groups in the pattern are also returned as part of the resulting     list. If maxsplit is nonzero, at most maxsplit splits occur,     and the remainder of the string is returned as the final element     of the list."""     return _compile(pattern, flags).split(string, maxsplit) split(pattern,string,maxsplit=0,flags=0)是字符串的分割,按照某个正则要求pattern分割字符串,返回一个列表returning a list containing the resulting substrings.就是按照某种方式分割字符串,并把字符串放在一个列表中。实例如下: import re   string = "ddafsadadfadfafdafdadfasfdafafda"   m = re.split("a",string) #分割字符串(1)   print(m)   n = re.split("a",string,3) #指定分割次数   print(n)   l = re.split("F",string) #分割字符串不存在列表中   print(l)

    运行结果如下:

 ['dd', 'fs', 'd', 'df', 'df', 'fd', 'fd', 'df', 'sfd', 'f', 'fd', ''] (1) ['dd', 'fs', 'd', 'dfadfafdafdadfasfdafafda'] (2) ['ddafsadadfadfafdafdadfasfdafafda'] (3)

    从(1)处可以看出,如果字符串开头或者结尾包括要分割的字符串,后面元素会是一个"";(2)处我们可以指定要分割的次数;(3)处如果要分割的字符串不存在列表中,则把原字符串放在列表中。

    (7)findall(pattern,string,flags=)

 def findall(pattern, string, flags=0):     """Return a list of all non-overlapping matches in the string.     If one or more capturing groups are present in the pattern, return     a list of groups; this will be a list of tuples if the pattern     has more than one group.     Empty matches are included in the result."""     return _compile(pattern, flags).findall(string) findall(pattern,string,flags=)是返回一个列表,包含所有匹配的元素。存放在一个列表中。示例如下: import re   string = "dd12a32d46465fad1648fa1564fda127fd11ad30fa02sfd58afafda"   m = re.findall("[a-z]",string) #匹配字母,匹配所有的字母,返回一个列表(1)   print(m)   n = re.findall("[0-9]",string) #匹配所有的数字,返回一个列表 (2)   print(n)   l = re.findall("[ABC]",string) #匹配不到的情况 (3)   print(l)

    运行结果如下:

 ['d', 'd', 'a', 'd', 'f', 'a', 'd', 'f', 'a', 'f', 'd', 'a', 'f', 'd', 'a', 'd', 'f', 'a', 's', 'f', 'd', 'a', 'f', 'a', 'f',   'd', 'a'] (1)   ['1', '2', '3', '2', '4', '6', '4', '6', '5', '1', '6', '4', '8', '1', '5', '6', '4', '1', '2', '7', '1', '1', '3', '0', '0',   '2', '5', '8'] (2) [] (3)

    上面代码运行结果(1)处匹配了所有的字符串,单个匹配;(2)处匹配了字符串中的数字,返回到一个列表中;(3)处匹配不存在的情况,返回一个空列表。

    重点:(1)匹配不到的时候返回一个空的列表;(2)如果没有指定匹配次数,则只单个匹配。

    (8)finditer(pattern,string,flags=0)

 def finditer(pattern, string, flags=0):     """Return an iterator over all non-overlapping matches in the     string. For each match, the iterator returns a match object.     Empty matches are included in the result."""     return _compile(pattern, flags).finditer(string) finditer(pattern,string)查找模式,Return an iterator over all non-overlapping matches in the string.For each match,the iterator a match object.

    代码如下:

 import re   string = "dd12a32d46465fad1648fa1564fda127fd11ad30fa02sfd58afafda"   m = re.finditer("[a-z]",string)   print(m)   n = re.finditer("AB",string)   print(n) 

    运行结果如下:

 (1)    (2)

    从上面运行结果可以看出,finditer(pattern,string,flags=0)返回的是一个iterator对象。

    (9)compile(pattern,flags=0)

-六神源码网