shell脚本正则表达式三剑客之一（grep,egrep）-云计算-互联网-天达云

shell脚本正则表达式三剑客之一（grep,egrep）
更新：HHH 时间：2023-1-7

Shell脚本之正则表达式

一.正则表达式三剑客之一：grep

1.学习正则表达式前我们拿一个无用的配置文件作为测试练习

[root@localhost ~]# vim chen.txt

#version=DEVEL
 System authorization information
auth --enableshadow --passalgo=sha512# Use CDROM installation media
cdrom
thethethe
THE
THEASDHAS
 Use graphical install
graphical
 Run the Setup Agent on first boot
firstboot --enable
ignoredisk --only-use=sda
wood
wd
wod
woooooooood
124153
3234
342222222
faasd11
2
ZASASDNA
short
shirt

2.查找特定字符

“-vn” 反向选择。查找不包含“the”字符的行，则需要通过 grep 命令的“-vn”选项实现。
-n“ 表示显示行号
“-i” 表示不区分大小写
命令执行后，符合匹配标准的字符，字体颜色会变为红色

[root@localhost ~]# grep -n 'the' chen.txt
6:thethethe
11:# Run the Setup Agent on first boot
[root@localhost ~]# grep -in 'the' chen.txt
6:thethethe
7:THE
8:THEASDHAS
11:# Run the Setup Agent on first boot
[root@localhost ~]# grep -vn 'the' chen.txt
1:#version=DEVEL
2:# System authorization information
3:auth --enableshadow --passalgo=sha512
4:# Use CDROM installation media
5:cdrom
7:THE
8:THEASDHAS
9:# Use graphical install
10:graphical
12:firstboot --enable
13:ignoredisk --only-use=sda
14:wood
15:wd
16:wod
17:woooooooood
18:124153
19:3234
20:342222222
21:faasd11
22:2
23:ZASASDNA
24:
short
shirt

3.括号"[ ]"来查找集合字符
想要查找“shirt”与“short”这两个字符串时，可以发现这两个字符串均包含“sh” 与“rt”。此时执行以下命令即可同时查找到“shirt”与“short”这两个字符串。“[]”中无论有几个字符，都仅代表一个字符，也就是说“[io]”表示匹配“i”或者“o”。

[root@localhost ~]# grep -n 'sh[io]rt' chen.txt  //过滤short或shirt中都有io集合字符
24:short
25:shirt

若要查找包含重复单个字符“oo”时，只需要执行以下命令即可。

[root@localhost ~]# grep -n 'oo' chen.txt 
11:# Run the Setup Agent on first boot
12:firstboot --enable
14:wood
17:woooooooood

若查找“oo”前面不是“w”的字符串，只需要通过集合字符的反向选择“[^]”来实现该目的，如执行“grep –n‘[^w]oo’test.txt”命令表示在 test.txt 文本中查找“oo” 前面不是“w”的字符串

[root@localhost ~]# grep -n '[^w]oo' chen.txt //过滤w开头oo的字符串
11:# Run the Setup Agent on first boot
12:firstboot --enable
17:woooooooood

在上述命令的执行结果中发现“woood”与“wooooood”也符合匹配规则，二者均包含“w”。其实通过执行结果就可以看出，符合匹配标准的字符加粗显示，而上述结果中可以得知，“#woood #”中加粗显示的是“ooo”，而“oo”前面的“o”是符合匹配规则的。同理 “#woooooood #”也符合匹配规则。
若不希望“oo”前面存在小写字母，可以使用“grep –n‘[^a-z]oo’test.txt”命令实现，其中“a-z”表示小写字母，大写字母则通过“A-Z”表示。

[root@localhost ~]# grep -n '[^a-z]oo' chen.txt 
19:Foofddd

查找包含数字的行可以通过“grep –n‘[0-9]’test.txt”命令来实现

[root@localhost ~]# grep -n '[0-9]' chen.txt
3:auth --enableshadow --passalgo=sha512
20:124153
21:3234
22:342222222
23:faasd11
24:2

查找行首“^”与行尾字符“$”

[root@localhost ~]# grep -n '^the' chen.txt
6:thethethe

查询以小写字母开头的行可以通过“1”规则来过滤，

[root@localhost ~]# grep -n '^[a-z]' chen.txt
3:auth --enableshadow --passalgo=sha512
5:cdrom
6:thethethe
10:graphical
12:firstboot --enable
13:ignoredisk --only-use=sda
14:wood
15:wd
16:wod
17:woooooooood
18:dfsjdjoooooof
23:faasd11
26:short
27:shirt

查询大写字母开头

[root@localhost ~]# grep -n '^[A-Z]' chen.txt
7:THE
8:THEASDHAS
19:Foofddd
25:ZASASDNA

若查询不以字母开头的行则使用“[a-zA-Z]”规则。

[root@localhost ~]# grep -n '^[^a-zA-Z]' chen.txt
1:#version=DEVEL
2:# System authorization information
4:# Use CDROM installation media
9:# Use graphical install
11:# Run the Setup Agent on first boot
20:124153
21:3234
22:342222222
24:2

“^”符号在元字符集合“[]”符号内外的作用是不一样的，在“[]”符号内表示反向选择，在“[]”符号外则代表定位行首。反之，若想查找以某一特定字符结尾的行则可以使用“$”定位符。例如，执行以下命令即可实现查询以小数点（.）结尾的行。因为小数点（.）在正则表达式中也是一个元字符（后面会讲到），所以在这里需要用转义字符“\”将具有特殊意义的字符转化成普通字符。

[root@localhost ~]# grep -n '\.$' chen.txt
5:cdrom.
6:thethethe.
9:# Use graphical install.
10:graphical.
11:# Run the Setup Agent on first boot.

当查询空白行时，执行“grep –n ‘^$’ chen.txt

查找任意一个字符“.”与重复字符“*”
在正则表达式中小数点（.）也是一个元字符，代表任意一个字符。例如，执行以下命令就可以查找“w??d”的字符串，即共有四个字符，以 w 开头 d 结尾。

[root@localhost ~]# grep -n 'w..d' chen.txt
14:wood

在上述结果中，“wood”字符串“w…d”匹配规则。若想要查询 oo、ooo、ooooo 等资料，则需要使用星号（）元字符。但需要注意的是，“”代表的是重复零个或多个前面的单字符。“o”表示拥有零个（即为空字符）或大于等于一个“o”的字符，因为允许空字符，所以执行“grep –n‘o’test.txt”命令会将文本中所有的内容都输出打印。如果是“oo”，则第一个 o 必须存在，第二个 o 则是零个或多个 o，所以凡是包含 o、oo、ooo、ooo，等的资料都符合标准。同理，若查询包含至少两个 o 以上的字符串，则执行“grep –n‘ooo’ test.txt”命令即可。

[root@localhost ~]# grep -n 'ooo*' chen.txt
11:# Run the Setup Agent on first boot.
12:firstboot --enable
14:wood
17:woooooooood
18:dfsjdjoooooof
19:Foofddd

查询以 w 开头 d 结尾，中间包含至少一个 o 的字符串，执行以下命令即可实现。

[root@localhost ~]# grep -n 'woo*d' chen.txt
14:wood
16:wod
17:woooooooood

查询以 w 开头 d 结尾，中间的字符可有可无的字符串。

[root@localhost ~]# grep -n 'w.*d' chen.txt
14:wood
15:wd
16:wod
17:woooooooood

查询任意数字所在行。

[root@localhost ~]# grep -n '[0-9][0-9]*' chen.txt
3:auth --enableshadow --passalgo=sha512
20:124153
21:3234
22:342222222
23:faasd11
24:2

查找连续字符范围“{}”
使用“.”与“*”来设定零个到无限多个重复的字符，如果想要限制一个范围内的重复的字符串该如何实现呢？例如，查找三到五个 o 的连续字符，这个时候就需要使用基础正则表达式中的限定范围的字符“{}”。因为“{}”在 Shell 中具有特殊意义，所以在使用“{}”字符时，需要利用转义字符“\”，将“{}”字符转换成普通字符。

查询两个 o 以上的字符

[root@localhost ~]# grep -n 'o\{2\}' chen.txt
11:# Run the Setup Agent on first boot.
12:firstboot --enable
14:wood
17:woooooooood
18:dfsjdjoooooof
19:Foofddd

查询以 w 开头以 d 结尾，中间包含 2～5 个 o 的字符串。

[root@localhost ~]# grep -n 'wo\{2,5\}d' chen.txt
14:wood

查询以 w 开头以 d 结尾，中间包含 2 以上 o 的字符串。

[root@localhost ~]# grep -n 'wo\{2,\}d' chen.txt
14:wood
17:woooooooood

二.扩展正则表达式

为了简化整个指令，需要使用范围更广的扩展正则表达式。例如，使用基础正则表达式查询除文件中空白行与行首为“#” 之外的行（通常用于查看生效的配置文件），执行“grep –v‘^KaTeX parse error: Expected group after '^' at position 22: …txt | grep –v ‘^̲#’”即可实现。这里需要使用管…|^#’test.txt”，其中，单引号内的管道符号表示或者（or）。
此外，grep 命令仅支持基础正则表达式，如果使用扩展正则表达式，需要使用 egrep 或 awk 命令。awk 命令在后面的小节进行讲解，这里我们直接使用 egrep 命令。egrep 命令与 grep 命令的用法基本相似。egrep 命令是一个搜索文件获得模式，使用该命令可以搜索文件中的任意字符串和符号，也可以搜索一个或多个文件的字符串，一个提示符可以是单个字符、一个字符串、一个字或一个句子。
常见的扩展正则表达式的元字符主要包括以下几个：

"+“示例：执行“egrep -n ‘wo+d’ test.txt”命令，即可查询"wood” “woood” "woooooood"等字符串

[root@localhost ~]# egrep -n 'wo+d' chen.txt
14:wood
16:wod
17:woooooooood

"?"示例：执行“egrep -n ‘bes?t’ test.txt”命令，即可查询“bet”“best”这两个字符串

[root@localhost ~]# egrep -n 'bes?t' chen.txt
11:best
12:bet

"|"示例：执行“egrep -n ‘of|is|on’ test.txt”命令即可查询"of"或者"if"或者"on"字符串

[root@localhost ~]# egrep -n 'of|is|on' chen.txt
1:#version=DEVEL
2:# System authorization information
4:# Use CDROM installation media
13:# Run the Setup Agent on first boot.
15:ignoredisk --only-use=sda
20:dfsjdjoooooof
21:Foofddd

"()"示例：“egrep -n ‘t(a|e)st’ test.txt”。“tast”与“test”因为这两个单词的“t”与“st”是重复的，所以将“a”与“e”列于“()”符号当中，并以“|”分隔，即可查询"tast"或者"test"字符串

[root@localhost ~]# egrep -n 't(a|e)st' chen.txt
12:test
13:tast

"()+“示例：“egrep -n ‘A(xyz)+C’ test.txt”。该命令是查询开头的"A"结尾是"C”，中间有一个以上的 "xyz"字符串的意思

[root@localhost ~]# egrep -n 'A(xyz)+C' chen.txt
14:AxyzxyzxyzC


返回云计算教程...

Shell脚本之正则表达式

一.正则表达式三剑客之一：grep

二.扩展正则表达式

新手上路

产品管理

支付方式

关于我们