这里是收集自国外正则表达式库中关于资源定位的第二部分,各位慢慢研究享用。可以在线进行正则表达式测试:http://www.zhongguosou.com/computer_question_tools/test_regex.aspx
域名
表达式:
^[a-zA-Z0-9]+([a-zA-Z0-9\-\.]+)?\.(com|org|net|mil|edu|COM|ORG|NET|MIL|EDU)$
匹配:
my.domain.com | regexlib.com | big-reg.com
不匹配:
.mydomain.com | regexlib.comm | -bigreg.com
网络协议地址
表达式:^(http|https|ftp)\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(:[a-zA-Z0-9]*)?/?([a-zA-Z0-9\-\._\?\,\'/\\\+&%\$#\=~])*$
匹配:
http://www.blah.com/~joe | ftp://ftp.blah.co.uk:2828/blah%20blah.gif | https://blah.gov/blah-blah.as
不匹配:
www.blah.com | http://www.blah"blah.com/I have spaces! | ftp://blah_underscore/[nope]
网络地址:
(?:(?<protocol>http(?:s?)|ftp)(?:\:\/\/)) (?:(?<usrpwd>\w+\:\w+)(?:\@))? (?<domain>[^/\r\n\:]+)? (?<port>\:\d+)? (?<path>(?:\/.*)*\/)? (?<filename>.*?\.(?<ext>\w{2,4}))? (?<qrystr>\??(?:\w+\=[^\#]+)(?:\&?\w+\=\w+)*)* (?<bkmrk>\#.*)?
匹配:
https://192.168.0.2:80/users/~fname.lname/file.ext | ftp://user1:pwd@www.domain.com | http://www.dom
不匹配:
www.domain.com | user1:pwd@domain.com | 192.168.0.2/folder/file.ext
IP地址:
表达式:
(^\d{3}\x2E\d{3}\x2E\d{3}\x2D\d{2}$)
匹配:
123.123.123-12
不匹配:
123.123.103.32 | 123 123 123 12 | sa3.332.322-12
URL匹配(有无协议均可)
([\d\w-.]+?\.(a[cdefgilmnoqrstuwz]|b[abdefghijmnorstvwyz]|c[acdfghiklmnoruvxyz]|d[ejkmnoz]|e[ceghrst]|f[ijkmnor]|g[abdefghilmnpqrstuwy]|h[kmnrtu]|i[delmnoqrst]|j[emop]|k[eghimnprwyz]|l[abcikrstuvy]|m[acdghklmnopqrstuvwxyz]|n[acefgilopruz]|om|p[aefghklmnrstwy]|qa|r[eouw]|s[abcdeghijklmnortuvyz]|t[cdfghjkmnoprtvwz]|u[augkmsyz]|v[aceginu]|w[fs]|y[etu]|z[amw]|aero|arpa|biz|com|coop|edu|info|int|gov|mil|museum|name|net|org|pro)(\b|\W(?<!&|=)(?!\.\s|\.{3}).*?))(\s|$)
匹配:
http://www.website.com/index.html | www.website.com | website.com
域名匹配:
^[a-zA-Z0-9]+([a-zA-Z0-9\-\.]+)?\.(aero|biz|com|coop|edu|gov|info|int|mil|museum|name|net|org|ac|ad|ae|af|ag|ai|al|am|an|ao|aq|ar|as|at|au|aw|az|ba|bb|bd|be|bf|bg|bh|bi|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cc|cd|cf|cg|ch|ci|ck|cl|cm|cn|co|cr|cs|cu|cv|cx|cy|cz|de|dj|dk|dm|do|dz|ec|ee|eg|eh|er|es|et|fi|fj|fk|fm|fo|fr|ga|gb|gd|ge|gf|gg|gh|gi|gl|gm|gn|gp|gq|gr|gs|gt|gu|gw|gy|hk|hm|hn|hr|ht|hu|id|ie|il|im|in|io|iq|ir|is|it|je|jm|jo|jp|ke|kg|kh|ki|km|kn|kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly| ma|mc|md|mg|mh|mk|ml|mm|mn|mo|mp|mq|mr|ms|mt|mu|mv|mw|mx|my|mz|na|nc|ne|nf|ng|ni|nl|no|np|nr|nu|nz|om|pa|pe|pf|pg|ph|pk| pl|pm|pn|pr|ps|pt|pw|py|qa|re|ro|ru|rw|sa|sb|sc|sd|se|sg|sh|si|sj|sk|sl|sm|sn|so|sr| st|su|sv|sy|sz|tc|td|tf|tg|th|tj|tk|tm|tn|to|tp|tr|tt|tv|tw|tz|ua|ug|uk|um|us|uy|uz|va|vc|ve|vg|vi|vn|vu|wf|ws|ye|yt|yu|za|zm|zr|zw|AERO|BIZ|COM|COOP|EDU|GOV|INFO|INT|MIL|MUSEUM|NAME|NET|ORG|AC|
匹配:
mydomain.com | my-domain.info | mydomain.aero
不匹配:
-mydomain.com | mydomain.aaa | .mydomain.com
文件路径(支持空格文件名)
^(([a-zA-Z]:)|(\\{2}\w+)\$?)(\\(\w[\w ]*.*))+\.(txt|TXT)$
匹配:
C:\Documents and Settings\roman.lukyanenko\Desktop\stuff\b_card2.txt
不匹配:
C:\file.doc
网址+端口:
^(((ht|f)tp(s?))\://)?(www.|[a-zA-Z].)[a-zA-Z0-9\-\.]+\.(com|edu|gov|mil|net|org|biz|info|name|museum|us|ca|uk)(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\;\?\'\\\+&%\$#\=~_\-]+))*$
匹配:
www.blah.com:8103 | www.blah.com/blah.asp?sort=ASC | www.blah.com/blah.htm#blah
不匹配:
www.state.ga | http://www.blah.ru
特别匹配:
表达式:
^DOMAIN\\\w+$
匹配:
DOMAIN\ssmith | DOMAIN\a | DOMAIN\username
不匹配:
ssmith | username | DOMAIN\
IPV4地址和端口
表达式:
^((?:2[0-5]{2}|1\d{2}|[1-9]\d|[1-9])\.(?:(?:2[0-5]{2}|1\d{2}|[1-9]\d|\d)\.){2}(?:2[0-5]{2}|1\d{2}|[1-9]\d|\d)):(\d|[1-9]\d|[1-9]\d{2,3}|[1-5]\d{4}|6[0-4]\d{3}|654\d{2}|655[0-2]\d|6553[0-5])$
匹配:
127.0.0.1:80 | 255.255.255.0:21 |
不匹配:
URL检查
表达式:
^(ht|f)tp(s?)\:\/\/[a-zA-Z0-9\-\._]+(\.[a-zA-Z0-9\-\._]+){2,}(\/?)([a-zA-Z0-9\-\.\?\,\'\/\\\+&%\$#_]*)?$
匹配:
http://www.thedaddy.org | http://forum.thedaddy.org/index.html | ftp://hows.it.going_buddy/checkit/o
不匹配:
www.thedaddy.org | http://hello | ftp://check.it
jpg/JPG文件路径匹配
表达式:
^(([a-zA-Z]:)|(\\{2}\w+)\$?)(\\(\w[\w ]*.*))+\.(jpg|JPG)$
匹配:
C:\Documents and Settings\roman.lukyanenko\Desktop\stuff\b_card2.jpg | C:\b_card.jpg | \\network\fol
不匹配:
C:\file.xls
URL链接串检查
表达式:
^(http|https|ftp)\://([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*@)?((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.[a-zA-Z]{2,4})(\:[0-9]+)?(/[^/][a-zA-Z0-9\.\,\?\'\\/\+&%\$#\=~_\-@]*)*$
匹配:
http://www.sysrage.net | https://64.81.85.161/site/file.php?cow=moo's | ftp://user:pass@host.com:123
不匹配:
sysrage.net
文件路径检查
表达式:
^([a-zA-Z]\:)(\\[^\\/:*?<>"|]*(?<![ ]))*(\.[a-zA-Z]{2,6})$
匹配:
C:\di___r\fi_sysle.txt | c:\dir\filename.txt
不匹配:
c:\dir\file?name.txt
邮箱地址等检查
表达式:
(mailto\:|(news|(ht|f)tp(s?))\://)(([^[:space:]]+)|([^[:space:]]+)( #([^#]+)#)?)
匹配:
http://www.domain.com | http://www.domain.com/index%20page.htm #linktext# | mailto://user@domai
不匹配:
<a href="http://www.domain.com">real html link</a> | http://www.without_space_
URL串格式检查
表达式:
^(?=[^&])(?:(?<scheme>[^:/?#]+):)?(?://(?<authority>[^/?#]*))?(?<path>[^?#]*)(?:\?(?<query>[^#]*))?(?:#(?<fragment>.*))?
匹配:
http://regexlib.com/REDetails.aspx?regexp_id=x#Details
不匹配:
&
文档路径
表达式:
^([a-zA-Z]\:|\\)\\([^\\]+\\)*[^\/:*?"<>|]+\.htm(l)?$
匹配:
x:\test\testing.htm | x:\test\test#$ ing.html | \\test\testing.html
不匹配:
x:\test\test/ing.htm | x:\test\test*.htm | \\test?<.htm