C#正则表达式,不包括字符串
我有一个字符串的集合,我想要的正则表达式是收集所有开始与http ..
HREF = “http://www.test.com/cat/1-one_piece_episodes/的” href = “http://www.test.com/cat/2-movies_english_subbed/” HREF =“HTTP://www.test的.com /猫/ 3-english_dubbed / “HREF =” http://www.exclude.com”
这是我的正则表达式模式..
href="(.*?)[^#]"
并返回此
href="http://www.test.com/cat/1-one_piece_episodes/" href="http://www.test.com/cat/2-movies_english_subbed/" href="http://www.xxxx.com/cat/3-english_dubbed/" href="http://www.exclude.com"
什么是排除最后一场比赛的模式..或排除内部具有排除域的匹配,如href =“http://www.exclude.com”
编辑:多次排除
href="((?:(?!"|\bexclude\b|\bxxxx\b).)*)[^#]"
@ridgerunner和我会将正则表达式更改为:
href="((?:(?!\bexclude\b)[^"])*)[^#]"
它匹配所有href
属性,只要它们不以#
结尾并且不包含单词exclude
。
说明:
href=" # Match href=" ( # Capture... (?: # the following group: (?! # Look ahead to check that the next part of the string isn't... \b # the entire word exclude # exclude \b # (\b are word boundary anchors) ) # End of lookahead [^"] # If successful, match any character except for a quote )* # Repeat as often as possible ) # End of capturing group 1 [^#]" # Match a non-# character and the closing quote.
允许多个“禁词”:
href="((?:(?!\b(?:exclude|this|too)\b)[^"])*)[^#]"
你的输入看起来不像一个有效的字符串(除非你转义它们中的引号)但你也可以在没有正则表达式的情况下完成:
string input = "href=\"http://www.test.com/cat/1-one_piece_episodes/\"href=\"http://www.test.com/cat/2-movies_english_subbed/\"href=\"http://www.test.com/cat/3-english_dubbed/\"href=\"http://www.exclude.com\""; List matches = new List (); foreach(var match in input.split(new string[]{"href"})) { if(!match.Contains("exclude.com")) matches.Add("href" + match); }
这会起作用吗?
href="(?!http://[^/"]+exclude.com)(.*?)[^#]"