如何加速循环? 是否有一个课程来替换多个术语?
循环:
var pattern = _dict[key]; string before; do { before = pattern; foreach (var pair in _dict) if (key != pair.Key) pattern = pattern.Replace(string.Concat("{", pair.Key, "}"), string.Concat("(", pair.Value, ")")); } while (pattern != before); return pattern;
它只是在一堆键上重复查找和替换。 字典只是 。
我可以看到2个改进。
- 每次我们执行
pattern.Replace
它再次从字符串的开头搜索。 如果它碰到第一个{
,它只会查看匹配的键列表(可能使用二进制搜索),然后替换适当的一个,那会更好。 -
pattern != before
bitpattern != before
是我如何检查在迭代期间是否有任何替换。 如果pattern.Replace
函数返回了实际发生的实际替换次数,我不需要这个。
但是……我真的不想写一个很讨厌的东西来做这一切。 这必须是一个相当普遍的情况? 有没有现成的解决方案?
全class
感谢Elian Ebbing和ChrisWue 。
class FlexDict : IEnumerable<KeyValuePair> { private Dictionary _dict = new Dictionary(); private static readonly Regex _re = new Regex(@"{([_a-z][_a-z0-9-]*)}", RegexOptions.Compiled | RegexOptions.IgnoreCase); public void Add(string key, string pattern) { _dict[key] = pattern; } public string Expand(string pattern) { pattern = _re.Replace(pattern, match => { string key = match.Groups[1].Value; if (_dict.ContainsKey(key)) return "(" + Expand(_dict[key]) + ")"; return match.Value; }); return pattern; } public string this[string key] { get { return Expand(_dict[key]); } } public IEnumerator<KeyValuePair> GetEnumerator() { foreach (var p in _dict) yield return new KeyValuePair(p.Key, this[p.Key]); } IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator(); } }
示例用法
class Program { static void Main(string[] args) { var flex = new FlexDict { {"h", @"[0-9a-f]"}, {"nonascii", @"[\200-\377]"}, {"unicode", @"\\{h}{1,6}(\r\n|[ \t\r\n\f])?"}, {"escape", @"{unicode}|\\[^\r\n\f0-9a-f]"}, {"nmstart", @"[_a-z]|{nonascii}|{escape}"}, {"nmchar", @"[_a-z0-9-]|{nonascii}|{escape}"}, {"string1", @"""([^\n\r\f\\""]|\\{nl}|{escape})*"""}, {"string2", @"'([^\n\r\f\\']|\\{nl}|{escape})*'"}, {"badstring1", @"""([^\n\r\f\\""]|\\{nl}|{escape})*\\?"}, {"badstring2", @"'([^\n\r\f\\']|\\{nl}|{escape})*\\?"}, {"badcomment1", @"/\*[^*]*\*+([^/*][^*]*\*+)*"}, {"badcomment2", @"/\*[^*]*(\*+[^/*][^*]*)*"}, {"baduri1", @"url\({w}([!#$%&*-\[\]-~]|{nonascii}|{escape})*{w}"}, {"baduri2", @"url\({w}{string}{w}"}, {"baduri3", @"url\({w}{badstring}"}, {"comment", @"/\*[^*]*\*+([^/*][^*]*\*+)*/"}, {"ident", @"-?{nmstart}{nmchar}*"}, {"name", @"{nmchar}+"}, {"num", @"[0-9]+|[0-9]*\.[0-9]+"}, {"string", @"{string1}|{string2}"}, {"badstring", @"{badstring1}|{badstring2}"}, {"badcomment", @"{badcomment1}|{badcomment2}"}, {"baduri", @"{baduri1}|{baduri2}|{baduri3}"}, {"url", @"([!#$%&*-~]|{nonascii}|{escape})*"}, {"s", @"[ \t\r\n\f]+"}, {"w", @"{s}?"}, {"nl", @"\n|\r\n|\r|\f"}, {"A", @"a|\\0{0,4}(41|61)(\r\n|[ \t\r\n\f])?"}, {"C", @"c|\\0{0,4}(43|63)(\r\n|[ \t\r\n\f])?"}, {"D", @"d|\\0{0,4}(44|64)(\r\n|[ \t\r\n\f])?"}, {"E", @"e|\\0{0,4}(45|65)(\r\n|[ \t\r\n\f])?"}, {"G", @"g|\\0{0,4}(47|67)(\r\n|[ \t\r\n\f])?|\\g"}, {"H", @"h|\\0{0,4}(48|68)(\r\n|[ \t\r\n\f])?|\\h"}, {"I", @"i|\\0{0,4}(49|69)(\r\n|[ \t\r\n\f])?|\\i"}, {"K", @"k|\\0{0,4}(4b|6b)(\r\n|[ \t\r\n\f])?|\\k"}, {"L", @"l|\\0{0,4}(4c|6c)(\r\n|[ \t\r\n\f])?|\\l"}, {"M", @"m|\\0{0,4}(4d|6d)(\r\n|[ \t\r\n\f])?|\\m"}, {"N", @"n|\\0{0,4}(4e|6e)(\r\n|[ \t\r\n\f])?|\\n"}, {"O", @"o|\\0{0,4}(4f|6f)(\r\n|[ \t\r\n\f])?|\\o"}, {"P", @"p|\\0{0,4}(50|70)(\r\n|[ \t\r\n\f])?|\\p"}, {"R", @"r|\\0{0,4}(52|72)(\r\n|[ \t\r\n\f])?|\\r"}, {"S", @"s|\\0{0,4}(53|73)(\r\n|[ \t\r\n\f])?|\\s"}, {"T", @"t|\\0{0,4}(54|74)(\r\n|[ \t\r\n\f])?|\\t"}, {"U", @"u|\\0{0,4}(55|75)(\r\n|[ \t\r\n\f])?|\\u"}, {"X", @"x|\\0{0,4}(58|78)(\r\n|[ \t\r\n\f])?|\\x"}, {"Z", @"z|\\0{0,4}(5a|7a)(\r\n|[ \t\r\n\f])?|\\z"}, {"Z", @"z|\\0{0,4}(5a|7a)(\r\n|[ \t\r\n\f])?|\\z"}, {"CDO", @""}, {"INCLUDES", @"~="}, {"DASHMATCH", @"\|="}, {"STRING", @"{string}"}, {"BAD_STRING", @"{badstring}"}, {"IDENT", @"{ident}"}, {"HASH", @"#{name}"}, {"IMPORT_SYM", @"@{I}{M}{P}{O}{R}{T}"}, {"PAGE_SYM", @"@{P}{A}{G}{E}"}, {"MEDIA_SYM", @"@{M}{E}{D}{I}{A}"}, {"CHARSET_SYM", @"@charset\b"}, {"IMPORTANT_SYM", @"!({w}|{comment})*{I}{M}{P}{O}{R}{T}{A}{N}{T}"}, {"EMS", @"{num}{E}{M}"}, {"EXS", @"{num}{E}{X}"}, {"LENGTH", @"{num}({P}{X}|{C}{M}|{M}{M}|{I}{N}|{P}{T}|{P}{C})"}, {"ANGLE", @"{num}({D}{E}{G}|{R}{A}{D}|{G}{R}{A}{D})"}, {"TIME", @"{num}({M}{S}|{S})"}, {"PERCENTAGE", @"{num}%"}, {"NUMBER", @"{num}"}, {"URI", @"{U}{R}{L}\({w}{string}{w}\)|{U}{R}{L}\({w}{url}{w}\)"}, {"BAD_URI", @"{baduri}"}, {"FUNCTION", @"{ident}\("}, }; var testStrings = new[] { @"""str""", @"'str'", "5", "5.", "5.0", "a", "alpha", "url(hello)", "url(\"hello\")", "url(\"blah)", @"\g", @"/*comment*/", @"/**/", @"", @"~=", "|=", @"#hash", "@import", "@page", "@media", "@charset", "!/*iehack*/important"}; foreach (var pair in flex) { Console.WriteLine("{0}\n\t{1}\n", pair.Key, pair.Value); } var sw = Stopwatch.StartNew(); foreach (var str in testStrings) { Console.WriteLine("{0} matches: ", str); foreach (var pair in flex) { if (Regex.IsMatch(str, "^(" + pair.Value + ")$", RegexOptions.IgnoreCase | RegexOptions.ExplicitCapture)) Console.WriteLine(" {0}", pair.Key); } } Console.WriteLine("\nRan in {0} ms", sw.ElapsedMilliseconds); Console.ReadLine(); } }
目的
用于构建可能相互扩展的复杂正则表达式。 也就是说,我正在尝试实现css规范 。
我认为如果你使用正则表达式查找{foo}
任何出现会更快,然后如果foo
恰好是字典中的键,则使用替换{foo}
的MatchEvaluator 。
我目前没有视觉工作室,但我想这在function上与您的代码示例相同:
var pattern = _dict[key]; bool isChanged = false; do { isChanged = false; pattern = Regex.Replace(pattern, "{([^}]+)}", match => { string matchKey = match.Groups[1].Value; if (matchKey != key && _dict.ContainsKey(matchKey)) { isChanged = true; return "(" + _dict[matchKey] + ")"; } return match.Value; }); } while (isChanged);
我可以问你为什么需要do / while循环吗? 字典中键的值是否可以再次包含必须替换的{placeholders}
? 你能确定你不会陷入无限循环,其中键"A"
包含"Blahblah {B}"
而键"B"
包含"Blahblah {A}"
吗?
编辑:进一步的改进将是:
- 使用预编译的正则表达式。
- 使用递归而不是循环(参见ChrisWue的评论)。
- 使用
_dict.TryGetValue()
,就像在Guffa的代码中一样。
你最终会得到一个O(n)算法,其中n是输出的大小,所以你不能做得比这更好。
您应该能够使用正则表达式来查找匹配项。 然后,您还可以使用快速查找字典,而不仅仅将其用作列表。
var pattern = _dict[key]; bool replaced = false; do { pattern = Regex.Replace(pattern, @"\{([^\}]+)\}", m => { string k = m.Groups[1].Value; string value; if (k != key && _dict.TryGetValue(k, out value) { replaced = true; return "(" + value + ")"; } else { return "{" + k + "}"; } }); } while (replaced); return pattern;
您可以实现以下算法:
- 在源字符串中搜索
{
- 将所有内容复制到
{
StringBuilder - 找到匹配
}
(搜索从最后的喜好位置完成) - 将
{
和}
之间的值与字典中的键进行比较 -
- 如果它匹配复制到字符串构建器
(
+Value
+)
- 另外从源字符串复制
- 如果它匹配复制到字符串构建器
- 如果未到达源字符串结尾,请转到步骤1
你可以使用PLINQ吗?
有点像:
var keys = dict.KeyCollection.Where(k => k != key); bool replacementMade = keys.Any(); foreach(var k in keys.AsParallel(), () => {replacement code})