如果逗号不在两个双引号之间,请用逗号分隔
我想用逗号分割这样的字符串:
field1:"value1", field2:"value2", field3:"value3,value4"
到一个看起来像的string[]
:
0 field1:"value1" 1 field2:"value2" 2 field3:"value3,value4"
我试图用Regex.Split
做到这一点,但似乎Regex.Split
正则表达式。
例如,使用Matches
比使用Split
更容易做到这一点
string[] asYouWanted = Regex.Matches(input, @"[A-Za-z0-9]+:"".*?""") .Cast() .Select(m => m.Value) .ToArray();
虽然如果你的值(或字段!)有任何机会包含转义引号(或任何类似的棘手),那么你可能最好使用正确的CSV解析器。
如果您确实在您的值中转义了引号,我认为以下正则表达式工作 – 给它一个测试:
@"field3:""value3\\"",value4""", @"[A-Za-z0-9]+:"".*?(?<=(?
添加的(?<=(?应该确保
"
它停止匹配在前面只有偶数个斜线,因为奇数个斜线意味着它被逃脱了。
未经测试但这应该是好的:
string[] parts = string.Split(new string[] { ",\"" }, StringSplitOptions.None);
记得在需要时添加“返回结尾”。
string[] arr = str.Split(new string[] {"\","}}, StringSplitOptions.None).Select(str => str + "\"").ToArray();
按\,
分割\,
如webnoob所述,然后后缀为"
使用select,然后强制转换为数组”。
试试这个
// (\w.+?):"(\w.+?)" // // Match the regular expression below and capture its match into backreference number 1 «(\w.+?)» // Match a single character that is a “word character” (letters, digits, and underscores) «\w» // Match any single character that is not a line break character «.+?» // Between one and unlimited times, as few times as possible, expanding as needed (lazy) «+?» // Match the characters “:"” literally «:"» // Match the regular expression below and capture its match into backreference number 2 «(\w.+?)» // Match a single character that is a “word character” (letters, digits, and underscores) «\w» // Match any single character that is not a line break character «.+?» // Between one and unlimited times, as few times as possible, expanding as needed (lazy) «+?» // Match the character “"” literally «"» try { Regex regObj = new Regex(@"(\w.+?):""(\w.+?)"""); Match matchResults = regObj.Match(sourceString); string[] arr = new string[match.Captures.Count]; int i = 0; while (matchResults.Success) { arr[i] = matchResults.Value; matchResults = matchResults.NextMatch(); i++; } } catch (ArgumentException ex) { // Syntax error in the regular expression }
最简单的内置方式就在这里 。 我把它弄了。 它工作正常。 它将"Hai,\"Hello,World\""
分为{"Hai","Hello,World"}