用逗号分隔字符串,忽略引号中的任何标点符号(包括’,’)

如何用逗号分隔字符串(来自文本框), 不包括双引号中的字符串( 不删除引号 ),以及其他可能的标点符号(例如’。”;” – ‘)?

例如,如果有人在文本框中输入以下内容:

apple, orange, "baboons, cows", rainbow, "unicorns, gummy bears" 

如何将上面的字符串拆分为以下字符串(例如,放入List中)?

 apple orange "baboons, cows" rainbow "Unicorns, gummy bears..." 

谢谢您的帮助!

您可以尝试下面使用正向前瞻的正则表达式,

 string value = @"apple, orange, ""baboons, cows"", rainbow, ""unicorns, gummy bears"""; string[] lines = Regex.Split(value, @", (?=(?:""[^""]*?(?: [^""]*)*))|, (?=[^"",]+(?:,|$))"); foreach (string line in lines) { Console.WriteLine(line); } 

输出:

 apple orange "baboons, cows" rainbow "unicorns, gummy bears" 

IDEONE

试试这个:

 Regex str = new Regex("(?:^|,)(\"(?:[^\"]+|\"\")*\"|[^,]*)", RegexOptions.Compiled); foreach (Match m in str.Matches(input)) { Console.WriteLine(m.Value.TrimStart(',')); } 

您也可以尝试查看FileHelpers

就像CSV解析器,而不是正则表达式,你可以遍历每个字符,如下所示:

 public List ItemStringToList(string inputString) { var itemList = new List(); var currentIem = ""; var quotesOpen = false; for (int i = 0; i < inputString.Length; i++) { if (inputString[i] == '"') { quotesOpen = !quotesOpen; continue; } if (inputString[i] == ',' && !quotesOpen) { itemList.Add(currentIem); currentIem = ""; continue; } if (currentIem == "" && inputString[i] == ' ') continue; currentIem += inputString[i]; } if (currentIem != "") itemList.Add(currentIem); return itemList; } 

示例测试用法:

 var test1 = ItemStringToList("one, two, three"); var test2 = ItemStringToList("one, \"two\", three"); var test3 = ItemStringToList("one, \"two, three\""); var test4 = ItemStringToList("one, \"two, three\", four, \"five six\", seven"); var test5 = ItemStringToList("one, \"two, three\", four, \"five six\", seven"); var test6 = ItemStringToList("one, \"two, three\", four, \"five six, seven\""); var test7 = ItemStringToList("\"one, two, three\", four, \"five six, seven\""); 

如果想要更快的字符连接,可以将其更改为使用StringBuilder。

尝试使用它可以在许多方面使用分割数组字符串,如果你想通过空格分割,只需在(”)中放置一个空格。

  namespace LINQExperiment1 { class Program { static void Main(string[] args) { string[] sentence = new string[] { "apple", "orange", "baboons cows", " rainbow", "unicorns gummy bears" }; Console.WriteLine("option 1:"); Console.WriteLine("————-"); // option 1: Select returns three string[]'s with // three strings in each. IEnumerable words1 = sentence.Select(w => w.Split(' ')); // to get each word, we have to use two foreach loops foreach (string[] segment in words1) foreach (string word in segment) Console.WriteLine(word); Console.WriteLine(); Console.WriteLine("option 2:"); Console.WriteLine("————-"); // option 2: SelectMany returns nine strings // (sub-iterates the Select result) IEnumerable words2 = sentence.SelectMany(segment => segment.Split(',')); // with SelectMany we have every string individually foreach (var word in words2) Console.WriteLine(word); // option 3: identical to Opt 2 above written using // the Query Expression syntax (multiple froms) IEnumerable words3 =from segment in sentence from word in segment.Split(' ') select word; } } } 

这比我想象的要复杂,我认为这是一个很好的实际问题。

以下是我为此提出的解决方案。 我不喜欢我的解决方案的一件事是必须添加双引号,另一个是变量的名称:p:

 internal class Program { private static void Main(string[] args) { string searchString = @"apple, orange, ""baboons, cows. dogs- hounds"", rainbow, ""unicorns, gummy bears"", abc, defghj"; char delimeter = ','; char excludeSplittingWithin = '"'; string[] splittedByExcludeSplittingWithin = searchString.Split(excludeSplittingWithin); List splittedSearchString = new List(); for (int i = 0; i < splittedByExcludeSplittingWithin.Length; i++) { if (i == 0 || splittedByExcludeSplittingWithin[i].StartsWith(delimeter.ToString())) { string[] splitttedByDelimeter = splittedByExcludeSplittingWithin[i].Split(delimeter); for (int j = 0; j < splitttedByDelimeter.Length; j++) { splittedSearchString.Add(splitttedByDelimeter[j].Trim()); } } else { splittedSearchString.Add(excludeSplittingWithin + splittedByExcludeSplittingWithin[i] + excludeSplittingWithin); } } foreach (string s in splittedSearchString) { if (s.Trim() != string.Empty) { Console.WriteLine(s); } } Console.ReadKey(); } } 

另一个Regex解决方案:

 private static IEnumerable Parse(string input) { // if used frequently, should be instantiated with Compiled option Regex regex = new Regex(@"(?<=^|,\s)(\""(?:[^\""]|\""\"")*\""|[^,\s]*)"); return regex.Matches(inputData).Where(m => m.Success); }