如何计算字符串中每个单词的出现次数?

我使用下面的代码从字符串输入中提取单词,我怎样才能得到每个单词的出现?

var words = Regex.Split(input, @"\W+") .AsEnumerable() .GroupBy(w => w) .Where(g => g.Count() > 10) .Select(g => g.Key); 

您可以使用string.Split而不是Regex.Split来获取每个单词的计数,如:

 string str = "Some string with Some string repeated"; var result = str.Split(new[] { " " }, StringSplitOptions.RemoveEmptyEntries) .GroupBy(r => r) .Select(grp => new { Word = grp.Key, Count = grp.Count() }); 

如果你想过滤掉至少重复10次的那些单词那么你可以在Select之前添加一个条件,如Where(grp=> grp.Count >= 10)

输出:

 foreach (var item in result) { Console.WriteLine("Word: {0}, Count:{1}", item.Word, item.Count); } 

输出:

 Word: Some, Count:2 Word: string, Count:2 Word: with, Count:1 Word: repeated, Count:1 

对于不区分大小写的分组,您可以将当前GroupBy替换为:

 .GroupBy(r => r, StringComparer.InvariantCultureIgnoreCase) 

所以你的查询将是:

 var result = str.Split(new[] { " " }, StringSplitOptions.RemoveEmptyEntries) .GroupBy(r => r, StringComparer.InvariantCultureIgnoreCase) .Where(grp => grp.Count() >= 10) .Select(grp => new { Word = grp.Key, Count = grp.Count() }); 

试试这个:

 var words = Regex.Split(input, @"\W+") .AsEnumerable() .GroupBy(w => w) .Select(g => new {key = g.Key, count = g.Count()}); 

删除Select语句以保留IGrouping ,您可以使用它来查看这两个键并计算值。

 var words = Regex.Split(input, @"\W+") .AsEnumerable() .GroupBy(w => w) .Where(g => g.Count() > 10); foreach (var wordGrouping in words) { var word = wordGrouping.Key; var count = wordGrouping.Count(); } 

你可以生成这样的字典:

 var words = Regex.Split(input, @"\W+") .GroupBy(w => w) .Select(g => g.Count() > 10) .ToDictionary(g => g.Key, g => g.Count()); 

或者,如果您想避免计算两次计数,请执行以下操作:

 var words = Regex.Split(input, @"\W+") .GroupBy(w => w) .Select(g => new { g.Key, Count = g.Count() }) .Where(g => g.Count > 10) .ToDictionary(g => g.Key, g => g.Count); 

现在你可以得到这样的单词数(假设单词“foo”在input出现的次数超过10次):

 var fooCount = words["foo"];