将字符串转换为标题大小写

我需要转换为标题案例如下:

  1. 短语中的第一个单词;

  2. 换句话说,在同一短语中,哪个长度大于minLength。

我在看ToTitleCase但结果不是预期的。

因此,minLength = 2的短语“汽车非常快”将成为“汽车非常快”。

我能够使用以下方法将第一个单词设为大写:

Char[] letters = source.ToCharArray(); letters[0] = Char.ToUpper(letters[0]); 

并获得我正在使用的单词:

 Regex.Matches(source, @"\b(\w|['-])+\b" 

但我不知道如何把所有这些放在一起

谢谢你,米格尔

示例代码:

 string input = "i have the car which is very fast"; int minLength = 2; string regexPattern = string.Format(@"^\w|\b\w(?=\w{{{0}}})", minLength); string output = Regex.Replace(input, regexPattern, m => m.Value.ToUpperInvariant()); 

更新 (对于单个字符串中有多个句子的情况)。

 string input = "i have the car which is very fast. me is slow."; int minLength = 2; string regexPattern = string.Format(@"(?<=(^|\.)\s*)\w|\b\w(?=\w{{{0}}})", minLength); string output = Regex.Replace(input, regexPattern, m => m.Value.ToUpperInvariant()); 

输出:

 I Have The Car Which is Very Fast. Me is Slow. 

你可能希望处理!? 和其他符号,然后你可以使用以下。 您可以根据需要添加任意数量的句子终止符号。

 string input = "i have the car which is very fast! me is slow."; int minLength = 2; string regexPattern = string.Format(@"(?<=(^|[.!?])\s*)\w|\b\w(?=\w{{{0}}})", minLength); string output = Regex.Replace(input, regexPattern, m => m.Value.ToUpperInvariant()); 

更新(2) – 将e-marketing转换为E-Marketing (考虑-作为有效的单词符号):

 string input = "i have the car which is very fast! me is slow. it is very nice to learn e-marketing these days."; int minLength = 2; string regexPattern = string.Format(@"(?<=(^|[.!?])\s*)\w|\b\w(?=[-\w]{{{0}}})", minLength); string output = Regex.Replace(input, regexPattern, m => m.Value.ToUpperInvariant()); 

英文标题案例非常复杂。 它不可计算。 期。

您可以获得的最好的例外是根据您的偏好列表更改所有小词。 对于所有口头表达,这仍然是错误的。 虽然扩展的变体列表可以捕获其中的许多变体,但如果没有语义分析,有些变体仍然无法决定。 两个例子:

  • 运行开/关空
  • 在建筑物上工作

后者从上下文中确实清楚; 前者不是。 意义明显不同,但计算机无法确定哪个是正确的。

(有时甚至人类都不能。我在这里问了第一个例子StackExchnge论坛并没有得到可接受的答案..)

这是我喜欢的替换列表; 但是一些四个字母的单词(没有双关语)是个人选择。 还有一些人可能会争辩说, 所有类型的数字,如任何所有少数都应该大写。

这不过是优雅的,事实上它是一种各种各样的尴尬。 但它对我很有用,所以我定期使用它并通过它喂了100k +标题..:

 public string ETC(string title) { // english title capitalization if (title == null) return ""; string s = title.Trim().Replace('`', '\''); // change apo to tick mark TextInfo UsaTextInfo = new CultureInfo("en-US", false).TextInfo; s = UsaTextInfo.ToTitleCase(s); // caps for all words // a list of exceptions one way or the other.. s = s.Replace(" A ", " a "); s = s.Replace(" also ", " Also "); s = s.Replace(" An ", " an "); s = s.Replace(" And ", " and "); s = s.Replace(" as ", " As "); s = s.Replace(" At ", " at "); s = s.Replace(" be ", " Be "); s = s.Replace(" But ", " But "); s = s.Replace(" By ", " by "); s = s.Replace(" For ", " for "); s = s.Replace(" From ", " from "); s = s.Replace(" if ", " If "); s = s.Replace(" In ", " in "); s = s.Replace(" Into ", " into "); s = s.Replace(" he ", " He "); s = s.Replace(" has ", " Has "); s = s.Replace(" had ", " Had "); s = s.Replace(" is ", " Is "); s = s.Replace(" my ", " My "); s = s.Replace(" ", " "); // no triple spaces s = s.Replace("'N'", "'n'"); // Rock 'n' Roll s = s.Replace("'N'", "'n'"); // Rock 'n Roll s = s.Replace(" no ", " No "); s = s.Replace(" Nor ", " nor "); s = s.Replace(" Not ", " not "); s = s.Replace(" Of ", " of "); s = s.Replace(" Off ", " off "); s = s.Replace(" On ", " on "); s = s.Replace(" Onto ", " onto "); s = s.Replace(" Or ", " or "); s = s.Replace(" O'c ", " O'C "); s = s.Replace(" Over ", " over "); s = s.Replace(" so ", " So "); s = s.Replace(" To ", " to "); s = s.Replace(" that ", " That "); s = s.Replace(" this ", " This "); s = s.Replace(" thus ", " Thus "); s = s.Replace(" The ", " the "); s = s.Replace(" Too ", " too "); s = s.Replace(" when ", " When "); s = s.Replace(" With ", " with "); s = s.Replace(" Up ", " up "); s = s.Replace(" Yet ", " yet "); // a few(!) verbal expressions s = s.Replace(" Get up ", " Get Up "); s = s.Replace(" Give up ", " Give Up "); s = s.Replace(" Givin' up ", " Givin' Up "); s = s.Replace(" Grow up ", " Grow Up "); s = s.Replace(" Hung up ", " Hung Up "); s = s.Replace(" Make up ", " Make Up "); s = s.Replace(" Wake Me up ", " Wake Me Up "); s = s.Replace(" Mixed up ", " Mixed Up "); s = s.Replace(" Shut up ", " Shut Up "); s = s.Replace(" Stand up ", " Stand Up "); s = s.Replace(" Wind up ", " Wind Up "); s = s.Replace(" Wake up ", " Wake Up "); s = s.Replace(" Come up ", " Come Up "); s = s.Replace(" Working on ", " Working On "); s = s.Replace(" Waiting on ", " Waiting On "); s = s.Replace(" Turn on ", " Turn On "); s = s.Replace(" Move on ", " Move On "); s = s.Replace(" Keep on ", " Keep On "); s = s.Replace(" Bring It on ", " Bring It On "); s = s.Replace(" Hold on ", " Hold On "); s = s.Replace(" Hang on ", " Hang On "); s = s.Replace(" Go on ", " Go On "); s = s.Replace(" Coming on ", " Coming On "); s = s.Replace(" Come on ", " Come On "); s = s.Replace(" Call on ", " Call On "); s = s.Replace(" Trust in ", " Trust In "); s = s.Replace(" Fell in ", " Fell In "); s = s.Replace(" Falling in ", " Falling In "); s = s.Replace(" Fall in ", " Fall In "); s = s.Replace(" Faith in ", " Faith In "); s = s.Replace(" Come in ", " Come In "); s = s.Replace(" Believe in ", " Believe In "); return s.Trim(); } 

请注意,仍有相当多的规则无法像这样实现。

一些基本规则并不那么难:将第一个和最后一个词大写。 所有动词( Is ),形容词( Red ),promouns( He ),名词( Ace )和数字( One ),即使它们少于3(或4)个字母。

但例外是困难的,例如:当他们是部分或言语表达时,不要把介词大写……

示例1:’在建筑物上工作’ – 你必须知道这是一首福音歌曲,以确定它是“开启”。

示例2:’运行On / on Empty’。 在Empty’上可能意味着’Running On’或’Running(with gas indictor)’。

所以最终你将不得不忍受妥协。

一个不需要正则表达式的替代(和天真)解决方案是使用String.Split方法和List.Select函数来映射复杂条件:

 var text = @"i have the car which is very fast. me is slow."; var length = 2; var first = true; // first word in the sentence var containsDot = false; // previous word contains a dot var result = text .Split(' ') .ToList() .Select (p => { if (first) { p = FirstCharToUpper(p); first = false; } if (containsDot) { p = FirstCharToUpper(p); containsDot = false; } containsDot = p.Contains("."); if (p.Length > length) { return FirstCharToUpper(p); } return p; }) .Aggregate ((h, t) => h + " " + t); Console.WriteLine(result); 

输出是:

 I Have The Car Which is Very Fast. Me is Slow. 

FirstCharToUpper方法来自这篇SOpost :

 public static string FirstCharToUpper(string input) { if (String.IsNullOrEmpty(input)) throw new ArgumentException("ARGH!"); return input.First().ToString().ToUpper() + String.Join("", input.Skip(1)); } 

这个解决方案的缺点是:条件越复杂,select语句就越复杂/不可读,但它是regex的替代品。

这是一种使用StringBuilder和纯字符串方法的方法,不需要使用正则表达式,所以它应该非常有效:

 public static string ToTitleCase(string input, int minLength = 0) { TextInfo ti = CultureInfo.CurrentCulture.TextInfo; string titleCaseDefault = ti.ToTitleCase(input); if (minLength == 0) return titleCaseDefault; StringBuilder sb = new StringBuilder(titleCaseDefault.Length); int wordCount = 0; char[] wordSeparatorChars = " \t\n.,;-:".ToCharArray(); for (int i = 0; i < titleCaseDefault.Length; i++) { char c = titleCaseDefault[i]; bool nonSpace = !char.IsWhiteSpace(c); if (nonSpace) { wordCount++; int firstSpace = titleCaseDefault.IndexOfAny(wordSeparatorChars, i); int endIndex = firstSpace >= 0 ? firstSpace : titleCaseDefault.Length; string word = titleCaseDefault.Substring(i, endIndex - i); if (wordCount == 1) // first word upper sb.Append(word); else sb.Append(word.Length < minLength ? word.ToLower() : ti.ToTitleCase(word)); i = endIndex - 1; } else sb.Append(c); } return sb.ToString(); } 

您采样数据:

 string text = "the car is very fast"; string output = ToTitleCase(text, 3);