正则表达式从名称中提取首字母

eg. if the Name is: John Deer the Initials should be: JD

我可以使用子字符串在Initials字段上执行此检查，但是想知道我是否可以为它编写正则表达式？编写正则表达式比使用字符串方法更好吗？

就个人而言，我更喜欢这个正则表达式

 Regex initials = new Regex(@"(\b[a-zA-Z])[a-zA-Z]* ?"); string init = initials.Replace(nameString, "$1"); //Init = "JD"

这会处理首字母和空白删除（即那里的’？’）。

你唯一需要担心的是像Jr.或Sr.，或者Mrs ….等标题和标点。有些人确实包括那些全名

这是我的解决方案。我的目标不是提供最简单的解决方案，而是提供一种可以采用各种（有时是奇怪的）名称格式的解决方案，并在首字母和姓氏初始（或在匿名用户的情况下）产生最佳猜测。

我也尝试用相对国际友好的方式编写它，使用unicode正则表达式，虽然我没有为多种外来名称（例如中文）生成首字母的经验，尽管它至少应该生成一些可用的东西用两个字来代表这个人。例如，用“행운의복숭아”这样的韩语命名会产生행복，正如你所料的那样（尽管在韩国文化中这可能不是正确的方法）。

 ///  /// Given a person's first and last name, we'll make our best guess to extract up to two initials, hopefully /// representing their first and last name, skipping any middle initials, Jr/Sr/III suffixes, etc. The letters /// will be returned together in ALL CAPS, eg "TW". /// /// The way it parses names for many common styles: /// /// Mason Zhwiti -> MZ /// mason lowercase zhwiti -> MZ /// Mason G Zhwiti -> MZ /// Mason G. Zhwiti -> MZ /// John Queue Public -> JP /// John Q. Public, Jr. -> JP /// John Q Public Jr. -> JP /// Thurston Howell III -> TH /// Thurston Howell, III -> TH /// Malcolm X -> MX /// A Ron -> AR /// AA Ron -> AR /// Madonna -> M /// Chris O'Donnell -> CO /// Malcolm McDowell -> MM /// Robert "Rocky" Balboa, Sr. -> RB /// 1Bobby 2Tables -> BT /// Éric Ígor -> ÉÍ /// 행운의 복숭아 -> 행복 /// /// 
 /// The full name of a person. /// One to two uppercase initials, without punctuation. public static string ExtractInitialsFromName(string name) { // first remove all: punctuation, separator chars, control chars, and numbers (unicode style regexes) string initials = Regex.Replace(name, @"[\p{P}\p{S}\p{C}\p{N}]+", ""); // Replacing all possible whitespace/separator characters (unicode style), with a single, regular ascii space. initials = Regex.Replace(initials, @"\p{Z}+", " "); // Remove all Sr, Jr, I, II, III, IV, V, VI, VII, VIII, IX at the end of names initials = Regex.Replace(initials.Trim(), @"\s+(?:[JS]R|I{1,3}|I[VX]|VI{0,3})$", "", RegexOptions.IgnoreCase); // Extract up to 2 initials from the remaining cleaned name. initials = Regex.Replace(initials, @"^(\p{L})[^\s]*(?:\s+(?:\p{L}+\s+(?=\p{L}))?(?:(\p{L})\p{L}*)?)?$", "$1$2").Trim(); if (initials.Length > 2) { // Worst case scenario, everything failed, just grab the first two letters of what we have left. initials = initials.Substring(0, 2); } return initials.ToUpperInvariant(); }

这个怎么样？

 var initials = Regex.Replace( "John Deer", "[^AZ]", "" );

这是一个强调保持简单的替代方案：

  ///  /// Gets initials from the supplied names string. /// 
 /// Names separated by whitespace /// Separator between initials (eg "", "." or ". ")  /// Upper case initials (with separators in between) public static string GetInitials(string names, string separator) { // Extract the first character out of each block of non-whitespace Regex extractInitials = new Regex(@"\s*([^\s])[^\s]*\s*"); return extractInitials.Replace(names, "$1" + separator).ToUpper(); }

如果提供的名称不符合预期，则有一个问题该怎么办。我个人认为它应该只返回每个不是空格的文本块中的第一个字符。例如：

 1Steve 2Chambers => 12 harold mcDonald => HM David O'Leary => DO David O' Leary => DOL Ronnie "the rocket" O'Sullivan => R"RO

会有人争论更复杂/更复杂的技术（例如，更好地处理最后一个）但IMO这确实是一个数据清理问题。

试试这个

 (^| )([^ ])([^ ])*','\2')

或者这个

  public static string ToInitials(this string str) { return Regex.Replace(str, @"^(?'b'\w)\w*,\s*(?'a'\w)\w*$|^(?'a'\w)\w*\s*(?'b'\w)\w*$", "${a}${b}", RegexOptions.Singleline) }

http://www.kewney.com/posts/software-development/using-regular-expressions-to-get-initials-from-a-string-in-c-sharp

是的，使用正则表达式。您可以使用Regex.Match和Regex.Match.Groups方法查找匹配项，然后提取所需的匹配值 – 在这种情况下为首字母。查找和提取值将同时发生。

这个怎么样：

  string name = "John Clark MacDonald"; var parts = name.Split(' '); string initials = ""; foreach (var part in parts) { initials += Regex.Match(part, "[AZ]"); Console.WriteLine(part + " --> " + Regex.Match(part,"[AZ]")); } Console.WriteLine("Final initials: " + initials); Console.ReadKey();

这允许使用可选的中间名，并适用于多个大写，如上所示。

[az]+[az]+\b将为你净化每个名字的前两个字母……

其中name =’Greg Henry’=’GH’或’James Smith”J S’

然后你可以拆分”并加入”

这甚至适用于像

‘James Henry George Michael’=’JHG M’

‘詹姆斯亨利乔治迈克尔三世’第二’=’JHGM III’

如果你想避免分割利用[az]+[az]+\b ?

但是像Jon Michael Jr. The 3rd这样的名字将是= JMJr.T3 ，如果您愿意，上面的选项允许您获得’The’，’the’和’3rd’。

如果你真的想要想要你可以使用(\b[a-zA-Z])[a-zA-Z]* ? 只匹配名称的部分，然后替换为前者。

正则表达式从名称中提取首字母

如何在Xamarin for Android中设置jpeg文件的属性？

使用async / await的信号量线程限制

Linq存储过程返回XML

Request.IsAuthenticated总是错误的

如何在外部浏览器中打开webBrowser控件中的链接？

阅读HttpwebResponse json响应，C＃

HMC SHA1哈希 – C＃产生与Ruby不同的哈希输出

c＃chart control删除条形图中条形之间的空格

ASP.NET MVC以分发包的forms部署

MediaCapture Windows 8桌面应用程序 – 无法使其正常工作