正则表达式拆分字符串保留引号
我需要根据空格作为分隔符拆分下面的字符串。 但是应该保留引用中的任何空格。
research library "not available" author:"Bernard Shaw"
至
research library "not available" author:"Bernard Shaw"
我想在C Sharp做这个,我有这个正则表达式: @"(?<="")|\w[\w\s]*(?="")|\w+|""[\w\s]*"""
来自SO中的另一个post,它将字符串拆分为
research library "not available" author "Bernard Shaw"
遗憾的是,这不符合我的确切要求。
我正在寻找任何正则表达式,这将成功。
任何帮助赞赏。
只要在引用的字符串中没有转义引用,以下内容应该有效:
splitArray = Regex.Split(subjectString, "(?<=^[^\"]*(?:\"[^\"]*\"[^\"]*)*) (?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)");
此正则表达式仅在空格字符前面和后面跟偶数引号时才会分割。
没有所有逃脱报价的正则表达式解释说:
(?<= # Assert that it's possible to match this before the current position (positive lookbehind): ^ # The start of the string [^"]* # Any number of non-quote characters (?: # Match the following group... "[^"]* # a quote, followed by any number of non-quote characters "[^"]* # the same )* # ...zero or more times (so 0, 2, 4, ... quotes will match) ) # End of lookbehind assertion. [ ] # Match a space (?= # Assert that it's possible to match this after the current position (positive lookahead): (?: # Match the following group... [^"]*" # see above [^"]*" # see above )* # ...zero or more times. [^"]* # Match any number of non-quote characters $ # Match the end of the string ) # End of lookahead assertion
干得好:
C#:
Regex.Matches(subject, @"([^\s]*""[^""]+""[^\s]*)|\w+")
正则表达式:
([^\s]*\"[^\"]+\"[^\s]*)|\w+