字符数减去HTML字符C＃

我试图找出一种方法来计算字符串中的字符数，截断字符串，然后返回它。但是，我需要此函数来不计算HTML标记。问题是，如果它计算HTML标记，那么如果截断点位于标记的中间，那么页面将显示为已损坏。

这就是我到目前为止……

public string Truncate(string input, int characterLimit, string currID) { string output = input; // Check if the string is longer than the allowed amount // otherwise do nothing if (output.Length > characterLimit && characterLimit > 0) { // cut the string down to the maximum number of characters output = output.Substring(0, characterLimit); // Check if the character right after the truncate point was a space // if not, we are in the middle of a word and need to remove the rest of it if (input.Substring(output.Length, 1) != " ") { int LastSpace = output.LastIndexOf(" "); // if we found a space then, cut back to that space if (LastSpace != -1) { output = output.Substring(0, LastSpace); } } // end any anchors if (output.Contains("<a href")) { output += ""; } // Finally, add the "..." and end the paragraph output += "

...see more
"; } return output; }

但我对此并不满意。有一个更好的方法吗？如果你能为这个提供一个新的解决方案，或者可能就我到目前为止添加的内容提出建议，那就太好了。

免责声明：我从未使用过C＃，所以我不熟悉与语言相关的概念……我这样做是因为我必须这样做，而不是选择。

谢谢，Hristo

使用正确的工具来解决问题。

HTML不是一种简单的解析格式。我建议您使用经过validation的现有解析器而不是自己编译。如果您知道您只会解析XHTML – 那么您可以使用XML解析器。

这些是在HTML上执行操作以保留语义表示的唯一可靠方法。

不要尝试使用正则表达式 。 HTML不是一种常规语言，只会让你自己感到悲伤和痛苦。

字符数减去HTML字符C＃

如何知道我的DirectoryEntry是否真的连接到我的LDAP目录？

Microsoft企业库缓存应用程序块不是线程安全的？

使具有特定颜色的图像的每个像素透明

并行运行异步方法

我需要一个快速的运行时表达式解析器

Gmail：如何以编程方式发送电子邮件

如何限制HttpModule每个请求只有一个呼叫？

如何在必要时使用ADO.NET Entity Framework加载varbinary（max）字段？

在Entity Framework中的运行时期间更改数据库，而不更改Connection

如何确定Console.Out是否已重定向到文件？