iTextSharp替换现有PDF中的文本而不会失去形成

我一直在互联网上搜索2周，并为我的问题找到了一些有趣的解决方案，但似乎没有什么能给我答案。

我的目标是做下面的事情：

我想在静态PDF文件中找到一个文本，并将该文本替换为另一个文本。我想保留内容的设计。这真的很难吗？

我发现了一种方法，但我丢失了整个信息：

using (PdfReader reader = new PdfReader(path)) { StringBuilder text = new StringBuilder(); for (int i = 1; i <= reader.NumberOfPages; i++) { text.Append(PdfTextExtractor.GetTextFromPage(reader, i)); text.Replace(txt_SuchenNach.Text, txt_ErsetzenMit.Text); } return text.ToString(); }

我的第二次尝试更好，但需要我可以更改内部文本的字段：

  string fileNameExisting =path; string fileNameNew = @"C:\TEST.pdf"; using (FileStream existingFileStream = new FileStream(fileNameExisting, FileMode.Open)) using (FileStream newFileStream = new FileStream(fileNameNew, FileMode.Create)) { // PDF öffnen PdfReader pdfReader = new PdfReader(existingFileStream); PdfStamper stamper = new PdfStamper(pdfReader, newFileStream); var form = stamper.AcroFields; var fieldKeys = form.Fields.Keys; foreach (string fieldKey in fieldKeys) { var value = pdfReader.AcroFields.GetField(fieldKey); form.SetField(fieldKey, value.Replace(txt_SuchenNach.Text, txt_ErsetzenMit.Text)); } // Textfeld unbearbeitbar machen (sieht aus wie normaler text) stamper.FormFlattening = true; stamper.Close(); pdfReader.Close(); }

这样可以保留其余文本的格式，并且只会更改我的搜索文本。我需要一个不在Textfield中的文本解决方案。

感谢您的所有答案和帮助。

一般问题是文本对象可能使用嵌入字体，并将特定字形分配给特定字母。即如果你有一个文本对象有一些像“abcdef”这样的文本，那么嵌入字体可能只包含这些（“abcdef”字母）的字形，但不包含其他字母。因此，如果将“abcdef”替换为“xyz”，则PDF将不会显示这些“xyz”，因为没有可用于显示这些字母的字形。

所以我会考虑以下工作流程：

遍历所有文本对象;
在PDF文件的顶部添加从头开始创建的新文本对象，并设置相同的属性（字体，位置等），但使用不同的文本; 此步骤可能要求您在原始PDF中使用相同的字体，但您可以检查已安装的字体并使用另一种字体作为新的文本对象。这样，iTextSharp或其他PDF工具将为新文本对象嵌入新的字体对象。
创建重复的文本对象后删除原始文本对象;
使用上述工作流程处理每个文本对象;
将修改后的PDF文档保存到新文件中。

我已经按照相同的要求开展工作，我可以通过以下步骤实现这一目标。

步骤1：找到源Pdf文件和目标文件路径

步骤2：读取源Pdf文件并搜索我们要替换的字符串的位置

第3步：用新的字符串替换字符串。

 using iTextSharp.text; using iTextSharp.text.pdf; using iTextSharp.text.pdf.parser; using PDFExtraction; using System; using System.Collections.Generic; using System.IO; using System.Linq; using System.Web; using System.Web.UI; using System.Web.UI.WebControls; namespace PDFReplaceTextUsingItextSharp { public partial class ExtractPdf : System.Web.UI.Page { static iTextSharp.text.pdf.PdfStamper stamper = null; protected void Page_Load(object sender, EventArgs e) { } protected void Replace_Click(object sender, EventArgs e) { string ReplacingVariable = txtReplace.Text; string sourceFile = "Source File Path"; string descFile = "Destination File Path"; PdfReader pReader = new PdfReader(sourceFile); stamper = new iTextSharp.text.pdf.PdfStamper(pReader, new System.IO.FileStream(descFile, System.IO.FileMode.Create)); PDFTextGetter("ExistingVariableinPDF", ReplacingVariable , StringComparison.CurrentCultureIgnoreCase, sourceFile, descFile); stamper.Close(); pReader.Close(); } ///  /// This method is used to search for the location words in pdf and update it with the words given from replacingText variable /// 
 /// Searchable String /// Replacing String /// Case Ignorance /// Path of the source file /// Path of the destination file public static void PDFTextGetter(string pSearch, string replacingText, StringComparison SC, string SourceFile, string DestinationFile) { try { iTextSharp.text.pdf.PdfContentByte cb = null; iTextSharp.text.pdf.PdfContentByte cb2 = null; iTextSharp.text.pdf.PdfWriter writer = null; iTextSharp.text.pdf.BaseFont bf = null; if (System.IO.File.Exists(SourceFile)) { PdfReader pReader = new PdfReader(SourceFile); for (int page = 1; page <= pReader.NumberOfPages; page++) { myLocationTextExtractionStrategy strategy = new myLocationTextExtractionStrategy(); cb = stamper.GetOverContent(page); cb2 = stamper.GetOverContent(page); //Send some data contained in PdfContentByte, looks like the first is always cero for me and the second 100, //but i'm not sure if this could change in some cases strategy.UndercontentCharacterSpacing = (int)cb.CharacterSpacing; strategy.UndercontentHorizontalScaling = (int)cb.HorizontalScaling; //It's not really needed to get the text back, but we have to call this line ALWAYS, //because it triggers the process that will get all chunks from PDF into our strategy Object string currentText = PdfTextExtractor.GetTextFromPage(pReader, page, strategy); //The real getter process starts in the following line List MatchesFound = strategy.GetTextLocations(pSearch, SC); //Set the fill color of the shapes, I don't use a border because it would make the rect bigger //but maybe using a thin border could be a solution if you see the currect rect is not big enough to cover all the text it should cover cb.SetColorFill(BaseColor.WHITE); //MatchesFound contains all text with locations, so do whatever you want with it, this highlights them using PINK color: foreach (iTextSharp.text.Rectangle rect in MatchesFound) { //width cb.Rectangle(rect.Left, rect.Bottom, 60, rect.Height); cb.Fill(); cb2.SetColorFill(BaseColor.BLACK); bf = BaseFont.CreateFont(BaseFont.HELVETICA_BOLD, BaseFont.CP1252, BaseFont.NOT_EMBEDDED); cb2.SetFontAndSize(bf, 9); cb2.BeginText(); cb2.ShowTextAligned(0, replacingText, rect.Left, rect.Bottom, 0); cb2.EndText(); cb2.Fill(); } } } } catch (Exception ex) { } } } }

iTextSharp替换现有PDF中的文本而不会失去形成

用于将XML文件读入DataTable的代码

如何从像素字节数组创建BitmapImage（实时video显示）

在回发asp.net之间保存变量的最佳方法？

拖放式手风琴面板？（ASP.Net）

C＃在Windows启动时运行应用程序MINIMIZED

文件只读访问，无论锁定（C＃）

如何将用户重定向到ACS默认登录页面

将鼠标侧按钮绑定到VisualStudio操作

垃圾收集对象跨越AppDomain边界

.NET中的被动日志记录是否可行？

iTextSharp替换现有PDF中的文本而不会失去形成

用于将XML文件读入DataTable的代码

如何从像素字节数组创建BitmapImage（实时video显示）

在回发asp.net之间保存变量的最佳方法？

拖放式手风琴面板？ （ASP.Net）

C＃在Windows启动时运行应用程序MINIMIZED

文件只读访问，无论锁定（C＃）

如何将用户重定向到ACS默认登录页面

将鼠标侧按钮绑定到VisualStudio操作

垃圾收集对象跨越AppDomain边界

.NET中的被动日志记录是否可行？

拖放式手风琴面板？（ASP.Net）