将XML解析为列表

我有一个非常详细的XML,我已经能够解析它的大部分,但我遇到了一棵树,只是让我难过,我害怕我正在努力,然后它需要。 这是我所指的XML。

   7200 General Supplies   7200 General Supplies   7200 General Supplies      T University   T University   T University     360806 National Institutes of Health   360903 National Institutes of Health   360957 National Institutes of Health     02 Research   02 Research   02 Research     015 Biology - Life Science   015 Biology - Life Science   015 Biology - Life Science     04400 TUSM:Neuroscience   04400 TUSM:Neuroscience   04400 TUSM:Neuroscience    

我试图最终得到一个类似于此的列表。

 Account distributionType Activity distributionValue Fund 7200 PercentOfPrice "" 10 360806 7200 PercentOfPrice "" 45 360903 7200 PercentOfPrice "" 45 360957 

等等…

我写的代码看起来像这样。 这是一个片段。 请注意,我认为我已经过度复杂了。

 if (tagName == "Codes") { // Create another reader that contains just the accounting elements. XmlReader inner = reader.ReadSubtree(); //inner.ReadToDescendant("Codes"); //printOutXML(inner); while (inner.Read()) { switch (inner.NodeType) { //walk down the xml hiearchy then simply fill in the values. case XmlNodeType.Element: switch (reader.Name) { case "CustomFieldValueSet": //get the attribute that we are currently working with such as account and innerTagName=inner.GetAttribute("name"); // activity and location can potentially be blank therefore i will check here if it is //and if it is i will immediate assign the activity list a set of empty quotes. if (innerTagName == "Activity") { if (inner.IsEmptyElement) { //quickly put fillers in . for (int i = 0; i < thisInvoice.account.Count; i++) { thisInvoice.activity.Add(""); } } } if (innerTagName == "Location") { if (inner.IsEmptyElement) { //quickly put fillers in . for (int i = 0; i  thisInvoice.distributionType.Count) { for (int i = 0; i < thisInvoice.distributionValue.Count - thisInvoice.distributionType.Count; i++) { thisInvoice.distributionType.Add(distType); } } break; case "Value": // XmlNodeType.Text if (innerTagName == "Account"/*&& inner.NodeType ==XmlNodeType.Text*/) { inner.MoveToContent();// move to the text inner.Read(); thisInvoice.account.Add(inner.Value); } if (innerTagName == "Activity") { // activitiy is not a mandartory field so it could be empty therefore we need // to check if its a self closing tag and if it is then we need to assign and if (inner.IsEmptyElement) { thisInvoice.activity.Add(""); } else { inner.MoveToContent();// move to the text inner.Read(); thisInvoice.activity.Add(inner.Value); } } if (innerTagName == "Location") { if (inner.IsEmptyElement) { thisInvoice.location.Add(""); } else { inner.MoveToContent();// move to the text inner.Read(); thisInvoice.location.Add(inner.Value); } } if (innerTagName == "Fund") { inner.MoveToContent();// move to the text inner.Read(); thisInvoice.fund.Add(inner.Value); } if (innerTagName == "Organization") { inner.MoveToContent();// move to the text inner.Read(); thisInvoice.org.Add(inner.Value); } if (innerTagName == "Program") { inner.MoveToContent();// move to the text inner.Read(); thisInvoice.prog.Add(inner.Value); } break; }//end switch break;//brake the outside case. case XmlNodeType.EndElement: if (inner.Name == "CustomFieldValueSet" || inner.Value == "CustomFieldValueSet") { distributionSwitch = true; Console.WriteLine(reader.Value); Console.WriteLine(reader.Name); } if (inner.Name == "Codes") { distributionSwitch = false; distType = null; inner.Close(); } break; }//end switch }//end while }//end the if; 

在标签distributionType的情况下,我需要使列表长度与帐户列表一样长,换句话说,一旦我将它放在变量上,我需要将其用作填充程序,以使分发类型列表与帐户清单。 我无法想象没有更简单的方法来做这个我继续看linq到xml但它没有多大意义。 我很想听听你们中的一些专家如何解决这个问题。 我试图用一点点代码来组合一个优雅的解决方案。 任何帮助将不胜感激。

如注释部分所述, Mihai使用LINQ to XML的解决方案的替代方法,您还可以使用预定义的类结构将XML反序列化为类型化的类和属性。

这样做的好处是,您将拥有一个表示XML的对象(很有希望),并允许您更轻松地处理XML中的数据。

使用提供的XML示例并使用Visual Studio中的编辑 – > 选择性粘贴 – >将XML作为类菜单选项,您将获得类似于下面的类结构(为了更容易阅读,这个类已经过一些改进)

 using System.Xml.Serialization; [XmlTypeAttribute(AnonymousType = true)] [XmlRootAttribute(Namespace = "", IsNullable = false)] public partial class Codes { [XmlElementAttribute("CustomFieldValueSet")] public List CustomFieldValueSet { get; set; } } [XmlTypeAttribute(AnonymousType = true)] public partial class CodesCustomFieldValueSet { [XmlElementAttribute("CustomFieldValue")] public List CustomFieldValue { get; set; } [XmlAttributeAttribute(AttributeName="name")] public string Name { get; set; } [XmlAttributeAttribute(AttributeName = "label")] public string Label { get; set; } [XmlAttributeAttribute(AttributeName = "distributionType")] public string DistributionType { get; set; } } [XmlTypeAttribute(AnonymousType = true)] public partial class CodesCustomFieldValueSetCustomFieldValue { public string Value { get; set; } public string Description { get; set; } [XmlAttributeAttribute(AttributeName = "distributionValue")] public decimal DistributionValue { get; set; } [XmlAttributeAttribute(AttributeName = "splitindex")] public byte SplitIndex { get; set; } } 

使用此类结构,您可以使用以下行反序列化XML
(其中txtInput.Text是我用来保存示例XML数据的TextBox)

 XmlSerializer serializer = new XmlSerializer(typeof(Codes)); Codes codesInput = serializer.Deserialize(new StringReader(txtInput.Text)) as Codes; if (codesInput != null) { // Do something with the data } 

注意:
根据您所需的输出和您提供的示例XML的结构,您需要将反序列化对象中的信息转换为您想要的内容/方式,因为我建议创建一个额外的类结构,并结合使用List ,用于保存所需输出中显示的所有信息。

更好的是,如果您控制XML的结构并且可以以更好的方式构造它,使其比现在更加自我解释,因为似乎每个CustomFieldValueSet之间的链接是splitindex ,这是一个属性。子节点,使它复杂化很多。

进一步阅读XML序列化:
MSDN:介绍XML序列化
XmlSerializer类

您可以使用Linq to XML 。

 using System.Xml; using System.Xml.Linq; static void Main(string[] args) { // This txt file contains your xml. var xml_sample = File.ReadAllText("xml_sample.txt"); var doc = XDocument.Parse(xml_sample); // Get all  that have the label attribute `Account` var accounts = from item in doc.Descendants("Codes").Descendants("CustomFieldValueSet") where (item.HasAttributes) && (item.Attribute("label").Value == "Account") select item; // Create an anonymous type containing the value of the // distributionValue attribute and the  node. var accountValue = from el in accounts.Descendants("CustomFieldValue") let distAttribute = el.Attribute("distributionValue") select new { distValue = distAttribute != null ? distAttribute.Value : "0", value = el.Descendants("Value").First().Value, }; // Display stuff here just to make sure we got it right. accounts.ToList().ForEach(el => Console.WriteLine(el.Name + " " + el.Attribute("distributionType").Value)); accountValue.ToList().ForEach(el => Console.WriteLine(el.distValue + ":"+ el.value)); } 

您应该能够根据需要使用这些想法来解析XML文件。