如何只将大型xml文件的一部分反序列化为c＃类？

我已经阅读了一些关于如何反序列化xml的post和文章，但仍然没有想出我应该编写代码以满足我的需求的方式，所以..我为另一个关于反序列化xml的问题道歉））

我有一个大的（50 MB）xml文件，我需要反序列化。我使用xsd.exe来获取文档的xsd模式，而不是自动生成我放入项目的c＃classes文件。我想从这个xml文件中获取一些（不是全部）数据并将其放入我的sql数据库中。

这是文件的层次结构（简化，xsd非常大）：

public class yml_catalog { public yml_catalogShop[] shop { /*realization*/ } } public class yml_catalogShop { public yml_catalogShopOffersOffer[][] offers { /*realization*/ } } public class yml_catalogShopOffersOffer { // here goes all the data (properties) I want to obtain )) }

这是我的代码：

第一种方法：

 yml_catalogShopOffersOffer catalog; var serializer = new XmlSerializer(typeof(yml_catalogShopOffersOffer)); var reader = new StreamReader(@"C:\div_kid.xml"); catalog = (yml_catalogShopOffersOffer) serializer.Deserialize(reader);//exception occures reader.Close();

我得到InvalidOperationException：XML（3,2）文档中有一个错误

第二种方法：

 XmlSerializer ser = new XmlSerializer(typeof(yml_catalogShopOffersOffer)); yml_catalogShopOffersOffer result; using (XmlReader reader = XmlReader.Create(@"C:\div_kid.xml")) { result = (yml_catalogShopOffersOffer)ser.Deserialize(reader); // exception occures }

InvalidOperationException：XML（0,0）文档中存在错误

第三：我试图反序列化整个文件：

  XmlSerializer ser = new XmlSerializer(typeof(yml_catalog)); // exception occures yml_catalog result; using (XmlReader reader = XmlReader.Create(@"C:\div_kid.xml")) { result = (yml_catalog)ser.Deserialize(reader); }

我得到以下内容：

 error CS0030: The convertion of type "yml_catalogShopOffersOffer[]" into "yml_catalogShopOffersOffer" is not possible. error CS0029: The implicit convertion of type "yml_catalogShopOffersOffer" into "yml_catalogShopOffersOffer[]" is not possible.

那么，如何修复（或覆盖）代码以避免exception？

编辑：当我写：

 XDocument doc = XDocument.Parse(@"C:\div_kid.xml");

发生XmlException：根级别的未经许可的数据，字符串1，位置1。

这是xml文件的第一个字符串：

编辑2： xml文件简短示例：

     OZON.ru ?????? "???????????????? ??????????????" http://www.ozon.ru/     base category bla bla bla // here goes all the categories       // other offers

PS我已经接受了答案（这是完美的）。但现在我需要使用categoryId为每个Offer找到“基类别”。数据是分层的，基类别是没有“parentId”属性的类别。所以，我写了一个递归方法来找到“基类”，但它永远不会完成。好像algorythm不是很快））
这是我的代码:(在main（）方法中）

 var doc = XDocument.Load(@"C:\div_kid.xml"); var offers = doc.Descendants("shop").Elements("offers").Elements("offer"); foreach (var offer in offers.Take(2)) { var category = GetCategory(categoryId, doc); // here goes other code }

助手方法：

 public static string GetCategory(int categoryId, XDocument document) { var tempId = categoryId; var categories = document.Descendants("shop").Elements("categories").Elements("category"); foreach (var category in categories) { if (category.Attribute("id").ToString() == categoryId.ToString()) { if (category.Attributes().Count() == 1) { return category.ToString(); } tempId = Convert.ToInt32(category.Attribute("parentId")); } } return GetCategory(tempId, document); }

在这种情况下我可以使用递归吗？如果没有，我怎么能找到“基类”？

尝试LINQ to XML。 XElement result = XElement.Load(@"C:\div_kid.xml");

在LINQ中查询是很棒的，但一开始肯定有点奇怪。您可以使用SQL语法或使用lambda表达式从Document中选择节点。然后创建包含您感兴趣的数据的匿名对象（或使用现有的类）。

最好是看到它在行动。

LINQ to XML的各种示例
使用xquery和lambdas的简单示例
表示命名空间的样本
在msdn上还有更多。搜索LINQ to XML。

根据您的示例XML和代码，这是一个具体示例：

 var element = XElement.Load(@"C:\div_kid.xml"); var shopsQuery = from shop in element.Descendants("shop") select new { Name = (string) shop.Descendants("name").FirstOrDefault(), Company = (string) shop.Descendants("company").FirstOrDefault(), Categories = from category in shop.Descendants("category") select new { Id = category.Attribute("id").Value, Parent = category.Attribute("parentId").Value, Name = category.Value }, Offers = from offer in shop.Descendants("offer") select new { Price = (string) offer.Descendants("price").FirstOrDefault(), Picture = (string) offer.Descendants("picture").FirstOrDefault() } }; foreach (var shop in shopsQuery){ Console.WriteLine(shop.Name); Console.WriteLine(shop.Company); foreach (var category in shop.Categories) { Console.WriteLine(category.Name); Console.WriteLine(category.Id); } foreach (var offer in shop.Offers) { Console.WriteLine(offer.Price); Console.WriteLine(offer.Picture); } }

作为额外：以下是如何从平面category元素反序列化类别树。你需要一个合适的class级来容纳他们，因为孩子的名单必须有一个类型：

 class Category { public int Id { get; set; } public int? ParentId { get; set; } public List Children { get; set; } public IEnumerable Descendants { get { return (from child in Children select child.Descendants).SelectMany(x => x). Concat(new Category[] { this }); } } }

要创建包含文档中所有不同类别的列表：

 var categories = (from category in element.Descendants("category") orderby int.Parse( category.Attribute("id").Value ) select new Category() { Id = int.Parse(category.Attribute("id").Value), ParentId = category.Attribute("parentId") == null ? null as int? : int.Parse(category.Attribute("parentId").Value), Children = new List() }).Distinct().ToList();

然后将它们组织成一棵树（从平面列表借用层次结构到层次结构）：

 var lookup = categories.ToLookup(cat => cat.ParentId); foreach (var category in categories) { category.Children = lookup[category.Id].ToList(); } var rootCategories = lookup[null].ToList();

要查找包含该theCategory的根：

 var root = (from cat in rootCategories where cat.Descendants.Contains(theCategory) select cat).FirstOrDefault();

如何只将大型xml文件的一部分反序列化为c＃类？

找不到使用System.Data.Linq

Web服务无法在GAC中使用类型创建类型错误

如何（有效地）将SqlDataReader字段转换（强制转换）为其对应的c＃类型？

ToCharArray和ToArray之间的区别

获取Sharepoint Online用户

WebClient访问具有凭据的页面

如何找到整个解决方案中的所有注释行？

C＃String.Format与字符串中的curl括号

如何穿越dacpac

如何在没有SelectionStart的情况下设置TextBox光标位置