I never used LINQ but now I have to and I'm having a bit of trouble with it. Below is an example of how I would normally compare two lists and retrieve the bad and good price records. How can I do the same procedure with LINQ?
static void FindItemswithInvalidPackages()
{
List<Item> items = new List<Item>()
{
new Item { Id = 1, Name = "Item 01", Packages = new List<Package> { new Package { Id = 1, Name = "SINGLE" }, new Package { Id = 2, Name = "PACK" } } },
new Item { Id = 2, Name = "Item 02", Packages = new List<Package> { new Package { Id = 1, Name = "SINGLE" }, new Package { Id = 2, Name = "PACK" } } },
new Item { Id = 3, Name = "Item 03", Packages = new List<Package> { new Package { Id = 1, Name = "SINGLE" }, new Package { Id = 2, Name = "PACK" } } },
new Item { Id = 4, Name = "Item 04", Packages = new List<Package> { new Package { Id = 1, Name = "SINGLE" }, new Package { Id = 2, Name = "PACK" } } },
new Item { Id = 5, Name = "Item 04", Packages = new List<Package> { new Package { Id = 1, Name = "SINGLE" }, new Package { Id = 2, Name = "PACK" } } },
new Item { Id = 6, Name = "Item 06", Packages = new List<Package> { new Package { Id = 1, Name = "SINGLE" }, new Package { Id = 2, Name = "PACK" } } }
};
List<RetailPrice> prices = new List<RetailPrice>()
{
new RetailPrice { Item = 1, Package = 1, Price = 1 },
new RetailPrice { Item = 2, Package = 2, Price = 2 },
new RetailPrice { Item = 3, Package = 1, Price = 3 },
new RetailPrice { Item = 4, Package = 2, Price = 4 },
new RetailPrice { Item = 5, Package = 3, Price = 5 }, // Bad one
new RetailPrice { Item = 6, Package = 2, Price = 6 }
};
// How would I do this in linq?
HashSet<string> itemPackages = new HashSet<string>();
foreach (var item in items)
{
foreach(var package in item.Packages)
{
itemPackages.Add($"{item.Id}|{package.Id}");
}
}
List<RetailPrice> badPrices = new List<RetailPrice>();
List<RetailPrice> goodPrices = new List<RetailPrice>();
foreach (var price in prices)
{
if (!itemPackages.Contains($"{price.Item}|{price.Package}"))
{
badPrices.Add(price);
}
else
{
goodPrices.Add(price);
}
}
}
I never used LINQ but now I have to and I'm having a bit of trouble with it. Below is an example of how I would normally compare two lists and retrieve the bad and good price records. How can I do the same procedure with LINQ?
static void FindItemswithInvalidPackages()
{
List<Item> items = new List<Item>()
{
new Item { Id = 1, Name = "Item 01", Packages = new List<Package> { new Package { Id = 1, Name = "SINGLE" }, new Package { Id = 2, Name = "PACK" } } },
new Item { Id = 2, Name = "Item 02", Packages = new List<Package> { new Package { Id = 1, Name = "SINGLE" }, new Package { Id = 2, Name = "PACK" } } },
new Item { Id = 3, Name = "Item 03", Packages = new List<Package> { new Package { Id = 1, Name = "SINGLE" }, new Package { Id = 2, Name = "PACK" } } },
new Item { Id = 4, Name = "Item 04", Packages = new List<Package> { new Package { Id = 1, Name = "SINGLE" }, new Package { Id = 2, Name = "PACK" } } },
new Item { Id = 5, Name = "Item 04", Packages = new List<Package> { new Package { Id = 1, Name = "SINGLE" }, new Package { Id = 2, Name = "PACK" } } },
new Item { Id = 6, Name = "Item 06", Packages = new List<Package> { new Package { Id = 1, Name = "SINGLE" }, new Package { Id = 2, Name = "PACK" } } }
};
List<RetailPrice> prices = new List<RetailPrice>()
{
new RetailPrice { Item = 1, Package = 1, Price = 1 },
new RetailPrice { Item = 2, Package = 2, Price = 2 },
new RetailPrice { Item = 3, Package = 1, Price = 3 },
new RetailPrice { Item = 4, Package = 2, Price = 4 },
new RetailPrice { Item = 5, Package = 3, Price = 5 }, // Bad one
new RetailPrice { Item = 6, Package = 2, Price = 6 }
};
// How would I do this in linq?
HashSet<string> itemPackages = new HashSet<string>();
foreach (var item in items)
{
foreach(var package in item.Packages)
{
itemPackages.Add($"{item.Id}|{package.Id}");
}
}
List<RetailPrice> badPrices = new List<RetailPrice>();
List<RetailPrice> goodPrices = new List<RetailPrice>();
foreach (var price in prices)
{
if (!itemPackages.Contains($"{price.Item}|{price.Package}"))
{
badPrices.Add(price);
}
else
{
goodPrices.Add(price);
}
}
}
If your collections are large, using a HashSet
to speed up the matching process is a good idea. This can avoid the inefficencies of a nested loop that a .Where(... .Any())
solution might generate and reduce an operation having an O(N x M)
complexity to one closer to O(N + M)
complexity. One pass through the lookup table to build the hash table, and one pass through the data table to check for matches. Each hash table seek is O(1)
, assuming no collisions. Big savings.
The following will build a hash set of Item.Id
and Package.Id
pairs (as an anonymous type) and will use that hash set to efficiently check each price
object.
var itemPackages = items
.SelectMany(item =>
item.Packages
.Select(package => new {Item = item.Id, Package = package.Id})
)
.ToHashSet();
List<RetailPrice> goodPrices = prices
.Where(price => itemPackages.Contains(new {price.Item, price.Package}))
.ToList();
List<RetailPrice> badPrices = prices
.Where(price => !itemPackages.Contains(new {price.Item, price.Package}))
.ToList();
Note that the Item-Id/Package-Id pairs above are represented as an anonymous type instead of a formatted string. (A string would still work, but adds extra overhead.)
The above is LINQ method syntax. The same can also be performed using (mostly) LINQ query syntax as follows.
var itemPackages = (
from item in items
from package in item.Packages
select new { Item = item.Id, Package = package.Id }
).ToHashSet();
var goodPrices = (
from price in prices
where itemPackages.Contains(new { price.Item, price.Package })
select price
).ToList();
var badPrices = (
from price in prices
where !itemPackages.Contains(new { price.Item, price.Package })
select price
).ToList();
LINQ method syntax is good for most simple operations involving the manipulation of one or a few collections. LINQ query syntax may be more intuitive to some programmers, especially when joining multiple data sources. Some operations (like .ToList()
) only be done using LINQ method syntax, so it is not uncommon to find a mix (like in the above code). It is good to learn both and then use whichever syntax you find easier to use and more readable for others for a given situation.
As you start using LINQ, often be working with anonymous types that are implicitly defined with an untyped new { ... }
expression. You may also often deal with complex nested generic types (even if you don't know it). To allow convenient declaration of variables to hold these types, C# introduced the var
keyword that implicitly defines the type based on the assigned value. So instead of declaring HashSet<anonymous type: int Item, int Package> itemPackages = ...
you can simply use var itemPackages = ...
. The var
keyword can also be used in place of simpler known types like int
or List<Price>
. Whether or not you choose to use var
for these cases is a question of style (readability vs simplicity).
Results:
Good prices 1:
{"Item":1,"Price":1,"Package":1}
{"Item":2,"Price":2,"Package":2}
{"Item":3,"Price":3,"Package":1}
{"Item":4,"Price":4,"Package":2}
{"Item":6,"Price":6,"Package":2}
Bad prices 1:
{"Item":5,"Price":5,"Package":3}
See this .NET Fiddle for a demo.
You can filter the elements in the prices
list that fulfill the existence of the item
element by matching the (item) Id
and the existence of the package
.
The badPrices
criteria is the negation of the goodPrices
criteria.
goodPrices = prices
.Where(price => items.Any(item => price.Item == item.Id && item.Packages.Any(package => price.Package == package.Id)))
.ToList();
badPrices = prices
.Where(price => !items.Any(item => price.Item == item.Id && item.Packages.Any(package => price.Package == package.Id)))
.ToList();
Demo @ .NET Fiddle