var someHugeListOfIds = myData.Pages .Select(p=>p.ParentSessionId); var someSessions = myData.Sessions .Where(s=>someHugeListOfIds.Contains(s.Id)); foreach(var session in someSessions) DoSomething(someSession)
I noticed that when the count of the someHugeListOfIds got to be greater than one hundred thousand (maybe even less, but didnt test) the execution of the foreach statement was incredibly slow. The solution? Convert that huge list of Ids in to a HashSet<T>. You can do this by hand, or you can write an extension method like the one below:
public static HashSet<T> ToHashSet<T>(this IEnumerable<T> enumerable) { HashSet<T> hashSet = new HashSet<T>(); foreach (var en in enumerable) { if (!hashSet.Contains(en)) hashSet.Add(en); } return hashSet; }
Now I can re-write the query to the one below and instead of 10s of seconds for the query to complete, its done in a few milliseconds!
var someHugeListOfIds = myData.Pages .Select(p=>p.ParentSessionId) .ToHashSet(); var someSessions = myData.Sessions .Where(s=>someHugeListOfIds.Contains(s.Id)); foreach(var session in someSessions) DoSomething(someSession)
Wicked. Hashset is hugely faster.
ReplyDelete