比较两个包含大量对象的列表（第三部分）“这些对象具有不同的类型”

发布于 2024-11-08 23:00:02 字数 923 浏览 0 评论 0原文

我怎样才能加快这个 linq 查询的速度？

这需要很长时间，当我在列表中放置很多对象时，我会遇到内存异常。

List<DirectoryInfo> directoriesThatWillBeCreated = new List<DirectoryInfo>();
// some code to fill the list
// ..
// ..

List<FileInfo> FilesThatWillBeCopied = new List<FileInfo>();
// some code to fill the list
//....

directoriesThatWillBeCreated = (from a in FilesThatWillBeCopied
                                from b in directoriesThatWillBeCreated
                                where a.FullName.Contains(b.FullName)
                                select b).ToList();

我希望我可以做类似以前的解决方案< /a> 但我不知道在处理不同类型的对象时如何做到这一点。我是否必须创建一个新类，然后将所有 FileInfo 和 DirectoryInfo 对象转换为该类，然后执行查询？此外，FileInfo 和 DirectoryInfo 类是密封的，我无法从它们继承，因此我必须创建一个新类，这不会太高效。至少这比该查询更有效，因为该查询需要很长时间。

原文

How could I speed up this linq query?

It takes a long time and when I place a lot of objects in the list I get a memory exception.

List<DirectoryInfo> directoriesThatWillBeCreated = new List<DirectoryInfo>();
// some code to fill the list
// ..
// ..

List<FileInfo> FilesThatWillBeCopied = new List<FileInfo>();
// some code to fill the list
//....

directoriesThatWillBeCreated = (from a in FilesThatWillBeCopied
                                from b in directoriesThatWillBeCreated
                                where a.FullName.Contains(b.FullName)
                                select b).ToList();

I hope I can do something like previous solution but I don't know how to do that when dealing with different types of objects. Do I have to create a new class then convert all the FileInfo and DirectoryInfo objects to that class then perform the query? Moreover FileInfo and DirectoryInfo classes are sealed and I cannot inherit from them therefore I'll have to create a new class and that will be not to efficient. At least that will be more efficient than that query because that query takes forever.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

初见你 2024-11-15 23:00:02

您可以做的一件事是将包含更改为开始。如果匹配失败，StartsWith 会更快失败。

directoriesThatWillBeCreated = (from a in FilesThatWillBeCopied
                                from b in directoriesThatWillBeCreated
                                where a.FullName.StartsWith(b.FullName)
                                select b).ToList();

但这并不是一个完整的解决方案。如果 FilesThatWillBeCopied 有 M 个项目，而 directoriesThatWillBeCreated 有 N 个元素，那么您的查询将处理 MxN 字符串比较。

另一种选择

尝试另一种优化，首先遍历directoriesThatWillBeCreated，然后选择与FilesThatWillBeCopied 中的任何FileInfo 匹配的那些。通过检查是否有任何匹配，一旦找到匹配，您就可以停止测试文件。可以这样做：（警告，记事本代码如下）

directoriesThatWillBeCreated = directoryThatWillBeCreated
    .Select(b => FilesThatWillBeCopied
    .Any(a => a.FullName.StartsWith(b.FullName)));

One thing you could do is change the Contains to a StartsWith. StartsWith will fail faster in the event of a failed match.

directoriesThatWillBeCreated = (from a in FilesThatWillBeCopied
                                from b in directoriesThatWillBeCreated
                                where a.FullName.StartsWith(b.FullName)
                                select b).ToList();

This isn't a complete solution, though. If FilesThatWillBeCopied has M items and directoriesThatWillBeCreated has N elements, then your query is going to process MxN string comparisons.

Another Option

Another optimization to try, iterate through directoriesThatWillBeCreated first, then select those that match any FileInfo in FilesThatWillBeCopied. By checking if any match, you could break out of testing the files once a match is found. That could be done like this: (warning, notepad code follows)

directoriesThatWillBeCreated = directoryThatWillBeCreated
    .Select(b => FilesThatWillBeCopied
    .Any(a => a.FullName.StartsWith(b.FullName)));

回复收藏 0 原文

毅然前行 2024-11-15 23:00:02

它很慢，因为代码在目录列表中对每个文件进行线性搜索。试试这个：

var dirlist = FilesThatWillBeCopied
    .Select(f => Directory.GetParent(f.FullName))
    .GroupBy(d => d.FullName)

您可能需要稍微尝试一下语法，但希望您明白这一点。

It's slow because the code does linear search in directory list for each file. Try this:

var dirlist = FilesThatWillBeCopied
    .Select(f => Directory.GetParent(f.FullName))
    .GroupBy(d => d.FullName)

You may need to play with the syntax a little bit but hopefully you see the point.

回复收藏 0 原文

三岁铭 2024-11-15 23:00:02

我建议使用 HashSet 进行比较，但不幸的是，DirectoryInfo 没有实现适当的相等比较，因此必须使用字符串。（另一种选择是实现您自己的 IComparer。）此外，您应该在名称上使用 StringComparer.InvariantCultureIgnoreCase，除非您确定两个集合具有相同的内容案件。

var dirs = new HashSet<string>(StringComparer.InvariantCultureIgnoreCase);
// fill dirs

var files = new List<FileInfo>();
// fill files

var result = new HashSet<string>(StringComparer.InvariantCultureIgnoreCase);

foreach (var file in files)
{
    var dir = file.Directory;
    while (dir != null && !result.Contains(dir.FullName))
    {
        if (dirs.Contains(dir.FullName))
            result.Add(dir.FullName);
        dir = dir.Parent;
    }
}

此解决方案根本不使用 LINQ，但当您追求性能并且最直接的 LINQ 解决方案太慢时，通常会出现这种情况。

I would suggest using HashSet<DirectoryInfo> for comparisons, but unfortunately, DirectoryInfo doesn't have proper equality comparisons implemented, so strings will have to do. (Another option would be to implement your own IComparer<DirectoryInfo>.) Also, you should use StringComparer.InvariantCultureIgnoreCase on the names unless you are sure that both collections have the same case.

var dirs = new HashSet<string>(StringComparer.InvariantCultureIgnoreCase);
// fill dirs

var files = new List<FileInfo>();
// fill files

var result = new HashSet<string>(StringComparer.InvariantCultureIgnoreCase);

foreach (var file in files)
{
    var dir = file.Directory;
    while (dir != null && !result.Contains(dir.FullName))
    {
        if (dirs.Contains(dir.FullName))
            result.Add(dir.FullName);
        dir = dir.Parent;
    }
}

This solution doesn't use LINQ at all, but that's often the case when you're after performance and the most straight-forward LINQ solution is too slow.

回复收藏 0 原文

~没有更多了~