自动编译 Linq 查询

发布于 2024-07-29 23:11:16 字数 1151 浏览 4 评论 0原文

我们发现 编译我们的 Linq 查询 比每次都必须编译要快得多,因此我们希望开始使用编译查询。 问题是它使代码更难阅读,因为查询的实际语法在其他文件中是关闭的,远离它的使用位置。

我突然想到,也许可以编写一个方法(或扩展方法),使用反射来确定传入的查询并自动缓存已编译的版本以供将来使用。

var foo = (from f in db.Foo where f.ix == bar select f).Cached();

Cached() 必须反映传入的查询对象并确定所选的表以及查询的参数类型。 显然,反射有点慢,因此最好使用缓存对象的名称(但您仍然必须在第一次编译查询时使用反射)。

var foo = (from f in db.Foo where f.ix == bar select f).Cached("Foo.ix");

有谁有这样做的经验,或者知道这是否可能?

更新:对于那些没有看过它的人,您可以使用以下代码将 LINQ 查询编译为 SQL

public static class MyCompiledQueries
{
    public static Func<DataContext, int, IQueryable<Foo>> getFoo =
        CompiledQuery.Compile(
            (DataContext db, int ixFoo) => (from f in db.Foo
                                            where f.ix == ixFoo
                                            select f)
        );
}

我想做的是缓存这些Func<> 对象,我可以在第一次自动编译查询后调用这些对象。

We've found that compiling our Linq queries is much, much faster than them having to compile each time, so we would like to start using compiled queries. The problem is that it makes code harder to read, because the actual syntax of the query is off in some other file, away from where it's being used.

It occurred to me that it might be possible to write a method (or extension method) that uses reflection to determine what queries are being passed in and cache the compiled versions automatically for use in the future.

var foo = (from f in db.Foo where f.ix == bar select f).Cached();

Cached() would have to reflect the query object passed in and determine the table(s) selected on and the parameter types for the query. Obviously, reflection is a bit slow, so it might be better to use names for the cache object (but you'd still have to use reflection the first time to compile the query).

var foo = (from f in db.Foo where f.ix == bar select f).Cached("Foo.ix");

Does anyone have any experience with doing this, or know if it's even possible?

UPDATE: For those who have not seen it, you can compile LINQ queries to SQL with the following code:

public static class MyCompiledQueries
{
    public static Func<DataContext, int, IQueryable<Foo>> getFoo =
        CompiledQuery.Compile(
            (DataContext db, int ixFoo) => (from f in db.Foo
                                            where f.ix == ixFoo
                                            select f)
        );
}

What I am trying to do is have a cache of these Func<> objects that I can call into after automatically compiling the query the first time around.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

策马西风 2024-08-05 23:11:21

对于未来的后代:.NET Framework 4.5 将默认执行此操作(根据我刚刚观看的演示文稿中的幻灯片)。

For future posterity : .NET Framework 4.5 will do this by default (according to a slide in a presentation I just watched).

吹泡泡o 2024-08-05 23:11:20

既然没人尝试,那我就试试吧。 也许我们都能以某种方式解决这个问题。 这是我的尝试。

我使用字典进行了设置,我也没有使用 DataContext,尽管我认为这是微不足道的。

public static class CompiledExtensions
    {
        private static Dictionary<string, object> _dictionary = new Dictionary<string, object>();

        public static IEnumerable<TResult> Cache<TArg, TResult>(this IEnumerable<TArg> list, string name, Expression<Func<IEnumerable<TArg>, IEnumerable<TResult>>> expression)
        {
            Func<IEnumerable<TArg>,IEnumerable<TResult>> _pointer;

            if (_dictionary.ContainsKey(name))
            {
                _pointer = _dictionary[name] as Func<IEnumerable<TArg>, IEnumerable<TResult>>;
            }
            else
            {
                _pointer = expression.Compile();
                _dictionary.Add(name, _pointer as object);
            }

            IEnumerable<TResult> result;
            result = _pointer(list);

            return result;
        }
    }

现在这使我能够这样做,

  List<string> list = typeof(string).GetMethods().Select(x => x.Name).ToList();

  IEnumerable<string> results = list.Cache("To",x => x.Where( y => y.Contains("To")));
  IEnumerable<string> cachedResult = list.Cache("To", x => x.Where(y => y.Contains("To")));
  IEnumerable<string> anotherCachedResult = list.Cache("To", x => from item in x where item.Contains("To") select item);

期待对此进行一些讨论,以进一步发展这个想法。

Since nobody is attempting, I'll give it a shot. Maybe we can both work this out somehow. Here is my attempt at this.

I set this up using a dictionary, I am also not using DataContext although this is trivial i believe.

public static class CompiledExtensions
    {
        private static Dictionary<string, object> _dictionary = new Dictionary<string, object>();

        public static IEnumerable<TResult> Cache<TArg, TResult>(this IEnumerable<TArg> list, string name, Expression<Func<IEnumerable<TArg>, IEnumerable<TResult>>> expression)
        {
            Func<IEnumerable<TArg>,IEnumerable<TResult>> _pointer;

            if (_dictionary.ContainsKey(name))
            {
                _pointer = _dictionary[name] as Func<IEnumerable<TArg>, IEnumerable<TResult>>;
            }
            else
            {
                _pointer = expression.Compile();
                _dictionary.Add(name, _pointer as object);
            }

            IEnumerable<TResult> result;
            result = _pointer(list);

            return result;
        }
    }

now this allows me to do this

  List<string> list = typeof(string).GetMethods().Select(x => x.Name).ToList();

  IEnumerable<string> results = list.Cache("To",x => x.Where( y => y.Contains("To")));
  IEnumerable<string> cachedResult = list.Cache("To", x => x.Where(y => y.Contains("To")));
  IEnumerable<string> anotherCachedResult = list.Cache("To", x => from item in x where item.Contains("To") select item);

looking forward to some discussion about this, to further develop this idea.

囚我心虐我身 2024-08-05 23:11:19

我不得不处理保存一个&gt; 使用 LinqToSql 开发的 15y/o 项目,CPU 消耗太大。

基准测试表明,使用编译查询对于复杂查询来说速度要快 7 倍,对于简单查询速度要快 2 倍(考虑到运行查询本身可以忽略不计,这里仅涉及编译查询的吞吐量)。

缓存不是由 .Net Framework 自动完成的(无论什么版本),这只发生在实体框架而不是 LINQ-TO-SQL 上,并且这些是不同的技术。

编译查询的使用很棘手,因此这里有两个重要的亮点:

  • 您必须编译包含具体化指令(FirstOrDefault/First/Any/Take/Skip/ToList)的 que 查询,否则您将面临将整个数据库带入内存的风险:LINQ to SQL *编译*查询以及它们何时执行
  • 您不能双重迭代已编译查询的结果(如果它是 IQueryable),但是一旦您正确考虑了上一点,这基本上就可以解决。

考虑到这一点,我想出了这个缓存类。 使用其他评论中提出的静态方法有一些可维护性缺点 - 主要是可读性较差 - 而且迁移现有的庞大代码库更困难。

                LinqQueryCache<VCDataClasses>
                    .KeyFromQuery()
                    .Cache(
                        dcs.CurrentContext, 
                        (ctx, courseId) => 
                            (from p in ctx.COURSEs where p.COURSEID == courseId select p).FirstOrDefault(), 
                        5);

在非常紧密的循环中,使用被调用方的缓存键而不是查询本身可以提高 10% 的性能:

                LinqQueryCache<VCDataClasses>
                    .KeyFromStack()
                    .Cache(
                        dcs.CurrentContext, 
                        (ctx, courseId) => 
                            (from p in ctx.COURSEs where p.COURSEID == courseId select p).FirstOrDefault(), 
                        5);

下面是代码。 出于安全考虑,缓存会阻止编码器在已编译的查询中返回 IQueryable。

public class LinqQueryCache<TContext>
        where TContext : DataContext
    {
        protected static readonly ConcurrentDictionary<string, Delegate> CacheValue = new ConcurrentDictionary<string, Delegate>();

        protected string KeyValue = null;

        protected string Key
        {
            get => this.KeyValue;

            set
            {
                if (this.KeyValue != null)
                {
                    throw new Exception("This object cannot be reused for another key.");
                }

                this.KeyValue = value;
            }
        }

        private LinqQueryCache(string key)
        {
            this.Key = key;
        }

        public static LinqQueryCache<TContext> KeyFromStack(
            [System.Runtime.CompilerServices.CallerFilePath] string sourceFilePath = "",
            [System.Runtime.CompilerServices.CallerLineNumber] int sourceLineNumber = 0)
        {
            return new LinqQueryCache<TContext>(Encryption.GetMd5(sourceFilePath + "::" + sourceLineNumber));
        }

        public static LinqQueryCache<TContext> KeyFromQuery()
        {
            return new LinqQueryCache<TContext>(null);
        }

        public T Cache<T>(TContext db, Expression<Func<TContext, T>> q)
        {
            if (Debugger.IsAttached && typeof(T).IsAssignableFrom(typeof(IQueryable)))
            {
                throw new Exception("Cannot compiled queries with an IQueryableResult");
            }

            if (this.Key == null)
            {
                this.Key = q.ToString();
            }

            if (!CacheValue.TryGetValue(this.Key, out var result))
            {
                result = CompiledQuery.Compile(q);
                CacheValue.TryAdd(this.Key, result);
            }

            return ((Func<TContext, T>)result)(db);
        }

        public T Cache<T, TArg1>(TContext db, Expression<Func<TContext, TArg1, T>> q, TArg1 param1)
        {
            if (Debugger.IsAttached && typeof(T).IsAssignableFrom(typeof(IQueryable)))
            {
                throw new Exception("Cannot compiled queries with an IQueryableResult");
            }

            if (this.Key == null)
            {
                this.Key = q.ToString();
            }

            if (!CacheValue.TryGetValue(this.Key, out var result))
            {
                result = CompiledQuery.Compile(q);
                CacheValue.TryAdd(this.Key, result);
            }

            return ((Func<TContext, TArg1, T>)result)(db, param1);
        }
    }

I had to deal with saving a > 15y/o project that was developed using LinqToSql and was too CPU hungry.

Benchmarking showed that using compiled query is x7 faster for complex queries, and x2 for simple queries (considering that the running the query itself is negligible, here it's just about the throughput of compiling the query).

Caching is NOT done automatically by .Net Framework (no matter what version), this only happens for Entity Framework NOT for LINQ-TO-SQL, and these are different technologies.

Usage of compiled queries is tricky, so here are two important highlights:

  • You MUST compile que query including the materialization instructions (FirstOrDefault/First/Any/Take/Skip/ToList), otherwise you risk bringing your whole database into memory: LINQ to SQL *compiled* queries and when they execute
  • You cannot DOUBLE iterate on a compiled query's result (if it's an IQueryable), but this is basically solved once you properly consider the previous point

Considering that, I came up with this cache class. Using the static approach as proposed in other comments has some maintainability drawbacks - it's mainly less readable -, plus it is harder to migrate an existing huge codebase.

                LinqQueryCache<VCDataClasses>
                    .KeyFromQuery()
                    .Cache(
                        dcs.CurrentContext, 
                        (ctx, courseId) => 
                            (from p in ctx.COURSEs where p.COURSEID == courseId select p).FirstOrDefault(), 
                        5);

On very tight loops, using a cache key from the callee instead of the query itself yielded +10% better performance:

                LinqQueryCache<VCDataClasses>
                    .KeyFromStack()
                    .Cache(
                        dcs.CurrentContext, 
                        (ctx, courseId) => 
                            (from p in ctx.COURSEs where p.COURSEID == courseId select p).FirstOrDefault(), 
                        5);

And here is the code. The cache prevents the coder from returning an IQueryable in a compiled query, just for safety.

public class LinqQueryCache<TContext>
        where TContext : DataContext
    {
        protected static readonly ConcurrentDictionary<string, Delegate> CacheValue = new ConcurrentDictionary<string, Delegate>();

        protected string KeyValue = null;

        protected string Key
        {
            get => this.KeyValue;

            set
            {
                if (this.KeyValue != null)
                {
                    throw new Exception("This object cannot be reused for another key.");
                }

                this.KeyValue = value;
            }
        }

        private LinqQueryCache(string key)
        {
            this.Key = key;
        }

        public static LinqQueryCache<TContext> KeyFromStack(
            [System.Runtime.CompilerServices.CallerFilePath] string sourceFilePath = "",
            [System.Runtime.CompilerServices.CallerLineNumber] int sourceLineNumber = 0)
        {
            return new LinqQueryCache<TContext>(Encryption.GetMd5(sourceFilePath + "::" + sourceLineNumber));
        }

        public static LinqQueryCache<TContext> KeyFromQuery()
        {
            return new LinqQueryCache<TContext>(null);
        }

        public T Cache<T>(TContext db, Expression<Func<TContext, T>> q)
        {
            if (Debugger.IsAttached && typeof(T).IsAssignableFrom(typeof(IQueryable)))
            {
                throw new Exception("Cannot compiled queries with an IQueryableResult");
            }

            if (this.Key == null)
            {
                this.Key = q.ToString();
            }

            if (!CacheValue.TryGetValue(this.Key, out var result))
            {
                result = CompiledQuery.Compile(q);
                CacheValue.TryAdd(this.Key, result);
            }

            return ((Func<TContext, T>)result)(db);
        }

        public T Cache<T, TArg1>(TContext db, Expression<Func<TContext, TArg1, T>> q, TArg1 param1)
        {
            if (Debugger.IsAttached && typeof(T).IsAssignableFrom(typeof(IQueryable)))
            {
                throw new Exception("Cannot compiled queries with an IQueryableResult");
            }

            if (this.Key == null)
            {
                this.Key = q.ToString();
            }

            if (!CacheValue.TryGetValue(this.Key, out var result))
            {
                result = CompiledQuery.Compile(q);
                CacheValue.TryAdd(this.Key, result);
            }

            return ((Func<TContext, TArg1, T>)result)(db, param1);
        }
    }
卸妝后依然美 2024-08-05 23:11:19

您无法在匿名 lambda 表达式上调用扩展方法,因此您需要使用 Cache 类。 为了正确缓存查询,您还需要将所有参数(包括 DataContext)“提升”为 lambda 表达式的参数。 这会导致非常冗长的用法,例如:

var results = QueryCache.Cache((MyModelDataContext db) => 
    from x in db.Foo where !x.IsDisabled select x);

为了清理它,如果我们使其成为非静态的,我们可以在每个上下文的基础上实例化 QueryCache:

public class FooRepository
{
    readonly QueryCache<MyModelDataContext> q = 
        new QueryCache<MyModelDataContext>(new MyModelDataContext());
}

然后我们可以编写一个 Cache 方法,该方法将使我们能够编写以下内容:

var results = q.Cache(db => from x in db.Foo where !x.IsDisabled select x);

查询中的任何参数也需要被提升:

var results = q.Cache((db, bar) => 
    from x in db.Foo where x.id != bar select x, localBarValue);

这是我模拟的 QueryCache 实现:

public class QueryCache<TContext> where TContext : DataContext
{
    private readonly TContext db;
    public QueryCache(TContext db)
    {
        this.db = db;
    }

    private static readonly Dictionary<string, Delegate> cache = new Dictionary<string, Delegate>();

    public IQueryable<T> Cache<T>(Expression<Func<TContext, IQueryable<T>>> q)
    {
        string key = q.ToString();
        Delegate result;
        lock (cache) if (!cache.TryGetValue(key, out result))
        {
            result = cache[key] = CompiledQuery.Compile(q);
        }
        return ((Func<TContext, IQueryable<T>>)result)(db);
    }

    public IQueryable<T> Cache<T, TArg1>(Expression<Func<TContext, TArg1, IQueryable<T>>> q, TArg1 param1)
    {
        string key = q.ToString();
        Delegate result;
        lock (cache) if (!cache.TryGetValue(key, out result))
        {
            result = cache[key] = CompiledQuery.Compile(q);
        }
        return ((Func<TContext, TArg1, IQueryable<T>>)result)(db, param1);
    }

    public IQueryable<T> Cache<T, TArg1, TArg2>(Expression<Func<TContext, TArg1, TArg2, IQueryable<T>>> q, TArg1 param1, TArg2 param2)
    {
        string key = q.ToString();
        Delegate result;
        lock (cache) if (!cache.TryGetValue(key, out result))
        {
            result = cache[key] = CompiledQuery.Compile(q);
        }
        return ((Func<TContext, TArg1, TArg2, IQueryable<T>>)result)(db, param1, param2);
    }
}

这可以扩展以支持更多参数。 最棒的是,通过将参数值传递到 Cache 方法本身,您可以获得 lambda 表达式的隐式类型。

编辑:请注意,您不能将新运算符应用于已编译的查询。具体来说,您不能执行以下操作:

var allresults = q.Cache(db => from f in db.Foo select f);
var page = allresults.Skip(currentPage * pageSize).Take(pageSize);

因此,如果您计划对查询进行分页,则需要在编译操作中执行此操作,而不是稍后执行。 这不仅是为了避免异常,而且也是为了与 Skip/Take 的整体要点保持一致(以避免从数据库返回所有行)。 这种模式可以工作:

public IQueryable<Foo> GetFooPaged(int currentPage, int pageSize)
{
    return q.Cache((db, cur, size) => (from f in db.Foo select f)
        .Skip(cur*size).Take(size), currentPage, pageSize);
}

另一种分页方法是返回一个 Func:

public Func<int, int, IQueryable<Foo>> GetPageableFoo()
{
    return (cur, size) => q.Cache((db, c, s) => (from f in db.foo select f)
        .Skip(c*s).Take(s), c, s);
}

这种模式的用法如下:

var results = GetPageableFoo()(currentPage, pageSize);

You can't have extension methods invoked on anonymous lambda expressions, so you'll want to use a Cache class. In order to properly cache a query you'll also need to 'lift' any parameters (including your DataContext) into parameters for your lambda expression. This results in very verbose usage like:

var results = QueryCache.Cache((MyModelDataContext db) => 
    from x in db.Foo where !x.IsDisabled select x);

In order to clean that up, we can instantiate a QueryCache on a per-context basis if we make it non-static:

public class FooRepository
{
    readonly QueryCache<MyModelDataContext> q = 
        new QueryCache<MyModelDataContext>(new MyModelDataContext());
}

Then we can write a Cache method that will enable us to write the following:

var results = q.Cache(db => from x in db.Foo where !x.IsDisabled select x);

Any arguments in your query will also need to be lifted:

var results = q.Cache((db, bar) => 
    from x in db.Foo where x.id != bar select x, localBarValue);

Here's the QueryCache implementation I mocked up:

public class QueryCache<TContext> where TContext : DataContext
{
    private readonly TContext db;
    public QueryCache(TContext db)
    {
        this.db = db;
    }

    private static readonly Dictionary<string, Delegate> cache = new Dictionary<string, Delegate>();

    public IQueryable<T> Cache<T>(Expression<Func<TContext, IQueryable<T>>> q)
    {
        string key = q.ToString();
        Delegate result;
        lock (cache) if (!cache.TryGetValue(key, out result))
        {
            result = cache[key] = CompiledQuery.Compile(q);
        }
        return ((Func<TContext, IQueryable<T>>)result)(db);
    }

    public IQueryable<T> Cache<T, TArg1>(Expression<Func<TContext, TArg1, IQueryable<T>>> q, TArg1 param1)
    {
        string key = q.ToString();
        Delegate result;
        lock (cache) if (!cache.TryGetValue(key, out result))
        {
            result = cache[key] = CompiledQuery.Compile(q);
        }
        return ((Func<TContext, TArg1, IQueryable<T>>)result)(db, param1);
    }

    public IQueryable<T> Cache<T, TArg1, TArg2>(Expression<Func<TContext, TArg1, TArg2, IQueryable<T>>> q, TArg1 param1, TArg2 param2)
    {
        string key = q.ToString();
        Delegate result;
        lock (cache) if (!cache.TryGetValue(key, out result))
        {
            result = cache[key] = CompiledQuery.Compile(q);
        }
        return ((Func<TContext, TArg1, TArg2, IQueryable<T>>)result)(db, param1, param2);
    }
}

This can be extended to support more arguments. The great bit is that by passing the parameter values into the Cache method itself, you get implicit typing for the lambda expression.

EDIT: Note that you cannot apply new operators to the compiled queries.. Specifically you cannot do something like this:

var allresults = q.Cache(db => from f in db.Foo select f);
var page = allresults.Skip(currentPage * pageSize).Take(pageSize);

So if you plan on paging a query, you need to do it in the compile operation instead of doing it later. This is necessary not only to avoid an exception, but also in keeping with the whole point of Skip/Take (to avoid returning all rows from the database). This pattern would work:

public IQueryable<Foo> GetFooPaged(int currentPage, int pageSize)
{
    return q.Cache((db, cur, size) => (from f in db.Foo select f)
        .Skip(cur*size).Take(size), currentPage, pageSize);
}

Another approach to paging would be to return a Func:

public Func<int, int, IQueryable<Foo>> GetPageableFoo()
{
    return (cur, size) => q.Cache((db, c, s) => (from f in db.foo select f)
        .Skip(c*s).Take(s), c, s);
}

This pattern is used like:

var results = GetPageableFoo()(currentPage, pageSize);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文