当前位置：文江博客话题详情

实体框架包含OrderBy随机生成重复数据

发布于 2024-12-13 01:09:11 字数 5497 浏览 0 评论 0 原文

当我从数据库中检索项目列表（包括一些子项目）（通过 .Include）并随机排序时，EF 给出了一个意外的结果。我创建/克隆了附加项目。

为了更好地解释自己，我创建了一个小而简单的 EF CodeFirst 项目可以重现该问题。首先，我将为您提供该项目的代码。

项目

创建一个基本的MVC3项目并通过Nuget添加EntityFramework.SqlServerCompact包。
这将添加以下软件包的最新版本：

EntityFramework v4.3.0
SqlServerCompact v4.0.8482.1
EntityFramework.SqlServerCompact v4.1.8482.2
WebActivator v1.5

模型和 DbContext

using System.Collections.Generic;
using System.Data.Entity;

namespace RandomWithInclude.Models
{
    public class PeopleContext : DbContext
    {
        public DbSet<Person> Persons { get; set; }
        public DbSet<Address> Addresses { get; set; }
    }

    public class Person
    {
        public int ID { get; set; }
        public string Name { get; set; }

        public virtual ICollection<Address> Addresses { get; set; }
    }

    public class Address
    {
        public int ID { get; set; }
        public string AdressLine { get; set; }

        public virtual Person Person { get; set; }
    }
}

数据库设置和种子数据：EF.SqlServerCompact.cs

using System.Collections.Generic;
using System.Data.Entity;
using System.Data.Entity.Infrastructure;
using RandomWithInclude.Models;

[assembly: WebActivator.PreApplicationStartMethod(typeof(RandomWithInclude.App_Start.EF), "Start")]

namespace RandomWithInclude.App_Start
{
    public static class EF
    {
        public static void Start()
        {
            Database.DefaultConnectionFactory = new SqlCeConnectionFactory("System.Data.SqlServerCe.4.0");
            Database.SetInitializer(new DbInitializer());
        }
    }
    public class DbInitializer : DropCreateDatabaseAlways<PeopleContext>
    {
        protected override void Seed(PeopleContext context)
        {
            var address1 = new Address {AdressLine = "Street 1, City 1"};
            var address2 = new Address {AdressLine = "Street 2, City 2"};
            var address3 = new Address {AdressLine = "Street 3, City 3"};
            var address4 = new Address {AdressLine = "Street 4, City 4"};
            var address5 = new Address {AdressLine = "Street 5, City 5"};
            context.Addresses.Add(address1);
            context.Addresses.Add(address2);
            context.Addresses.Add(address3);
            context.Addresses.Add(address4);
            context.Addresses.Add(address5);
            var person1 = new Person {Name = "Person 1", Addresses = new List<Address> {address1, address2}};
            var person2 = new Person {Name = "Person 2", Addresses = new List<Address> {address3}};
            var person3 = new Person {Name = "Person 3", Addresses = new List<Address> {address4, address5}};
            context.Persons.Add(person1);
            context.Persons.Add(person2);
            context.Persons.Add(person3);
        }
    }
}

控制器： HomeController.cs

using System;
using System.Data.Entity;
using System.Linq;
using System.Web.Mvc;
using RandomWithInclude.Models;

namespace RandomWithInclude.Controllers
{
    public class HomeController : Controller
    {
        public ActionResult Index()
        {
            var db = new PeopleContext();
            var persons = db.Persons
                                .Include(p => p.Addresses)
                                .OrderBy(p => Guid.NewGuid());

            return View(persons.ToList());
        }
    }
}

视图：Index.cshtml

@using RandomWithInclude.Models
@model IList<Person>

<ul>
    @foreach (var person in Model)
    {
        <li>
            @person.Name
        </li>
    }
</ul>

这应该是全部，您的应用程序应该编译:)

问题

如您所见，我们有 2 个简单的模型（人员和地址）和一个人可以有多个地址。
我们为生成的数据库添加种子3 个人和 5 个地址。
如果我们从数据库中获取所有人员（包括地址）并将结果随机化，然后仅打印出这些人员的姓名，这就是一切出错的地方。

结果，我有时会得到 4 个人，有时 5 个，有时 3 个，我期望 3 个。总是。
例如：

人 1
人 3
人 1
人 3
人 2

所以..它正在复制/克隆数据！这并不酷..
似乎 EF 无法跟踪哪些地址是哪个人的子地址。

生成的 SQL 查询是这样的：

SELECT 
    [Project1].[ID] AS [ID], 
    [Project1].[Name] AS [Name], 
    [Project1].[C2] AS [C1], 
    [Project1].[ID1] AS [ID1], 
    [Project1].[AdressLine] AS [AdressLine], 
    [Project1].[Person_ID] AS [Person_ID]
FROM ( SELECT 
    NEWID() AS [C1], 
    [Extent1].[ID] AS [ID], 
    [Extent1].[Name] AS [Name], 
    [Extent2].[ID] AS [ID1], 
    [Extent2].[AdressLine] AS [AdressLine], 
    [Extent2].[Person_ID] AS [Person_ID], 
    CASE WHEN ([Extent2].[ID] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2]
    FROM  [People] AS [Extent1]
    LEFT OUTER JOIN [Addresses] AS [Extent2] ON [Extent1].[ID] = [Extent2].[Person_ID]
)  AS [Project1]
ORDER BY [Project1].[C1] ASC, [Project1].[ID] ASC, [Project1].[C2] ASC

解决方法

如果我从查询，一切顺利。但当然，地址不会被加载，并且每次访问该集合都会对数据库进行新的调用。
我可以首先从数据库获取数据，然后通过在 .OrderBy.. 之前添加 .ToList() 进行随机化，如下所示： var people = db.Persons.Include(p => p.Addresses)。 ToList().OrderBy(p => Guid.NewGuid());

有人知道为什么会这样吗？
这可能是 SQL 生成中的一个错误吗？

原文

When I retrieve a list of items from a database including some children (via .Include), and order the randomly, EF gives me an unexpected result.. I creates/clones addition items..

To explain myself better, I've created a small and simple EF CodeFirst project to reproduce the problem.
First i shall give you the code for this project.

The project

Create a basic MVC3 project and add the EntityFramework.SqlServerCompact package via Nuget.
That adds the latest versions of the following packages:

EntityFramework v4.3.0
SqlServerCompact v4.0.8482.1
EntityFramework.SqlServerCompact v4.1.8482.2
WebActivator v1.5

The Models and DbContext

using System.Collections.Generic;
using System.Data.Entity;

namespace RandomWithInclude.Models
{
    public class PeopleContext : DbContext
    {
        public DbSet<Person> Persons { get; set; }
        public DbSet<Address> Addresses { get; set; }
    }

    public class Person
    {
        public int ID { get; set; }
        public string Name { get; set; }

        public virtual ICollection<Address> Addresses { get; set; }
    }

    public class Address
    {
        public int ID { get; set; }
        public string AdressLine { get; set; }

        public virtual Person Person { get; set; }
    }
}

The DB Setup and Seed data: EF.SqlServerCompact.cs

using System.Collections.Generic;
using System.Data.Entity;
using System.Data.Entity.Infrastructure;
using RandomWithInclude.Models;

[assembly: WebActivator.PreApplicationStartMethod(typeof(RandomWithInclude.App_Start.EF), "Start")]

namespace RandomWithInclude.App_Start
{
    public static class EF
    {
        public static void Start()
        {
            Database.DefaultConnectionFactory = new SqlCeConnectionFactory("System.Data.SqlServerCe.4.0");
            Database.SetInitializer(new DbInitializer());
        }
    }
    public class DbInitializer : DropCreateDatabaseAlways<PeopleContext>
    {
        protected override void Seed(PeopleContext context)
        {
            var address1 = new Address {AdressLine = "Street 1, City 1"};
            var address2 = new Address {AdressLine = "Street 2, City 2"};
            var address3 = new Address {AdressLine = "Street 3, City 3"};
            var address4 = new Address {AdressLine = "Street 4, City 4"};
            var address5 = new Address {AdressLine = "Street 5, City 5"};
            context.Addresses.Add(address1);
            context.Addresses.Add(address2);
            context.Addresses.Add(address3);
            context.Addresses.Add(address4);
            context.Addresses.Add(address5);
            var person1 = new Person {Name = "Person 1", Addresses = new List<Address> {address1, address2}};
            var person2 = new Person {Name = "Person 2", Addresses = new List<Address> {address3}};
            var person3 = new Person {Name = "Person 3", Addresses = new List<Address> {address4, address5}};
            context.Persons.Add(person1);
            context.Persons.Add(person2);
            context.Persons.Add(person3);
        }
    }
}

The controller: HomeController.cs

using System;
using System.Data.Entity;
using System.Linq;
using System.Web.Mvc;
using RandomWithInclude.Models;

namespace RandomWithInclude.Controllers
{
    public class HomeController : Controller
    {
        public ActionResult Index()
        {
            var db = new PeopleContext();
            var persons = db.Persons
                                .Include(p => p.Addresses)
                                .OrderBy(p => Guid.NewGuid());

            return View(persons.ToList());
        }
    }
}

The View: Index.cshtml

@using RandomWithInclude.Models
@model IList<Person>

<ul>
    @foreach (var person in Model)
    {
        <li>
            @person.Name
        </li>
    }
</ul>

this should be all, and you application should compile :)

The problem

As you can see, we have 2 straightforward models (Person and Address) and Person can have multiple Addresses.
We seed the generated database 3 persons and 5 addresses.
If we get all the persons from the database, including the addresses and randomize the results and just print out the names of those persons, that's where it all goes wrong.

As a result, i sometimes get 4 persons, sometimes 5 and sometimes 3, and i expect 3. Always.
e.g.:

Person 1
Person 3
Person 1
Person 3
Person 2

So.. it's copying/cloning data! And that's not cool..
It just seems that EF looses track of what addresses are a child of which person..

The generated SQL query is this:

SELECT 
    [Project1].[ID] AS [ID], 
    [Project1].[Name] AS [Name], 
    [Project1].[C2] AS [C1], 
    [Project1].[ID1] AS [ID1], 
    [Project1].[AdressLine] AS [AdressLine], 
    [Project1].[Person_ID] AS [Person_ID]
FROM ( SELECT 
    NEWID() AS [C1], 
    [Extent1].[ID] AS [ID], 
    [Extent1].[Name] AS [Name], 
    [Extent2].[ID] AS [ID1], 
    [Extent2].[AdressLine] AS [AdressLine], 
    [Extent2].[Person_ID] AS [Person_ID], 
    CASE WHEN ([Extent2].[ID] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2]
    FROM  [People] AS [Extent1]
    LEFT OUTER JOIN [Addresses] AS [Extent2] ON [Extent1].[ID] = [Extent2].[Person_ID]
)  AS [Project1]
ORDER BY [Project1].[C1] ASC, [Project1].[ID] ASC, [Project1].[C2] ASC

Workarounds

If i remove the .Include(p =>p.Addresses) from the query, everything goes fine. but of course the addresses aren't loaded and accessing that collection will make a new call to the database every time.
I can first get the data from the database and randomize later by just adding a .ToList() before the .OrderBy.. like this: var persons = db.Persons.Include(p => p.Addresses).ToList().OrderBy(p => Guid.NewGuid());

Does anybody have any idea of why it is happening like this?
Might this be a bug in the SQL generation?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

你丑哭了我 2024-12-20 01:09:12

tl;dr：这里有一个漏洞抽象。对我们来说，Include 是一个简单的指令，用于将事物的集合粘贴到每个返回的Person 行上。但 EF 的 Include 实现是通过为每个 Person-Address 组合返回整行并在客户端重新组装来完成的。按易失性值排序会导致这些行被打乱，从而分解 EF 所依赖的 Person 组。

当我们查看此 LINQ 的 ToTraceString() 时：

 var people = c.People.Include("Addresses");
 // Note: no OrderBy in sight!

我们看到

SELECT 
[Project1].[Id] AS [Id], 
[Project1].[Name] AS [Name], 
[Project1].[C1] AS [C1], 
[Project1].[Id1] AS [Id1], 
[Project1].[Data] AS [Data], 
[Project1].[PersonId] AS [PersonId]
FROM ( SELECT 
    [Extent1].[Id] AS [Id], 
    [Extent1].[Name] AS [Name], 
    [Extent2].[Id] AS [Id1], 
    [Extent2].[PersonId] AS [PersonId], 
    [Extent2].[Data] AS [Data], 
    CASE WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1]
    FROM  [Person] AS [Extent1]
    LEFT OUTER JOIN [Address] AS [Extent2] ON [Extent1].[Id] = [Extent2].[PersonId]
)  AS [Project1]
ORDER BY [Project1].[Id] ASC, [Project1].[C1] ASC

因此，我们为每个 A 获取 n 行，加上 1每个 P 的 行，没有任何 A。

但是，添加 OrderBy 子句会将 thing-to-order-by 放在有序列的 start 处：

var people = c.People.Include("Addresses").OrderBy(p => Guid.NewGuid());

在您的情况下给出

SELECT 
[Project1].[Id] AS [Id], 
[Project1].[Name] AS [Name], 
[Project1].[C2] AS [C1], 
[Project1].[Id1] AS [Id1], 
[Project1].[Data] AS [Data], 
[Project1].[PersonId] AS [PersonId]
FROM ( SELECT 
    NEWID() AS [C1], 
    [Extent1].[Id] AS [Id], 
    [Extent1].[Name] AS [Name], 
    [Extent2].[Id] AS [Id1], 
    [Extent2].[PersonId] AS [PersonId], 
    [Extent2].[Data] AS [Data], 
    CASE WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2]
    FROM  [Person] AS [Extent1]
    LEFT OUTER JOIN [Address] AS [Extent2] ON [Extent1].[Id] = [Extent2].[PersonId]
)  AS [Project1]
ORDER BY [Project1].[C1] ASC, [Project1].[Id] ASC, [Project1].[C2] ASC

So，其中ordered-by- thing 不是 P 的属性，而是易失性的，因此对于相同的不同 PA 记录可能不同P，整个事情崩溃了。

我不确定这种行为在按预期工作~~~铸铁错误连续体中的哪个位置。但至少现在我们知道了。

tl;dr: There's a leaky abstraction here. To us, Include is a simple instruction to stick a collection of things onto each single returned Person row. But EF's implementation of Include is done by returning a whole row for each Person-Address combo, and reassembling at the client. Ordering by a volatile value causes those rows to become shuffled, breaking apart the Person groups that EF is relying on.

When we have a look at ToTraceString() for this LINQ:

 var people = c.People.Include("Addresses");
 // Note: no OrderBy in sight!

we see

SELECT 
[Project1].[Id] AS [Id], 
[Project1].[Name] AS [Name], 
[Project1].[C1] AS [C1], 
[Project1].[Id1] AS [Id1], 
[Project1].[Data] AS [Data], 
[Project1].[PersonId] AS [PersonId]
FROM ( SELECT 
    [Extent1].[Id] AS [Id], 
    [Extent1].[Name] AS [Name], 
    [Extent2].[Id] AS [Id1], 
    [Extent2].[PersonId] AS [PersonId], 
    [Extent2].[Data] AS [Data], 
    CASE WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1]
    FROM  [Person] AS [Extent1]
    LEFT OUTER JOIN [Address] AS [Extent2] ON [Extent1].[Id] = [Extent2].[PersonId]
)  AS [Project1]
ORDER BY [Project1].[Id] ASC, [Project1].[C1] ASC

So we get n rows for each A, plus 1 row for each P without any As.

Adding an OrderBy clause, however, puts the thing-to-order-by at the start of the ordered columns:

var people = c.People.Include("Addresses").OrderBy(p => Guid.NewGuid());

gives

SELECT 
[Project1].[Id] AS [Id], 
[Project1].[Name] AS [Name], 
[Project1].[C2] AS [C1], 
[Project1].[Id1] AS [Id1], 
[Project1].[Data] AS [Data], 
[Project1].[PersonId] AS [PersonId]
FROM ( SELECT 
    NEWID() AS [C1], 
    [Extent1].[Id] AS [Id], 
    [Extent1].[Name] AS [Name], 
    [Extent2].[Id] AS [Id1], 
    [Extent2].[PersonId] AS [PersonId], 
    [Extent2].[Data] AS [Data], 
    CASE WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2]
    FROM  [Person] AS [Extent1]
    LEFT OUTER JOIN [Address] AS [Extent2] ON [Extent1].[Id] = [Extent2].[PersonId]
)  AS [Project1]
ORDER BY [Project1].[C1] ASC, [Project1].[Id] ASC, [Project1].[C2] ASC

So in your case, where the ordered-by-thing is not a property of a P, but is instead volatile, and therefore can be different for different P-A records of the same P, the whole thing falls apart.

I'm not sure where on the working-as-intended ~~~ cast-iron bug continuum this behaviour falls. But at least now we know about it.

回复收藏 0 原文

得不到的就毁灭 2024-12-20 01:09:12

我不认为查询生成存在问题，但是当 EF 尝试将行转换为对象时肯定存在问题。

看起来这里有一个固有的假设，即连接语句中同一个人的数据将按或不按顺序分组在一起返回。

例如，

P.Id P.Name  A.Id A.StreetLine
1    Person 1 10    --- 
1    Person 1 11
2    Person 2 12
3    Person 3 13
3    Person 3 14

即使您按其他列排序，连接查询的结果也将始终是，同一个人总是会一个接一个地出现。

对于任何连接查询来说，这个假设大多成立。

但我认为这里有一个更深层次的问题。 OrderBy 适用于当您想要按特定顺序排列数据（与随机相反）时，因此该假设似乎确实合理。

我认为你真的应该取出数据，然后根据代码中的其他一些方式将其随机化

I dont think there is an issue in query generation, but there is definately an issue when EF tries to convert rows into object.

It looks like there is an inherent assumption here that data for the same person in a joined statement will be returned grouped together order by or not.

for example the result of a joined query will always be

P.Id P.Name  A.Id A.StreetLine
1    Person 1 10    --- 
1    Person 1 11
2    Person 2 12
3    Person 3 13
3    Person 3 14

even if you order by some other column, same person would always appear one after the other.

this assumption is mostly true for any joined query.

But there is a deeper issue here i think. OrderBy is for when you want data in certain order ( as opposite to random), so that assumption does seem reasonable.

i think you should really get data out and then randomize it according to some other means in your code

回复收藏 0 原文

烟酉 2024-12-20 01:09:12

从理论来看：
要对项目列表进行排序，比较函数相对于项目应该是稳定的；这意味着对于任何 2 个项目 x, y，x< 的结果是y 应该与查询（调用）的次数相同。

我认为该问题与对 OrderBy 方法：
keySelector - 从元素中提取键的函数。

EF 没有明确提及所提供的函数是否应返回相同的值多次调用对象（在您的情况下返回不同/随机值），但我认为他们在文档中使用的“关键”术语隐含地暗示了这一点。

回复收藏 0 原文

谁与争疯 2024-12-20 01:09:12

当您定义查询路径来定义查询结果时（使用包含），查询路径仅对返回的 ObjectQuery 实例有效。 ObjectQuery 的其他实例和对象上下文本身不受影响。此功能允许您链接多个“包含”以进行急切加载。

因此，你的陈述转化为

from person in db.Persons.Include(p => p.Addresses).OrderBy(p => Guid.NewGuid())
select person

你想要的而不是你想要的。

from person in db.Persons.Include(p => p.Addresses)
select person
.OrderBy(p => Guid.NewGuid())

因此，您的第二个解决方法工作正常:)

参考：在实体中查询概念模型时加载相关对象
框架 - http://msdn.microsoft.com/en-us/library/bb896272 .aspx

When you define a query path to define the query results, (use Include), the query path is only valid on the returned instance of ObjectQuery. Other instances of ObjectQuery and the object context itself are not affected. This functionality lets you chain multiple "Includes" for eager loading.

Therefor, Your statement translates into

from person in db.Persons.Include(p => p.Addresses).OrderBy(p => Guid.NewGuid())
select person

instead of what you intended.

from person in db.Persons.Include(p => p.Addresses)
select person
.OrderBy(p => Guid.NewGuid())

Hence your second workaround works fine :)

Reference: Loading Related Objects While Querying A Conceptual Model in Entity
Framework - http://msdn.microsoft.com/en-us/library/bb896272.aspx

回复收藏 0 原文

寄居人 2024-12-20 01:09:12

我也遇到了这个问题，并通过向我正在获取的主类添加 Randomizer Guid 属性来解决它。然后，我将列的默认值设置为 NEWID() ，如下所示（使用 EF Core 2）

builder.Entity<MainClass>()
    .Property(m => m.Randomizer)
    .HasDefaultValueSql("NEWID()");

获取时，它会变得更加复杂。我创建了两个随机整数作为我的排序索引，然后像这样运行查询

var rand = new Random();
var randomIndex1 = rand.Next(0, 31);
var randomIndex2 = rand.Next(0, 31);
var taskSet = await DbContext.MainClasses
    .Include(m => m.SubClass1)
        .ThenInclude(s => s.SubClass2)
    .OrderBy(m => m.Randomizer.ToString().Replace("-", "")[randomIndex1])
        .ThenBy(m => m.Randomizer.ToString().Replace("-", "")[randomIndex2])
    .FirstOrDefaultAsync();

这似乎工作得很好，并且应该提供足够的熵，即使是一个大型数据集也可以相当随机。

I also ran into this problem, and solved it by adding a Randomizer Guid property to the main class I was fetching. I then set the column's default value to NEWID() like this (using EF Core 2)

builder.Entity<MainClass>()
    .Property(m => m.Randomizer)
    .HasDefaultValueSql("NEWID()");

When fetching, it gets a bit more complicated. I created two random integers to function as my order-by indexes, then ran the query like this

var rand = new Random();
var randomIndex1 = rand.Next(0, 31);
var randomIndex2 = rand.Next(0, 31);
var taskSet = await DbContext.MainClasses
    .Include(m => m.SubClass1)
        .ThenInclude(s => s.SubClass2)
    .OrderBy(m => m.Randomizer.ToString().Replace("-", "")[randomIndex1])
        .ThenBy(m => m.Randomizer.ToString().Replace("-", "")[randomIndex2])
    .FirstOrDefaultAsync();

This seems to be working well enough, and should provide enough entropy for even a large dataset to be fairly randomized.

回复收藏 0 原文

一直在等你来 2024-12-20 01:09:11

人们可以通过阅读 AakashM 答案和 Nicolae Dascalu 回答，看来 Linq OrderBy 需要一个稳定的排名函数，这NewID/Guid.NewGuid 不是。

因此，我们必须使用另一个在单个查询中稳定的随机生成器。

为了实现这一点，在每次查询之前，使用 .Net 随机生成器来获取随机数。然后将此随机数与实体的独特属性相结合以进行随机排序。为了对结果进行“随机化”，对其进行校验和。（checksum 是一个计算哈希值的 SQL Server 函数；最初的想法基于此博客。）

Id 是 int，您可以这样编写查询：

// Random instances should be stored and reused, not instanciated at each usage.
// But beware, it is not thread safe. If you want to share it between threads, you
// would have to use locks, see its documentation.
// https://learn.microsoft.com/en-us/dotnet/api/system.random.
// But using locks is a bad idea for scalability, especially in a Web context.
var randomGenerator = new Random();
// ...

var rnd = randomGenerator.NextDouble();
var persons = db.Persons
    .Include(p => p.Addresses)
    .OrderBy(p => SqlFunctions.Checksum(p.Id * rnd));

假设 Person NewGuid hack，这很可能不是一个具有良好分布等的良好随机生成器。但它不会导致实体在结果中重复。

当心：
如果您的查询排序不能保证实体排名的唯一性，则必须对其进行补充以保证它。例如，如果您使用实体的非唯一属性进行校验和调用，则在 OrderBy.ThenBy(p => p.Id) 的内容>.
如果您的排名对于查询的根实体来说不是唯一的，则其包含的子实体可能会与具有相同排名的其他实体的子实体混合。然后 bug 就会留在这里。

注意：
我更喜欢使用 .Next() 方法获取 int，然后通过异或 (^) 将其组合到实体 int 唯一属性，而不是使用 double 并将其相乘。但是 SqlFunctions.Checksum< /code> 不幸的是，它没有提供 int 数据类型的重载，尽管 SQL Server 函数应该支持它。您可以使用强制转换来克服这个问题，但为了简单起见，我最终选择了乘法。

As one can sort it out by reading AakashM answer and Nicolae Dascalu answer, it strongly seems Linq OrderBy requires a stable ranking function, which NewID/Guid.NewGuid is not.

So we have to use another random generator that would be stable inside a single query.

To achieve this, before each querying, use a .Net Random generator to get a random number. Then combine this random number with a unique property of the entity to get randomly sorted. And to 'randomize' a bit the result, checksum it. (checksum is a SQL Server function that compute a hash; original idea founded on this blog.)

Assuming Person Id is an int, you could write your query this way :

// Random instances should be stored and reused, not instanciated at each usage.
// But beware, it is not thread safe. If you want to share it between threads, you
// would have to use locks, see its documentation.
// https://learn.microsoft.com/en-us/dotnet/api/system.random.
// But using locks is a bad idea for scalability, especially in a Web context.
var randomGenerator = new Random();
// ...

var rnd = randomGenerator.NextDouble();
var persons = db.Persons
    .Include(p => p.Addresses)
    .OrderBy(p => SqlFunctions.Checksum(p.Id * rnd));

Like the NewGuid hack, this is very probably not a good random generator with a good distribution and so on. But it does not cause entities to get duplicated in results.

Beware:
If your query ordering does not guarantees uniqueness of your entities ranking, you must complement it for guarantying it. By example, if you use a non-unique property of your entities for the checksum call, then add something like .ThenBy(p => p.Id) after the OrderBy.
If your ranking is not unique for your queried root entity, its included children may get mixed with children of other entities having the same ranking. And then the bug will stay here.

Note:
I would prefer use .Next() method to get an int then combine it through a xor (^) to an entity int unique property, rather than using a double and multiply it. But SqlFunctions.Checksum unfortunately does not provide an overload for int data type, though the SQL server function is supposed to support it. You may use a cast to overcome this, but for keeping it simple I finally had chosen to go with the multiply.

回复收藏 0 原文

~没有更多了~

关于作者

Bonjour°[大白

暂无简介

0 文章

0 评论

25 人气

关注发私信

友情链接

文江博客

实体框架包含OrderBy随机生成重复数据

项目

问题

解决方法

The project

The problem

Workarounds

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

游缘惊梦

小兔几

Glik

生生漫

Luxian

Champion-Ming

友情链接

实体框架包含OrderBy随机生成重复数据

项目

问题

解决方法

The project

The problem

Workarounds

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

游缘惊梦

小兔几

Glik

生生漫

Luxian

Champion-Ming

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。