从 List中取出 n 个随机元素?

发布于 2024-10-12 02:02:10 字数 96 浏览 7 评论 0原文

如何从 ArrayList中获取 n 个随机元素?理想情况下,我希望能够连续调用 take() 方法来获取另一个 x 元素,而无需替换。

How can I take n random elements from an ArrayList<E>? Ideally, I'd like to be able to make successive calls to the take() method to get another x elements, without replacement.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(13

罗罗贝儿 2024-10-19 02:02:10

主要有两种方式。

  1. 使用 Random#nextInt(int)

    列表;列表 = createItSomehow();
    随机随机 = new Random();
    Foo foo = list.get(random.nextInt(list.size()));
    

    但是,不能保证连续的 n 调用返回唯一的元素。

  2. 使用 Collections#shuffle()

    列表;列表 = createItSomehow();
    Collections.shuffle(列表);
    Foo foo = list.get(0);
    

    它使您能够通过递增索引获取n 个唯一元素(假设列表本身包含唯一元素)。


如果您想知道是否有 Java 8 Stream 方法;不,没有内置的。标准 API 中还没有 Comparator#randomOrder() 之类的东西(还没有?)。您可以尝试类似下面的方法,同时仍然满足严格的 Comparator 契约(尽管分布非常糟糕):

List<Foo> list = createItSomehow();
int random = new Random().nextInt();
Foo foo = list.stream().sorted(Comparator.comparingInt(o -> System.identityHashCode(o) ^ random)).findFirst().get();

最好使用 Collections#shuffle() 代替。

Two main ways.

  1. Use Random#nextInt(int):

    List<Foo> list = createItSomehow();
    Random random = new Random();
    Foo foo = list.get(random.nextInt(list.size()));
    

    It's however not guaranteed that successive n calls returns unique elements.

  2. Use Collections#shuffle():

    List<Foo> list = createItSomehow();
    Collections.shuffle(list);
    Foo foo = list.get(0);
    

    It enables you to get n unique elements by an incremented index (assuming that the list itself contains unique elements).


In case you're wondering if there's a Java 8 Stream approach; no, there isn't a built-in one. There's no such thing as Comparator#randomOrder() in standard API (yet?). You could try something like below while still satisfying the strict Comparator contract (although the distribution is pretty terrible):

List<Foo> list = createItSomehow();
int random = new Random().nextInt();
Foo foo = list.stream().sorted(Comparator.comparingInt(o -> System.identityHashCode(o) ^ random)).findFirst().get();

Better use Collections#shuffle() instead.

冰火雁神 2024-10-19 02:02:10

到目前为止,大多数提出的解决方案都建议通过检查唯一性并在需要时重试来进行完整列表洗牌或连续随机挑选。

但是,我们可以利用 Durstenfeld 算法(当今最流行的 Fisher-Yates 算法)。

Durstenfeld 的解决方案是将“敲击”的数字移至末尾
通过将每个数字与最后一个未敲击的数字交换来列出列表
迭代。

由于上述原因,我们不需要打乱整个列表,而是运行循环的步骤与需要返回的元素数量一样多。如果我们使用完美的随机函数,该算法可确保列表末尾的最后 N 个元素是 100% 随机的。

在许多现实场景中,我们需要从数组/列表中选择预定(最大)数量的随机元素,这种优化方法对于各种纸牌游戏非常有用,例如德州扑克,您先验地知道数字每场比赛使用的卡牌数量;通常,牌组中只需要有限数量的牌。

public static <E> List<E> pickNRandomElements(List<E> list, int n, Random r) {
    int length = list.size();

    if (length < n) return null;

    //We don't need to shuffle the whole list
    for (int i = length - 1; i >= length - n; --i)
    {
        Collections.swap(list, i , r.nextInt(i + 1));
    }
    return list.subList(length - n, length);
}

public static <E> List<E> pickNRandomElements(List<E> list, int n) {
    return pickNRandomElements(list, n, ThreadLocalRandom.current());
}

Most of the proposed solutions till now suggest either a full list shuffle or successive random picking by checking uniqueness and retry if required.

But, we can take advantage of the Durstenfeld's algorithm (the most popular Fisher-Yates variant in our days).

Durstenfeld's solution is to move the "struck" numbers to the end of
the list by swapping them with the last unstruck number at each
iteration.

Due to the above, we don't need to shuffle the whole list, but run the loop for as many steps as the number of elements required to return. The algorithm ensures that the last N elements at the end of the list are 100% random if we used a perfect random function.

Among the many real-world scenarios where we need to pick a predetermined (max) amount of random elements from arrays/lists, this optimized method is very useful for various card games, such as Texas Poker, where you a-priori know the number of cards to be used per game; only a limited number of cards is usually required from the deck.

public static <E> List<E> pickNRandomElements(List<E> list, int n, Random r) {
    int length = list.size();

    if (length < n) return null;

    //We don't need to shuffle the whole list
    for (int i = length - 1; i >= length - n; --i)
    {
        Collections.swap(list, i , r.nextInt(i + 1));
    }
    return list.subList(length - n, length);
}

public static <E> List<E> pickNRandomElements(List<E> list, int n) {
    return pickNRandomElements(list, n, ThreadLocalRandom.current());
}
衣神在巴黎 2024-10-19 02:02:10

如果您想从列表中连续选取 n 个元素,并且能够在不需要一遍又一遍地替换的情况下执行此操作,那么最好的方法可能是随机排列元素,然后将 n 块中的块取出。如果您随机排列列表,则可以保证您挑选的每个块的统计随机性。也许最简单的方法是使用 Collections.shuffle

If you want to successively pick n elements from the list and be able to do so without replacement over and over and over again, you are probably best of randomly permuting the elements, then taking chunks off in blocks of n. If you randomly permute the list you guarantee statistical randomness for each block you pick out. Perhaps the easiest way to do this would be to use Collections.shuffle.

人间☆小暴躁 2024-10-19 02:02:10

一个公平的方法是遍历列表,在第 n 次迭代时计算是否选择第 n 个元素的概率,这本质上是您仍然需要选择的项目数除以元素数的分数在列表的其余部分中可用。例如:(

public static <T> T[] pickSample(T[] population, int nSamplesNeeded, Random r) {
  T[] ret = (T[]) Array.newInstance(population.getClass().getComponentType(),
                                    nSamplesNeeded);
  int nPicked = 0, i = 0, nLeft = population.length;
  while (nSamplesNeeded > 0) {
    int rand = r.nextInt(nLeft);
    if (rand < nSamplesNeeded) {
      ret[nPicked++] = population[i];
      nSamplesNeeded--;
    }
    nLeft--;
    i++;
  }
  return ret;
}

这段代码是从我不久前写的 从一个列表。)

A fair way to do this is to go through the list, on the nth iteration calculating the probability of whether or not to pick the nth element, which is essentially the fraction of the number of items you still need to pick over the number of elements available in the rest of the list. For example:

public static <T> T[] pickSample(T[] population, int nSamplesNeeded, Random r) {
  T[] ret = (T[]) Array.newInstance(population.getClass().getComponentType(),
                                    nSamplesNeeded);
  int nPicked = 0, i = 0, nLeft = population.length;
  while (nSamplesNeeded > 0) {
    int rand = r.nextInt(nLeft);
    if (rand < nSamplesNeeded) {
      ret[nPicked++] = population[i];
      nSamplesNeeded--;
    }
    nLeft--;
    i++;
  }
  return ret;
}

(This code copied from a page I wrote a while ago on picking a random sample from a list.)

爱给你人给你 2024-10-19 02:02:10

简单明了

   // define ArrayList to hold Integer objects
    ArrayList<Integer> arrayList = new ArrayList<>();

    for (int i = 0; i < maxRange; i++) {
        arrayList.add(i + 1);
    }

    // shuffle list
    Collections.shuffle(arrayList);

    // adding defined amount of numbers to target list
    ArrayList<Integer> targetList = new ArrayList<>();
    for (int j = 0; j < amount; j++) {
        targetList.add(arrayList.get(j)); 
    }

    return targetList;

Simple and clear

   // define ArrayList to hold Integer objects
    ArrayList<Integer> arrayList = new ArrayList<>();

    for (int i = 0; i < maxRange; i++) {
        arrayList.add(i + 1);
    }

    // shuffle list
    Collections.shuffle(arrayList);

    // adding defined amount of numbers to target list
    ArrayList<Integer> targetList = new ArrayList<>();
    for (int j = 0; j < amount; j++) {
        targetList.add(arrayList.get(j)); 
    }

    return targetList;
路弥 2024-10-19 02:02:10

正如其他答案中所述,当源列表很大时,由于复制,Collections.shuffle 的效率不是很高。这里有一个 Java 8 的单行代码:

  • 如果您不需要源中的许多元素,则在像 ArrayList 这样的随机访问列表上足够高效
  • 不会修改源
  • 不保证唯一性,如果它对您来说不是超级重要的话。如果您从一百个元素中挑选五个,那么这些元素很有可能是独一无二的。

代码:

private static <E> List<E> pickRandom(List<E> list, int n) {
  return new Random().ints(n, 0, list.size()).mapToObj(list::get).collect(Collectors.toList());
}

但是,对于无法快速随机访问的列表(例如 LinkedList),复杂度将为 n*O(list_size)

As noted in other answers, Collections.shuffle is not very efficient when the source list is big, because of the copying. Here is a Java 8 one-liner that:

  • efficient enough on random access lists like ArrayList if you don't need many elements from the source
  • doesn't modify the source
  • DOES NOT guarantee uniqueness, if it's not super important for you. If you pick 5 from a hundred, there's a very good chance the elements will be unique.

Code:

private static <E> List<E> pickRandom(List<E> list, int n) {
  return new Random().ints(n, 0, list.size()).mapToObj(list::get).collect(Collectors.toList());
}

However, for a list with no quick random access (like LinkedList) the complexity would be n*O(list_size).

坚持沉默 2024-10-19 02:02:10

使用以下类:

import java.util.Enumeration;
import java.util.Random;

public class RandomPermuteIterator implements Enumeration<Long> {
    int c = 1013904223, a = 1664525;
    long seed, N, m, next;
    boolean hasNext = true;

    public RandomPermuteIterator(long N) throws Exception {
        if (N <= 0 || N > Math.pow(2, 62)) throw new Exception("Unsupported size: " + N);
        this.N = N;
        m = (long) Math.pow(2, Math.ceil(Math.log(N) / Math.log(2)));
        next = seed = new Random().nextInt((int) Math.min(N, Integer.MAX_VALUE));
    }

    public static void main(String[] args) throws Exception {
        RandomPermuteIterator r = new RandomPermuteIterator(100);
        while (r.hasMoreElements()) System.out.print(r.nextElement() + " ");
    }

    @Override
    public boolean hasMoreElements() {
        return hasNext;
    }

    @Override
    public Long nextElement() {
        next = (a * next + c) % m;
        while (next >= N) next = (a * next + c) % m;
        if (next == seed) hasNext = false;
        return  next;
    }
}

Use the following class:

import java.util.Enumeration;
import java.util.Random;

public class RandomPermuteIterator implements Enumeration<Long> {
    int c = 1013904223, a = 1664525;
    long seed, N, m, next;
    boolean hasNext = true;

    public RandomPermuteIterator(long N) throws Exception {
        if (N <= 0 || N > Math.pow(2, 62)) throw new Exception("Unsupported size: " + N);
        this.N = N;
        m = (long) Math.pow(2, Math.ceil(Math.log(N) / Math.log(2)));
        next = seed = new Random().nextInt((int) Math.min(N, Integer.MAX_VALUE));
    }

    public static void main(String[] args) throws Exception {
        RandomPermuteIterator r = new RandomPermuteIterator(100);
        while (r.hasMoreElements()) System.out.print(r.nextElement() + " ");
    }

    @Override
    public boolean hasMoreElements() {
        return hasNext;
    }

    @Override
    public Long nextElement() {
        next = (a * next + c) % m;
        while (next >= N) next = (a * next + c) % m;
        if (next == seed) hasNext = false;
        return  next;
    }
}
绮烟 2024-10-19 02:02:10

继续选择一个随机元素,并确保不会再次选择相同的元素:

public static <E> List<E> selectRandomElements(List<E> list, int amount)
{
    // Avoid a deadlock
    if (amount >= list.size())
    {
        return list;
    }

    List<E> selected = new ArrayList<>();
    Random random = new Random();
    int listSize = list.size();

    // Get a random item until we got the requested amount
    while (selected.size() < amount)
    {
        int randomIndex = random.nextInt(listSize);
        E element = list.get(randomIndex);

        if (!selected.contains(element))
        {
            selected.add(element);
        }
    }

    return selected;
}

理论上,这可能会无限运行,但实际上没有问题。你越接近整个原始列表,运行时间显然就越慢,但这不是选择随机子列表的重点,不是吗?

Keep selecting a random element and make sure you do not choose the same element again:

public static <E> List<E> selectRandomElements(List<E> list, int amount)
{
    // Avoid a deadlock
    if (amount >= list.size())
    {
        return list;
    }

    List<E> selected = new ArrayList<>();
    Random random = new Random();
    int listSize = list.size();

    // Get a random item until we got the requested amount
    while (selected.size() < amount)
    {
        int randomIndex = random.nextInt(listSize);
        E element = list.get(randomIndex);

        if (!selected.contains(element))
        {
            selected.add(element);
        }
    }

    return selected;
}

In theory this could run endlessly but in practise it is fine. The closer you get the whole original list the slower the runtime of this gets obviously but that is not the point of selecting a random sublist, is it?

丑疤怪 2024-10-19 02:02:10

下面的类从任何类型的列表中检索 N 个项目。如果您提供种子,那么每次运行时它将返回相同的列表,否则,新列表的项目将在每次运行时更改。您可以通过运行主要方法来检查其行为。

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import java.util.Random;

public class NRandomItem<T> {
    private final List<T> initialList;

    public NRandomItem(List<T> list) {
        this.initialList = list;
    }

    /**
     * Do not provide seed, if you want different items on each run.
     * 
     * @param numberOfItem
     * @return
     */
    public List<T> retrieve(int numberOfItem) {
        int seed = new Random().nextInt();
        return retrieve(seed, numberOfItem);
    }

    /**
     * The same seed will always return the same random list.
     * 
     * @param seed,
     *            the seed of random item generator.
     * @param numberOfItem,
     *            the number of items to be retrieved from the list
     * @return the list of random items
     */
    public List<T> retrieve(int seed, int numberOfItem) {
        Random rand = new Random(seed);

        Collections.shuffle(initialList, rand);
        // Create new list with the number of item size
        List<T> newList = new ArrayList<>();
        for (int i = 0; i < numberOfItem; i++) {
            newList.add(initialList.get(i));
        }
        return newList;
    }

    public static void main(String[] args) {
        List<String> l1 = Arrays.asList("Foo", "Bar", "Baz", "Qux");
        int seedValue = 10;
        NRandomItem<String> r1 = new NRandomItem<>(l1);

        System.out.println(String.format("%s", r1.retrieve(seedValue, 2)));
    }
}

The following class retrieve N items from a list of any type. If you provide a seed then on each run it will return the same list, otherwise, the items of the new list will change on each run. You can check its behaviour my running the main methods.

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import java.util.Random;

public class NRandomItem<T> {
    private final List<T> initialList;

    public NRandomItem(List<T> list) {
        this.initialList = list;
    }

    /**
     * Do not provide seed, if you want different items on each run.
     * 
     * @param numberOfItem
     * @return
     */
    public List<T> retrieve(int numberOfItem) {
        int seed = new Random().nextInt();
        return retrieve(seed, numberOfItem);
    }

    /**
     * The same seed will always return the same random list.
     * 
     * @param seed,
     *            the seed of random item generator.
     * @param numberOfItem,
     *            the number of items to be retrieved from the list
     * @return the list of random items
     */
    public List<T> retrieve(int seed, int numberOfItem) {
        Random rand = new Random(seed);

        Collections.shuffle(initialList, rand);
        // Create new list with the number of item size
        List<T> newList = new ArrayList<>();
        for (int i = 0; i < numberOfItem; i++) {
            newList.add(initialList.get(i));
        }
        return newList;
    }

    public static void main(String[] args) {
        List<String> l1 = Arrays.asList("Foo", "Bar", "Baz", "Qux");
        int seedValue = 10;
        NRandomItem<String> r1 = new NRandomItem<>(l1);

        System.out.println(String.format("%s", r1.retrieve(seedValue, 2)));
    }
}
怪异←思 2024-10-19 02:02:10

此解决方案不会修改原始列表,也不会随着列表大小而增加复杂性。

要从 7 个列表中获取 4 个样本,我们只需从所有 7 个元素中随机选择一个元素,然后从剩余 6 个元素中随机选择一个元素,依此类推。如果我们已经选择了索引 4, 0, 3,接下来我们从 0, 1, 2, 3 中生成一个随机数,分别代表索引 1, 2, 5, 6。

static Random rand = new Random();

static <T> List<T> randomSample(List<T> list, int size) {
    List<T> sample = new ArrayList<>();

    for (int sortedSampleIndices[] = new int[size], i = 0; i < size; i++) {
        int index = rand.nextInt(list.size() - i);

        int j = 0;
        for (; j < i && index >= sortedSampleIndices[j]; j++)
            index++;
        sample.add(list.get(index));

        for (; j <= i; j++) {
            int temp = sortedSampleIndices[j];
            sortedSampleIndices[j] = index;
            index = temp;
        }
    }

    return sample;
}

This solution doesn't modify the original list or otherwise scale in complexity with the list size.

To get a sample of 4 from a list of 7, we just select a random element out of all 7, then select a random element out of the remaining 6, and so on. If we've already selected indices 4, 0, 3, next we generate a random number out of 0, 1, 2, 3, respectively representing index 1, 2, 5, 6.

static Random rand = new Random();

static <T> List<T> randomSample(List<T> list, int size) {
    List<T> sample = new ArrayList<>();

    for (int sortedSampleIndices[] = new int[size], i = 0; i < size; i++) {
        int index = rand.nextInt(list.size() - i);

        int j = 0;
        for (; j < i && index >= sortedSampleIndices[j]; j++)
            index++;
        sample.add(list.get(index));

        for (; j <= i; j++) {
            int temp = sortedSampleIndices[j];
            sortedSampleIndices[j] = index;
            index = temp;
        }
    }

    return sample;
}
旧伤还要旧人安 2024-10-19 02:02:10

所有这些答案都需要可修改的列表,否则会遇到性能问题。

这是一个快速片段,需要 O(k) 额外空间,并保证在 O(k) 时间内运行,并且不需要可修改数组。 (在地图中进行随机播放)

  func getRandomElementsFrom(array: [Int], count: Int = 8) -> [Int] {
    if array.count <= count {
        return array
    }

    var mapper = [Int: Int]()
    var results = [Int]()

    for i in 0..<count {
        let randomIndex = Int.random(in: 0..<array.count - i)

        if let existing = mapper[randomIndex] {
            results.append(array[existing])
        } else {
            let element = array[randomIndex]
            results.append(element)
        }

        let targetIndex = array.count - 1 - i
        mapper[randomIndex] = mapper[targetIndex] ?? targetIndex 
    }

    return results
}

All of these answers require a modifiable list or run into performance issued

Here's a swift snippet that required O(k) additional space and is guaranteed to run in O(k) time and doesn't need a modifiable array. (Performs shuffles in a map)

  func getRandomElementsFrom(array: [Int], count: Int = 8) -> [Int] {
    if array.count <= count {
        return array
    }

    var mapper = [Int: Int]()
    var results = [Int]()

    for i in 0..<count {
        let randomIndex = Int.random(in: 0..<array.count - i)

        if let existing = mapper[randomIndex] {
            results.append(array[existing])
        } else {
            let element = array[randomIndex]
            results.append(element)
        }

        let targetIndex = array.count - 1 - i
        mapper[randomIndex] = mapper[targetIndex] ?? targetIndex 
    }

    return results
}
鹊巢 2024-10-19 02:02:10

我对此进行了基准测试,重点关注给出 k 个独特结果的方法。我比较了本页中的 3 种方法:

  1. Kostas Kryptos 的交换方法
  2. 只是简单的迭代,如果已经选择了该项目,则再次绘制,如 BullyWiiPlaza 所示,
  3. 使用 aykutfirat 的枚举器创建要从列表中检索的索引。

我没有添加随机播放方法,因为我猜测它显然会比方法 1 或 2 慢。

的列表中获取 3 个项目的结果

从 10: 1:150 ns
2:145 纳秒
3:260 ns

从 1000 个列表中获取 5 个项目的结果:

1:1362 ns
2:209 纳秒
3: 312 ns

看来,对于较大的原始列表,普通方法并没有真正表现出明显的性能下降,而交换方法却如此。显然,随着列表的增加,交换会变得昂贵。

因此,最佳性能只是简单地选择每个随机项目,如果已​​经选择了该项目,则再次进行绘制。

I've benchmarked this, focusing on the methods which give k unique results. I compared 3 methods from this page:

  1. the swapping method by Kostas Kryptos
  2. just plain iterating, and drawing again if the item has already been chosen, as shown by BullyWiiPlaza
  3. use aykutfirat's Enumerator to create the indexes to retrieve from the list.

I didn't add the shuffle approach, because I guessed it would clearly be slower than method 1 or 2.

Results with getting 3 items out of a list of 10:

1: 150 ns
2: 145 ns
3: 260 ns

Results with getting 5 items out of a list of 1000:

1: 1362 ns
2: 209 ns
3: 312 ns

It appears that the plain method doesn't really show significantly slower performances with bigger original lists, whereas the swap method does. Apparently the swapping gets expensive with bigger lists.

So best performance is just plain selecting each random item, and doing the draw again if the item was already selected.

情魔剑神 2024-10-19 02:02:10

以下方法返回从参数 List 列表中获取的新 Min(n, list.size()) 随机元素列表。请记住,每次调用后都会修改列表列表。因此,每次调用都将“消耗”原始列表,从中返回 n 个随机元素:

public static <T> List<T> nextRandomN(List<T> list, int n) {
  return Stream
    .generate(() -> list.remove((int) (list.size() * Math.random())))
    .limit(Math.min(list.size(), n))
    .collect(Collectors.toList());
}

示例用法:

List<Integer> list = new ArrayList<>(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10));

System.out.println(nextRandomN(list, 3).toString());
System.out.println(nextRandomN(list, 3).toString());
System.out.println(nextRandomN(list, 3).toString());
System.out.println(nextRandomN(list, 3).toString());

示例输出:

[8, 2, 3]
[4, 10, 7]
[1, 5, 9]
[6]

The following method returns a new List of Min(n, list.size()) random elements taken from the paramenter List list. Have in mind that the List list is being modified after each call. Therefore, each call will be "consuming" the original list returning n random elements from it:

public static <T> List<T> nextRandomN(List<T> list, int n) {
  return Stream
    .generate(() -> list.remove((int) (list.size() * Math.random())))
    .limit(Math.min(list.size(), n))
    .collect(Collectors.toList());
}

Sample usage:

List<Integer> list = new ArrayList<>(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10));

System.out.println(nextRandomN(list, 3).toString());
System.out.println(nextRandomN(list, 3).toString());
System.out.println(nextRandomN(list, 3).toString());
System.out.println(nextRandomN(list, 3).toString());

Sample output:

[8, 2, 3]
[4, 10, 7]
[1, 5, 9]
[6]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文