无需泛型即可进行防白痴迭代的 API 设计

发布于 2024-11-15 08:28:51 字数 2804 浏览 0 评论 0原文

当您为代码库设计 API 时，您希望它易于使用，并且难以使用不好。理想情况下，您希望它是白痴证明。

您可能还希望使其与无法处理泛型的旧系统兼容，例如 .Net 1.1 和 Java 1.4。但您不希望在较新的代码中使用它变得很痛苦。

我想知道以类型安全的方式使事物轻松迭代的最佳方法...请记住，您不能使用泛型，因此 Java 的 Iterable 已经过时了，.Net 也是如此IEnumerable。

您希望人们能够使用 Java 中增强的 for 循环(for Item i : items) 和 foreach / For Each 循环在 .Net 中，您不希望他们进行任何转换。基本上，您希望您的 API 现在友好且向后兼容。

我能想到的最好的类型安全选项是数组。它们完全向后兼容并且它们很容易以类型安全的方式进行迭代。但数组并不理想，因为你无法使它们不可变。因此，当您有一个包含数组的不可变对象时，您希望人们能够对其进行迭代，为了保持不变性，您必须在每次访问它时提供一个防御性副本。

在 Java 中，执行 (MyObject[]) myInternalArray.clone(); 速度非常快。我确信 .Net 中的同等功能也超快。如果你有 like:

class Schedule {
   private Appointment[] internalArray;
   public Appointment[] appointments() {
       return (Appointment[]) internalArray.clone();
   }
}

人们可以做 like:

for (Appointment a : schedule.appointments()) {
    a.doSomething();
}

并且它将变得简单、清晰、类型安全且快速。

但是他们可以做这样的事情：

for (int i = 0; i < schedule.appointments().length; i++) {
    Appointment a = schedule.appointments()[i];
}

然后效率会非常低，因为每次迭代整个约会数组都会被克隆两次（一次用于长度测试，一次用于获取索引处的对象））。如果数组很小，这不是问题，但如果数组中有数千个项目，则非常可怕。哎呀。

真的有人会这么做吗？我不确定......我想这很大程度上是我的问题。

您可以调用方法 toAppointmentArray() 而不是 appointments()，这可能会降低任何人以错误方式使用它的可能性。但当人们只想迭代约会时，这也会让他们更难找到。

当然，您会清楚地记录 appointments() ，说它返回一个防御性副本。但很多人不会阅读特定的文档。

尽管我欢迎建议，但在我看来，没有完美的方法可以使其简单、清晰、类型安全、和防白痴。如果少数人在不知情的情况下克隆数组数千次，我是否失败了，或者对于大多数来说，为简单、类型安全的迭代付出的代价是否可以接受？

注意，我碰巧正在为 Java 和 .Net 设计这个库，这就是为什么我试图使这个问题适用于两者。我将其标记为与语言无关，因为这对于其他语言也可能出现。代码示例采用 Java 语言，但 C# 类似（尽管可以选择将 Appointments 访问器设置为属性）。

更新：我做了一些快速的性能测试，看看这对 Java 产生了多大的影响。我测试了：

克隆数组一次，并使用增强的 for 循环迭代
它使用 ArrayList 迭代增强的 for 循环
迭代不可修改的数组列表（来自 Collections.unmodifyableList）使用增强的 for 循环
以不好的方式迭代数组（在长度检查中重复克隆它以及获取每个索引项时）。

对于 10 个物体，相对速度（多次重复并取中值）如下：

1,000
1,300
1,300
5,000

对于 100 个物体：

1,300
4,900
6,300
85,500

对于 1000 个物体：

6,400
51,700
56,200
7,000,300

对于 10000 个对象：

68,000
445,000
651,000
655,180,000

当然是粗略数字，但足以让我相信两件事：

克隆，然后迭代绝对是不是性能问题。实际上它始终比使用更快列表。（这就是为什么 Java enum.values() 方法返回一个数组的防御性副本而不是一个不可变的列表。）
如果您重复调用该方法，不必要地重复克隆阵列，所涉及的阵列越大，性能就越成为一个问题。这太可怕了。那里没有什么惊喜。

原文

When you're designing the API for a code library, you want it to be easy to use well, and hard to use badly. Ideally you want it to be idiot proof.

You might also want to make it compatible with older systems that can't handle generics, like .Net 1.1 and Java 1.4. But you don't want it to be a pain to use from newer code.

I'm wondering about the best way to make things easily iterable in a type-safe way... Remembering that you can't use generics so Java's Iterable<T> is out, as is .Net's IEnumerable<T>.

You want people to be able to use the enhanced for loop in Java (for Item i : items), and the foreach / For Each loop in .Net, and you don't want them to have to do any casting. Basically you want your API to be now-friendly as well as backwards compatible.

The best type-safe option that I can think of is arrays. They're fully backwards compatible and they're easy to iterate in a typesafe way. But arrays aren't ideal because you can't make them immutable. So, when you have an immutable object containing an array that you want people to be able to iterate over, to maintain immutability you have to provide a defensive copy each and every time they access it.

In Java, doing (MyObject[]) myInternalArray.clone(); is super-fast. I'm sure that the equivalent in .Net is super-fast too. If you have like:

class Schedule {
   private Appointment[] internalArray;
   public Appointment[] appointments() {
       return (Appointment[]) internalArray.clone();
   }
}

people can do like:

for (Appointment a : schedule.appointments()) {
    a.doSomething();
}

and it will be simple, clear, type-safe, and fast.

But they could do something like:

for (int i = 0; i < schedule.appointments().length; i++) {
    Appointment a = schedule.appointments()[i];
}

And then it would be horribly inefficient because the entire array of appointments would get cloned twice for every iteration (once for the length test, and once to get the object at the index). Not such a problem if the array is small, but pretty horrible if the array has thousands of items in it. Yuk.

Would anyone actually do that? I'm not sure... I guess that's largely my question here.

You could call the method toAppointmentArray() instead of appointments(), and that would probably make it less likely that anyone would use it the wrong way. But it would also make it harder for people to find when they just want to iterate over the appointments.

You would, of course, document appointments() clearly, to say that it returns a defensive copy. But a lot of people won't read that particular bit of documentation.

Although I'd welcome suggestions, it seems to me that there's no perfect way to make it simple, clear, type-safe, and idiot proof. Have I failed if a minority of people are unwitting cloning arrays thousands of times, or is that an acceptable price to pay for simple, type-safe iteration for the majority?

NB I happen to be designing this library for both Java and .Net, which is why I've tried to make this question applicable to both. And I tagged it language-agnostic because it's an issue that could arise for other languages too. The code samples are in Java, but C# would be similar (albeit with the option of making the Appointments accessor a property).

UPDATE: I did a few quick performance tests to see how much difference this made in Java. I tested:

cloning the array once, and iterating over it using the enhanced for loop
iterating over an ArrayList using
the enhanced for loop
iterating over an unmodifyable
ArrayList (from
Collections.unmodifyableList) using
the enhanced for loop
iterating over the array the bad way (cloning it repeatedly in the length check
and when getting each indexed item).

For 10 objects, the relative speeds (doing multiple repeats and taking the median) were like:

1,000
1,300
1,300
5,000

For 100 objects:

1,300
4,900
6,300
85,500

For 1000 objects:

6,400
51,700
56,200
7,000,300

For 10000 objects:

68,000
445,000
651,000
655,180,000

Rough figures for sure, but enough to convince me of two things:

Cloning, then iterating is definitely
not a performance issue. In fact
it's consistently faster than using a
List. (this is why Java's
enum.values() method returns a
defensive copy of an array instead of
an immutable list.)
If you repeatedly call the method,
repeatedly cloning the array unnecessarily,
performance becomes more and more of an issue the larger the arrays in question. It's pretty horrible. No surprises there.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

何时共饮酒 2024-11-22 08:28:51

clone() 很快，但不是我所说的超级快。

如果您不相信人们能够有效地编写循环，我不会让他们编写循环（这也避免了对clone()的需要）

interface AppointmentHandler {
    public void onAppointment(Appointment appointment);
}

class Schedule {
    public void forEachAppointment(AppointmentHandler ah) {
        for(Appointment a: internalArray)
            ah.onAppointment(a);
    }
}

clone() is fast but not what I would describe as super faster.

If you don't trust people to write loops efficiently, I would not let them write a loop (which also avoids the need for a clone())

interface AppointmentHandler {
    public void onAppointment(Appointment appointment);
}

class Schedule {
    public void forEachAppointment(AppointmentHandler ah) {
        for(Appointment a: internalArray)
            ah.onAppointment(a);
    }
}

回复收藏 0 原文