.NET 中使用一种相当常见的模式来测试类的功能。 这里我将使用 Stream 类作为示例,但该问题适用于使用此模式的所有类。
该模式是提供一个名为 CanXXX 的布尔属性来指示 XXX 功能在该类上可用。 例如,Stream 类具有 CanRead、CanWrite 和 CanSeek 属性,表示可以调用 Read、Write 和 Seek 方法。 如果属性值为 false,则调用相应的方法将导致引发 NotSupportedException。
来自流类的 MSDN 文档:
根据底层数据源或存储库,流可能仅支持其中部分功能。 应用程序可以使用 CanRead、CanWrite 和 CanSeek 属性查询流的功能。
以及 CanRead 属性的文档:
在派生类中重写时,获取一个指示当前流是否支持读取的值。
如果从 Stream 派生的类不支持读取,则调用 Read、ReadByte 和 BeginRead 方法会抛出 NotSupportedException。
按照以下方式编写的:
if (stream.CanRead)
{
stream.Read(…)
}
请注意,没有同步代码,例如以任何方式锁定流对象 - 其他线程可能正在访问它或它的对象参考。 也没有代码来捕获 NotSupportedException。
MSDN 文档没有声明属性值不能随时间变化。 事实上,当流关闭时,CanSeek 属性会更改为 false,这展示了这些属性的动态性质。 因此,没有合同保证上述代码片段中对 Read() 的调用不会抛出 NotSupportedException。
我预计有很多代码都存在这个潜在问题。 我想知道那些发现这个问题的人是如何解决的。 什么样的设计模式适合这里?
我也很感谢对此模式(CanXXX、XXX() 对)有效性的评论。 对我来说,至少在 Stream 类的情况下,这代表了一个试图做太多事情的类/接口,应该分成更基本的部分。 缺乏严格的、记录在案的合同使得测试变得不可能,实施也变得更加困难!
There is a fairly common pattern used in .NET to test for the capabilities of a class. Here I'll use the Stream class as an example, but the issue applies to all classes that use this pattern.
The pattern is to supply a boolean property called CanXXX to indicate that capability XXX is available on the class. For example, the Stream class has CanRead, CanWrite and CanSeek properties to indicate that the Read, Write and Seek methods can be called. If the properties value is false, then calling the respective method will result in a NotSupportedException being thrown.
From the MSDN documentation on the stream class:
Depending on the underlying data source or repository, streams might support only some of these capabilities. An application can query a stream for its capabilities by using the CanRead, CanWrite, and CanSeek properties.
And documentation for the CanRead property:
When overridden in a derived class, gets a value indicating whether the current stream supports reading.
If a class derived from Stream does not support reading, calls to the Read, ReadByte, and BeginRead methods throw a NotSupportedException.
I see a lot of code written along the lines of the following:
if (stream.CanRead)
{
stream.Read(…)
}
Note that there is no synchronisation code, say, to lock the stream object in any manner — other threads may be accessing it or objects that it references. There is also no code to catch a NotSupportedException.
The MSDN documentation does not state that the property value can not change over time. In fact, the CanSeek property changes to false when the stream is closed, demonstrating the dynamic nature of these properties. As such, there is no contractual guarantee that call to Read() in the above code snippet will not throw a NotSupportedException.
I expect that there is a lot of code out there that suffers from this potential problem. I wonder how those who have identified this issue have addressed it. What design patterns are appropriate here?
I'd also appreciate comments on the validity of this pattern (the CanXXX, XXX() pairs). To me, at least in the case of the Stream class, this represents a class/interface that is trying to do too much and should be split into more fundamental pieces. The lack of a tight, documented contract makes testing impossible and implementation even harder!
发布评论
评论(6)
好吧,这是另一个尝试,希望比我的其他答案更有用......
不幸的是,MSDN 没有给出关于如何
CanRead
/CanWrite
/< code>CanSeek 可能会随着时间的推移而改变。 我认为可以合理地假设,如果一个流是可读的,它将继续可读,直到它被关闭 - 并且对于其他属性也同样如此。在某些情况下,我认为流到 是合理的稍后变得可查找 - 例如,它可能会缓冲它读取的所有内容,直到到达底层数据的末尾,然后允许在其中查找以让客户端重新读取数据。 然而,我认为适配器忽略这种可能性是合理的。
这应该可以处理除了最病态的病例之外的所有病例。 (流的设计几乎就是为了造成严重破坏!)将这些要求添加到现有文档中理论上是一个重大更改,尽管我怀疑 99.9% 的实现已经遵守它。 不过,Connect 可能值得建议。
现在,至于是否使用“基于功能”的 API(例如
Stream
)和基于接口的 API 之间的讨论……我看到的根本问题是 .NET 不提供指定变量必须是对多个接口的实现的引用的能力。 例如,我不能写:如果它确实允许这样做,这可能是合理的 - 但如果没有的话,你最终会遇到潜在接口的爆炸:
我认为这比当前情况更混乱 -尽管我认为除了现有的 Stream 类之外,我还支持仅使用
IReadable
和IWritable
的想法。 这将使客户更容易声明式地表达他们的需求。通过代码契约,API可以声明它们提供什么和需要什么,诚然:
我不知道静态检查器对此有多大帮助 - 或者它如何应对流关闭时变得不可读/不可写的事实。
Okay, here's another attempt which will hopefully be more useful than my other answer...
It's unfortunate that MSDN doesn't give any specific guarantees about how
CanRead
/CanWrite
/CanSeek
may change over time. I think it would be reasonable to assume that if a stream is readable it will continue to be readable until it is closed - and the same will hold for the other propertiesIn some cases I think it would be reasonable for a stream to become seekable later - for instance, it might buffer everything it reads until it reaches the end of the underlying data, and then allow seeking within it afterwards to let clients reread the data. I think it would be reasonable for an adapter to ignore that possibility, however.
This should take care of all but the most pathological cases. (Streams pretty much designed to cause havoc!) Adding these requirements to the existing documentation is a theoretically breaking change, even though I suspect that 99.9% of implementations will obey it already. Still, it might be worth suggesting on Connect.
Now, as for the discussion between whether to use a "capability-based" API (like
Stream
) and an interface-based one... the fundamental problem I see is that .NET doesn't provide the ability to specify that a variable has to be a reference to an implementation of more than one interface. For example, I can't write:If it did allow this, it might be reasonable - but without that, you end up with an explosion of potential interfaces:
I think that's messier than the current situation - although I think I would support the idea of just
IReadable
andIWritable
in addition to the existingStream
class. That would make it easier for clients to declaratively express what they need.With Code Contracts, APIs can declare what they provide and what they require, admittedly:
I don't know how much the static checker can help with that - or how it copes with the fact that streams do become unreadable/unwritable when they're closed.
如果不了解对象的内部结构,您必须假设在多个线程中修改对象时,“标志”属性太不稳定而无法依赖。
我发现这个问题关于只读集合比流更常见,但我觉得这是相同设计模式的另一个例子,并且适用相同的论点。
澄清一下,.NET 中的 ICollection 接口具有 IsReadOnly 属性,该属性旨在用作集合是否支持修改其内容的方法的指示符。 就像流一样,该属性可以随时更改,并会导致抛出 InvalidOperationException 或 NotSupportedException。
围绕这个问题的讨论通常可以归结为:
模式很少是一件好事,因为你被迫处理不止一组“行为”; 拥有可以随时切换模式的东西要糟糕得多,因为您的应用程序现在也必须处理多个“一组”行为。 然而,仅仅因为可以将某些东西分解为更谨慎的功能并不一定意味着您总是应该这样做,特别是当将其分解并不能降低手头任务的复杂性时。
我个人的观点是,你必须选择最接近你认为同级消费者能够理解的心理模型的模式。 如果您是唯一的消费者,请选择您最喜欢的型号。 就 Stream 和 ICollection 而言,我认为对它们有一个单一的定义更接近于类似系统中多年开发所建立的心智模型。 当您谈论流时,您谈论的是文件流和内存流,而不是它们是否可读或可写。 同样,当您谈论集合时,您很少用“可写性”来提及它们。
我对此的经验法则是:始终寻找一种方法将行为分解为更具体的界面,而不是拥有操作“模式”,只要它符合简单的心理模型即可。 如果很难将单独的行为视为单独的事物,请使用基于模式的模式并非常清楚地记录它。
Without knowing the internals of an object, you must assume that a "flag" property is too volatile to rely on when the object is being modified in multiple threads.
I have seen this question more commonly asked about read-only collections than streams, but I feel it's another example of the same design patter, and the same arguments apply.
To clarify, the ICollection interface in .NET has the property IsReadOnly, which is intended to be used as an indicator of whether the collection supports methods to modify its contents. Just like streams, this property can change at any time and will cause InvalidOperationException or NotSupportedException to be thrown.
The discussions around this usually boil down to:
Modes are rarely a good thing, as you are forced to deal with more than one "set" of behaviour; having something which can switch modes at any time is considerably worse, as your application now has to deal with more than one "set" of behaviour too. However, just because it's possible to break something down into more discreet functionality does not necessarily mean you always should, particularly when breaking it apart does nothing to reduce the complexity of the task at hand.
My personal opinion is that you have to pick the pattern which is closest to the mental model you perceive the consumers of your class will understand. If you are the only consumer, pick whichever model you like most. In the case of Stream and ICollection, I think that having a single definition of these is much closer to the mental model built up by years of development in similar systems. When you talk about streams, you talk about file streams and memory streams, not whether they're readable or writeable. Similarly, when you talk about collections, you rarely refer to them in terms of "writeability".
My rule of thumb on this one: Always look for a way to break down the behaviours into more specific interfaces, rather than having "modes" of operation, as long as it compliments a simple mental model. If it's hard to think of the separate behaviours as separate things, use a mode-based pattern and document it very clearly.
Stream.CanRead 只是检查底层流是否有读取的可能性。 它没有说明是否可以进行实际读取(例如磁盘错误)。
如果您使用任何 *Reader 类,则无需捕获 NotImplementedException,因为它们都支持读取。 只有 *Writer 才会有 CanRead=False 并抛出该异常。 如果您知道流支持读取(例如您使用了 StreamReader),恕我直言,无需进行额外的检查。
您仍然需要捕获异常,因为读取期间的任何错误都会抛出异常(例如磁盘错误)。
另请注意,任何未记录为线程安全的代码都不是线程安全的。 通常静态成员是线程安全的,但实例成员不是——但是,需要检查每个类的文档。
stream.CanRead just checks whether underlying stream has possibility of reading. It does not say anything about whether actual reading will be possible (e.g. disk error).
There is no need to catch NotImplementedException if you used any of *Reader classes since they all support reading. Only *Writer will have CanRead=False and throw that exception. If you are aware that stream supports reading (e.g. you used StreamReader), IMHO there is no need to make additional check.
You still need to catch exceptions since any error during read will throw them (e.g. disk error).
Also notice that any code that is not documented as thread-safe is not thread-safe. Usually static members are thread safe, but instance members aren't - however, there is need to check documentation for each class.
从你的问题和所有随后的评论来看,我猜测你的问题在于所述合同的清晰度和“正确性”。 所述合同是 MSDN 在线文档中的内容。
您指出的是,文档中缺少一些内容,迫使人们对合同做出假设。 更具体地说,由于没有提及流的可读性属性的易变性,因此可以做出的唯一假设是
NotSupportedException
可能可能抛出,无论相应的 CanRead 属性的值是几毫秒(或更长时间)之前的。我认为在这种情况下,我们需要继续了解该接口的意图,即:
CanRead
的值是不变的。尽管如此,Read* 方法可能可能会抛出
NotSupportedException
。相同的参数可以应用于所有其他 Can* 属性。
From your question and all the subsequent commentary, I'm guessing that your issue is with the clarity and "correctness" of the stated contract. The stated contract being what is in the MSDN online documentation.
What you've pointed out is that there is something missing from the documentation which forces one to make assumptions about the contract. More specifically, because there is nothing said about the volatility of the readability property of a stream, the only assumption that one can make is that it is possible for a
NotSupportedException
to be thrown, regardless of what the value of the corresponding CanRead property was a few milliseconds (or more) prior.I think that one needs to go on the intent of this interface in this case, that is:
CanRead
is invariant.Notwithstanding the above, Read* methods may potentially throw a
NotSupportedException
.The same argument can be applied to all the other Can* properties.
当我看到此模式的实例时,我通常会期望这样:
A 无参数
CanXXX
成员将始终返回相同的值,除非.........存在
CanXXXChanged
事件,其中无参数CanXXX
可能会在该事件发生之前和之后返回不同的值; 但不触发事件就不会改变。参数化
CanXXX(…)
成员可能会为不同的参数返回不同的值; 但对于相同的参数,它很可能返回相同的值。 也就是说,CanXXX(constValue)
可能保持不变。<块引用>
我在这里很谨慎:如果
stream.CanWriteToDisk(largeConstObject)
现在返回true
,是否可以合理地假设它将始终返回true
将来会怎样? 可能不会,所以参数化CanXXX(…)
是否会为相同的参数返回相同的值可能取决于上下文。调用 < code>XXX(…) 仅当
CanXXX
返回true
时才能成功。话虽如此,我同意
Stream
使用此模式存在一些问题。 至少在理论上是这样,但在实践中也许不是如此。When I see an instance of this pattern, I would generally expect this:
A parameter-less
CanXXX
member will always return the same value, unless……in the presence of a
CanXXXChanged
event, where a parameter-lessCanXXX
may return a different value before and after an occurrence of that event; but it will not change without triggering the event.A parameterized
CanXXX(…)
member may return different values for different arguments; but for the same arguments, it is likely to return the same value. That is,CanXXX(constValue)
is likely to remain constant.A call to
XXX(…)
can succeed only ifCanXXX
returnstrue
.That being said, I agree that
Stream
's use of this pattern is somewhat problematic. At least in theory, if perhaps not so much in practice.这听起来更像是一个理论问题,而不是一个实际问题。 除了由于流被关闭之外,我实在想不出任何情况下流会变得不可读/不可写。其他。
很可能存在一些特殊情况,但我根本不希望它们经常出现。 我认为绝大多数代码不需要担心这一点。
但这是一个有趣的哲学问题。
编辑:解决 CanRead 等是否有用的问题,我相信它们仍然有用 - 主要用于参数验证。 例如,仅仅因为一个方法需要在某个时刻读取一个流,并不意味着它想在该方法的开头读取它,但理想情况下应该在该位置执行参数验证。 这实际上与检查参数是否为 null 并抛出
ArgumentNullException
没有什么不同,而不是在第一次取消引用它时等待抛出NullReferenceException
。此外,
CanSeek
略有不同:在某些情况下,您的代码可以很好地处理可查找和不可查找的流,但在可查找的情况下效率更高。这确实依赖于“可搜索性”等保持一致 - 但正如我所说,这在现实生活中似乎是正确的。
好吧,让我们尝试换一种方式......
除非您在内存中读取/查找并且已经确保有足够的数据,或者您在预先分配的缓冲区中写入,否则总是事情有可能会出错。 磁盘故障或填满、网络崩溃等。这些事情确实在现实生活中发生,因此您始终需要以能够在故障中幸存下来的方式进行编码(或者在问题没有发生时有意识地选择忽略问题)这真的很重要)。
如果您的代码能够在磁盘发生故障的情况下执行正确的操作,那么它很可能能够在 FileStream 从可写变为不可写时幸存下来。
如果
Stream
确实有固定合约,那么它们必须极其脆弱 - 您无法使用静态检查来证明您的代码始终有效。 你能做的最好的事情就是在面对失败时证明它做了正确的事情。我不相信
Stream
会很快发生变化。 虽然我当然承认它可以被更好地记录下来,但我不接受它“完全被破坏”的想法。 如果我们不能在现实生活中实际使用它,它就会更加坏掉……如果它比现在更坏,那么从逻辑上讲,它还没有完全坏掉。我对该框架有更大的问题,例如日期/时间 API 的状态相对较差。 在过去的几个版本中,它们已经变得很多,但是它们仍然缺少(例如)乔达时间。 缺乏内置的不可变集合、对语言中不可变性的支持不佳等——这些都是真正让我头疼的问题。 我宁愿看到它们得到解决,也不愿在 Stream 上花费大量时间,在我看来,这是一个有点棘手的理论问题,在现实生活中几乎不会引起任何问题。
This sounds more like a theoretical problem than a practical one. I can't really think of any situations in which a stream would become unreadable/unwritable other than due to it being closed.
There may well be corner cases, but I wouldn't expect them to show up often at all. I don't think the vast majority of code needs to worry about this.
It's an interesting philosophical problem though.
EDIT: Addressing the question of whether or not CanRead etc are useful, I believe they still are - mostly for argument validation. For example, just because a method takes a stream which it's going to want to read at some point doesn't mean it wants to read it right at the start of the method, but that's where the argument validation should ideally be performed. This is really no different to checking whether a parameter is null and throwing
ArgumentNullException
instead of waiting for aNullReferenceException
to be thrown when you first happen to dereference it.Also,
CanSeek
is slightly different: in some cases your code may well cope with both seekable and non-seekable streams, but with more efficiency in the seekable case.This does rely on the "seekability" etc remaining consistent - but as I've said, that appears to be true in real life.
Okay, let's try putting this another way...
Unless you're reading/seeking within memory and you've already made sure there's enough data, or you're writing within a preallocated buffer, there's always a chance things will go wrong. Disks fail or fill up, networks collapse etc. These things do happen in real life, so you always need to code in a way which will survive failure (or consciously choose to ignore the problem when it doesn't really matter).
If your code can do the right thing in the case of a disk failure, chances are it can survive a
FileStream
turning from writable to non-writable.If
Stream
did have firm contracts, they'd have to be incredibly weak - you couldn't use static checking to prove that your code will always work. The best you could do is to prove that it did the right thing in the face of failure.I don't believe
Stream
is going to change any time soon. While I certainly accept that it could be better documented, I don't accept the idea that it is "completely broken." It would be more broken if we couldn't actually use it in real life... and if it could be more broken than it is now, it's logically not completely broken.I have far bigger issues with the framework, such as the relatively poor state of date/time APIs. They've become a lot better in the last couple of versions, but they're still missing a lot of the functionality of (say) Joda Time. The lack of built-in immutable collections, poor support for immutability in the language etc - these are real problems which cause me actual headaches. I'd rather see them addressed than spend ages on
Stream
which seems to me to be a somewhat intractable theoretical problem which causes few issues in real life.