下载的单元测试
我正在编写一个 Java 程序,用于下载然后处理许多网页。测试在不访问远程服务器的情况下下载页面的程序组件的最佳实践是什么?
I am writing a Java program that downloads and then processes many webpages. What is the best practice for testing a component of the program that downloads a page without hitting the remote servers?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
一种想法是使用“InputStream”作为传递给处理代码的对象。我相信用于通过 HTTP 读取数据的 HttpClient (或等效)类为您提供了某种可以读取响应的流。为了进行测试,您可以替换不同类型的流来读取,例如本地 FileStream。
One thought would be to use "InputStream" as the object you pass to your processing code. I believe the HttpClient (or equivalent) class for reading data via HTTP gives you some sort of stream to read on the response. For testing, you could just substitute a different type of stream to read from, such as a local FileStream.
因此,执行下载的组件和处理页面的组件应该是分开的。任何时候您在对一段代码进行单元测试时遇到困难,这都表明您可能试图在一个组件中做太多事情。
完成此操作后,您可以测试处理部分,但这是最有意义的。让处理器组件采用 InputStream 甚至只是一个 String 作为输入。
至于下载部分,您可能需要进行集成测试。集成测试通常涉及更多内容,需要设置本地 Web 服务器(maven 可以做到这一点),或者至少使用 file: URL。
So the component that does the download and the component that processes the page should be a separate. Any time you are having trouble unit testing a piece of code, that's a sign that you may be trying to do too much in one component.
Once you've done that, you test the processing part however makes the most sense. Have the processor component take an InputStream or even just a String as input.
As for the download part, you probably need an integration test. Integration tests are often a lot more involved and would require setting up a local web server (maven can do this), or at the very least using a file: URL.
如果您的代码支持 HTTP 代理,您可以拥有一个充当代理的离网缓存。只需使用代理缓存运行代码一次,保存数据、网络延迟等。然后,您可以使用仅返回数据的代理运行代码。要在两者之间切换,只需配置 HTTP 代理即可。
这种方法的优点是您可以针对任意数量的站点进行单元测试。您的网络缓存/http 代理将可重复使用以供将来使用。
If your code supports having an HTTP proxy you could have an off network cache that functions as a proxy. Just run the code once with the proxy caching, saving the data, network delays, etc. Then after that you can run the code with the proxy just returning the data. To switch between the two is just a matter of configuring the HTTP proxy.
The advantage of this approach is you can unit test against an arbitrary number of sites. Your network cache/http proxy would be reusable for future uses.
查看依赖注入
这是一种将不同的“依赖项”“注入”到函数中的技术,而不是一开始就将它们放在函数中(简单的解释)。
阅读 Martin Fowlers 关于 DI 的文章
http://martinfowler.com/articles/injection.html
希望有帮助
/乔纳斯
Check out Dependency Injection
It's technique where you "inject" the different "dependenies" into your functions instead of having them in your function to begin with ( simplyfied explanation ).
Read Martin Fowlers article about DI
http://martinfowler.com/articles/injection.html
hope it helps
/Jonas