说服产品经理更改长时间运行的同步流程的设计
在我们的 Web 应用程序中,我们有一个功能:
- 获取产品列表
- 将它们写入 Excel 文件
- 将 Excel 文件返回给用户以供下载
此过程需要更频繁,具体取决于产品数量 > 2分钟。有些请求需要超过 5 分钟!用户平均下载 100-500 个产品,请求大约需要 1 - 5 分钟。
我认为 1 分钟对于任何 Web 服务器线程在任何单个请求上都处于活动状态来说太长了。除了花费如此长的时间之外,该过程本身还会导致我们的服务器出现内存不足错误并使其崩溃。
我想让他们相信这是不好的做法,因此必须通过引用软件架构师撰写的文章、书籍或研究来改变设计,这些文章或书籍或研究表明这是如此,并提出了在这种情况下该怎么做的建议。
有人知道这样的书/文章/研究吗?
如果您不同意我的观点,即 1 分钟对于任何 Web 服务器线程在单个请求上处于活动状态来说都太长,请告诉我原因。
In our web application, we have a feature which:
- Gets a list of products
- Writes them to an excel file
- Returns the excel file to the user for download
This process takes more often depending on the number of products > 2 minutes. Some requests take more than 5 minutes! On average users download 100-500 products and the request takes somewhere around 1 - 5 minutes.
I think 1 minute is too long for any web server thread to be active on any single request. Aside from the fact that it takes so long, the process itself causes out of memory errors in our server and makes it crash.
I would like to convince them that this is bad practice and hence the design must be changed by citing articles or books or studies written by software architects saying that this is so along with recommendations on what to do in this situation.
Anyone know such books/articles/studies?
If you disagree with me in the assumption that 1 minute is too long for any web server thread to be active on a single request, kindly let me know why.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您是否尝试过重新思考生成 xls 的逻辑?因为如果这是业务需求,那么他们可能真的不愿意改变(在那里,做过)。
我使用 apache-poi 生成 xls 并创建报告,并且性能令人满意(最多几秒即可生成报告)。我们使用服务器端缓存来缓存数据。然后我们就可以取出 xls 了。
Have you tried rethinking the logic to generate the xls? Because if it is a business requirement then they might be really reluctant to change(been there, done that) it.
I have used apache-poi to generate xls and create reports and for the performance satisfactory(max few secs to make the report). We used server side caching to cache the data. and then we would just pull out the xls.
如果正在下载的 Excel 列表是固定的,您可以考虑在后台线程中生成它们并返回直接下载链接。即使它一两个小时改变一次,也值得这样做,而不是按需生成。
另一种方法可能是,当用户选择产品列表并请求 Excel 提供一个选项时,他们是否希望接收带有下载链接的电子邮件,甚至接收作为电子邮件附件的文件。如果他们接受,请将每个请求提交到队列并运行批处理作业,生成 Excel 工作表并将其作为邮件附件发送。这样您就不会限制网络服务器。
另外,我的主要问题是为什么需要一分钟多的时间,哪个部分的过程需要很长时间。是否值得研究该领域(数据库连接{池化、服务器共置}、大表{分区}、Excel 生成)?
您是否在每个 Excel 中添加固定图形?如果是这样,请使用已经具有页眉/页脚等的模板。
值得重新审视那些造成瓶颈的部分,而不是盲目地说这是一个糟糕的设计/方法。
调查可能会解决当前的问题,或者至少你不会在未来的设计中犯同样的错误
If the list of Excels being downloaded is fixed, you can think of generating them in background thread and return a direct link for download. Even if it changes once in an hour or 2 its worth doing that instead of generating on demand.
The other approach could be, when users select the product list and request the Excel provide an option if they would like to receive an email with the download link or even receive the file as attachment to the email. If they accept, submit each request to a Queue and run a batch job that generates the Excel sheet and sends them as mail attachment. This way you would not throttle the web server.
In addition, my main question would be why is it taking more than a minute, which part of the process is taking long time. Its worth investigating that area (DB connectivity {pooling, co-locate the servers}, huge table {partition}, excel generation)?
Are you adding fixed graphics to each excel? if so use templates that already have the headers/footers etc.
Its worth re-looking at the pieces that are creating bottlenecks rather than blindly saying its a bad design/approach.
Investigation would probably fix the current issues or at least you would not carry forward the same mistakes in future design