如何扩展当前使用 django 且响应时间较长的 Web 应用程序
我正在服务器端使用 django 编写一个 Web 应用程序。服务器大约需要 4 秒才能生成对用户的响应。它使用天气 API。我的应用程序必须针对每个用户请求对该 api 进行约 50 次查询。
服务器端使用Python的urllib来使用天气API。我使用 python 线程来加速该过程,因为 urllib 是同步的。我正在将 wsgi 与 apache 一起使用。问题是 wsgi 堆栈是完全同步的,当许多用户使用我的应用程序时,他们必须等待彼此的请求完成。由于每个请求大约需要 4 秒,这是不可接受的。
我有点陷入困境,我能做什么?
谢谢
I am writing a web application with django on the server side. It takes ~4 seconds for server to generate a response to the user. It makes use of a weather api. My application has to make ~50 query to that api for each user request.
Server side uses urllib of python for using the weather api. I used pythons threading to speed up the process because urllib is synchronous. I am using wsgi with apache. The problem is wsgi stack is fully synchronous and when many users use my application, they have to wait for one anothers request to finish. Since each request takes ~4 seconds, this is unacceptable.
I am kind of stuck, what can I do?
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您在多线程配置甚至多进程配置中使用 mod_wsgi,则一个请求不应阻止另一个请求执行某些操作。它们应该能够同时运行。如果使用多线程配置,您确定您没有在自己的应用程序中的某些资源上使用某种锁定机制,从而阻止请求通过同一代码段运行吗?另一种可能性是您对 Apache MPM 和/或 mod_wsgi 守护进程模式的配置不当,从而阻止了并发请求。
无论如何,正如另一个答案中提到的,您最好考虑缓存策略以避免首先进行天气查找或卸载到客户端。
If you are using mod_wsgi in a multithreaded configuration, or even a multi process configuration, one request should not block another from being able to do something. They should be able to run concurrently. If using a multithreaded configuration, are you sure that you aren't using some locking mechanism on some resource within your own application which precludes requests running through the same section of code? Another possibility is that you have configured Apache MPM and/or mod_wsgi daemon mode poorly so as to preclude concurrent requests.
Anyway, as mentioned in another answer, you are much better off looking at caching strategies to avoid the weather lookups in the first place, or offloading to client.
每个请求对外部资源进行 50 次查询可能是一个糟糕的地方,而且可能根本没有必要。
天气变化不会那么快,因此您只需将结果缓存一段时间就可能会受益匪浅。那么,无论您收到多少请求,您每天都不需要执行超过几个查询。
如果这不是您的情况,您也许可以让客户为您完成这项工作。重构代码,以便天气 api 聚合发生在 JavaScript 中的客户端上,而不是通过服务器进行全部处理。
编辑:根据您发布的评论,您所要求的内容可能无法在您正在使用的 API 的限制内进行优化。问题在于,该服务很好地抽象出了许多天气信息来源的差异,将它们聚合到最近的位置查询中。毕竟,气象站仅提供点数据。
如果您直接与提供 API 的技术支持人员交谈,您可能会发现他们愿意支持更复杂的查询(边界框),并为此提供说明。不过,更有可能的是,他们将其抽象出来,因为他们不想实际揭示其 API 实际提供的解决方案,或者因为他们对数据进行建模或执行进行此类查询的计算的方式存在一些技术原因太难支持了。
如果没有这个或缓存,你只是运气不好。
50 queries to an outside resource per request is probably a bad place to be, and probably not neccesary at all.
The weather doesn't change all that quickly, and so you can probably benefit enormously by just caching results for a while. Then it doesn't matter how many requests you're getting, you don't need to do more than a few queries per day
If that's not your situation, you might be able to get the client to do the work for you. Refactor the code so that the weather api aggregation happens on the client in javascript, rather than funneling it all through the server.
Edit: based on comments you've posted, what you are asking for probably cannot be optimized within the constraints of the API you are using. The problem is that the service is doing a good job of abstracting away the differences in the many sources of weather information they aggregate into a nearest location query. after all, weather stations provide only point data.
If you talk directly to the technical support people that provide the API, you might find that they are willing to support more complex queries (bounding box), for which they will give you instructions. More likely, though, they abstract that away because they don't want to actually reveal the resolution that their API actually provides, or because there is some technical reason in the way that they model their data or perform their calculations that would make such queries too difficult to support.
Without that or caching, you are just out of luck.