扭曲线程如何避免深度复制

发布于 2024-11-30 04:19:21 字数 179 浏览 1 评论 0原文

我有一个扭曲的服务器，它为每个请求执行一些“长”任务，所以我推迟线程每个调用。在每个请求中，我访问一个公共资源，该资源在此过程中会发生变化。每个请求都应从原始数据开始，因此我在公共资源上使用深度复制（同时调用锁获取）。它有效，但我认为它不够快。我有一种感觉，深度复制会减慢速度。

在处理资源突变的线程扭曲服务器时，您有什么建议？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

离去的眼神 2024-12-07 04:19:21

尝试在工作线程中使用尽可能少的数据进行操作。将它们需要的所有数据作为参数传递，并将它们的所有输出作为返回值（延迟触发的值），而不是作为输入的突变。

然后将结果集成到reactor线程中的公共数据结构中。

这可以让您独立地推理工作并避免任何额外的锁定（这会导致争用、减慢速度并让事情变得更加混乱）。

回复收藏 0 原文

送舟行 2024-12-07 04:19:21

如果您愿意，您可以使用 threading.Lock 同步对共享资源的访问，就像在任何其他线程程序中一样，而不是复制它。

无论如何，我认为值得在使用和不使用深度复制的情况下对代码进行基准测试，并在进行优化之前进行测量以找出性能的好坏。也许它慢的原因与深度复制无关。

关于使用锁定的编辑：我的意思是您可以对该资源使用更细粒度的锁定。我假设您的线程所做的不仅仅是访问共享资源。您可以尝试从执行工作的多个线程中受益，然后同步对涉及写入共享资源的一个“关键部分”的访问。您还可以研究使共享资源线程安全。例如，如果有一个共享对象，SillyExampleFriendsList：

class SillyExampleFriendsList(object):
    """Just manipulates a couple lists"""
    def __init__(self):
       self._lock = threading.RLock()
       self._friends = []
       self._enemies = []

    def unfriend(self, x):
       # we lock here to ensure that we're never in a state where
       # someone might think 'x' is both our friend and our enemy.
       self._lock.acquire()
       self._friends.remove(x)
       self._enemies.append(x)
       self._lock.release()

这里的重点是，通过仔细使用锁，可以在多个线程之间共享上述对象，而无需进行深度复制。识别可能需要这样做的所有情况并非易事，细粒度锁定策略可能更难以调试，并且仍然会带来开销。

也就是说，您可能根本不需要线程、锁或深度复制，并且如果不对代码进行基准测试，就不清楚是否存在需要解决的性能问题。我很好奇是什么让您认为您的代码应该或需要更快？

If you like you could just synchronize access to the shared resource with threading.Lock just like you would in any other threaded program rather than copying it.

Regardless, I think it's worth benchmarking your code with and without the deepcopy and otherwise measuring to figure out how good/bad the performance really is before making optimizations. Perhaps the reason it is slow has nothing to do with deepcopy.

EDIT regarding using locking: What I mean is that you can use more fine grained locking around this resource. I assume that your threads are doing more than accessing a shared resource. You can try to benefit from multiple threads doing work and then synchronize access to just the one "critical section" that involves writing to the shared resource. You might also investigate making your shared resource threadsafe. For example, if have a shared object, SillyExampleFriendsList:

class SillyExampleFriendsList(object):
    """Just manipulates a couple lists"""
    def __init__(self):
       self._lock = threading.RLock()
       self._friends = []
       self._enemies = []

    def unfriend(self, x):
       # we lock here to ensure that we're never in a state where
       # someone might think 'x' is both our friend and our enemy.
       self._lock.acquire()
       self._friends.remove(x)
       self._enemies.append(x)
       self._lock.release()

The point here is just that the above object could potentially be shared between multiple threads without deepcopy by careful use of locks. It's not trivial to identify all the cases where this might be necessary and fine grained locking strategies can be more difficult to debug and still introduce overhead.

That said, you may not need threads, locks, or deepcopy at all and without benchmarking your code it's not clear if you have a performance problem that needs to be solved. I'm curious what makes you think that your code should be, or needs to be, faster?

回复收藏 0 原文

~没有更多了~