提高多线程分布式应用程序中的性能成本内存
我成功地将 Web 应用程序的性能提高了 10%,比以前快了 10%。这样,我注意到内存使用量增加了一倍!
测试应用程序执行以下操作:调用 Web 服务,执行一些复杂的业务操作 * [用户数] * [次数]
我检查我的更改代码,但没有任何可疑的使用更多内存的情况..(我所做的只是删除行将 DataSet 序列化为 byte[] 并将其保存在缓存中的代码) 我在多线程测试中一次又一次地检查:
- 当我跳过越来越多的代码(性能提高 - 内存上升)
- 时,当我在循环中重复错误代码时(性能不好 - 内存下降)
可以有谁解释一下为什么吗???
代码如下:
之前:(循环时间:100% 内存 100%)
outStream = new MemoryStream();
new BinaryFormatter().Serialize(outStream, stateData);
outStream.Close();
SessionSettings.StateData_Set(stateId, outStream.ToArray());
outStream.Dispose();
选项 1 之后:(循环时间:200% 内存 50%)
for (int i = 0; i < 20; i++)
{
outStream = new MemoryStream();
new BinaryFormatter().Serialize(outStream, stateData);
}
outStream.Close();
SessionSettings.StateData_Set(stateId, outStream.ToArray());
outStream.Dispose();
选项 2 之后:(循环时间:90% 内存 200%)
//outStream = new MemoryStream();
//new BinaryFormatter().Serialize(outStream, stateData);
//outStream.Close();
SessionSettings.StateData_Set(stateId, null);
//outStream.Dispose();
SessionSettings.StateData_Set 将对象放入a
dictionary<string,dictionary<string, object>>
意味着
<Sessions<DataKey,DataObject>>
在每个循环结束时内部字典会删除该条目 并且在每个用户会话结束时,整个内部词典都会从外部词典中删除。
I've managed to improve a web aplication performance to be 10% faster than it used to be. With this, I've noticed that memory usage has doubled!!!
The test application does: call a web service, do some complicated business action * [number of users] * [number of times]
I check my change code, but nothing was suspicious of using more memory..(all i did is remove lines of code that serialized a DataSet into byte[] and saved this in cache)
I checked again and again in a multithreaded test:
- As I skipped more and more code (perfomance improved - memory went up)
- As I repeated bad code in a loop (perfomance was bad- memory went down)
Can any one explain why????
Code below:
Before: (Cycle Time : 100% Memory 100%)
outStream = new MemoryStream();
new BinaryFormatter().Serialize(outStream, stateData);
outStream.Close();
SessionSettings.StateData_Set(stateId, outStream.ToArray());
outStream.Dispose();
After option 1: (Cycle Time: 200% Memory 50%)
for (int i = 0; i < 20; i++)
{
outStream = new MemoryStream();
new BinaryFormatter().Serialize(outStream, stateData);
}
outStream.Close();
SessionSettings.StateData_Set(stateId, outStream.ToArray());
outStream.Dispose();
After option 2: (Cycle Time: 90% Memory 200%)
//outStream = new MemoryStream();
//new BinaryFormatter().Serialize(outStream, stateData);
//outStream.Close();
SessionSettings.StateData_Set(stateId, null);
//outStream.Dispose();
SessionSettings.StateData_Set puts an object into a
dictionary<string,dictionary<string, object>>
which means
<Sessions<DataKey,DataObject>>
in the end of each cycle the inner dictionary removes the entry
and in the end of each user session the entire inner dictionary is removed from the outer dictionary.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
另一个猜测:如果您的应用程序分配了太多内存(可能是通过频繁序列化),CLR 将更频繁地触发 GC。通过观察性能计数器 ->在 GC 中你会注意到,GC 消耗了大量的 CPU - 我见过超过 40% 的场景。如果您的数据集很大并且 byte[] 存储最终位于 LOH 上,则尤其如此。
通过限制这些分配,GC 的触发频率将大大降低,从而提高应用程序性能。观察到内存增加的一个原因可能是托管堆现在在健康区域中工作得更好。
为了找到更可靠的解释,请发布一些性能对策:优化前和优化后。有趣的是:总堆大小、GC 花费的时间。
Another guess: if your application allocates too much memory (possibly by frequent serializing) the CLR will trigger a GC much more often. By watching the Performance Counter -> Time in GC you will notice, the GC eats up a lot of CPU - I have seen scenarios high above 40%. This is especially true, if your datasets are large and the byte[] storage ends up on the LOH.
By limiting those allocations the GC will get triggered far less often, causing a better application performance. A reason for the observed increase in memory could be that the managed heap now works more in a healthy region.
In order to find more reliably explanation, please post some performance counter measures: before your optimization and after your optimizations. Interesting would be: overall heap size, time spend in GC.
由于您没有提供任何代码,仅提供了简短的解释,因此这更像是一个谜语而不是问题。
因此,也许 10 个用户几乎同时访问缓存数据集。缓存数据集被用户同步锁定和查看(一次一个)。这在多线程测试中表现不佳,但占用的内存很少。
也许复制数据集(未缓存)大约在同一时间被 10 个用户访问。每个用户都有自己的复制数据集副本。如果没有锁定/同步访问和数据集的多个副本,内存会增加,但性能会提高。
Since you didn't provide any code and only a brief explanation it's more of a riddle than a question.
So, perhaps the CACHED DATASET was being accessed by 10 users at roughly the same time. The CACHED DATASET was being locked and viewed by the users SYNCHRONOUSLY (one at a time). This would perform poorly in a multi-threaded test but would use little memory.
Perhaps the REPLICATED DATASET (uncached) was being accessed by 10 users at roughly the same time. Each user would have their own copy of the REPLICATED DATASET. With no locking/synchronized access and multiple copies of the DATASET the memory would increase but the performance would improve.