使用 PHP 处理大型(对象)数据集

发布于 2024-08-26 14:55:20 字数 1099 浏览 4 评论 0原文

我目前正在从事一个广泛依赖 EAV 模型的项目。两个实体的属性都由模型单独表示,有时会扩展其他模型(或至少是基本模型)。

到目前为止,这种方法效果很好,因为应用程序的大多数区域仅依赖于过滤后的实体集,而不是整个数据集。

然而,现在我需要解析整个数据集(即:所有实体及其所有属性),以便提供基于属性的排序/过滤算法。

该应用程序当前由大约 2200 个实体组成,每个实体大约有 100 个属性。每个实体都由单个模型表示(例如 Client_Model_Entity),并具有一个名为 $_attributes 的受保护属性,它是 Attribute 对象的数组。

每个实体对象大约有 500KB,这给服务器带来了令人难以置信的负载。对于 2000 个实体,这意味着单个任务需要 1GB RAM(以及大量 CPU 时间)才能工作,这是不可接受的。

是否有任何模式或通用方法来迭代如此大的数据集?分页并不是真正的选择,因为必须考虑所有因素才能提供排序算法。

编辑:希望使事情变得更清晰的代码示例:

// code from the resource model
for ($i=0,$n=count($rowset);$i<$n;++$i)
{
    $clientEntity = new Client_Model_Entity($rowset[$i]);
    // getattributes gets all possible attributes from the db and creates models for them
    // this is actually the big resource hog, as one client can have 100 attributes
    $clientEntity->getAttributes(); 
    $this->_rows[$i] = $clientEntity;
    // memory usage has now increased by 500KB
    echo $i . ' : ' . memory_get_usage() . '<br />';
}

I am currently working on a project that extensively relies on the EAV model. Both entities as their attributes are individually represented by a model, sometimes extending other models (or at least, base models).

This has worked quite well so far since most areas of the application only rely on filtered sets of entities, and not the entire dataset.

Now, however, I need to parse the entire dataset (IE: all entities and all their attributes) in order to provide a sorting/filtering algorithm based on the attributes.

The application currently consists of aproximately 2200 entities, each with aproximately 100 attributes. Every entity is represented by a single model (for example Client_Model_Entity) and has a protected property called $_attributes, which is an array of Attribute objects.

Each entity object is about 500KB, which results in an incredible load on the server. With 2000 entities, this means a single task would take 1GB of RAM (and a lot of CPU time) in order to work, which is unacceptable.

Are there any patterns or common approaches to iterating over such large datasets? Paging is not really an option, since everything has to be taken into account in order to provide the sorting algorithm.

EDIT: a code example to hopefully make things clearer:

// code from the resource model
for ($i=0,$n=count($rowset);$i<$n;++$i)
{
    $clientEntity = new Client_Model_Entity($rowset[$i]);
    // getattributes gets all possible attributes from the db and creates models for them
    // this is actually the big resource hog, as one client can have 100 attributes
    $clientEntity->getAttributes(); 
    $this->_rows[$i] = $clientEntity;
    // memory usage has now increased by 500KB
    echo $i . ' : ' . memory_get_usage() . '<br />';
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

2024-09-02 14:55:20

如果属性之间有很多共同点,您可以看一下 Flyweight 模式: http:// en.wikipedia.org/wiki/Flyweight_pattern。这可能会显着减少表示模型所需的对象数量。

If there's a lot of commonality between the attributes, you could take a look at the Flyweight pattern: http://en.wikipedia.org/wiki/Flyweight_pattern. This might significantly reduce the number of objects required to represent your model.

反目相谮 2024-09-02 14:55:20

一种解决方案可能是实现 Iterator 接口 并一次解析一个对象。

One solution could be to implement the Iterator interface and parse one object at the time.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文