在类外共享静态数据

发布于 2024-07-25 18:17:41 字数 925 浏览 3 评论 0原文

首先,这是一个激励性的例子:

public class Algorithm
{
    public static void compute(Data data)
    {
        List<Task> tasks = new LinkedList<Task>();
        Client client = new Client();
        int totalTasks = 10;

        for(int i = 0; i < totalTasks; i++)
            tasks.add(new Task(data));

         client.submit(tasks);
    }
}

// AbstractTask implements Serializable
public class Task extends AbstractTask
{
    private final Data data;

    public Task(Data data)
    {
        this.data = data;
    }

    public void run()
    {
        // Do some stuff with the data.
    }
}

所以,我正在做一些并行编程,并且有一个创建大量任务的方法。 这些任务共享它们将操作的数据,但我在为每个任务提供对数据的引用时遇到问题。 问题是,当任务被序列化时,会为每个任务创建一个数据副本。 现在,在这个任务类中,我可以对数据进行静态引用,以便它只存储一次,但这样做在任务类的上下文中并没有多大意义。 我的想法是将对象作为静态存储在另一个外部类中,并让任务从该类请求该对象。 这可以在发送任务之前完成,很可能是在上面发布的示例中的计算方法中完成的。 您认为这样合适吗? 任何人都可以提供有关所建议想法的任何替代解决方案或提示吗? 谢谢!

First, here is a motivating example:

public class Algorithm
{
    public static void compute(Data data)
    {
        List<Task> tasks = new LinkedList<Task>();
        Client client = new Client();
        int totalTasks = 10;

        for(int i = 0; i < totalTasks; i++)
            tasks.add(new Task(data));

         client.submit(tasks);
    }
}

// AbstractTask implements Serializable
public class Task extends AbstractTask
{
    private final Data data;

    public Task(Data data)
    {
        this.data = data;
    }

    public void run()
    {
        // Do some stuff with the data.
    }
}

So, I am doing some parallel programming and have a method which creates a large number of tasks. The tasks share the data that they will operate on, but I am having problems giving each task a reference to the data. The problem is, when the tasks are serialized, a copy of the data is made for each task. Now, in this task class, I could make a static reference to the data so that it is only stored once, but doing this doesn't really make much sense in the context of the task class. My idea is to store the object as a static in another external class and have the tasks request the object from the class. This can be done before the tasks are sent, likely, in the compute method in the example posted above. Do you think that this is appropriate? Can anyone offer any alternative solutions or tips regarding the idea suggested? Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

温柔戏命师 2024-08-01 18:17:41

您能详细解释一下您所处的序列化情况吗? Task 如何报告结果,结果去了哪里——它们是否修改了Data? 他们产生一些输出吗? 所有任务都需要访问所有数据吗? 任何Task是否写入同一个ObjectOutputStream

抽象地说,我想我可以看到两类解决方案。

  1. 如果Task不需要访问所有Data,我会尝试只为每个Task提供它需要的数据。
  2. 如果他们确实需要所有这些,那么我不会让 Task 包含 Data 本身,而是让它包含某种可用于获取的 ID数据。 如何将 Data 的一份副本传输到 Task 可以运行的每个位置,并授予 Task 访问它的权限,我正在不确定,没有更好地了解整体情况。 但我建议尝试单独管理数据

Can you explain more about this serialization situation you're in? How do the Tasks report a result, and where does it go -- do they modify the Data? Do they produce some output? Do all tasks need access to all the Data? Are any of the Tasks written to the same ObjectOutputStream?

Abstractly, I guess I can see two classes of solutions.

  1. If the Tasks don't all need access to all the Data, I would try to give each Task only the data that it needs.
  2. If they do all need all of it, then instead of having the Task contain the Data itself, I would have it contain an ID of some kind that it can use to get the data. How to get just one copy of the Data transferred to each place a Task could run, and give the Task access to it, I'm not sure, without better understanding the overall situation. But I would suggest trying to manage the Data separately.
再浓的妆也掩不了殇 2024-08-01 18:17:41

我不确定我完全理解这个问题,但在我看来,任务实际上是序列化的以供以后执行。

如果是这种情况,一个重要的问题是所有 Task 对象是否都写入同一个 ObjectOutputStream。 如果是这样,Data 只会在第一次遇到时被序列化。 稍后的“副本”将仅引用流中的相同对象句柄。

也许可以利用这一点来避免对数据的静态引用(这可能会导致面向对象设计中的许多问题)。

I'm not sure I fully understand the question, but it sounds to me as though Tasks are actually serialized for later execution.

If this is the case, an important question would be whether all of the Task objects are written to the same ObjectOutputStream. If so, the Data will only be serialized the first time it is encountered. Later "copies" will just reference the same object handle from the stream.

Perhaps one could take advantage of that to avoid static references to the data (which can cause a number of problems in OO design).

如日中天 2024-08-01 18:17:41

编辑:由于对所问问题的误解,下面的答案实际上并不相关。 将其留在这里等待问题作者提供更多详细信息。


这正是 transient 关键字的原因发明的。

声明实例字段不是
默认序列化形式的一部分
一个东西。 当一个物体是
序列化,仅其值
非瞬态实例字段是
包含在默认序列号中
表示。 当一个物体是
反序列化的瞬态字段是
仅初始化为默认值
值。

public class Task extends AbstractTask {
    private final transient Data data;

    public Task(Data data) {
        this.data = data;
    }

    public void run() {
        // Do some stuff with the data.
    }
}

Edit: The answer below is not actually relevant, due to a misunderstanding about what was being asked. Leaving it here pending more details from the question's author.


This is precisely why the transient keyword was invented.

Declares that an instance field is not
part of the default serialized form of
an object. When an object is
serialized, only the values of its
non-transient instance fields are
included in the default serial
representation. When an object is
deserialized, transient fields are
initialized only to their default
value.

public class Task extends AbstractTask {
    private final transient Data data;

    public Task(Data data) {
        this.data = data;
    }

    public void run() {
        // Do some stuff with the data.
    }
}
め七分饶幸 2024-08-01 18:17:41

您是否考虑过创建一个单例而不是使其静态?

Have you considered making a singleton instead of making it static?

看海 2024-08-01 18:17:41

我的想法是将对象存储为
另一个外部类中的 static 和
让任务从中请求对象
班级。

忘记这个想法吧。 当任务被序列化并通过网络发送时,该对象将不会被发送; 静态数据不会(也不能)在 JVM 之间以任何方式共享。

基本上,如果您的任务是单独序列化的,则共享数据的唯一方法是单独发送数据,或者仅在一个任务中发送它,并以某种方式让其他任务在接收计算机上获取它。 这可以通过具有数据集的一个任务和其他任务查询的静态字段来实现,但这当然需要首先运行该任务。 这可能会导致同步问题。

但实际上,听起来您正在使用某种假设任务是独立的处理队列。 如果试图让他们共享数据,你就违背了这个概念。 你的数据有多大? 共享数据真的绝对必要吗?

My idea is to store the object as a
static in another external class and
have the tasks request the object from
the class.

Forget about this idea. When the tasks are serialzed and sent over the network, that object will not be sent; static data is not (and cannot) be shared in any way between JVMs.

Basically, if your Tasks are serialized separately, the only way to share the data is to send it separately, or send it only in one task and somehow have the others acquire it on the receiving machine. This could happen via a static field that the one task that has the data sets and the others query, but of course that requires that one task to be run first. And it could lead to synchronization problems.

But actually, it sounds like you are using some sort of processing queue that assumes tasks to be self-contained. By trying to have them share data, you are going against that concept. How big is your data anyway? Is it really absolutely necessary to share the data?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文