并发和日历类

发布于 2024-11-16 20:36:17 字数 4129 浏览 4 评论 0原文

我有一个围绕哈希映射(ConcurrentMap> slicesMap)组织的线程(实现可运行的类,称为AnalyzeTree)。该类遍历大型文本文件中的数据(此处称为树),并将地理坐标从其中解析到 HashMap。这个想法是一次处理一棵树,并根据键(只是代表时间的 Double 值)添加或增长值。

代码的相关部分如下所示:

// grow map entry if key exists
if (slicesMap.containsKey(sliceTime)) {

    double[] imputedLocation = imputeValue(
        location, parentLocation, sliceHeight,
        nodeHeight, parentHeight, rate,
        useTrueNoise, currentTreeNormalization,
        precisionArray);

    slicesMap.get(sliceTime).add(new Coordinates(imputedLocation[1], imputedLocation[0], 0.0));

    // start new entry if no such key in the map
} else {

    List<Coordinates> coords = new ArrayList<Coordinates>();

    double[] imputedLocation = imputeValue(
        location, parentLocation, sliceHeight,
        nodeHeight, parentHeight, rate,
        useTrueNoise, currentTreeNormalization,
        precisionArray);

    coords.add(new Coordinates(imputedLocation[1], imputedLocation[0], 0.0));

    slicesMap.putIfAbsent(sliceTime, coords);
    // slicesMap.put(sliceTime, coords);

}// END: key check

类的调用方式如下(执行器是 ExecutorService executor = Executors.newFixedThreadPool(NTHREDS) ):

mrsd = new SpreadDate(mrsdString);
int readTrees = 1;
while (treesImporter.hasTree()) {

    currentTree = (RootedTree) treesImporter.importNextTree();

    executor.submit(new AnalyzeTree(currentTree,
        precisionString, coordinatesName, rateString,
        numberOfIntervals, treeRootHeight, timescaler,
        mrsd, slicesMap, useTrueNoise));

    // new AnalyzeTree(currentTree, precisionString,
    // coordinatesName, rateString, numberOfIntervals,
    // treeRootHeight, timescaler, mrsd, slicesMap,
    // useTrueNoise).run();

    readTrees++;

}// END: while has trees

现在,并行执行时会遇到麻烦(顺序运行的注释部分没问题),我认为它可能会抛出 ConcurrentModificationException,但显然问题出在 mrsd (SpreadDate 对象的实例,它只是一个用于日期相关计算的类)中。

SpreadDate 类如下所示:

public class SpreadDate {

    private Calendar cal;
    private SimpleDateFormat formatter;
    private Date stringdate;

    public SpreadDate(String date) throws ParseException {

        // if no era specified assume current era
        String line[] = date.split(" ");
        if (line.length == 1) {
            StringBuilder properDateStringBuilder = new StringBuilder();
            date = properDateStringBuilder.append(date).append(" AD").toString();
        }

        formatter = new SimpleDateFormat("yyyy-MM-dd G", Locale.US);
        stringdate = formatter.parse(date);

        cal = Calendar.getInstance();
    }

    public long plus(int days) {
        cal.setTime(stringdate);
        cal.add(Calendar.DATE, days);
        return cal.getTimeInMillis();
    }// END: plus

    public long minus(int days) {
        cal.setTime(stringdate);
        cal.add(Calendar.DATE, -days); //line 39
        return cal.getTimeInMillis();
    }// END: minus

    public long getTime() {
        cal.setTime(stringdate);
        return cal.getTimeInMillis();
    }// END: getDate
}

抛出异常时的堆栈跟踪:

java.lang.ArrayIndexOutOfBoundsException: 58
at     sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:454)
at java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2098)
at java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2013)
at java.util.Calendar.setTimeInMillis(Calendar.java:1126)
at java.util.GregorianCalendar.add(GregorianCalendar.java:1020)
at utils.SpreadDate.minus(SpreadDate.java:39)
at templates.AnalyzeTree.run(AnalyzeTree.java:88)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)

如果将初始化 mrsd 的部分移动到 AnalyzeTree 类,则它运行时不会出现任何问题 - 但是每次此线程运行时初始化类的内存效率不是很高,因此我的担忧。如何补救?

I have a thread (class implementing runnable, called AnalyzeTree) organised around a hash map (ConcurrentMap> slicesMap). The class goes through the data (called trees here) in the large text file and parses the geographical coordinates from it to the HashMap. The idea is to process one tree at a time and add or grow the values according to the key (which is just a Double value representing time).

The relevant part of code looks like this:

// grow map entry if key exists
if (slicesMap.containsKey(sliceTime)) {

    double[] imputedLocation = imputeValue(
        location, parentLocation, sliceHeight,
        nodeHeight, parentHeight, rate,
        useTrueNoise, currentTreeNormalization,
        precisionArray);

    slicesMap.get(sliceTime).add(new Coordinates(imputedLocation[1], imputedLocation[0], 0.0));

    // start new entry if no such key in the map
} else {

    List<Coordinates> coords = new ArrayList<Coordinates>();

    double[] imputedLocation = imputeValue(
        location, parentLocation, sliceHeight,
        nodeHeight, parentHeight, rate,
        useTrueNoise, currentTreeNormalization,
        precisionArray);

    coords.add(new Coordinates(imputedLocation[1], imputedLocation[0], 0.0));

    slicesMap.putIfAbsent(sliceTime, coords);
    // slicesMap.put(sliceTime, coords);

}// END: key check

And the class is called like this (executor is ExecutorService executor = Executors.newFixedThreadPool(NTHREDS) ):

mrsd = new SpreadDate(mrsdString);
int readTrees = 1;
while (treesImporter.hasTree()) {

    currentTree = (RootedTree) treesImporter.importNextTree();

    executor.submit(new AnalyzeTree(currentTree,
        precisionString, coordinatesName, rateString,
        numberOfIntervals, treeRootHeight, timescaler,
        mrsd, slicesMap, useTrueNoise));

    // new AnalyzeTree(currentTree, precisionString,
    // coordinatesName, rateString, numberOfIntervals,
    // treeRootHeight, timescaler, mrsd, slicesMap,
    // useTrueNoise).run();

    readTrees++;

}// END: while has trees

Now this is running into troubles when executed in parallel (the commented part running sequentially is fine), I thought it might throw a ConcurrentModificationException, but apparently the problem is in mrsd (instance of SpreadDate object, which is simply a class for date related calculations).

The SpreadDate class looks like this:

public class SpreadDate {

    private Calendar cal;
    private SimpleDateFormat formatter;
    private Date stringdate;

    public SpreadDate(String date) throws ParseException {

        // if no era specified assume current era
        String line[] = date.split(" ");
        if (line.length == 1) {
            StringBuilder properDateStringBuilder = new StringBuilder();
            date = properDateStringBuilder.append(date).append(" AD").toString();
        }

        formatter = new SimpleDateFormat("yyyy-MM-dd G", Locale.US);
        stringdate = formatter.parse(date);

        cal = Calendar.getInstance();
    }

    public long plus(int days) {
        cal.setTime(stringdate);
        cal.add(Calendar.DATE, days);
        return cal.getTimeInMillis();
    }// END: plus

    public long minus(int days) {
        cal.setTime(stringdate);
        cal.add(Calendar.DATE, -days); //line 39
        return cal.getTimeInMillis();
    }// END: minus

    public long getTime() {
        cal.setTime(stringdate);
        return cal.getTimeInMillis();
    }// END: getDate
}

And the stack trace from when exception is thrown:

java.lang.ArrayIndexOutOfBoundsException: 58
at     sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:454)
at java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2098)
at java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2013)
at java.util.Calendar.setTimeInMillis(Calendar.java:1126)
at java.util.GregorianCalendar.add(GregorianCalendar.java:1020)
at utils.SpreadDate.minus(SpreadDate.java:39)
at templates.AnalyzeTree.run(AnalyzeTree.java:88)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)

If a move the part initializing mrsd to the AnalyzeTree class it runs without any problems - however it is not very memory efficient to initialize class each time this thread is running, hence my concerns. How can it be remedied?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

鹿港小镇 2024-11-23 20:36:17

Calendar 和 Date 是可变类的两个示例。您似乎在 ExecutorService 之间共享它们,并且您看到的结果是意外的。我建议为每个线程创建一个 SpreadDate 对象的新实例。

Calendar and Date are two example of mutable classes. You seem to be sharing them across the ExecutorService and the results as you see are unexpected. I would suggest creating a new instance of the SpreadDate object for each thread.

眸中客 2024-11-23 20:36:17

正如前面的答案中提到的,像 CalendarSimpleDateFormat 这样的类不是线程安全的,因此您无法从多个线程同时访问它们。 (JavaDocs 通常明确指定哪些类是线程安全的,哪些类不是.)

一种选择是为不同的线程创建不同的实例(在您的情况下,是 SpreadDate 的不同实例)。

另一种选择是使用 Java 的 ThreadLocal 机制。它允许为每个线程创建一个实例 - 如果同一个线程执行多个任务,它将一遍又一遍地使用同一个实例。这可以提供一个很好的平衡 - 您的代码是线程安全的,但您不会分配大量对象并且不会等待同步。

与任何优化一样,我建议仔细考虑是否确实需要它 - 根据上面的代码判断,我不确定您能获得多少收益。如果您选择使用它,它将看起来像:

public class SpreadDate {

    private static ThreadLocal<Calendar> calThreadLocal;
    private SimpleDateFormat formatter;
    private Date stringdate;

    public SpreadDate(String date) throws ParseException {
        // ...skipped...

        calThreadLocal = new ThreadLocal<Calendar>() {

            @Override
            protected Calendar initialValue() {
                return Calendar.getInstance();
            }
        };
    }

    public long plus(int days) {
        Calendar cal = calThreadLocal.get();
        cal.setTime(stringdate);
        cal.add(Calendar.DATE, days);
        return cal.getTimeInMillis();
    }// END: plus

    // ...skipped...
}

As mentioned in the previous answers, classes like Calendar and SimpleDateFormat are not thread-safe, so you cannot access them concurrently from multiple threads. (JavaDocs often specify explicitly which classes are thread-safe and which aren't.)

One option is to create different instances for different threads (in your case, different instances of SpreadDate).

Another option is to use Java's ThreadLocal mechanism. It allows creating an instance per thread - if the same thread performs several tasks, it will use the same instance over and over again. This can provide a nice balance - your code is thread-safe, but you're not allocating massive amounts of objects and not waiting on synchronization.

As with any optimization, I suggest considering carefully if you actually need it - judging by the code above, I'm not sure you have much to gain. Should you choose to use it, it would look something like:

public class SpreadDate {

    private static ThreadLocal<Calendar> calThreadLocal;
    private SimpleDateFormat formatter;
    private Date stringdate;

    public SpreadDate(String date) throws ParseException {
        // ...skipped...

        calThreadLocal = new ThreadLocal<Calendar>() {

            @Override
            protected Calendar initialValue() {
                return Calendar.getInstance();
            }
        };
    }

    public long plus(int days) {
        Calendar cal = calThreadLocal.get();
        cal.setTime(stringdate);
        cal.add(Calendar.DATE, days);
        return cal.getTimeInMillis();
    }// END: plus

    // ...skipped...
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文