3

I want to perform parallel calculation for several sets of input values. Do I need to synchronize the calculate(a, b, inputIndex) method?

private static final String FORMULA = "(#{a} + #{b}) * (#{a} + #{b} * #{b} - #{a})";
private List<Pair<Integer, Integer>> input = Arrays.asList(
        new ImmutablePair<>(1, 2),
        new ImmutablePair<>(2, 2),
        new ImmutablePair<>(3, 1),
        new ImmutablePair<>(4, 2),
        new ImmutablePair<>(1, 5)
);
private List<String> output = new ArrayList<>(Arrays.asList("", "", "", "", ""));

public void calculate() {
    IntStream.range(0, input.size()).forEach(idx -> {
        Pair<Integer, Integer> pair = input.get(idx);

        Thread threadWrapper = new Thread(
                () -> this.calculate(pair.getLeft(), pair.getRight(), idx)
        );
        threadWrapper.start();
    });

    try {
        Thread.sleep(4000); // waiting for threads to finish execution just in case
        System.out.println("Calculation result => " + output);
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
}

private void calculate(Integer a, Integer b, int inputIndex) {
    System.out.println("Thread with index " + inputIndex + " started calculation.");
    Evaluator eval = new Evaluator();
    eval.putVariable("a", a.toString());
    eval.putVariable("b", b.toString());

    try {
        String result = eval.evaluate(FORMULA);
        Thread.sleep(3000);
        output.set(inputIndex, result);
        System.out.println("Thread with index " + inputIndex + " done.");
    } catch (EvaluationException | InterruptedException e) {
        e.printStackTrace();
    }
}

Because if the code of calculate method was inside run method of Runnable I wouldn't need to do that. (also I think I don't need synchronized collections there because for input I only get data by index and for output I put element into the certain position)

lapots
  • 12,553
  • 32
  • 121
  • 242
  • Why do you think `ArrayList.add` is thread-safe? Have you read the documentation for [`ArrayList`](http://docs.oracle.com/javase/8/docs/api/java/util/ArrayList.html)? As a side note, `Runnable r = () -> {}; r.run();` doesn't start a new thread, so your code is single-threaded. https://stackoverflow.com/q/8579657/2891664 – Radiodef Jun 07 '17 at 19:56
  • 1
    @Radiodef, I got that after couple of runs. I know it is unsafe. Oh! My mistake I thought I used `set` method there. Fixed. – lapots Jun 07 '17 at 19:59
  • 1
    yes , you'r right . you don't need synchronize calculate as you do read operation and put by index (set) . also think about use thread pool for calculation. also you can just pass only idx as income param to calculate and get a and b inside calculate. – xyz Jun 07 '17 at 20:14
  • if you will use CompletableFuture.get() you can avoid Thread.sleep() – xyz Jun 07 '17 at 20:16
  • @sbjavateam you mean that `sleep` that inside `calculate(a, b, inputIndex)`? – lapots Jun 07 '17 at 20:21
  • yes , and for Thread.sleep(4000); // waiting for threads to finish execution just in case . looks like you use it to wait calulation result. – xyz Jun 07 '17 at 20:23
  • execution might finish in 1 sec but you should wait 3/4 , or 34 not enoguh (not shure that it's your case , but in common) time – xyz Jun 07 '17 at 20:25
  • Well I put random 4 value. I don't really care about consumed time to wait – lapots Jun 07 '17 at 20:30
  • 1
    Without some form of thread synchronization, the writes to `output` from one thread very well might not be visible to the reads from `output` from another thread. You at least should declare `output` as `final` and synchronize on that object. Read up on _happens-before_. – Lew Bloch Jun 07 '17 at 21:08
  • @LewBloch in that case isn't declaring output as volatile enough ? – Adonis Jun 08 '17 at 06:55
  • @LewBloch but does it have matter in my case? I set elements on the certain position so I do not really care about the state of other list of elements and the order in which the result was set – lapots Jun 08 '17 at 07:06
  • Just a side note: what’s the point of `new ArrayList<>(Arrays.asList("", "", "", "", ""))`? You can use either, `Arrays.asList("", "", "", "", "")` directly or `new ArrayList<>(Collections .nCopy(input.size(), ""))`, which is much cleaner. – Holger Jun 08 '17 at 12:53
  • Like I said, study up on _happens-before_. It does matter, @user1432980. – Lew Bloch Jun 08 '17 at 13:51

1 Answers1

2

It’s important to emphasize that trying the code and getting the correct output is not sufficient to prove the correctness of a program, especially not when multi-threading is involved. In your case it may happen to work by accident, for two reasons:

  • You have debug output statements, i.e. System.out.println(…), in your code, which is introducing thread synchronization, as in the reference implementation, PrintStream synchronizes internally

  • Your code is simple and doesn’t run long enough to get deeply optimized by the JVM

Obviously, both reasons may be gone if you use similar code in a production environment.


For getting a correct program, even changing calculate(Integer a, Integer b, int inputIndex) to a synchronized method is not sufficient. Synchronization is only sufficient for establishing happens-before relationships regarding threads synchronizing on the same object.

Your initiating method calculate() does not synchronize on the this instance and it also doesn’t perform any other action that could be sufficient to establish a happens-before relationship with the calculating threads (like calling Thread.join(). It only invokes Thread.sleep(4000), which obviously doesn’t even guaranty that the other threads completed within that time. Further, the Java Language Specification states explicitly:

It is important to note that neither Thread.sleep nor Thread.yield have any synchronization semantics. In particular, the compiler does not have to flush writes cached in registers out to shared memory before a call to Thread.sleep or Thread.yield, nor does the compiler have to reload values cached in registers after a call to Thread.sleep or Thread.yield.

For example, in the following (broken) code fragment, assume that this.done is a non-volatile boolean field:

while (!this.done)
    Thread.sleep(1000);

The compiler is free to read the field this.done just once, and reuse the cached value in each execution of the loop. This would mean that the loop would never terminate, even if another thread changed the value of this.done.

Note that what has been said in the example about this.done applies to the array elements of the backing array of your list as well. Weren’t you using immutable String instances, the effects could be even worse.


But it isn’t necessary to make the entire method synchronized, only the data exchange must be thread safe. The cleanest solution is to make the entire method side effect free, i.e. turn the signature to String calculate(Integer a, Integer b) and let the method return the result instead of manipulating a shared data structure. If the method is side effect free, it doesn’t need any synchronization.

The caller has to assemble the result values to a List then, but since you are already using the Stream API, this operation comes for free:

private static final String FORMULA = "(#{a} + #{b}) * (#{a} + #{b} * #{b} - #{a})";
private List<Pair<Integer, Integer>> input = Arrays.asList(
        new ImmutablePair<>(1, 2),
        new ImmutablePair<>(2, 2),
        new ImmutablePair<>(3, 1),
        new ImmutablePair<>(4, 2),
        new ImmutablePair<>(1, 5)
);

public void calculate() {
    List<String> output = input.parallelStream()
            .map(pair -> this.calculate(pair.getLeft(), pair.getRight()))
            .collect(Collectors.toList());
    System.out.println("Calculation result => " + output);
}

private String calculate(Integer a, Integer b) {
    System.out.println(Thread.currentThread()+" does calculation of ("+a+","+b+")");
    Evaluator eval = new Evaluator();
    eval.putVariable("a", a.toString());
    eval.putVariable("b", b.toString());

    try {
        String result = eval.evaluate(FORMULA);
        Thread.sleep(3000);
        System.out.println(Thread.currentThread()+" with ("+a+","+b+") done.");
        return result;
    } catch (EvaluationException | InterruptedException e) {
        throw new RuntimeException(e);
    }
}
Holger
  • 285,553
  • 42
  • 434
  • 765
  • this has caught my eye: `In particular, the compiler does not have to flush writes cached in registers out to shared memory`. Since when java-docs explain what a *particular CPU* architecture might do? Also `flush writes cached in registers out to shared memory` seems misleading if shared memory is RAM, since it is not this that happens in real cpus... I wonder why not the usual happens before explanations here – Eugene Jun 08 '17 at 18:25
  • @Eugene: this is not about a particular CPU architecture, it is you, who is taking the words in the narrow sense of a particular CPU architecture. Just consider “registers”==“unshared memory” and “shared memory==shared memory” and “flush”==“an explicit action inserted by the compiler”. It is a general pattern of almost every architecture to have faster unshared memory, be it registers, be it caches without coherence, or local memory nodes and the compiler will try to utilize this unshared memory as efficient as possible. Note how the cited text uses “cache” and “registers” synonymously. – Holger Jun 09 '17 at 08:05