Wednesday, October 14, 2009

GParallelizer Performance

GParallelizer is a Groovy wrapper for new Java concurrency library. It allows you to perform list and map operations using parallel threads, which in theory leverages the full power of multi-processor computations. Here I want to check if it's true in reality. I run the following tests on my dual-core MacBook

import static org.gparallelizer.Parallelizer.*
import org.gparallelizer.ParallelEnhancer
import org.junit.Before
import org.junit.Test

class GParsTest {

def list = []

@Before void setUp() {
1000000.times {
list << (float) Math.random()
}
}

@Test void sequential() {
def start = System.currentTimeMillis()
list.findAll { it < 0.4 }
def duration = System.currentTimeMillis() - start

println "Sequential: ${duration}ms"
}

@Test void parallel_with_enhancer() {
ParallelEnhancer.enhanceInstance list

def start = System.currentTimeMillis()
list.findAllAsync { it < 0.4 }
def duration = System.currentTimeMillis() - start

println "Parallel with enhancer: ${duration}ms"
}

@Test void parallel_with_parallelizer_2() {
parallelWithParallelizer 2
}

@Test void parallel_with_parallelizer_3() {
parallelWithParallelizer 3
}

@Test void parallel_with_parallelizer_5() {
parallelWithParallelizer 5
}

@Test void parallel_with_parallelizer_10() {
parallelWithParallelizer 10
}

def parallelWithParallelizer(threads) {
def start = System.currentTimeMillis()
withParallelizer(threads) {
list.findAllAsync { it < 0.4 }
}
def duration = System.currentTimeMillis() - start

println "Parallel with parallelizer (${threads}): ${duration}ms"
}
}

And here is the output

Sequential: 774ms
Parallel with enhancer: 9311ms
Parallel with parallelizer (2): 1785ms
Parallel with parallelizer (3): 769ms
Parallel with parallelizer (5): 500ms
Parallel with parallelizer (10): 722ms

Something strange happened with mixed-in ParallelEnhancer, but with Parallelizer performance improved indeed. With optimal thread pool size parallel processing is 35% faster than sequential.

Conclusion: Use GPars methods if you need to process big amount of data. Try different config parameters to find the best solution for your particular problem.

Resources

• Brian Goetz on new concurrency library
• Vaclav Pech on GPars

No comments: