Friday, August 28, 2009

MPJ Express on Infiniband

Courtesy: Zafar Gillani ( who worked on porting MPJ Express on Infiniband.

MPJ Express can be executed on InfiniBand in couple of ways (so far):

Use IPoIB: For this just provide IP addresses in the "machines" file alloted to the InfiniBand HCAs. Run the MPJ Express as on GigE or FE.

Over SDP: First create a configuration file somewhere on the system (such as in ~/mpj-user/ directory). Lets say we created "sdp.conf". There are two approaches to this, either write bind rule or connect rule in "sdp.conf" file (as explained below). Comments are represented via a # sign.

example: bind *

example: connect *

* in port means it can use any free port on runtime. The port can be defined in range as well such as 15000-*. A separate bind rule has to be defined for each compute node that has configured IB HCA. This is sort of a machines file:

bind *
bind *
bind *
and so on

When to use bind rule? SDP protocol transport should be used when a TCP socket binds to an address and port that matches the rule. In Java this is equivalent to socket.bind(SocketAddress). Bind rule is recommended since this will explicitly bind an unbound socket to an IP address using the SDP protocol.

When to use connect rule? SDP protocol transport should be used when an unbound TCP socket attempts to connect to an address and port that matches the rule.

To execute MPJ Express on InfiniBand simply use the following command: -Dcom.sun.sdp.conf=sdp.conf -np 2 Application.

Switch is optional but is recommended since this will explicitly tell JVM to use IPv4. This prevents Java from using IPv6 if IPv6 is enabled on IB HCAs (InfiniBand Host Channel Adaptor analogous to NICs).

Tuesday, July 14, 2009

Nested Parallelism Using MPJ Express

Steps to write a nested parallel Java application using MPJ Express and Java OpenMP (JOMP). JOMP can be downloaded from here.

Step 1: Write

  • Edit the HybridApp.jomp - the source code is shown in the Figure below. Here we rely on MPJ Express for node parallelism and JOMP for thread parallelism.

Step 2: Translate and Compile

Once we have written our application, we need to translate the .jomp file to .java. Later the Java file is compiled using the Java compiler.

  • The code is translated from .jomp to .java by using the command:

    aamir@barq:~/tmp/jomp> java -cp $MPJ_HOME/lib/mpj.jar:jomp1.0b.jar jomp.compiler.Jomp HybridApp

  • Compile the Java application:

    aamir@barq:~/tmp/jomp> javac -cp $MPJ_HOME/lib/mpj.jar:.:jomp1.0b.jar

Step 3: Execute

  • Write machines file:



  • Start MPJ Express daemons:

    aamir@barq:~/tmp/jomp> mpjboot machines

  • Run the hybrid code:

    aamir@barq:~/tmp/jomp> -wdir ~/tmp/jomp -cp $MPJ_HOME/lib/mpj.jar:.:jomp1.0b.jar -np 2 -Djomp.threads=2 -dport 11050 HybridApp

    Hello from process <1:0>
    Hello from process <1:1>
    Hello from process <0:0>
    Hello from process <0:1>

  • Halt MPJ Express daemons:

    aamir@barq:~/tmp/jomp> mpjhalt machines

Tuesday, May 12, 2009

Communicating Multi-dimensional arrays using MPJ Express

Many scientific applications utilize multi-dimensional arrays for storing data. Naturally in the parallel versions, there is a frequent requirement of communicating these multi-dimensional arrays.

Currently the MPJ Express version "directly" supports communicating basic datatypes to and from single dimension arrays. But, it is obviously possible to communicate multiple dimension arrays. Here we'll see how 2D arrays are communicated using the MPJ Express software.

There are mainly two ways of communicating 2D arrays. The first is to communicate this data using the MPI.OBJECT datatype. The second is to map (or flatten) the 2D array onto a 1D array and communicate normally. If you are looking for performance, the second option is recommended. The main reason is performance, the first option is severely hampered by the performance of Java's serialization.

The MultidimMatrix class communicates data from the 2D array using the MPI.OBJECT datatype.

The MultiAsSingleDimMatrix class stores the 2D array as a 1D array. By doing this, the data can be communicated using the datatype of the array---in our example, the MPI.DOUBLE datatype.

Friday, May 8, 2009

Overhead of using Multi-dimensional arrays in Java

As discussed many times in the related literature, the Java multidimensional arrays introduce a good amount of overhead because of the way they are stored in the memory. The purpose of this post is to quantify this performance overhead.

Let's first look at the matrix multiplication implementation that uses the multidimensional arrays (the 2D version).

Now let's look at the same code but implemented using single dimension array. Basically the two dimensional array is mapped onto the single dimension array (the 1D version).

Now put some performance results:

aamir@barq:~/tmp> java MultidimMatrix
time => 19.821604495
aamir@barq:~/tmp> java Matrix
time -> 11.928243662

The 1D code is 1.66 times faster than the 2D code.

Tuesday, May 5, 2009

Parallel Programming with Java

I recently gave a talk on "Parallel Programming with Java" to Masters students at the University of La Coruna, Spain.

This talk gives a good introduction on how to get started with the MPJ Express software.

Monday, May 4, 2009

Ports Used by the MPJ Express Software

There are three kinds of ports used by the MPJ Express software.
  1. Daemons Ports (where MPJ Express daemons listen on compute-nodes). The value of this can be changed in two steps:
  • Edit $MPJ_HOME/conf/wrapper.conf and search for property "". Change it to whatever you want- lets call it X.
  • Now you start your parallel application, use -dport switch to specify X. Some thing like this - ... -dport X ..
  1. There are also ports used by MPJ Express runtime on the head-node. This is used to ship across the code. The default value of this is 15000. It can be changed by using the switch -sport Y - where `Y' is the port that you choose.
  2. Each MPJ Express process uses a port for communication with peers. This can be changed by using -mpjport Z switch to - the default value is 20000.