Friday, December 14, 2012

On Shaping Arrays

The best thing about Fortran is its handling of arrays, in particular the multi-dimensional ones. As someone who models astrophysical fluid flows for a living, I tend use 4-dimensional arrays (3 for space and 1 for variables) in most of my relevant codes.

In defining arrays, there are three methods by which we can use to define them:
   (1) nested do loops
   (2) forall assignment
   (3) nested implied-do loops

I presume every Fortran user knows of the first of these methods very well, as it is the most used and most versatile method of defining an array. I think most people have heard of the forall assignment, which was once part of the now-defunct High Performance Fortran but added to Fortran 95 update. It is used as follows,
forall(i=1:iMax, j=1:jMax, k=1:kMax) A(i,j,k)=i*k+j/jMax
I can pass the i,j,k indices to a function that returns a value, so it can work much like the nested do loop. The original point of the forall assigment was for optimization in defining arrays, but since most compilers are pretty good at optimizing loops for memory purposes, the use of this command is declining. For an array of size (750,750,750), it takes about half a second to define the array with the above definition using either of the two methods.

The last method, the nested implied do loop, is probably about as useful as the forall assignment, but it introduces the command reshape, which has some useful applications for clusters and file I/O. Suppose we have a 2D array defined in the nested do loop,
do j=1,jMax
   do i=1,iMax
     B(i,j) = i*j
we can collapse the inner loop as,
do j=1,jMax
   B(1:iMax,j) = (/ (i*j, i=1,iMax) /)
But in collapsing this other do loop,
B =(/ ((/ (i*j, i=1,iMax) /), j=1,jMax) /)
we would get an "incompatible ranks" error. This is because B is defined as a 2-dimensional array, B(iMax,jMax), and the above doubly-collapsed loop is a 1-dimensional array with dimensions of (iMax*jMax). To get into the 2D array from the 1D array, we can use the reshape command. For ease, I have written the (iMax*jMax) array as a separate line:
tempArray=(/ ((/ (i*j, i=1,iMax) /), j=1,jMax) /)
B =reshape( tempArray, (/iMax,jMax/) )
Extending this to a 3D array should be straight-forward.

Nested do loops should be the primary method for defining your arrays as it is the most clear method (you just have to remember that Fortran is column-major). The forall assignment, while sometimes convenient, just is not as useful as it was thought to be, probably should be dropped from codes that currently employ it, a should be marked as obsolescent in future updates to Fortran. The nested implied do loops probably should not be used for defining arrays, but it is useful for understanding how reshape works for use in other areas of codes.

Comments are always welcome!

No comments:

Post a Comment