Wednesday, September 1, 2010

2-D Array Memory Layout

Two kinds of multi-dimensional arrays.

There are two basic ways of implementing 2-dimensional arrays: rectangular, sequential, two-dimensional arrays and arrays of arrays. Some languages use one method, some another, and some both. C++ allows both methods, and each has its advantages and disadvantages.
  • rectangular sequential arrays. In this case the rows are laid out sequentially in memory. This provides simplicity in memory management and a slight speed advantage. This is the kind of array C/C++ creates by default.
  • Arrays of arrays. In this style each element of a one-dimensional array is a pointer to another array. This is a common way to create an array of C-strings (a zero-terminated array of characters). This is also useful if the rows are uneven length or are dynamically allocated. The disadvantage is that memory management is more complex, often requiring dynamic allocation and deallocation. This is the only kind of multi-dimensional array in Java. See Arrays of Arrays.

Rectangular arrays allocated in memory by row

Let's see how the memory is allocated for this array.
char ttt[3][3] = {{'x', 'x', 'o'},
{'o', 'o', 'x'},
{'x', 'o', ' '}
};
The memory for this array could be visualized as in the diagram to the right, which identifies a few cells by their subscripts.
2D array memory representation
2D array memory representation
Because memory is addressed linearly, a better representation is like the diagram to the left.

Computing the address of an array element

C++ must compute the memory address of each array element that it accesses. C++ does this automatically, but it helps to understand what's going on "under the hood". Assume the following declaration:
char a[ROWS][COLS];  // assume ROWS and COLS are const ints
Because arrays are laid out in memory by row, each row length is COLS (the number of columns is the size of a row). Let's assume that you want to find the address of a[r][c]. The baseAddress of the array is the address of the first element. The rowSize is COLS in the above example. The elementSize is the number of bytes required to represent the data (typically 1 for char, 4 for int, 4 for float, and 8 for double.
address = baseAddress + elementSize * (r*rowSize + c);
Note
  • The number of rows (ROWS) is not used in the computation.
  • Because the number of rows is not used, there is no need to pass it when declaring a formal array parameter for a two-dimension array.

No comments:

Post a Comment