Fancy M-Scripting, The Function Expansion Episode

As I promised before, not all of these lessons will result in pretty code. This, however, is the worst I believe. Efficiency wise, though, this is the most beneficial. So let’s follow up on an example from the last lesson!

function C = batchMul(A,B)

% Arrange so that the cumulative product will be done 
% along the rows.
A = permute(A,[2 1 4 3]);
B = permute(B,[1 4 2 3]);

% Replicate the dimensions such that the "resulting" 
% matrix now resides in the 2nd and 3rd dimensions.
A = repmat(A,[1 1 size(B,3)]);
B = repmat(B,[1 size(A,2)]);

Notice the matrix replication lines. Before discussing how to make that bit more efficient, let’s discuss a bit of the back-end of the Matlab/Octave platforms. While doing a bit of MEX (Matlab EXecutable) programming, back in the day, I noticed that every operation requires re-allocation of memory for the output matrix. Naturally, there are cases in which this is redundant. For example, given the simple addition operation

A = rand(100);
B = rand(100);

A = addTwoVars(A,B);

whereas the source code of addTwoVars is given by it’s C source code:

#include <mex.h>
#include <matrix.h>
#include <mat.h>

void mexFunction(int nOutput, mxArray ** Output, int nInput, const mxArray ** Input)
{
	// This example shows a per-element addition example of two arbitrary sized matrices

	// Input[0] - Matrix A, "double" class
	// Input[1] - Matrix B, "double" class

	// Output[0] - Per element addition of matrix A and B. Same as writing A + B, "double" classs

	// Verify that there are only two inputs to this function	
	if(nInput != 2) 
		mexErrMsgTxt("There must be exactly two inputs\n");
	
	// Verify that both are of "double" class
	if(!mxIsDouble(Input[0]) || !mxIsDouble(Input[1]) )
		mexErrMsgTxt("Both inputs must be of 'double' class\n");

	// Verify that the number of dimensions of both matrices is the same
	if(mxGetNumberOfDimensions(Input[0]) != (mxGetNumberOfDimensions(Input[1]))
		mexErrMsgTxt("Both inputs must consist of the same number of dimensions\n");
	
	// For each dimension, vary similarity between the two matrices
	for(unsigned int dimIdx = 0 ; dimIdx &lt; mxGetNumberOfDimensions(Input[0]) ; dimIdx++)	
		if(mxGetDimensions(Input[0])[dimIdx] != mxGetDimensions(Input[1])[dimIdx])
			mexErrMsgTxt("Dimension missmatch between inputs\n");

	// Verify that there is an input
	if(nOutput != 1)
		mexErrMsgTxt("There must be one output argument");

	/***************
	* Start adding *
	***************/

	// Get number of elements 
	size_t nElements = mxGetNumberOfElements(Input[0]);

	// Create output matrix
	Output[0] = mxCreateNumericArray(	mxGetNumberOfDimensions(Input[0]), 
						mxGetDimensions(Input[0]), 
						mxDOUBLE_CLASS, 
						mxREAL);

	// Get pointers to all data arrays
	double	*outData = mxGetPr(Output[0]),
		*A = mxGetPr(Input[0]),
		*B = mxGetPr(Input[1]);

	// Per element addition
	for(size_t idx = 0 ; idx &lt; nElements ; idx++)
		outData[idx] = A[idx] + B[idx];

}

Notice that the output variable was completely re-allocated during this code. Well, this is not entirely unavoidable, but such redundancies can definitely be sped up a bit.

So going back to the repmat exaple, I suggest that instead of replicating the matrices before hand, the Binary Singleton eXpansion FUNction is suggested, a.k.a. the renown bsxfun. The usage rules are as such: If there is a singleton dimension in one of the input variables that needs to be replicated, use the bsxfun. For example, the lines

% Replicate the dimensions such that the "resulting" 
% matrix now resides in the 2nd and 3rd dimensions.
A = repmat(A,[1 1 size(B,3)]);
B = repmat(B,[1 size(A,2)]);

% Cumulative product
C = sum(A.*B,1);

can (and should) be replaced with

% Per element multipication
C = bsxfun(@times,A,B)

% Cumulative product
C = sum(C,1);

Or even better, by

% Cumulative product
C = sum(bsxfun(@times,A,B),1

Not half bad, actually. The result in this case is actually very pleasing to the eye! However, in most cases, this decreases readability. For an example, I found a gem in my thesis work. I constantly had to validate weather some coordinates are within a specific range. So I performed a simple Boolean operation.

% Now check if within bounds, define as regular
% inSeg = (cxListRT &lt; segLengths) &amp; (cxListRT &gt;= 0);
inSeg = bsxfun(@and,bsxfun(@lt,cxListRT,segLengths),cxListRT &gt;= 0);

Notice the difference in readability between the line that describes the expansion (the second commented line) and the actual way it is written. Actually, the mere fact that I had to document the line with it’s own “repmat” counterpart, explains about it’s readability.

What about acceleration? Well, the octave results I got were a bit exaggerated. For the following test script

clc
clear all
close all

Ms = [10 25 50 100 250 500 1000 2500];
N = 100;

nIters = 100;

t_repmat = zeros(size(Ms));
t_bsxfun = t_repmat;
t_repmat_curr = 0;
t_bsxfun_curr = 0;

for mIdx = 1:numel(Ms)
  M = Ms(mIdx);
  A = rand([M 1]);
  B = rand([1 M]);
  for i = 1:nIters
    tic
    C = repmat(A,[1 size(B,2)]) + repmat(B,[size(A,1) 1]);
    t_repmat_curr = toc;
    t_repmat(mIdx) = t_repmat(mIdx) + t_repmat_curr;
  end

  for i = 1:nIters
    tic
    C = bsxfun(@plus,A,B);
    t_bsxfun_curr = toc;
    t_bsxfun(mIdx) = t_bsxfun(mIdx) + t_bsxfun_curr;
  end
end
% Average the timings to diminish effects of "first run"
t_bsxfun = t_bsxfun/nIters;
t_bsxfun = t_bsxfun/nIters;

figure('position',[250    214   1188    421]);
subplot(1,2,1);
semilogx(Ms,t_repmat*1000,'-b');
hold on;
semilogx(Ms,t_bsxfun*1000,'-r');
hold off;
xlabel('Number or rows\cols','fontsize',17);
ylabel('t [msec]','fontsize',17);
set(gca,'fontsize',15)
hdl = legend({'repmat','bsxfun'});
set(hdl,'fontsize',15);
subplot(1,2,2);
semilogx(Ms,t_repmat./t_bsxfun);
xlabel('Number of rows\cols','fontsize',17);
ylabel('Speedup: t_{repmat}/t_{bsxfun}','fontsize',17);
set(gca,'fontsize',15)

I got insane accelerations. This, again, is probably because of unpredictable nature of the Octave implementation. But until I have paid good cash to use the superior Matlab, I am not allowed to complain. Or at least until I have found a collaborator that have…

Cheers!

RF With Care

Fancy M-Scripting, The Function Expansion Episode

Leave a Reply Cancel reply