The MATLAB programming system


MATLAB combines the features of FORTRAN 90 and BASIC.  Is the result a
heroic centaur or a sterile mule?


1.  Background
2.  Beginning usage
3.  FORTRAN 90 
4.  M files
5.  Examples
6.  Evaluation
7.  Local Installation Details


1.  Background


The author of MATLAB told me that he first wrote the program (in FORTRAN!) as 
an easy interface to the EISPACK and LINPACK routines.
Let's recall that EISPACK and LINPACK are two of the oldest, most
venerable and most influential software packages written.  EISPACK and
LINPACK have been around for at least twenty years, worked their way into 
SCILIB, and only now are being replaced by the new LAPACK library.

Why put up an interface between the programmer and the software?  I didn't
ask the author that question, but in my old job I wrote several programs
that did the same thing, so I feel qualified to an opinion.  

The main reason I was asked to write such programs was to enable students in
linear algebra classes to test out algorithms without having to write
programs.  Most students don't know FORTRAN, and the Math Department found
itself having to teach FORTRAN on top of linear algebra.  And then students
would come to the instructor with programming errors and ask for help.

So it was decided to write a program that accepted simple input statements
from the user, that could define data structures, solve a problem using
a particular algorithm, and display the results.  The program was very
successful, at least from the instructor's point of view.

An interesting sidelight is that some of the instructors use the program
for their own work.  They claim they would never be able to do this kind of
work themselves.  Even though the program necessary to solve a simple
matrix problem is easy to write, the effort isn't worth it to them.  But
when an interactive program is available, they enjoy setting up and solving
problems.  Just as supercomputing allows you to solve bigger problems you
never would have attempted otherwise, an interactive program can allow you
to solve simple problems that would normally be too tedious to set up.

Before we go much further, let me here specify what's necessary to set up
and solve just a three by three matrix system in FORTRAN, using LINPACK:

      DIMENSION A(3,3)
      DIMENSION IPIV(3)
      DIMENSION X(3)
      DATA A /1.0, 4.0, 7.0, 2.0, 5.0, 8.0, 3.0, 6.0, 9.0/
      DATA X /1.0, 2.0, 3.0/
      LDA=3
      N=3
      JOB=0
      CALL SGEFA(A,LDA,N,IPIV,INFO,JOB)
      CALL SGESL(A,LDA,N,IPIV,X)
      WRITE(*,*)(X(I),I=1,N)
      STOP
      END
  
  $ FORTRAN/NOLIST MYPROG.FOR
  $ LINK/NOMAP MYPROG, PSC$LIB:LINPACKS/LIB
  $ RUN MYPROG

Now this is bad enough, but let's go over some of the simple questions an
intermediate user of FORTRAN might ask:

  What's that DATA statement doing?  Why in that order?  How come the
  DATA statement puts the entries in the wrong place when I change A to be 
  of size (4,3)?

  Why do I have to set up IPIV if I never use it?

  How come I put the right hand side in X, but all of a sudden it's the
  answer instead?

  What's the difference between LDA and N?

  What's JOB doing?

  What does NOLIST do?  NOMAP?  

  How come on the Cray you don't have to specify the LINPACK library?
  
  If I change my source code, I have to recompile my program, and relink it.
  What if I relink it first and then recompile?

  Why don't I have to recompile the LINPACK library?

  Do I have to keep around this file MYPROG.EXE?

  How come my matrix A changed after I called this routine?

  It says my matrix A is overwritten by the L, U factors.  Explain
  to me how these two matrices are stored in A.  

Unfortunately, there's another problem here.

  How come I get "floating overflow"?

These questions have to do with the fact that FORTRAN is not a very good
model of the way people think.  It requires much too much structure, and
has lots of rules that people find unnatural.  For instance, the fact that
you must declare the type and size of an object before using it, and that
an object of one type can't be used when an object of another type is 
expected.

But, for example, it's fairly easy to write down some "pseudocode" that
defines the above problem, containing all the information, and in an
natural way:

  n=3
  a is of size n by n
  a(i,j)=j+(i-1)*n
  b(i)=i
  solve a*x=b for x

A friendly language ought not to ask much more than this.  In my program,
the commands you had to use were:

  $ RUN MTH:LINGO
  MATRIX 
  A
  3,3
  1,2,3
  4,5,6
  7,8,9
  VECTOR
  B
  3
  1,2,3
  VECTOR
  X
  3
  0,0,0
  SOLVE
  A
  B
  X
  TYPE
  X
  STOP

Well this isn't so friendly, but the point is it's a lot less structured.
You can make a mistake and simply reenter the information.  If you put
the wrong values into B, you can easily change them.  At any time, you
can print out the values of any of the variables you like.  The syntax
is not quite as clumsy as it looks, since the program prompts you for
everything.  On the other hand, you soon get sick of having the program
printing out menus for you when you know exactly what you want.  Now,
for comparison, here is the MATLAB computation:
  
  matlab
  a = [ 1 2 3; 4 5 6; 7 8 9];
  b = [ 1 2 3];
  x=a \ b
  quit

It can't get simpler than that!  I think, in this case, at least, the language
of MATLAB is able to exactly fit the problem.  You don't have to say any
more than is necessary.  You don't have to include DIMENSION or REAL statements.
There just aren't many opportunities for typing mistakes.  

This simplicity is a tremendous advantage for MATLAB.  There are larger,
more complicated problems that MATLAB can solve in just as few statements.
This means that MATLAB wins big in terms of friendliness and power.  The
question we must ask is, what is the cost?  $10,000 on the Cray YMP, for
one thing.  But for another, is this a program that BELONGS on the YMP?

We can open up a religious debate at this point.  Historically, as 
supercomputers have gotten cheaper and more widespread, more users and
more applications have jumped onto them.  Many of these users bring their
habits from minicomputers or other machines where MegaFLOP envy is not
a problem.  They expect the same programs to be available on the Cray
that they had elsewhere, and it does not matter to them in the least
whether these programs are efficient.

So one question is: is it any of our damn business?  If a user has a problem,
and wants to solve it with a given program, when can we say no?  (I propose
that we say no at the time of the initial grant proposal, or not at all).

Another question is: is MATLAB one of these inefficient programs?  Assuming,
that is, that we can estimate efficiency.  To be fair, we probably should not
insist that MATLAB solve all problems efficiently, but should require that
it do so on big problems at least.


2.  Beginning Usage


The fundamental object in MATLAB is the array.  The number of dimensions,
and the extent of the array, are not explicitly declared by the user, but
determined from the context of assignment statements.  Moreover, all
arrays are assumed to be complex, although MATLAB will notice that an
object is essentially real, or essentially integer.

Matlab has some use for scalars, particularly constants such as 2.0,
and defined values such as "pi".  

MATLAB can determine the number of dimensions and the extent of any
object.  It may adjust its behavior depending on whether it is operating
on a scalar, vector, or matrix.  In many cases, MATLAB will treat a
matrix as a set of column vectors.  You can override this behavior in
some cases, by specifying a range of index values, or by specifying that
a matrix is temporarily to be treated as a vector.  If A is a matrix,
for example, you can treat it as a vector by writing A(:).


MATLAB expects as input a series of statements which are either expressions,
or statements assigning the value of an expression to a variable.  MATLAB
will normally print the value of the expression unless you request it not
to do so with a trailing semicolon.


The simplest assignment statement sets a scalar, a vector, or a matrix
to a set of constant values:

  s = x
  v = [ v1 v2 v3]
  m = [ v1 v2 v3 ; v4 v5 v6; v7 v8 v9]

In each case, no previous statement about the dimension of the object
is required.  MATLAB determines that from context, and keeps track of
the information when needed later.

You can build matrices from other matrices using the [ ] notation.
Assuming x and y are column vectors, [x y] is the matrix whose first
column is X and second column is Y.  In this regard, here is a
definition of the Hadamard matrix in MATLAB language:

  H0 = [1]
  H1 = [H0 H0 ; H0 -H0]
  H2 = [H1 H1 ; H1 -H1]
  ...

Can you see what is going on here?

[1]  [1  1]  [1  1  1  1] ...
     [1 -1]  [1 -1  1 -1]
             [1  1 -1 -1]
             [1 -1 -1  1]


More ambitious statements involve simple arithmetic operators.  We can issue
statements like:

  A = [1 2 3; 4 5 6; 7 8 9] 
  B = A + 1
  C = 2 * A * B

Note that MATLAB correctly determines that A + 1 and 2 * A are matrix-scalar
operations, but that A * B is standard matrix multiplication.  

MATLAB includes ways of quickly building certain standard matrices.  To
set A to the 5 by 5 identity matrix, the commands are

  n = 5
  A = eye(n)

Similarly, "zeros(n)", "ones(n)" and rand(n) specify N by N arrays of zeroes,
ones, or random values.  The diag(x) command creates a matrix whose entries
are all zero, except that the diagonal entries are equal to X.  However,
if A is a matrix, diag(A) creates a vector containing the diagonal entries
of A.  What happens when you type diag(diag(A))?

Interestingly you can also assign "zeros(m,n)".  This means that MATLAB allows 
you to call a function with only a partial argument list, if you like, and it 
can tell how many arguments you supplied, and make up for the missing 
arguments by supplying defaults.  This means you can write a single function 
that can be used by an idiot and a sage.


The full set of operators available is +, -, *, ^, ' (complex conjugate
transpose), \ (left division) and / (right division).

A \ b means essentially INVERSE(A) * b, so to solve A*x=b for x, you
write x = A \ b.  

A / b means essentially b * INVERSE(A), which allows you to solve x*A=b.


In some cases, the obvious interpretation of a matrix operation may not be 
what is desired, or it may be necessary to state whether an operation is to be 
carried out in the matrix or scalar sense.  MATLAB chooses the matrix 
interpreation by default, but you can specify you want to do an element 
by element operation by prefixing the operator with a period.  Thus, using the 
same matrix A above,

  A * A = [  30  36  42 ]   A .* A = [  1  4  9 ]
          [  66  81  96 ]            [ 16 25 36 ]
          [ 102 126 150 ]            [ 49 64 81 ]


MATLAB has a wide set of built in functions, with names familiar to FORTRAN
programmers, including 

  abs, sqrt, real, imag, fix, exp, log, sin, cos, tan, max, min

All of the functions can take scalar, vector or matrix arguments.  It's
easy to see what this means for ABS; if B is a matrix, ABS(B) is a matrix
of the same size, with each entry the absolute value of the entry in B.

SQRT and EXP may raise a question in your mind.  Isn't there a matrix
square root and a matrix exponential?  Indeed there is, but instead of
resolving the problem with a period, MATLAB assumes that SQRT(B) means
the matrix whose entries are the square roots of the entries in B, and
provides a separate function name, SQRTM, for the matrix square root,
as well as EXPM and LOGM.

MAX and MIN also have features that may be unexpected.  If V is a vector,
we know what to expect from MAX(V); a scalar value that is the maximum
in the vector.  But what is MAX(A) where A is an array?  The authors
decided that the most useful definition was to define this to be a vector,
consisting of the maximum entries in each column of A.  We could always
get the maximum value of the whole matrix either by typing

  max(max(a))        <-- max(a) is a vector of values, and max(max(a)) is
                         a single value.

or

  max(a(:))          <--  The A(:) states that A is to treated as a vector


3.  FORTRAN 90


FORTRAN 90 and MATLAB were both developed to try to improve the handling of
arrays.  Both have come up with some features that are identical.

In both, for example, you may say "A=B" or "A=2*B+1" where A and B are
arrays.  In FORTRAN 90, you may write array expression such as SQRT(A),
F/G, and so on, but unfortunately, A*B is interpreted as element-wise
multiplication!


Now let us compare other features.


First of all, let us address the question of subscripting.  MATLAB provides
a way of generating indices identical to the FORTRAN DO loop conventions.
For example, the expression 1:5 is analogous to the DO loop limits 
in the expression "DO 10 I=1,5".  MATLAB actually expands 1:5 to be the
vector [1 2 3 4 5].  

Tragically, the MATLAB implementors picked the wrong way to define an
increment.  Rather than LO:HI:INC, they chose LO:INC:HI!  Thus, the expression 
"1:3:10" expands to [1 4 7 10], but the expression "1:10:3" expands to [1].  
You can use such an expression to initialize a vector:

  x = 5:5:100

The values do not have to be integers.

  x = 0:pi:pi/20
  y = sin(x)


To access a single entry of an array, you simply use the usual notation
of parentheses.  Thus it is legal to say

  A(3,5) = A(5,3) + 17

But to access a submatrix, you use the colon notation above:

A(1:5, 1:5) describes the 5 by 5 submatrix of A consisting of the
first five rows and columns.  Similarly, A(1:2:N, :) describes
the matrix made up of the even rows of A and all the columns of A.
(Just like in FORTRAN 90, the occurrence of a lone colon in an index
position indicates that index extends over the full range).

You can also index an array indirectly.  X(V) is the vector [X(V(1)), 
X(V(2)), ..., X(V(N))], again, just like in FORTRAN 90.


A FORTRAN 90 construct of some use is the ability to write conditional
statements of the form  

      WHERE (X.LT.1.0) 
        X=X+1.0
      ELSEWHERE
        X=X-1.0
      ENDWHERE

MATLAB provides the FIND statement, which returns the indices for which
a given condition is true.  The statements

      i=find(X < 1)
      j=find(X >= 1)
      x(i)=x(i)+1
      x(j)=x(j)-1

do the same operation as the FORTRAN 90 WHERE did.

Another MATLAB control construction is the IF statement.  MATLAB
allows you to compare vectors or matrices for equality or inequality.
Note, however, that the condition must be true for all entries to be true.
For instance:

  if A == B 
    disp('Matrices are equal')
    end

However, the statement

  if A ~= B
    disp('Matrices are not equal')
    end

is NOT going to work if only a few entries of A and B are not equal.  This is 
because the way MATLAB interprets a matrix conditional is that it must be the 
case for every entry.  Thus two matrices are equal if every entry is equal, 
and not equal if EVERY entry is not equal.  This is not quite what one would 
expect!

MATLAB includes two logical operators, ANY and ALL, that allow you to
control the behavior of IF statements.  Both ANY and ALL are "vector"
operators.  That is, they prefer to operate on a vector, and return
a scalar, but can operate on a matrix (as a set of column vectors)
returning a vector.  For example,

  if all( all (A < 0.0))
    disp('Matrix was all negative!  Reversing!')
    A = -A
    end

  if any ( any(A < 0.0))
    disp('Matrix has at least one negative entry!')
    end

  if all ( any (A < 0.0))
    disp('Every column has at least one negative entry!')
    end

  if any (all (A < 0.0))
    disp('Some row is entirely negative!')
    end


4.  M files


Although MATLAB is thought of as an interactive system, you wouldn't
get very far if you had to start it up from scratch every time you ran it.
Although MATLAB can dump its variable memory into a file, and read that back 
in, you probably want to store repeated sequences of commands and invoke
them by a brief name.  

In the simplest case, if there are just a sequence of commands with
no dummy variables, you can just type those commands into a file whose
extension is ".m", and then invoke the commands by naming the file.
Thus, to compute machine epsilon, I could create a file called "epsilon.m"
containing the lines

tiny=1.0
while (tiny + 1 >= 1.0)
  tiny = tiny/2
  end
tiny=2*tiny
fprintf('Machine epsilon is %g\n',tiny)

and then, in my program, typing "epsilon" would cause these lines to execute,
with exactly the same effect as if I had typed them in directly to MATLAB.


More to the point, however, are operations which I want to "parameterize".
As a simple example, I could want to define the area of a circle through
a function statement, so that in matlab I could say

  x = area(r)

To do this, I again create a file, with the name "area.m", containing
lines like the following:

  function a = area(r)
  area = pi * r * r

Most M files are more elaborate than this.  In particular, the file
can check the number of arguments input with the NARGIN function, and
the dimensions of the input object with the SIZE function.  It is possible
to return one or more scalars, vectors or arrays as output.  The variables
in a function M file are local, in the same way that FORTRAN subroutine
local variables are hidden from the calling program.


It is in the area of M files that MATLAB makes itself most valuable.
There are many people churning out such files, and it is possible to
grab them from NETLIB.  These files encapsulate procedures for geophysical
data analysis, chemometrics, signal analysis, and other fields which the
writers of MATLAB never touched.  
  

5.  Examples


The "MATLAB" example shows how to set up a simple matrix and ask some
questions about it.

The "MATMUL" example shows an attempt to benchmark MATLAB by having it
multiply two 256 by 256 matrices.

The "POWER" example shows my attempt to write the power method for
finding the dominant eigenvalue of a matrix.

The "PLOT" example shows some sample plots.

The "CHAOS" example shows an ambitious plot, prepared using some
M files received over the network.


6.  Evaluation


There are a number of programs available for symbolic calculations.
MACSYMA is unsupported, and written in LISP.  REDUCE seems to be under
active development, but is also written in LISP.  We have versions of
REDUCE on VMS and UNICOS.  MAPLE is not available for the Cray, although
we have a copy on VMS.  MATHEMATICA wants a quarter of a million dollars
for a Cray version, ha ha ha.  And now I've read reviews saying THEORIST
is better than MATHEMATICA.

I have no experience with these programs.  I have some vague impressions.
Perhaps one of the biggest differences is that MATLAB is NOT a symbolic 
calculation package.  If you wish to integrate X*SIN(X), MATLAB does not
have a table of rules that includes an entry for the sine function.  
MATLAB solves the problem in the same way that a FORTRAN program would,
by weighted summation of the values at various points.  MATHEMATICA and the 
other packages look up rules for determining the integral from the given
formula with no actual computation.  Thus those programs are doing "artificial
intelligence" for this problem, whereas MATLAB is carrying out a much
more straightforward calculation.  MATLAB is not really solving your problem
in a new way, but rather is carrying out all the bookwork and I/O to get
your problem through a fairly standard algorithm.

Graphics is good...a big barrier for most people.

Easy to improve performance by hooking up to system libraries.

Thus, I conclude that MATLAB

  is appropriate for supercomputing, because it focusses on vectors
    rather than on symbolic formulas,
  has the potential for higher performance, 
    (using Standard C and SCILIB where possible),
  is compatible in philosophy with the FORTRAN 90 array syntax, 
  is sufficiently friendly to attract users,
    and powerful enough to keep them,
  hides exactly those parts of a calculation that are of no interest
    to the programmer,
  makes it easy for programmers of moderate skill to do things
    they wouldn't have tried on their own.


7.  Local Installation Details


MATLAB has been given to us for a two month trial period, 1 July to 31 August.

MATLAB is only available on UNICOS.  It is invoked simply by typing "matlab".

There is online help available for MATLAB, on both VMS and UNICOS, by typing
"pschelp matlab".  Inside of MATLAB, you can type "help".

There are several online documents: 

  MATLAB.DOC contains local details and a list of all commands.
  MATLABPRIMER.TEX is a nice primer, with exercises and explanations.
  MATLABPRIMER.DOC is a plain text version of the primer.

In the MATLAB examples directory (DOC$ROOT1:[EXAMPLES.MATLAB] on VMS,
and /usr/local/examples/matlab on UNICOS) there is a sample NQS job containing 
the text of a few interactions with MATLAB.  Obviously it is unnatural to try 
to use an interactive program in batch, but it does give you something to look 
over.  

Also in the MATLAB examples directory are a number of subdirectories containing
various M files.  These cover such topics as chemometrics, signal analysis,
and fancy special matrices.  These M files are also available online,
and so MATLAB knows about them and can access the functions they define.

There are a couple copies of the MATLAB User's Guide available for inspection.
There is also the book "Handbook for Matrix Computations" by Coleman and
Van Loan, which describes and compares LINPACK, EISPACK and MATLAB.

There is a MATLAB directory in NETLIB, the electronic software distribution
service.  It contains subdirectories of M files.

There is also a MATLAB users group, which has a news digest distributed
electonically.