Multiprocessor Diagnostics

William W. Collier, collier@acm.org.
13 Gary Place, Wappingers Falls, New York 12590
Tel: 845.297.5901.

ARCHTEST is a program which tests the logical behavior of a shared memory multiprocessor (SMMP) when two or more processors simultaneously access the same shared data.

Background

When SMMP systems first appeared, Leslie Lamport defined sequential consistency (SC), the standard of behavior the systems were expected to exhibit:

The result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program. [LAMP79]

SC implies that two strong rules are obeyed:

In "Reasoning About Parallel Architectures" [COLL92] I exhibited programs which detect a failure of a machine to obey SC. Shortly afterwards I founded Multiprocessor Diagnostics and began offering these programs under the name of ARCHTEST.

At about the same time engineers were pointing out that:

What is the "Right" standard of behavior for SMMP systems?

Most machines today, as evidenced by the sample of those tested by ARCHTEST, are not SC. Some perform read operations before logically preceding write operations. Others do this and also perform write operations nonatomically.

Almost all machines are claimed to be cache coherent. This phrase has had different meanings over time.

If machines need not be SC, then of what use is ARCHTEST today?

It is important to recognize that there are still some very basic rules which SMMPs must obey. Here are three elementary examples of basic rules being violated.

Example 1. The machine must compute.

  Initially, A = 0;
  
      P1
    A = 1;
  
  Terminally, A = 23.
Example 2. The rules followed by a uniprocessor must be obeyed.
  Initially, A = X = 0.
  
      P1
    A = 1;
    X = A;

  Terminally, A = 1, X = 0.
Example 3. The machine must be cache coherent (in the CC3 sense).
  Initially, A = U = V = X = Y = 0.
  
      P1          P1   
    A = 1;      A = 2; 
    U = A;      X = A; 
    V = A;      Y = A; 

  Terminally, A = either 1 or 2, U = 1, V = 2, X = 2, Y = 1.
Testing for violations of such basic rules can be very valuable. Some customers have used ARCHTEST in simulation and have thereby found design flaws early in the design process. (They don't run all of ARCHTEST in simulation, of course; they run the basic test programs in assembler language and save the output in a file; then the file is fed into ARCHTEST for analysis.

At the other end of the spectrum some have used ARCHTEST to verify the behavior of a completed system. See [PHIL05] for an example.

Finally, ARCHTEST provides performance information in several forms, including:

Current development efforts for ARCHTEST

ARCHTEST is being improved on several fronts.

The analysis routines now provide more explanatory information describing instances where a machine has violated a rule.

The output routines are now in html format. This will make it easier to annotate, to compare, and to cross reference performance information on different machines in a new round of testing that is about to begin.

Recently Jens Ramsey of Freescale Semiconductor and I independently discovered a bug in the analysis of the results from Test 3 in ARCHTEST. The bug could cause Test 3 to fail to see that a machine did not obey write order. Because of this bug I will update the copy of ARCHTEST, at no fee, held by each current licensee.

Future Development

For years I have looked for new tests, which differed in a logically significant way from the current tests in ARCHTEST, but have found none.

The tests in ARCHTEST currently involve 2-4 threads and 2-4 operands. When there were only 2-4 processors in a system to be tested, this was sufficient. Today it is not.

At one time in the past another fellow and I wrote code to test a new system. I wrote what I thought were very subtle and clever programs. The other fellow wrote programs which tried something simple. If that succeeded, he doubled one of the parameters. If that succeeded, he doubled another one. And so on. When production time came around, the other fellow's programs found far more bugs than mine did.

I propose to use the other fellow's approach in extending ARCHTEST. ARCHTEST2 will have no new logical tests. However, it will have the capability of running many threads, operating on many operands. Plans are still not definite. Ideally, the new code will be available in the summer of 2009.

References

Site Map

Last updated June 15, 2008.