Using the Tester

This note summarises the use of the student solutions tester. See the comments in the code for further detail.

Underlying Assumptions

There are some basic assumptions underlying the specification of the tester.

First, it is assumed that the lecturer wishes to be able to test student solutions as nearly automatically as possible; our own research has shown that one of the most time-consuming jobs when marking assignments can be testing each solution individually.

Second, it is assumed that the lecturer wishes to be able to demonstrate that objective rather than subjective criteria have been used in allocating marks to students.

In order to support the lecturer in both these aims, the tester is designed to test that student solutions perform to specification. Through assuming a reasonably tight specification, we are able first to write tests which check whether or not the student has fulfilled the task assigned, and second to provide the lecturer with objective data about whether or not the student has done so.

The tester is therefore very suitable for providing consistent data across a whole class's solutions; it is not, however, suitable for use in marking free-form assignments, where every submission must be assessed individually and somehow assessing the inventiveness of the solution is likely to feature more highly in the marking process.

Output from the Tester

For each student's solution, the tester will generate two output files (in HTML). The first is a log of the tests carried out (each of which, of course, must be defined by the lecturer when setting the assignment). The second is a brief "scorecard", which summarises the success or failure of the programmer in meeting the specification. See the sample output data for examples of these.

Using the Tester

In order for the tester to know which symbols form part of the student's solution, etc., some information must be provided at the start of each solution file. We suggest that the lecturer provide the students with access to a definition of a suitable defining form - we have provided such a form as define-exercise (see the sample student solution for an example).

Before testing the students' solutions, the lecturer must provide various material. First are needed sample (working!) solutions for each definition the student is asked to submit. These should be stored one per file, to enable the tester to test each of the current student's definitions in isolation from the rest of that student's definitions. For example, suppose the student was to define a function foo which called another function bar; and both definitions formed part of the assignment. If the student made an error in bar, then tests on foo might fail even if the definition of foo was correct. It is therefore advisable to be able to shadow the student's definition of bar when testing foo, by loading in the sample solutions, which are known to be correct.

Second, the lecturer must provide any library files which are needed in order to test the solutions. These should be pre-compiled.

Third, the lecturer must define suitable tests for the definitions the students are required to write. Return values can be easily checked by the tester; however there may be other checks which require a human - for example, is the output clear and readable? Accordingly, the lecturer may, if appropriate, note any manual or visual checks which the person testing the solutions should carry out. These will be printed in the tester output.

Before kicking the tester off to test all the students' solutions, the person running the tests will need to be sure that each solution both loads and compiles. (We put "load" before "compile" deliberately here, as it is quite common for a student's file to load correctly, but not to compile cleanly - or even at all, if the mistakes are serious enough.) It is difficult to automate the process of checking whether a solution loads and compiles: unless the lecturer opts to make it a condition of submission (and therefore of being assigned a grade!) that the solution both loads and compiles without actually signalling errors, the marking process usually involves the person running the tests in carrying out some small edits of at least some number of the students' solutions in order to enable them to be tested fairly. The define-exercise form therefore provides the person running the tests with a place where any problems in loading and compiling can be noted (through editing a copy of the student's solution); these notes will then appear in the tester output.

Finding suitable programs for setting as assignments

Assuming, again, that the lecturer wishes to be able to test solutions as automatically as possible, care must be taken when selecting suitable material for assignments. As far as possible, the results required should be well-specified, and it is much easier to test automatically if students are to be graded on values returned, or other clearly demonstrable aspects of a specification, rather on whether output appears prettily on the screen (or indeed at all).

Another very time-consuming activity can be writing the specification and creating the sample solution. The small sample programs on this website may therefore be of some use in setting assignments: the majority of these have detailed function-by-function specifications, and students can of course be asked to implement these in particular ways. Some of the sample programs include example lists of tasks suitable for use as assignments. See the validating-links and the design-patterns programs, for example.