Skip navigation to content

3.2.3 Assessment

Assessment is made of students' abilities in the various modules that they take. Assessment must take place against published criteria that are appropriate for the work in hand and must reflect what it is that modules and programmes at specific SCQF levels intend to deliver.

Note that not every element of every programme has to deliver on all that the programme sets out to achieve: the properties of various modules combine to deliver on the programme.

Assessment can be diagnostic, formative or summative (and in some instances will be two or three of these – they are not mutually exclusive). Diagnostic assessment can be used to determine standards of pre-existing knowledge or competency at the start of a class; formative assessment is used to help determine how students are progressing without the need to have the marks used as a formal judgment. Summative assessment – that which counts towards module grades – is generally also formative.  Effective feedback on performance in summative examinations should help students improve their performance in future tests.


3.2.3.1 Key features of the University of St Andrews Assessment Strategy

If Schools identify processes that ensure that assessment is transparent, reliable, valid and objective, then the University and students can have confidence that the marks assigned are appropriate.

  • Transparent: There are clear criteria against which the work is being judged and students are informed of all assessment procedures at the beginning of each phase of the programme.
  • Reliable:  Assessment provides an accurate estimate of student performance such that, if assessed again by a different exam or examiner the same outcome would occur.
  • Valid:  Does the examination test what it should? Does the assessment match the learning objectives or exam blueprint?
  • Objective:  The ideal for assessment – both content and procedures – is that it should be sufficiently clear and free from bias such that two independent, properly informed markers would reach the same mark.

Standard setting can be understood as a simple question: how is it determined that a particular element of work is worth the mark given?  Standard setting in St Andrews University does not involve relative (norm-referenced) methodology requiring the fitting of marks to a predetermined, normally distributed, grade curve such that a fixed proportion of students achieve particular grades (and such that the proportions of students achieving those grades can be standardized across disciplines). This approach would require homogeneity of student abilities and numbers for every module in every subject in every academic year. It also cuts across the independence of different Schools and disciplines to determine their own standards. In most academic Schools standard setting is relatively straightforward.

  • Many tests of complex calculation or knowledge allow for an accumulation of marks on an objective basis. All that is normally required here is that the questions set show an incremental level of difficulty such that there is sufficient challenge to discriminate between students with different aptitudes and abilities.
  • For more qualitative work (such as essays, dissertations, reports) the normal standard setting methodology is that every student's work is assessed individually using criterion referenced standards.  This approach determines whether a student knows enough for a particular purpose such as passing a module or is achieving a level of performance consistent with certain degree classification levels. The assessor determines the level of performance required of students. Effective marking must reflect properly what the intended learning outcomes of the teaching in question are.

There are however particular requirements in some disciplines, such as Medicine, where issues of professional competency require a more formal approach to standard setting. Traditionally, pass marks for assessments have been set at an arbitrary level. However since assessments are likely to vary in difficulty some mechanism must be adopted by which an appropriate pass mark is determined. In order to achieve this, the Bute Medical School applies standard setting to individual components of every assessment. Here, standard setting is a procedure which estimates the degree of difficulty of an assessment. It ensures consistency of results between different forms of assessment and between different modules and requires that specific levels of competency be shown in order to pass a test. This requires methods based on judgments about test questions, such as the Angoff (or a modified Angoff). The Medical School draws the test judges from those who taught the material in the modules and those who had particular expertise in certain disciplines within the programme. These judges meet to consider and standard set the exam paper.

External Examiners and Deans have a Critical Role in Standard Setting.  Both External Examiners and Deans have a role in approving programmes of study and modules; examination and coursework formats; exam questions; and in reviewing the performance of students in examinations. The External Examiners provide a discipline-based reference point for work in St Andrews, while the Deans provide Faculty-wide perspectives that allow them to monitor examination outcomes year-on-year and to require review of module outcomes if there is variance from expected outcomes (expressed for example, as a disproportionate number of failing grades, the absence of grades at the highest levels, or by distributions of marks that are skewed in some way); see below under "grade adjustment".


3.2.3.3 Marking

The intention here is to deliver in systematic form a representative mark for each piece of work within a module that can be accumulated with other marks to form a single composite grade for that module. It is important to make clear the difference between marks and grades.

Marks are given to pieces of work – essays, dissertations, examination questions, oral presentations and so on. This marking will often be out of 20 (that is, direct onto the Common Reporting Scale) but need not be. Equally permissible is the use of percentage scales to mark particular types of work, or cumulative scores out of any number (the total number correct from a 55 item multiple choice questionnaire for example).

Grades are expressed on the 20-point Common Reporting Scale and give a final standardized outcome for work done. Grade conversion refers to the process by which marks are converted to grades on the 20-point Common Reporting Scale. The grades used across all assessed work must be scaled identically in order that reliable comparisons can be made of students' abilities across different modules, and so that a reliable overall grade point average (GPA) can be calculated if required. The GPA can be used either as a summary statistic of students' performance in and of itself, or it can be used as the basis of Honours Degree Classification.

The grades for individual modules, and an overall summary statistic taken at the end of any year of study, are of value to students as indices of academic progress; they are of value to academic Schools in determining admission to different programmes and levels of study; and they are of value to employers in that they provide a reliable and valid indicator of ability in specific subjects.

The key to marking does not lie in the 20-point Common Reporting Scale, nor a percentage scale, nor even in grade conversion and/or adjustment. Rather, the critical parts are the construction of good questions and marking that is transparent, reliable, valid and as objective as possible.

All grades for all credit bearing modules, across Faculties and levels of study (SCQF 7–11 [1000–5000]) are reported on the 20-point Common Reporting Scale, allowing comparability of outcomes across all modules. The 20-point Common Reporting Scale was introduced (in academic year 1994-1995) in order to have consistency of module results across Schools. At the time of introduction there was a need, in the new modular degree structure, for a common approach to module grade reporting in order for Schools effectively to run joint degrees. The introduction of degree programmes that cross multiple Schools (such as the undergraduate Sustainable Development programme) and the development of interdisciplinary degree programmes reinforces the need for a common reporting mechanism.

As well as comparability across all credit bearing modules, the use of a Common Reporting Scale has merit in that it allows for flexibility of marking strategies across Schools. Different marking strategies, appropriate to particular disciplines or types of assessment, can all be accommodated under a Common Reporting Scale. Such flexibility is essential in a multi-Faculty University where very different types of examination and marking strategies are required by different disciplines.

As with most forms of test measurement, there are points that require clarification. In this context, it is important to understand that theories of psychometric measurement are not immutable and that there has been, and still is, debate about them. [1] The 20-point Common Reporting Scale can potentially be used and understood in different ways. For example:

  • The 20-point Common Reporting Scale could be thought of as an ordinal scale (following the terminology of Stevens [1951] [2]). That is, it could be used in categorical terms, in which the numbers themselves have no meaning beyond the fact that they can be ordered from smallest to largest. If marking is done on another scale (for example, percentages) then conversion can be made by systematically interpreting particular bands of marks as belonging to particular grade categories. If this procedure is used properly, the only grades appearing would be integers from 0 to 20, these being discrete categories. An ordinal scale measures rank order – biggest to smallest; first, second, third, and so on. Decimal points between ranks cannot be used because the distances between categories are not necessarily constant – the distance from first to second will likely be different to that between second and third. It is not possible to have "first and a half".

  • The 20-point Common Reporting Scale can be thought of as an interval scale (again following Stevens) in which the intervals between integers are equal (and if zero is taken as an absolute reference point, it could be thought of as a ratio scale). In this case the 20-point Common Reporting Scale is not one of rank-ordered categories but a quantitative measure in which decimal places are meaningfully used (inferring that the intervals between numbers are equal). Following conversion from the marking scale, grades between 0 and 20 (with decimal places used) are assigned to work, and from these simple descriptive statistics such as means and medians are meaningfully calculated. [3]

  • The 20-point Common Reporting Scale is used in many Schools as a marking scale as well as a reporting scale: that is, marking is done directly onto the 20-point scale. If this is the case for all elements that contribute to the overall grade for a module then there is no distinction between marks and grades, and at the point of reporting marks on the 20-point scale become grades.

  • Regardless of the process by which different Schools arrive at the grades for modules only module grades on the 20-point Common Reporting Scale are used for degree classification. For degree classification, credit weighted grade point averages (GPAs) and credit weighted medians are calculated, and decimal places used. Since its introduction in 1994, the 20-point scale has been used as an interval scale that permits generation of statistics (means and medians) that have meaning to students, staff and External Examiners. It is commonplace for staff and students to track GPAs over time in order to monitor academic progress, and all Schools use the 20-point Common Reporting Scale for Honours Degree Classification, using decimal places and treating it as an interval scale.


[1] See for example Hand DJ (1996) Statistics and the Theory of Measurement. Journal of the Royal Statistical Society, Series A (Statistics in Society) 159: 445-492.

[2] Stevens SS (1951) Mathematics, measurement and psychophysics. In Handbook of Experimental Psychology (ed. SS Stevens). Wiley, NY. It is notable that in discussing problems of measurement, and in formulating the differences between nominal, ordinal, interval and ratio scales, Stevens noted that in practice it is often not possible to decide "…. into which category a given scale falls" (p. 30).

[3] Note that while, in principle, translation from a 20 point interval scale to a percentage scale could be straightforward, it is made more difficult because there are criteria for passing and for achieving  particular categories in a percentage scale. For example, if 40% is set as a 7 and 75% as a 17, the translation between the 20-point Common Reporting Scale and the percentage scale will not be linear.

Contact details

Nicola Milton, Executive Officer to the Proctor

Proctor's Office
College Gate

North Street
St Andrews
Fife
KY16 9AJ
Scotland, United Kingdom

Tel: 01334 462131
Fax:01334 467432