Annex E - Laboratory Quality Assurance

Codified File

Validation Principle

OIV-MA-AS1-05 Principle of validation of routine methods with respect of reference methods

Principle of validation of routine methods with respect to reference methods

 

The OIV acknowledges the existence of methods of analysis of wines in addition to those described in the Summary of International Methods of Analysis of Wines and Musts, of common methods most often automated.  These methods are economically and commercially important because they permit maintaining a complete and efficient analytical framework around the production and marketing of wine.  Moreover, these methods allow the use of modern means of analysis and the development and adaptation of techniques of analysis.

In order to allow laboratories to use these methods and to insure their linkage to methods described within the Summary, the OIV decides to establish a plan of evaluation and validation by a laboratory of an alternative, common method, mechanized or not with respect to a reference method described in the Summary of International Methods of Analysis of Wines and Musts.

This principle, which will be adapted to the particular situation of the analysis of wines and musts, will take its inspiration from international standards in current use and allow the laboratory to assess and validate its alternative method in two ways:

 

Collaborative Study

OIV-MA-AS1-07 Collaborative study

The purpose of the collaborative study is to give a quantified indication of the precision of method of analysis, expressed as its repeatability r and reproducibility R.

 

Repeatability: the value below which the absolute difference between two single test results obtained using the same method on identical test material, under the same conditions (same operator, same apparatus, same laboratory and a short period of time) may be expected to lie within a specified probability.

 

Reproducibility: the value below which the absolute difference between two single test results obtained using the same method on identical test material, under different conditions (different operators, different apparatus and/or different laboratories and/or different time) may be expected to lie within a specified probability.

The term "individual result" is the value obtained when the standardized trial method is applied, once and fully, to a single sample. Unless otherwise stated, the probability is 95%.

 

General Principles

  • The method subjected to trial must be standardized, that is, chosen from the existing methods as the method best suited for subsequent general use.
  • The protocol must be clear and precise.
  • The number of laboratories participating must be at least ten.
  • The samples used in the trials must be taken from homogeneous batches of material.
  • The levels of the analyte to be determined must cover the concentrations generally encountered.
  • Those taking part must have a good experience of the technique employed.
  • For each participant, all analyses must be conducted within the same laboratory by the same analyst.
  • The method must be followed as strictly as possible.  Any departure from the method described must be documented.
  • The experimental values must be determined under strictly identical conditions: on the same type of apparatus, etc.
  • They must be determined independently of each other and immediately after each other.
  • The results must be expressed by all laboratories in the same units, to the same number of decimal places.
  • Five replicate experimental values must be determined, free from outliers.  If an experimental value is an outlier according to the Grubbs test, three additional measurements must be taken.

Statistical Model

The statistical methods set out in this document are given for one level (concentration, sample).  If there are a number of levels, the statistical evaluation must be made separately for each.  If a linear relationship is found (y = bx or y = a + bx) as between the repeatability (r) or reproducibility (R) and the concentration (), a regression of r (or R) may be run as a function of .

The statistical methods given below suppose normally‑distributed random values.

The steps to be followed are as follows:

A/ Elimination of outliers within a single laboratory by Grubbs test.  Outliers are values which depart so far from the other experimental values that these deviations cannot be regarded as random, assuming the causes of such deviations are not known.

B/ Examine whether all laboratories are working to the same precision, by comparing variances by the Bartlett test and Cochran test.  Eliminate those laboratories for which statistically deviant values are obtained.

C/ Track down the systematic errors from the remaining laboratories by a variance analysis and by a Dixon test identify the extreme outlier values.  Eliminate those laboratories for which the outlier values are significant.

D/ From the remaining figures, calculate standard deviation of repeatability); Sr., and repeatability r standard deviation of reproducibility SR and reproducibility R.

Notation:

The following designations have been chosen:

m Number of laboratories

i(i = 1, 2... m) Index (No. of the laboratory)

 Number of individual values from the ith laboratory

Total number of individual values

x(i = 1, 2... ni) Individual value of the ith laboratory

Mean value of the ith laboratory

Total mean value

Standard deviation of the ith laboratory

A/ Verification of outlier values within one laborator

 

After determining five individual values , a Grubbs test is performed at the laboratory, to identify the outliers’ values.

Test the null hypothesis whereby the experimental value with the greatest absolute deviation from the mean is not an outlier observation.

Calculate PG =

= suspect value

Compare PG with the corresponding value shown in Table 1 for P = 95%.

If PG < value as read, value is not an outlier and si can be calculated.

If PG > value as read, value probably is an outlier therefore make a further three determinations.

Calculate the Grubbs test for with the eight determinations.

If PG > corresponding value for P = 99%, regard as a deviant value and calculate without .

B/ Comparison of variances among laboratories

 

Bartlett Test

The Bartlett test allows us to examine both major and minor variances.  It serves to test the null hypothesis of the equality of variances in all laboratories, as against the alternative hypothesis whereby the variances are not equal in the case of some laboratories.

At least five individual values are required per laboratory.

Calculate the statistics of the test:

 

Compare PB with the value indicated in table 2 at m - 1 degrees of freedom.

If PB > the value in the table, there are differences among the variances.

The Cochran test is used to confirm that the variance from one laboratory is greater than that from other laboratories.

Calculate the test statistics:

Compare PC with the value shown in table 3 for m and at P = 99%.

If PC > the table value, the variance is significantly greater than the others.

If there is a significant result from the Bartlett or Cochran tests, eliminate the outlier variance and calculate the statistical test again.

In the absence of a statistical method appropriate to a simultaneous test of several outlier values, the repeated application of the tests is permitted, but should be used with caution.

If the laboratories produce variances that differ sharply from each other, an investigation must be made to find the causes and to decide whether the experimental values found by those laboratories are to be eliminated or not.  If they are, the coordinator will have to consider how representative the remaining laboratories are.

If statistical analysis shows that there are differing variances, this shows that the laboratories have operated the methods at varying precisions.  This may be due to inadequate practice or to lack of clarity or inadequate description in the method.

C/ Systematic errors

Systematic errors made by laboratories are identified using either Fischer's method or Dixon's test.

R .A. Fischer variance analysis

This test is applied to the remaining experimental values from the laboratories with an identical variance.

The test is used to identify whether the spread of the mean values from the laboratories is very much greater than that for the individual values expressed by the variance among the laboratories () or the variance within the laboratories ().

Calculate the test statistics :

Compare PF with the corresponding value shown in table 4 (distribution of F) where = = m 1 and = = N ‑ m degrees of freedom.

If PF > the table value, it can be concluded that there are differences among the means, that is, there are systematic errors.

Dixon test

This test enables us to confirm that the mean from one laboratory is greater or smaller than that from the other laboratories.

Take a data series Z(h), h = 1,2,3...H, ranged in increasing order.

Calculate the statistics for the test:

3 to 7

Or

8 to 12

Or

13 plus

Or

Compare the greatest value of Q with the critical values shown in table 5.

If the test statistic is > the table value at P = 95%, the mean in question can be regarded as an outlier.

If there is a significant result in the R A Fischer variance analysis or the Dixon test, eliminate one of the extreme values and calculate the test statistics again with

the remaining values. As regards repeated application of the tests, see the explanations in paragraph (B).

If the systematic errors are found, the corresponding experimental values concerned must not be included in subsequent computations; the cause of the systematic error must be investigated.

D/Calculating repeatability (r) and reproducibility (R).

From the results remaining after elimination of outliers, calculate the standard deviation of repeatability sr and repeatability r, and the standard deviation of reproducibility sR and reproducibility R, which are shown as characteristic values of the method of analysis.

If there is no difference between the means from the laboratories, then there is no difference between sr and sR or between r and R.  But, if we find differences among the laboratory means, although these may be tolerated for practical considerations, we have to show and and r and R.

Bibliography

  • AFNOR, norme NFX06041, Fidélitè des méthodes d'essai.  Détermination de la répétabilité et de la reproductibilité par essais interlaboratoires.
  • DAVIES O. L., GOLDSMITH P.l., Statistical Methods in Research and Production, Oliver and Boyd, Edinburgh, 1972.
  • GOETSCH F. H., KRÖNERT W., OLSCHIMKE D., OTTO U., VIERKÖTTER S., Meth. An., 1978, No 667.
  • GOTTSCHALK G., KAISER K. E., Einführung in die Varianzanalyse und Ringversuche, B‑1 Hoschultaschenbücher, Band 775, 1976.
  • GRAF, HENNING, WILRICH, Statistische Methoden bei textilen Untersuchungen, Springer Verlag, Berlin, Heidelberg, New York, 1974.
  • GRUBBS F. E., Sample Criteria for Testing Outlying Observations, The Annals of Mathematical Statistics, 1950, vol. 21, p 27‑58.
  • GRUBBS F. E., Procedures for Detecting Outlying Observations in Samples, Technometrics, 1969, vol. 11, No 1, p 1‑21.
  • GRUBBS F. E. and BECK G., Extension of Sample Sizes and Percentage Points for Significance Tests of Outlying Observations, Technometrics, 1972, vol. 14, No 4, p 847‑854.
  • ISO, norme 5725.
  • KAISER R., GOTTSCHALK G., Elementare Tests zur Beurteilung von Messdaten, B‑I Hochschultaschenbücher, Band 774, 1972.
  • LIENERT G. A., Verteilungsfreie Verfahren in der Biostatistik, Band I, Verlag Anton Haine, Meisenheim am Glan, 1973.
  • NALIMOV V. V., The Application of Mathematical Statistics to Chemical Analysis, Pergamon Press, Oxford, London, Paris, Frankfurt, 1963.
  • SACHS L., Statistische Auswertungsmethoden, Springer Verlag, Berlin, Heidelberg, New York, 1968

 

Table 1 -  Critical values for the Grubbs test

P = 95%

  P 99%

3

4

5

6

7

8

9

10

11

12

1,155

1,481

1,715

1,887

2,020

2,126

2,215

2,290

2,355

2,412

1,155

1,496

1,764

1,973

2,139

2,274

2,387

2,482

2,564

2,636

Table 2 – Critical values for the Bartlett test (P = 95%)

f(m - 1)

X2

f(m - 1)

 X2

1

3,84

5,99

7,81

9,49

11,07

12,59

14,07

15,51

16,92

18,31

19,68

21,03

22,36

23,69

25,00

26,30

27,59

28,87

30,14

31,41

21

22

23

24

25

26

27

28

29

30

35

40

50

60

70

80

90

100

32,7

33,9

35,2

36,4

37,7

38,9

40,1

41,3

42,6

43,8

49,8

55,8

67,5

79,1

90,5

101,9

113,1

124,3

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

Table 3 – Critical values for the Cochran test

m

ni =  2

ni= 3

ni = 4

ni = 5

ni = 6

99%

95%

99%

95%

99%

95%

99%

95%

99%

95%

2

-

-

0.995

0.975

0.979

0.939

0.959

0.906

0.937

0.877

3

0.993

0.967

0.942

0.871

0.883

0.798

0.834

0.746

0.793

0.707

4

0.968

0.906

0.864

0.768

0.781

0.684

0.721

0.629

0.676

0.590

5

0.928

0.841

0.788

0.684

0.696

0.598

0.633

0.544

0.588

0.506

6

0.883

0.781

0.722

0.616

0.626

0.532

0.564

0.480

0.520

0.445

7

0.838

0.727

0.664

0.561

0.568

0.480

0.508

0.431

0.466

0.397

8

0.794

0.680

0.615

0.516

0.521

0.438

0.463

0.391

0.423

0.360

9

0.754

0.638

0.573

0.478

0.481

0.403

0.425

0.358

0.387

0.329

10

0.718

0.602

0.536

0.445

0.447

0.373

0.393

0.331

0.357

0.303

11

0.684

0.570

0.504

0.417

0.418

0.348

0.366

0.308

0.332

0.281

12

0.653

0.541

0.475

0.392

0.392

0.326

0.343

0.288

0.310

0.262

13

0.624

0.515

0.450

0.371

0.369

0.307

0.322

0.271

0.291

0.246

14

0.599

0.492

0.427

0.352

0.349

0.291

0.304

0.255

0.274

0.232

15

0.575

0.471

0.407

0.335

0.332

0.276

0.288

0.242

0.259

0.220

16

0.553

0.452

0.388

0.319

0.316

0.262

0.274

0.230

0.246

0.208

17

0.532

0.434

0.372

0.305

0.301

0.250

0.261

0.219

0.234

0.198

18

0.514

0.418

0.356

0.293

0.288

0.240

0.249

0.209

0.223

0.189

19

0.496

0.403

0.343

0.281

0.276

0.230

0.238

0.200

0.214

0.181

20

0.480

0.389

0.330

0.270

0.265

0.220

0.229

0.192

0.205

0.174

21

0.465

0.377

0.318

0.261

0.255

0.212

0.220

0.185

0.197

0.167

22

0.450

0.365

0.307

0.252

0.246

0.204

0.212

0.178

0.189

0.160

23

0.437

0.354

0.297

0.243

0.238

0.197

0.204

0.172

0.182

0.155

24

0.425

0.343

0.287

0.235

0.230

0.191

0.197

0.166

0.176

0.149

25

0.413

0.334

0.278

0.228

0.222

0.185

0.190

0.160

0.170

0.144

26

0.402

0.325

0.270

0.221

0.215

0.179

0.184

0.155

0.164

0.140

27

0.391

0.316

0.262

0.215

0.209

0.173

0.179

0.150

0.159

0.135

28

0.382

0.308

0.255

0.209

0.202

0.168

0.173

0.146

0.154

0.131

29

0.372

0.300

0.248

0.203

0.196

0.164

0.168

0.142

0.150

0.127

30

0.363

0.293

0.241

0.198

0.191

0.159

0.164

0.138

0.145

0.124

31

0.355

0.286

0.235

0.193

0.186

0.155

0.159

0.134

0.141

0.120

32

0.347

0.280

0.229

0.188

0.181

0.151

0.155

0.131

0.138

0.117

33

0.339

0.273

0.224

0.184

0.177

0.147

0.151

0.127

0.134

0.114

34

0.332

0.267

0.218

0.179

0.172

0.144

0.147

0.124

0.131

0.111

35

0.325

0.262

0.213

0.175

0.168

0.140

0.144

0.121

0.127

0.108

36

0.318

0.256

0.208

0.172

0.165

0.137

0.140

0.119

0.124

0.106

37

0.312

0.251

0.204

0.168

0.161

0.134

0.137

0.116

0.121

0.103

38

0.306

0.246

0.200

0.164

0.157

0.131

0.134

0.113

0.119

0.101

39

0.300

0.242

0.196

0.161

0.154

0.129

0.131

0.111

0.116

0.099

40

0.294

0.237

0.192

0.158

0.151

0.126

0.128

0.108

0.114

0.097

Table 4 – Critical values for the F-Test (P=99%)

f1

 f2

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

1

4052

4999

5403

5625

5764

5859

5928

5981

6023

6056

6083

6106

6126

6143

6157

2

98.5

99.0

99.2

99.3

99.3

99.3

99.4

99.4

99.4

99.4

99.4

99.4

99.4

99.4

99.4

3

34.1

30.8

29.4

28.7

28.2

27.9

27.7

27.5

27.3

27.2

27.1

27.1

27.0

26.9

26.9

4

21.2

18.0

16.7

16.0

15.5

15.2

15.0

14.8

14.7

14.5

14.5

14.4

14.3

14.2

14.2

5

16.3

13.3

12.1

11.4

11.0

10.7

10.5

10.3

10.2

10.1

9.96

9.89

9.82

9.77

9.72

6

13.7

10.9

9.78

9.15

8.75

8.47

8.26

8.10

7.98

7.87

7.79

7.72

7.66

7.60

7.56

7

12.2

9.55

8.45

7.85

7.46

7.19

6.99

6.84

6.72

6.62

6.54

6.47

6.41

6.36

6.31

8

11.3

8.65

7.59

7.01

6.63

6.37

6.18

6.03

5.91

5.81

5.73

5.67

5.61

5.56

5.52

9

10.6

8.02

6.99

6.42

6.06

5.80

5.61

5.47

5.35

5.26

5.18

5.11

5.05

5.01

4.96

10

10.0

7.56

6.55

5.99

5.64

5.39

5.20

5.06

4.94

4.85

4.77

4.71

4.65

4.60

4.56

11

9.64

7.20

6.21

5.67

5.31

5.07

4.88

4.74

4.63

4.54

4.46

4.39

4.34

4.29

4.25

12

9.33

6.93

5.95

5.41

5.06

4.82

4.64

4.50

4.39

4.30

4.22

4.16

4.10

4.05

4.01

13

9.07

6.70

5.74

5.21

4.86

4.62

4.44

4.30

4.19

4.10

4.02

3.96

3.90

3.86

3.82

14

8.86

6.51

5.56

5.04

4.69

4.46

4.28

4.14

4.03

3.94

3.86

3.80

3.75

3.70

3.66

15

8.68

6.36

5.42

4.89

4.56

4.32

4.14

4.00

3.89

3.80

3.73

3.67

3.61

3.56

3.52

16

8.53

6.23

5.29

4.77

4.44

4.20

4.03

3.89

3.78

3.69

3.62

3.55

3.50

3.45

3.41

17

8.40

6.11

5.18

4.67

4.34

4.10

3.93

3.79

3.68

3.59

3.52

3.46

3.40

3.35

3.31

18

8.29

6.01

5.09

4.58

4.25

4.01

3.84

3.71

3.60

3.51

3.43

3.37

3.32

3.27

3.23

19

8.18

5.93

5.01

4.50

4.17

3.94

3.77

3.63

3.52

3.43

3.36

3.30

3.24

3.19

3.15

20

8.10

5.85

4.94

4.43

4.10

3.87

3.70

3.56

3.46

3.37

3.29

3.23

3.18

3.13

3.09

21

8.02

5.78

4.87

4.37

4.04

3.81

3.64

3.51

3.40

3.31

3.24

3.17

3.12

3.07

3.03

22

7.95

5.72

4.82

4.31

3.99

3.76

3.59

3.45

3.35

3.26

3.18

3.12

3.07

3.02

2.98

23

7.88

5.66

4.76

4.26

3.94

3.71

3.54

3.41

3.30

3.21

3.14

3.07

3.02

2.97

2.93

24

7.82

5.61

4.72

4.22

3.90

3.67

3.50

3.36

3.26

3.17

3.09

3.03

2.98

2.93

2.89

25

7.77

5.57

4.68

4.18

3.85

3.63

3.46

3.32

3.22

3.13

3.06

2.99

2.94

2.89

2.85

26

7.72

5.53

4.64

4.14

3.82

3.59

3.42

3.29

3.18

3.09

3.02

2.96

2.90

2.86

2.81

27

7.68

5.49

4.60

4.11

3.78

3.56

3.39

3.26

3.15

3.06

2.99

2.93

2.87

2.82

2.78

28

7.64

5.45

4.57

4.07

3.75

3.53

3.36

3.23

3.12

3.03

2.96

2.90

2.84

2.79

2.75

29

7.60

5.42

4.54

4.04

3.73

3.50

3.33

3.20

3.09

3.00

2.93

2.87

2.81

2.77

2.73

30

7.56

5.39

4.51

4.02

3.70

3.47

3.30

3.17

3.07

2.98

2.91

2.84

2.79

2.74

2.70

40

7.31

5.18

4.31

3.83

3.51

3.29

3.12

2.99

2.89

2.80

2.73

2.66

2.61

2.56

2.52

50

7.17

5.06

4.20

3.72

3.41

3.19

3.02

2.89

2.78

2.70

2.62

2.56

2.51

2.46

2.42

60

7.07

4.98

4.13

3.65

3.34

3.12

2.95

2.82

2.72

2.63

2.56

2.50

2.44

2.39

2.35

70

7.01

4.92

4.07

3.60

3.29

3.07

2.91

2.78

2.67

2.59

2.51

2.45

2.40

2.35

2.31

80

6.96

4.88

4.04

3.56

3.25

3.04

2.87

2.74

2.64

2.55

2.48

2.42

2.36

2.31

2.27

90

6.92

4.85

4.01

3.53

3.23

3.01

2.84

2.72

2.61

2.52

2.45

2.39

2.33

2.29

2.24

100

6.89

4.82

3.98

3.51

3.21

2.99

2.82

2.69

2.59

2.50

2.43

2.37

2.31

2.27

2.22

200

6.75

4.71

3.88

3.41

3.11

2.89

2.73

2.60

2.50

2.41

2.34

2.27

2.22

2.17

2.13

500

6.69

4.65

3.82

3.36

3.05

2.84

2.68

2.55

2.44

2.36

2.29

2.22

2.17

2.12

2.07

6.63

4.61

3.78

3.32

3.02

2.80

2.64

2.51

2.41

2.32

2.25

2.18

2.13

2.08

2.04

Table 4 – Critical values for the F-Test (P=99%) [Continued]

f1

 f2

16

17

18

19

20

30

40

50

60

70

80

100

200

500

1

6169

6182

6192

6201

6209

6261

6287

6303

6313

6320

6326

6335

6350

6361

6366

2

99.4

99.4

99.4

99.4

99.5

99.5

99.5

99.5

99.5

99.5

99.5

99.5

99.3

99.5

99.5

3

26.8

26.8

26.8

26.7

26.7

26.5

26.4

26.4

26.3

26.3

26.3

26.2

26.2

26.1

26.1

4

14.2

14.1

14.1

14.0

14.0

13.8

13.7

13.7

13.7

13.6

13.6

13.6

13.5

13.5

13.5

5

9.68

9.64

9.61

9.58

9.55

9.38

9.29

9.24

9.20

9.18

9.16

9.13

9.08

9.04

9.02

6

7.52

7.48

7.45

7.42

7.40

7.23

7.14

7.09

7.06

7.03

7.01

6.99

6.93

6.90

6.88

7

6.28

6.24

6.21

6.18

6.16

5.99

5.91

5.86

5.82

5.80

5.78

5.75

5.70

5.67

5.65

8

5.48

5.44

5.41

5.38

5.36

5.20

5.12

5.07

5.03

5.01

4.99

4.96

4.91

4.88

4.86

9

4.92

4.89

4.86

4.83

4.81

4.65

4.57

4.52

4.48

4.46

4.44

4.41

4.36

4.33

4.31

10

4.52

4.49

4.46

4.43

4.41

4.25

4.17

4.12

4.08

4.06

4.04

4.01

3.96

3.93

3.91

11

4.21

4.18

4.15

4.12

4.10

3.94

3.86

3.81

3.77

3.75

3.73

3.70

3.65

3.62

3.60

12

3.97

3.94

3.91

3.88

3.86

3.70

3.62

3.57

3.54

3.51

3.49

3.47

3.41

3.38

3.36

13

3.78

3.74

3.72

3.69

3.66

3.51

3.42

3.37

3.34

3.32

3.30

3.27

3.22

3.19

3.17

14

3.62

3.59

3.56

3.53

3.51

3.35

3.27

3.22

3.18

3.16

3.14

3.11

3.06

3.03

3.00

15

3.49

3.45

3.42

3.40

3.37

3.21

3.13

3.08

3.05

3.02

3.00

2.98

2.92

2.89

2.87

16

3.37

3.34

3.31

3.28

3.26

3.10

3.02

2.97

2.93

2.91

2.89

2.86

2.81

2.78

2.75

17

3.27

3.24

3.21

3.19

3.16

3.00

2.92

2.87

2.83

2.81

2.79

2.76

2.71

2.68

2.65

18

3.19

3.16

3.13

3.10

3.08

2.92

2.84

2.78

2.75

2.72

2.70

2.68

2.62

2.59

2.57

19

3.12

3.08

3.05

3.03

3.00

2.84

2.76

2.71

2.67

2.65

2.63

2.60

2.55

2.51

2.49

20

3.05

3.02

2.99

2.96

2.94

2.78

2.69

2.64

2.61

2.58

2.56

2.54

2.48

2.44

2.42

21

2.99

2.96

2.93

2.90

2.88

2.72

2.64

2.58

2.55

2.52

2.50

2.48

2.42

2.38

2.36

22

2.94

2.91

2.88

2.85

2.83

2.67

2.58

2.53

2.50

2.47

2.45

2.42

2.36

2.33

2.31

23

2.89

2.86

2.83

2.80

2.78

2.62

2.54

2.48

2.45

2.42

2.40

2.37

2.32

2.28

2.26

24

2.85

2.82

2.79

2.76

2.74

2.58

2.49

2.44

2.40

2.38

2.36

2.33

2.27

2.24

2.21

25

2.81

2.78

2.75

2.72

2.70

2.54

2.45

2.40

2.36

2.34

2.32

2.29

2.23

2.19

2.17

26

2.78

2.75

2.72

2.69

2.66

2.50

2.42

2.36

2.33

2.30

2.28

2.25

2.19

2.16

2.13

27

2.75

2.71

2.68

2.66

2.63

2.47

2.38

2.33

2.29

2.27

2.25

2.22

2.16

2.12

2.10

28

2.72

2.68

2.65

2.63

2.60

2.44

2.35

2.30

2.26

2.24

2.22

2.19

2.13

2.09

2.06

29

2.69

2.66

2.63

2.60

2.57

2.41

2.33

2.27

2.23

2.21

2.19

2.16

2.10

2.06

2.03

30

2.66

2.63

2.60

2.57

2.55

2.39

2.30

2.25

2.21

2.18

2.16

2.13

2.07

2.03

2.01

40

2.48

2.45

2.42

2.39

2.37

2.20

2.11

2.06

2.02

1.99

1.97

1.94

1.87

1.85

1.80

50

2.38

2.35

2.32

2.29

2.27

2.10

2.01

1.95

1.91

1.88

1.86

1.82

1.76

1.71

1.68

60

2.31

2.28

2.25

2.22

2.20

2.03

1.94

1.88

1.84

1.81

1.78

1.75

1.68

1.63

1.60

70

2.27

2.23

2.20

2.18

2.15

1.98

1.89

1.83

1.78

1.75

1.73

1.70

1.62

1.57

1.54

80

2.23

2.20

2.17

2.14

2.12

1.94

1.85

1.79

1.75

1.71

1.69

1.65

1.58

1.53

1.49

90

2.21

2.17

2.14

2.11

2.09

1.92

1.82

1.76

1.72

1.68

1.66

1.62

1.55

1.50

1.46

100

2.19

2.15

2.12

2.09

2.07

1.89

1.80

1.74

1.69

1.66

1.63

1.60

1.52

1.47

1.43

200

2.09

2.06

2.03

2.00

1.97

1.79

1.69

1.63

1.58

1.55

1.52

1.48

1.39

1.33

1.28

500

2.04

2.00

1.97

1.94

1.92

1.74

1.63

1.56

1.52

1.48

1.45

1.41

1.31

1.23

1.16

2.00

1.97

1.93

1.90

1.88

1.70

1.59

1.52

1.47

1.43

1.40

1.36

1.25

1.15

1.00

Table 5 –  Critical values for the Dixon test

Test criteria

Critical values

m

95%

99%

3

0,970

0,994

Z(2)   –   Z(1)

ou Z(H) –  Z (H – 1)

4

0,829

0,926

Z(H) – Z(1)

Z(H) – Z(1)

5

0,710

0,821

The greater of the two values

6

0,628

0,740

7

0,569

0,680

8

0,608

0,717

Z(2) – Z(1) ou

Z(H) –  Z (H – 1)

9

0,564

0,672

Z(H – 1) – Z(1)

Z(H) – Z(2)

10

0,530

0,635

The greater of the two values

11

0,502

0,605

12

0,479

0,579

13

0,611

0,697

 Z(3) – Z(1)

ou Z(H) –  Z (H – 2)

14

0,586

0,670

Z(H – 2) –  Z(1)

 Z(H) – Z(3)

15

0,565

0,647

     The greater of the two values

16

0,546

0,627

17

0,529

0,610

18

0,514

0,594

19

0,501

0,580

20

0,489

0,567

21

0,478

0,555

22

0,468

0,544

23

0,459

0,535

24

0,451

0,526

25

0,443

0,517

26

0,436

0,510

27

0,429

0,502

28

0,423

0,495

29

0,417

0,489

30

0,412

0,483

31

0,407

0,477

32

0,402

0,472

33

0,397

0,467

34

0,393

0,462

35

0,388

0,458

36

0,384

0,454

37

0,381

0,450

38

0,377

0,446

39

0,374

0,442

40

0,371

0,438

Table 6 –  Results of the collaborative study

Analysis

Sample

Lab nº

Individual values x1

1

2

3

4

5

6

7

8

1

548

556

558

553

542

5

551

6,47

41,8

2

300

299

304

308

300

5

302

3,83

14,7

3

567

558

563

532*

560

560

563

567

7

563

3,51

12,3

4

557

550

555

560

551

5

555

4,16

17,3

5

569

575

565

560

572

5

568

5,89

34,7

6

550

546

549

557

588

570

576

568

8

563

14,92

222,6

 

7

557

560

560

552

547

5

555

5,63

31,7

8

548

543

560

551

548

5

550

6,28

39,5

9

558

563

551

555

560

5

556

5,63

31,7

10

554

559

551

545

557

5

553

5,5

30,2

Statistical Figures:

Bartlett Test:

 

Within laboratory: = 5.37  

PB = 3.16 < 15.51  (95%; ƒ = 8)

 

Between laboratory: = 13.97 ƒz = 7

Analysis of variance:

= 5.37 

r = 15

sR = 7.78

R = 22

PF = 6.76 > 3.21 (99%; = 7;  = 34)

Reliability of methods

OIV-MA-AS1-08 Reliability of analytical results

Data concerning the reliability of analytical methods, as determined by collaborative studies, are applicable in the following cases:

  1. Verifying the results obtained by a laboratory with a reference method
  2. Evaluating analytical results which indicate a legal limit has been exceeded
  3. Comparing results obtained by two or more laboratories and comparing those results with a reference value
  4. Evaluating results obtained from a non-validated method
  1. Verification of the acceptability of results obtained with a reference method

 

The validity of analytical results depends on the following:

  • the laboratory should perform all analyses within the framework of an appropriate quality control system which includes the organization, responsibilities, procedures, etc.
  • as part of the quality control system, the laboratory should operate according to an internal Quality Control Procedure
  • results should be obtained in accordance with the acceptability criteria described in the internal Quality Control Procedure

Internal quality control shall be established in accordance with internationally recognized standards, such those of the IUPAC document titled, "Harmonized Guidelines for Internal Quality Control in Analytical Laboratories."

Internal Quality Control implies an analysis of the reference material.

Reference samples should consist of a template of the samples to be analyzed and should contain an appropriate, known concentration of the substance analyzed which is similar to that found in the sample.

To the extent possible, reference material shall be certified by an internationally recognized organization.

However, for many types of analysis, there are no certified reference materials.  In this case, one could use, for example, material analyzed by several laboratories in a competence test and considering the average of the results to be the value assigned to the substance analyzed.

One could also prepare reference material by formulation (model solution with known components) or by adding a known quantity of the substance analyzed to a

sample which does not contain (or not yet contain) the substance by means of a recovery test (dosed addition) on one of the samples to analyze.

Quality Control is assured by adding reference material to each series of samples, and analyzing these pairs (test samples and reference material).  This verifies correct implementation of the method and should be independent of the analytical calibration and protocol as its goal is to verify the aforementioned.

Series means a number of samples analyzed under repeatable conditions.  Internal controls serve to ensure the appropriate level of uncertainty is not exceeded.

If the analytical results are considered to be part of a normal population whose mean is m and standard deviation is s, only around 0.3% of the results will be outside the limits m ± 3s.  When aberrant results are obtained (outside these limits), the system is considered to be outside statistical control (unreliable data).

The control is graphically represented using Shewhart Control Graphs.  To produce these graphical results, the measured values obtained from the reference material are placed on the vertical axis while the series numbers are placed on the horizontal axis.  The graph also includes horizontal lines representing the mean, m, m ± 2 (warning limits) and m ± 3 (action limits) (Figure 1).

To estimate the standard deviation, a control should be analyzed, in pairs, in at least 12 trials.  Each analytical pair shall be analyzed under repeatable conditions and randomly inserted in a sample series.  Analyses will be duplicated on different days to reflect reasonable changes from one series to another.  Variations can have several causes: modification of the reactants composition, instrument re-calibration and even different operators.  After eliminating aberrant data using the Grubbs test, calculate the standard deviation to construct the Shewhart graphs.  This standard deviation is compared to that of the reference method.  If a published precision level is not obtained for the reference method, caused should be investigated.

The precision limits of the laboratory should be periodically revised by repeating the indicated procedure.

Once the Quality Control graph is constructed, graph the results obtained from each series for the control material.

A series is considered outside statistical control if:

(I)a value is outside the action limit,

(II)the current and previous values are situated outside the attention limits even in within the action limits,

III) nine successive values lie on the same side of the mean.

The laboratory response to "outside control" conditions is to reject the results for the series and perform tests to determine the cause, then take action to remedy the situation.

A Shewhart Control Graph can also be produced for the differences between analytical pairs in the same sample, especially when reference material does not exist.  In this case, the absolute difference between two analyses of the same sample is graphed.  The graph's lower line is 0 and the attention limit is 1.128 while the action limit is 3.686Sw where = the standard deviation of a series.

This type of graph only accounts for repeatability.  It should be no greater than the published repeatability limit for the method.

In the absence of control material, it sometimes becomes necessary to verify that the reproducibility limit of the reference method is not exceeded by comparing the results obtained to those of obtained by an experimental laboratory using the same sample.

Each laboratory performs two tests and the following formula is used:

= Critical difference (P=0,95)

= Means of 2 results obtained by lab 1

= Means of 2 results obtained by lab 2

R = Reproducibility of reference method

r = Repeatability of reference method

If the critical difference has been exceeded, the underlying reason is to be found and the test is to be repeated within one month.

  1. Evaluation of analytic results indicating that a legal limit has been exceeded

When analytical results indicated that a legal limit has been exceeded, the following procedure should be followed:

In the case of an individual result, conduct a second test under repeatable conditions.  If it is not possible to conduct a second test under repeatable conditions, conduct a double analysis under repeatable conditions and use these data to evaluate the critical difference.

Determine the absolute value of the difference between the mean of the results obtained under repeatable conditions and the legal limit.  An absolute value of the difference which is greater than the critical distance indicates that the sample does not fit the specifications.

Critical difference is calculated by the formula:Mean of results obtained

= Limit

n=Number of analyses

R=reproducibility

r=repeatability

In other words, this is a maximal limit where the average of the results obtained should not be greater than:

If the limit is a minimum, the average of the results obtained should not be less than:

  1. Comparing results obtained using two or more laboratories and comparing these results to a reference value

 

To determine whether or not data originating in two laboratories are in agreement, calculate the absolute difference between the two results and compare to the critical difference:= Mean of 2 results obtained by lab 1

=Mean of 2 results obtained by lab 2

= number of analyses in lab 1 sample

=number of analyses in lab 2 sample

R=Reproducibility of reference method

r=Repeatability of reference method

If the result is the average of two tests, the equation can be simplified to:

If the data are individual results, the critical difference is R.

If the critical difference is not exceeded, the conclusion is that the results of the two laboratories are in agreement.

Comparing results obtained by several laboratories with a reference value:

Suppose p laboratories have made n1 determinations, whose mean for each laboratory is y1 and whose total mean is:

The mean of all laboratories is compared with the reference value.  If the absolute difference exceeds the critical difference, as calculated using the following formula, we conclude the results are not in agreement with the reference value:

)

=Critical difference, calculated as indicated in point 2, for the reference method.

For example, the reference value can be the value assigned to a reference material or the

value obtained by the same laboratory or by a different laboratory with a different method.

 

  1. Evaluating analytical results obtained using non-valitated methods

A provisional reproducibility value can be assigned to a non-validated method by comparing it to that of a second laboratory:

= Mean of 2 results obtained by lab 1

= Mean of 2 results obtained by lab 2

r = Repeatability of reference method

Provisional reproducibility can be used to calculate critical difference.

If provisional reproducibility is less than twice the value of repeatability, it should be set to 2r.

A reproducibility value greater than three times repeatability or twice the value calculated using the Horwitz equation is not acceptable.

Horwitz equation:

%=Standard deviation for reproducibility(expressed as a percentage of the mean)

C= concentration, expressed as a decimal fraction (for example,   10g/100g = 0.1)

This equation was empirically obtained from more than 3000 collaborative studies including a diverse group of analyzed substances, matrices and measurement techniques.  In the absence of other information, RSDR values that are lower or equal to the RSDR values calculated using the Horwitz equation can be considered acceptable.

 values calculated by the Horwitz equation:

 

Concentration

%

10-9

45

10-8

32

10-7

23

10-6

16

10-5

11

10-4

8

10-3

5,6

10-2

4

10-1

2,8

1

2

If the result obtained using a non-validated method is close to the limit specified by legislation, the decision on the limit shall be decided as follows (for upper limits):

and, for lower limits,

S = decision limit

= legal limit

= provisional reproducibility for non-validated method

=reproducibility for reference method

= critical difference, calculated as indicated in point 2, for the reference method

The result which exceeds the decision limit should be replaced with a final result obtained using the reference method.

Critical differences for probability levels other than 95%

This difference can be determined by multiplying the critical differences at the 95% level by the coefficients shown in Table 1.

Table 1 - Multiplicative coefficients allowing

the calculation of critical differences for

probability levels other than 95%

Probability level P

Multiplicative coefficient

90

0,82

95

1,00

98

1,16

99

1,29

99,5

1,40

Shewhart control graph


 

Bibliography

  • "Harmonized Guidelines for Internal Quality Control in Analytical Chemistry Laboratories". IUPAC. Pure and App. Chem. Vol 67, nº 4, 649-666, 1995
  • "Shewhart Control Charts" ISO 8258. 1991.
  • "Precision of test methods - Determination of repeatability and reproducibility for a standard test method by inter-laboratory tests". ISO 5725, 1994.
  • "Draft Commission Regulation of establishing rules for the application of reference and routine methods for the analysis and quality evaluation of milk and milk products". Commission of the European Communities, 1995.
  • "Harmonized protocols for the adoption of standardized analytical methods and for the presentation of their performance characteristics". IUPAC. Pure an App. Chem., Vol. 62, nº 1, 149-162. 1990.

Protocol for the design, conducts and interpretation of collaborative studies

OIV-MA-AS1-09 Protocol for the designe, conducts and interpretation of collaborative studies

Introduction

After a number of meetings and workshops, a group of representatives from 27 organizations adopted by consensus a "Protocol for the design, conducts and interpretation of collaborative studies" which was published in Pure & Appl. Chem. 60, 855-864, 1995. A number of organizations have accepted and used this protocol. As a result of their experience and the recommendations of the Codex Committee on Methods of Analysis and Sampling (Joint FAO/WHO Food Standards Programme, Report of the Eighteenth Session, 9-13 November, 1992; FAO, Rome Italy, ALINORM 93/23, Sections 34-39), three minor revisions were recommended for incorporation into the original protocol. These are: (1) Delete the double split level design because the interaction term it generates depends upon the choice of levels and if it is statistically significant, the interaction cannot be physically interpreted. (2) Amplify the definition of "material". (3) Change the outlier removal criterion from 1% to 2.5%.

The revised protocol incorporating the changes is reproduced below. Some minor editorial revisions to improve readability have also been made. The vocabulary and definitions of the document 'Nomenclature of Interlaboratory Studies (Recommendations 1994)' [published in Pure Appl Chem., 66, 1903-1911 (1994)] has been incorporated into this revision, as well as utilizing, as far as possible, the appropriate terms of the International Organization for Standardization (ISO), modified to be applicable to analytical chemistry.

Protocol

  1. Preliminary work

Method-performance (collaborative) studies require considerable effort and should be conducted only on methods that have received adequate prior testing. Such within-laboratory testing should include, as applicable, information on the following:

1.1.  Preliminary estimates of precision

Estimates of the total within-laboratory standard deviation of the analytical results over the concentration range of interest as a minimum at the upper and lower limits

of the concentration range, with particular emphasis on any standard or specification value.

Note 1: The total within-laboratory standard deviation is a more inclusive measure of imprecision that the ISO repeatability standard deviation, §3.3 below. This standard deviation is the largest of the within-laboratory type precision variables to be expected from the performance of a method; it includes at least variability from different days and preferably from different calibration curves. It includes between-run (between-batch) as well as within-run (within-batch) variations. In this respect it can be considered as a measure of within-laboratory reproducibility. Unless this value is well within acceptable limits, it cannot be expected that the between-laboratory standard deviation (reproducibility standard deviation) will be any better. This precision term is not estimated from the minimum study described in this protocol.

NOTE 2: The total within-laboratory standard deviation may also be estimated from ruggedness trials that indicate how tightly controlled the experimental factors must be and what their permissible ranges are. These experimentally determined ranges should be incorporated into the description of the method.

1.2.  Systematic error (bias)

Estimates of the systematic error of the analytical results over the concentration range and in the substances of interest, as a minimum at the upper and lower limits of the concentration range, with particular emphasis on any standard or specification value.

The results obtained by applying the method to relevant reference materials should be noted.

1.3.  Recoveries

The recoveries of "spikes" added to real materials and to extracts, digests, or other treated solutions thereof.

1.4.  Applicability

The ability of the method to identify and measure the physical and chemical forms of the analyte likely to be present in the materials, with due regard to matrix effects.

1.5.  Interference

The effect of other constituents that are likely to be present at appreciable concentrations in matrices of interest and which may interfere in the determination.

1.6.  Method comparison

The results of comparison of the application of the method with existing tested methods intended for similar purposes.

1.7.  Calibration Procedures

The procedures specified for calibration and for blank correction must not introduce important bias into the results.

1.8.  Method description

The method must be clearly and unambiguously written.

1.9.  Significant figures

The initiating laboratory should indicate the number of significant figures to be reported, based on the output of the measuring instrument.

Note: In making statistical calculations from the reported data, the full power of the calculator or computer is to be used with no rounding or truncating until the final reported mean and standard deviations are achieved. At this point the standard deviations are rounded to 2 significant figures and the means and related standard deviations are rounded to accommodate the significant figures of the standard deviation. For example, if = 0.012, c is reported as 0.147, not as 0. 1473 or 0. 15, and RSDR is reported as 8.2%. (Symbols are defined in Appendix L) If standard deviation calculations must be conducted manually in steps, with the transfer of intermediate results, the number of significant figures to be retained for squared numbers should be at least 2 times the number of figures in the data plus 1.

  1. Design of the method-performance study

 

2.1.  Number of materials

For a single type of substance, at least 5 materials (test samples) must be used; only when a single level specification is involved for a single matrix may this minimum required number of materials to be reduced to 3. For this design parameter, the two portions of a split level and the two individual portions of blind replicates per laboratory are considered as a single material.

Note 1: A material is an 'analyte/matrix/concentration' combination to which the method-performance parameters apply. This parameter determines the applicability of a method. For application to a number of different substances, a sufficient number of matrices and levels should be chosen to include potential interferences and the concentration of typical use.

Note 2: The 2 or more test samples of blind or open replicates statistically, are a single material (they are not independent).

NOTE 3: A single split level (Youden pair) statistically analyzed as a pair is a single material; if analyzed statistically and reported as single test samples, they are 2 materials. In addition, the pair can be used to calculate the within-laboratory standard deviation, as

(for duplicates, blind or open

(for duplicates, blind or open

where , the difference between the 2 individual values from the split level for each laboratory and n is the number of laboratories. In this special case, , the among laboratories standard deviation, is merely the average of the two values calculated from the individual components of the split level, and it is used only as a check of the calculations.

Note 4: The blank or negative control may be a material or not depending on the usual purpose of the analysis. For example, in trace analysis, where very low levels (near the limit of quantitation) are often sought, the blanks are considered as materials and are necessary to determine certain 'limits of measurement.' However, if the blank is merely a procedural control in macro analysis (e.g., fat in cheese), it would not be considered a material.

2.2.  Number of laboratories

At least 8 laboratories must report results for each material; only when it is impossible to obtain this number (e.g., very expensive instrumentation or specialized laboratories required) may the study be conducted with less, but with an absolute minimum of 5 laboratories. If the study is intended for international use, laboratories from different countries should participate. In the case of methods requiring the use of specialized instruments, the study might include the entire population of available laboratories. In such cases, "n" is used in the denominator for calculating the standard deviation instead of "(n - 1)". Subsequent entrants to the field should demonstrate the ability to perform as well as the original participant.

2.3.  Number of Replicates

The repeatability precision parameters must be estimated by using one of the following sets of designs (listed in approximate order of desirability):

2.3.1.      Split Level

For each level that is split and which constitutes only a single material for purposes of design and statistical analysis, use 2 nearly identical test samples that differ only slightly in analyte concentration (e.g., <1-5%). Each laboratory must analyse each test sample once and only once.

Note: The statistical criterion that must be met for a pair of test samples to constitute a split level is that the reproducibility standard deviation of the two parts of the single split level must be equal.

2.3.2.      Combination blind replicates and split level

Use split levels for some materials and blind replicates for other materials in the same study (single values from each submitted test sample).

2.3.3.      Blind replicates

For each material, use blind identical replicates, when data censoring is impossible (e.g., automatic input, calculation, and printout) non-blind identical replicates may be used.

2.3.4.      Known replicates

For each material, use known replicates (2 or more analyses of test portions from the same test sample), but only when it is not practical to use one of the preceding designs.

2.3.5.      Independent analyses

Use only a single test portion from each material (i.e., do not perform multiple analyses) in the study, but rectify the inability to calculate repeatability parameters by quality control parameters or other within-laboratory data obtained independently of the method-performance study.

  1. Statistical analysis (See Flowchart, A.4. 1)

For the statistical analysis of the data, the required statistical procedures listed below must be performed and the results reported. Supplemental, additional procedures are not precluded.

3.1.  Valid data

Only valid data should be reported and subjected to statistical treatment. Valid data are those data that would be reported as resulting from the normal performance of laboratory analyses; they are not marred by method deviations, instrument malfunctions, unexpected occurrences during performance, or by clerical, typographical and arithmetical errors.

3.2.  One-way analysis of variance

One-way analysis of variance and outlier treatments must be applied separately to each material (test sample) to estimate the components of variance and repeatability and reproducibility parameters.

3.3.  Initial estimation

Calculate the mean, c (= the average of laboratory averages), repeatability relative standard deviation, and reproducibility relative standard deviation, RSDR with no outliers removed, but using only valid data.

3.4.  Outlier treatment

The estimated precision parameters that must also be reported are based on the initial valid data purged of all outliers flagged by the harmonized 1994 outlier removal procedure. This procedure essentially consists of sequential application of the Cochran and Grubbs tests (at 2.5% probability (P) level, 1-tail for Cochran and 2-tail for Grubbs) until no further outliers are flagged or until a drop of 22.2% (= 219) in the original number of laboratories providing valid data would occur.

Note: Prompt consultation with a laboratory reporting suspect values may result in correction of mistakes or discovering conditions that lead to invalid data, 3.1.

Recognizing mistakes and invalid data per se is much preferred to relying upon statistical tests to remove deviate values.

3.4.1.      Cochran test

First apply Cochran outlier test (1-tail test a P = 2.5%) and remove any laboratory whose critical value exceeds the tabular value given in the tale, Appendix A.3. 1, for the number of laboratories and replicates involved.

3.4.2.      Grubbs tests

Apply the single value Grubbs test (2 tail) and remove any outlying laboratory. If no laboratory is flagged, then apply the pair value tests (2 tail) - 2 at the same end and 1 value at each end, P = 2.5% overall. Remove any laboratory(ies) flagged by these tests whose critical value exceeds the tabular value given in the appropriate column of the table Appendix A.3.3. Stop removal when the next application of the test will flag as table, A outliers more that 22.2% (2 of 9) of the laboratories.

Note: The Grubbs tests are to be applied one material at a time to the set of replicate means from all laboratories, and not to the individual values from replicated designs because the distribution of all the values taken together is multimodal, not Caussian, i.e., their differences from the overall mean for that material are not independent.

3.4.3.      Final estimation

Recalculate the parameters as in §3.3 after the laboratories flagged by the preceding procedure have been removed. If no outliers were removed by the Cochran-Grubbs sequence, terminate testing. Otherwise, reapply the Cochran-Grubbs sequence to the data purged of the flagged outliers until no further outliers are flagged or until more than a total of 22.2% (2 of 9 laboratories) would be removed in the next cycle. See flowchart A.3.4.

  1. Final report

The final report should be published and should include all valid data. Other information and parameters should be reported in a format similar (with respect to the reported items) to the following, as applicable:

[x] Method-performance tests carried out at the international level in [year(s)] by [organisation] in which [y and z] laboratories participated, each performing [k] replicates, gave the following statistical results:

Table of method -Performance parameters

Analyte; Results expressed in [units]

Material [Description and listed in columns across top of table in increasing order of magnitude of means]

Number of laboratories retained after eliminating outliers

Number of outlying laboratories

Code (or designation) of outlying laboratories

Number of accepted results

Mean

True or accepted value, if known

Repeatability standard deviation (Sr)

Repeatability relative standard deviation (RSDR)

Repeatability limit, r (2.8 x Sr)

Reproducibility standard deviation (SR)

Reproducibility relative standard deviation (RSDR)

Reproducibility limit, R (2.8 X SR)

4.1.  Symbols

A set of symbols for use in reports and publications is attached as Appendix 1 (A.1.).

4.2.  Definitions

A set of definitions for use in study reports and publications is attached as Appendix 2 (A.2.).

4.3.  Miscellaneous

4.3.1.      Recovery

Recovery of added analyte as a control on method or laboratory bias should be calculated as follows:

[Marginal] Recovery, %=

(Total analyte found - analyte originally present) x 100/(analyte added)

Although the analyte may be expressed as either concentration or amount, the units must be the same throughout. When the quantity of analyte is determined by analysis, it must be determined in the same way throughout.

Analytical results should be reported uncorrected for recovery. Report recoveries separately.

4.3.2.      When , is negative

By definition, is greater than or equal to   in method-performance studies; occasionally the estimate of is greater than the estimate of (the average of the replicates is greater than the range of laboratory averages and the calculated is then negative). When this occurs, set = 0 and = .

  1. References
  • Horwitz, W. (1988) Protocol for the design, conduct, and interpretation of method performance studies. Pure & Appl. Chem. 60, 855-864.
  • Pocklington, W.D. (1990) Harmonized protocol for the adoption of standardized analytical methods and for the presentation of their performance characteristics. Pure and Appl. Chem. 62, 149-162.
  • International Organization for Standardization. International Standard 5725-1986. Under revision in 6 parts; individual parts may be available from National Standards member bodies.

Appendices

Appendix 1. - Symbols

Use the following set of symbols and terms for designating parameters developed by a method-performance study.

Mean (of laboratory averages): x

Standard deviations:s (estimates)

  • Repeatability:
  • 'Pure' between-laboratory:
  • Reproducibility;

Variances: (with subscripts, r, L, and R)

Relative standard deviations: RSD (with subscripts, r, L, and r)

Maximum tolerable differences

(as defined by ISO 5725-1986);

See A.2.4 and A.2.5)

Repeatability limitr = (2.8 x )

Reproducibility limit R = (2.8 X )

Number of replicates per laboratory :k (general)

Average number of replicates per laboratory i:k (for a balanced design)

Number of laboratories :L

Number of materials (test samples): m

Total number of values in a given assay: n (= kL for a balanced design)

Total number of values in a given study: N (= kLm for an overall balanced design)

____________________

If other symbols are used, their relationship to the recommended symbols should be explained fully.

Appendix 2. -  Definitions

Use the following definitions. The first three definitions utilize the 1UPAC document "Nomenclature of Interlaboratory Studies" (approved for publication 1994). The next two definitions are assembled from components given in ISO 3534-1:1993. All test results are assumed to be independent, i.e., 'obtained in a manner not influenced by any previous result on the same or similar test object. Quantitative measures of precision depend critically on the stipulated conditions. Repeatability and reproducibility conditions are particular sets of extreme stipulated conditions.'

  1. A.2.1 Method-performance studies

An interlaboratory study in which all laboratories follow the same written protocol and use the same test method to measure a quantity in sets of identical test items [test samples, materials]. The reported results are used to estimate the performance characteristics of the method. Usually these characteristics are within-laboratory and among-laboratories precision, and when necessary and possible, other pertinent characteristics such as systematic error, recovery, internal quality control parameters, sensitivity, limit of determination, and applicability.

  1. A.2.2 Laboratory-performance study

An interlaboratory study that consists of one or more analyses or measurements by a group of laboratories on one or more homogeneous, stable test items, by the method selected or used by each laboratory. The reported results are compared with those of other laboratories or with the known or assigned reference value, usually with the objective of evaluating or improving laboratory performance.

  1. A.2.3 Material certification stud

An interlaboratory study that assigns a reference value ('true value') to a quantity (concentration or property) in the test item, usually with a stated uncertainty.

  1. A.2.4  Repeatability limit (r)

When the mean of the values obtained from two single determinations with the same method on identical test items in the same laboratory by the same operator using the same equipment within short intervals of time, lies within the range of the mean values cited in the Final Report, 4.0, the absolute difference between the two test results obtained should be less than or equal to the repeatability limit (r) [= 2.8 x s,) that can generally be inferred by linear interpolation of from the Report.

Note: This definition, and the corresponding definition for reproducibility limit, has been assembled from five cascading terms and expanded to permit application by interpolation to a test item whose mean is not the same as that used to establish the original parameters, which is the usual case in applying these definitions. The term 'repeatability [and reproducibility] limit' is applied specifically to a probability of 95% and is taken as 2.8 x s, [or SRI. The general term for this statistical concept applied to any measure of location (e.g., median) and with other probabilities (e.g., 99%) is "repeatability [and reproducibility] critical difference".

  1. A.2.5  Reproducibility limit (R)

When the mean of the values obtained from two single determinations with the same method on identical test items in different laboratories with different operators using different equipment, lies within the range of the mean values cited in the Final Report, 4.0, the absolute difference between the two test results obtained should be less than or equal to the reproducibility limit (R) [= 2.8 x ] that can generally be inferred by linear interpolation of from the Report.

Note 1: When the results of the interlaboratory test make it possible, the value of r and R can be indicated as a relative value (e.g., as a percentage of the determined mean value) as an alternative to the absolute value.

Note 2: When the final reported result in the study is an average derived from more than a single value, i.e., k is greater than 1, the value for R must be adjusted according to the following formula before using R to compare the results of a single routine analyses between two laboratories.

Similar adjustments must be made for replicate results constituting the final values for and , if these will be the reported parameters used for quality control purposes.

Note 3: The repeatability limit, r, may be interpreted as the amount within which two determinations should agree with each other within a laboratory 95% of the time. The reproducibility limit, R, may be interpreted as the amount within which two separate determinations conducted in different laboratories should agree with each other 95% of the time.

Note 4: Estimates Of can be obtained only from a planned, organized method performance study; estimates of can be obtained from routine work within a laboratory by use of control charts. For occasional analyses, in the absence of control charts, within-laboratory precision may be approximated as one half SR (Pure and Appl. Chem., 62, 149-162 (1990) , Sec. L3, Note.).

  1. A.2.6 One-way analysis of variance

One-way analysis of variance is the statistical procedure for obtaining the estimates of within laboratory and between-laboratory variability on a material-by-material basis. Examples of the calculations for the single level and single-split-level designs can be found in ISO 5725-1986.

Appendix 3. – Critical values

  1. A.3.1 Critical values for the Cochran maximum variance ratio at the 2.5% (1 -tail) rejection level, expressed as the percentage the highest variance is of the total variance; r = number of replicates.

No of labs

r=2

r=3

r=4

r=5

=6

4

94.3

81.0

72.5

65.4

62.5

5

88.6

72.6

64.6

58.1

53.9

6

83.2

65.8

58.3

52.2

47.3

7

78.2

60.2

52.2

47.3

42.3

8

73.6

55.6

47.4

43.0

38.5

9

69.3

51.8

43.3

39.3

35.3

10

65.5

48.6

39.9

36.2

32.6

11

62.2

45.8

37.2

33.6

30.3

12

59.2

43.1

35.0

31.3

28.3

13

56.4

40.5

33.2

29.2

26.5

14

53.8

38.3

31.5

27.3

25.0

15

51.5

36.4

29.9

25.7

23.7

16

49.5

34.7

28.4

24.4

22.0

17

47.8

33.2

27.1

23.3

21.2

18

46.0

31.8

25.9

22.4

20.4

19

44.3

30.5

24.8

21.5

19.5

20

42.8

29.3

23.8

20.7

18.7

21

41.5

28.2

22.9

19.9

18.0

22

40.3

27.2

22.0

19.2

17.3

23

39.1

26.3

21.2

18.5

16.6

24

37.9

25.5

20.5

17.8

16.0

25

36.7

24.8

19.9

17.2

15.5

26

35.5

24.1

19.3

16.6

15.0

27

34.5

23.4

18.7

16.1

14.5

28

33.7

22.7

18.1

15.7

14.1

29

33.1

22.1

17.5

15.3

13.7

30

32.5

21.6

16.9

14.9

13.3

35

29.3

19.5

15.3

12.9

11.6

40

26.0

17.0

13.5

11.6

10.2

50

21.6

14.3

11.4

9.7

8.6

Tables A.3.1 and A.3.3 were calculated by R. Albert (October, 1993) by computer simulation involving several runs of approximately 7000 cycles each for each value, and then smoothed. Although Table A.3.1 is strictly applicable only to a balanced design (same number of replicates from all laboratories), it can be applied to an unbalanced design without too much error, if there are only a few deviations.

  1. A.3.2 Calculation of Cochran maximum variance outlier ratio

Compute the within-laboratory variance for each laboratory and divide the largest of these variances by the sum of the all of the variances and multiply by 100. The resulting quotient is the Cochran statistic which indicates the presence of a removable outlier if this quotient exceed the critical value listed above in the Cochran table for the number of replicates and laboratories specified.

  1. A.3.3 Critical values for the Grubbs extreme deviation outlier tests at the 2.5% (2-tail), 1.25% (1tail) rejection level, expressed as the percent reduction in standard deviations caused by the removal of the suspect value(s).

No. of labs

One highest

or lowest

Two highest

or two lowest

One highest and

one lowest

4

86.1

98.9

99.1

5

73.5

90.9

92.7

6

64.0

81.3

84.0

7

57.0

73.1

76.2

8

51.4

66.5

69.6

9

46.8

61.0

64.1

10

42.8

56.4

59.5

11

39.3

52.5

55.5

12

36.3

49.1

52.1

13

33.8

46.1

49.1

14

31.7

43.5

46.5

15

29.9

41.2

44.1

16

28.3

39.2

42.0

17

26.9

37.4

40.1

18

25.7

35.9

38.4

19

24.6

34.5

36.9

20

23.6

33.2

35.4

21

22.7

31.9

34.0

22

21.9

30.7

32.8

23

21.2

29.7

31.8

24

20.5

28.8

30.8

25

19.8

28.0

29.8

26

19.1

27.1

28.9

27

18.4

26.2

28.1

28

17.8

25.4

27.3

29

17.4

24.7

26.6

30

17.1

24.1

26.0

40

13.3

19.1

20.5

50

11.1

16.2

17.3

  1. A.3.4 Calculation of the Grubbs test values

To calculate the single Chubbs test statistic, compute the average for each laboratory and then calculate the standard deviation (M) of these L averages (designate as the original s). Calculate the SD of the set of averages with the highest average removed (SH); calculate the SD of the set of averages with the lowest average removed (SL). The calculate the percentage decrease in SD for both as follows:

  • 100 x [ 1 - (sL/s] and 100 x [ 1 - (sH/s)].

The higher of these two percentage decreases is the singe Grubbs test statistic, which signal the presence of an outlier to be omitted at the P = 2.5% level, 2tail, if it exceeds the critical value listed in the single value column, Column 2, of Table A.3.3 , for the number of laboratory averages used to calculate the original s.

To calculate the paired Grubbs test statistics, calculate the percentage decrease in standard deviation obtained by dropping the two highest averages and also by dropping the two lowest averages, as above. Compare the higher of the percentage changes in standard deviation with the tabular values in column 3 and proceed with (1) or (2): (1) If the tabular value is exceeded, remove the responsible pair. Repeat the cycle again, starting at the beginning with the Cochran extreme variance test again, the Grubbs extreme value test, and the paired Grubbs extreme value test. (2) If no further values are removed, then calculate the percentage change in standard deviation obtained by dropping both the highest extreme value and the lowest extreme value together, and compare with the tabular values in the last column of A.3.3. If the tabular value is exceeded, remove the high-low pair of averages, and start the cycle again with the Cochran test until no further values are removed. In all cases, stop outlier testing when more than 22.2% (2/9) of the averages are removed.

Appendix 4

  1. A.4.1. Flowchart for outlier removal