Practical guide for the Validation
OIVMAAS112 Practical guide for the validation, quality control, and uncertainty assessment of an alternative oenological analysis method
Contents
4.2 Definition of measurement error
5.2 Section one: Scope of method
5.2.1 Definition of analyzable matrices
5.2.2 Detection and quantification limit
5.2.2.4.1 Determination on blank
5.2.2.4.1.2 Basic protocol and calculations
5.2.2.4.2 Approach by linearity study
5.2.2.4.2.2 Basic protocol and calculations
5.2.2.4.3 Graphic approach based on the background noise of the recording
5.2.2.4.3.2 Basic protocol and calculation
5.2.2.4.4 Checking a predetermined quantification limit
5.2.2.4.4.2 Basic protocol and calculation
5.3 Section two: systematic error study
5.3.1.4 ISO 11095type approach
5.3.1.4.2 Calculations and results
5.3.1.4.2.1 Defining the regression model
5.3.1.4.2.2 Estimating parameters
5.3.1.4.2.4 Test of the linearity assumption
5.3.1.4.2.4.1 Definitions of errors linked to calibration
5.3.1.4.2.4.2 FischerSnedecor test
5.3.1.5 ISO 8466type approach
5.3.1.5.2 Calculations and results
5.3.1.5.2.1 Defining the linear regression model
5.3.1.5.2.2 Defining the polynomial regression model
5.3.1.5.2.3 Comparing residual standard deviations
5.3.2.3.1 Standard addition test
5.3.2.3.1.3 Calculations and results
5.3.2.3.1.3.1 Study of the regression line r = a + b.v
5.3.2.3.1.3.2 Analysis of the results
5.3.2.3.1.3.3 Overlap line graphics
5.3.2.3.2 Study of the influence of other compounds on the measurement result
5.3.2.3.2.2 Basic protocol and calculations
5.3.3 Study of method accuracy
5.3.3.1 Presentation of the step
5.3.3.2 Comparison of the alternative method with the OIV reference method
5.3.3.2.2 Accuracy of the alternative method compared with the reference method
5.3.3.2.2.3 Basic protocol and calculations
5.3.3.3 Comparison by interlaboratory tests
5.3.3.3.2 Basic protocol and calculations
5.3.3.4 Comparison with reference materials
5.3.3.4.2 Basic protocol and calculations
5.4 Section three: random error study
5.4.3.3 General theoretical case
5.4.3.3.1 Basic protocol and calculations
5.4.3.3.1.1 Calculations with several test materials
5.4.3.3.1.2 Calculations with 1 test material
5.4.3.4.3 Basic protocol and calculations
5.4.3.4.3.2 Particular case applicable to only 1 repetition
5.4.3.4.4 Comparison of repeatability
5.4.3.4.4.1 Determination of the repeatability of each method
5.4.3.4.4.2 FischerSnedecor test
5.4.3.5 Intralaboratory reproducibility
5.4.3.5.3 Basic protocol and calculations
6. Quality control of analysis methods (IQC)
6.4 Checking the analytical series
6.4.2 Checking accuracy using reference materials
6.5 Checking the analysis system
6.5.2.2 Presentation of results and definition of limits
6.5.2.3 Using the Shewhart chart
6.5.3 Internal comparison of analysis systems
6.5.4 External comparison of the analysis system
6.5.4.1 Analysis chain of interlaboratory comparisons
6.5.4.2 Comparison with external reference materials
6.5.4.2.1 Standard uncertainty of reference material
6.5.4.2.2 Defining the validity limits of measuring reference material
7. Assessment of measurement uncertainty
7.4.1 Definition of the measurand, and description of the quantitative analysis
method
7.4.2 Critical analysis of the measurement process
7.4.3 Estimation calculations of standard uncertainty
(intralaboratory approach)
7.4.3.2 Calculating the standard deviation of intralaboratory reproducibility
7.4.3.3.1 Gauging error (or calibration error)
7.4.3.3.1.2 Calculations and results
7.4.3.3.1.3 Estimating the standard uncertainty associated the gauging line
(or calibration line)
7.4.3.3.2.1 Methods adjusted with only one certified reference material
7.4.3.3.2.2 Methods adjusted with several reference materials (gauging ranges etc)
7.4.4 Estimating standard uncertainty by interlaboratory tests
7.4.4.3 Using the standard deviation of interlaboratory and intermethod
reproducibility SR_{inter}
7.4.4.4 Other components in the uncertainty budget
7.5 Expressing expanded uncertainty
The purpose of this guide is to assist oenological laboratories carrying out serial analysis as part of their validation, internal quality control and uncertainty assessment initiatives concerning the standard methods they use.
International standard ISO 17025, defining the "General Requirements for the Competence of Testing and Calibration Laboratories", states that the accredited laboratories must, when implementing an alternative analytical method, make sure of the quality of the results obtained. To do so, it indicates several steps. The first step consists in defining the customers' requirements concerning the parameter in question, in order to determine, thereafter, whether the method used meets those requirements. The second step includes initial validation for nonstandardized, modified or laboratorydeveloped methods. Once the method is applied, the laboratories must use inspection and traceability methods in order to monitor the quality of the results obtained. Finally, they must assess the uncertainty of the results obtained.
In order to meet these requirements, the laboratories have a significant reference system at their disposal comprising a large number of international guides and standards. However, in practice, the application of these texts is delicate since, because they address every category of calibration and test laboratory, they remain very general and presuppose, on behalf of the reader, indepth knowledge of the mathematical rules applicable to statistical data processing.
This guide is based on this international reference system, taking into account the specific characteristics of oenology laboratories routinely carrying out analyses on series of must or wine samples. Defining the scope of application in this way enabled a relevant choice of suitable tools to be made, in order to retain only those methods most suitable for that scope. Since it is based on the international reference system, this guide is therefore strictly compliant with it. Readers, however, wishing to study certain points of the guide in greater detail can do so by referring to the international standards and guides, the references for which are given in each chapter.
The authors have chosen to combine the various tools meeting the requirements of the ISO 17025 standard since there is an obvious solution of continuity in their application, and the data obtained with certain tools can often be used with the others. In addition, the mathematical resources used are often similar.
The various chapters include application examples, taken from oenology laboratories using these tools.
It is important to point out that that this guide does not pretend to be exhaustive. It is only designed to present, in as clear and applicable a way as possible, the contents of the requirements of the ISO 17025 standard and the basic resources that can be implemented in a routine laboratory to meet them. Each laboratory remains perfectly free to supplement these tools or to replace them by others that they consider to be more efficient or more suitable.
Finally, the reader’s attention should be drawn to the fact that the tools presented do not constitute an end in themselves and that their use, as well as the interpretation of the results to which they lead, must always be subject to critical analysis. It is only under these conditions that their relevance can be guaranteed, and laboratories will be able to use them as tools to improve the quality of the analyses they carry out.
The definitions indicated below used in this document result from the normative references given in the bibliography.
Analyte
Object of the analysis method
Blank
Test carried out in the absence of a matrix (reagent blank) or on a matrix which does not contain the analyte (matrix blank).
Bias
Difference between the expected test results and an accepted reference value.
Uncertainty budget
The list of uncertainty sources and their associated standard uncertainties, established in order to assess the compound standard uncertainty associated with a measurement result.
Gauging (of a measuring instrument)
Material positioning of each reference mark (or certain principal reference marks only) of a measuring instrument according to the corresponding value of the measurand.
NOTE "gauging" and "calibration" are not be confused
Repeatability conditions
Conditions where independent test results are obtained with the same method on identical test items in the same laboratory by the same operator using the same equipment within short intervals of time.
Reproducibility conditions (intralaboratory)
Conditions where independent test results are obtained with the same method on identical test items in the same laboratory by the same or different operator(s) using different gauges on different days.
Experimental standard deviation
For a series of n measurements of the same measurand, the quantity s characterizing the dispersion of the results and given by the formula:

being the result of the measurement^{}^{}and the arithmetic mean of the n results considered.
Repeatability standard deviation
Standard deviation of many repetitions obtained in a single laboratory by the same operator on the same instrument, i.e. under repeatable conditions.
Internal reproducibility standard deviation (or total intralaboratory variability)
Standard deviation of repetitions obtained in a single laboratory with the same method, using several operators or instruments and, in particular, by taking measurements on different dates, i.e. under reproducibility conditions.
Random error
Result of a measurement minus the mean that would result from an infinite number of measurements of the same measurand carried out under reproducibility conditions.
Measurement error
Result of a measurement minus a true value of the measurand.
Systematic error
Mean error that would result from an infinite number of measurements of the same measurand carried out under reproducibility conditions minus a true value of the measurand.
NOTE Error is a highly theoretical concept in that it calls upon values that are not accessible in practice, in particular the true values of measurands. On principle, the error is unknown.
Mathematical expectation
For a series of n measurements of the same measurand, if n tends towards the infinite, the mean tends towards the expectation E(x).

Calibration
Series of operations establishing under specified conditions the relation between the values of the quantity indicated by a measuring instrument or system, or the values represented by a materialized measurement or a reference material, and the corresponding values of the quantity measured by standards.
Intralaboratory evaluation of an analysis method
Action which consists in submitting an analysis method to an intralaboratory statistical study, based on a standardized and/or recognized protocol, demonstrating that within its scope, the analysis method meets preestablished performance criteria.
Within the framework of this document, the evaluation of a method is based on an intralaboratory study, which includes the comparison with a reference method.
Precision
Closeness of agreement between independent test results obtained under prescribed conditions
Note 1 Precision depends only on the distribution of random errors and does not have any relationship with the true or specified value.
Note 2 The measurement of precision is expressed on the basis of the standard deviation of the test results.
Note 3 The expression "independent test results" refers to results obtained such that they are not influenced by a previous result on the same or a similar test material. Quantitative measurements of precision are critically dependent upon the prescribed conditions. Repeatability and reproducibility conditions are particular sets of extreme conditions.
Quantity (measurable)
An attribute of a phenomenon, body or substance that may be distinguished qualitatively and determined quantitatively.
Uncertainty of measurement
A parameter associated with the result of a measurement, which characterizes the dispersion of the values that could reasonably be attributed to the measurand.
Standard uncertainty (u(xi))
Uncertainty of the result of a measurement expressed in the form of a standard deviation.
Accuracy
Closeness of agreement between the mean value obtained starting from a broad series of test results and an accepted reference value.
Note The measurement of accuracy is generally expressed in terms of bias.
Detection limit
Lowest amount of an analyte to be examined in a test material that can be detected and regarded as different from the blank value (with a given probability), but not necessarily quantified. In fact, two risks must be taken into account:
 the risk α of considering the substance is present in test material when its quantity is null;
 the risk β of considering a substance is absent from a substance when its quantity is not null.
Quantification limit
Lowest amount of an analyte to be examined in a test material that can be quantitatively determined under the experimental conditions described in the method with a defined variability (given coefficient of variation).
Linearity
The ability of a method of analysis, within a certain range, to provide an instrumental response or results proportional to the quality of analyte to be determined in the laboratory sample.
This proportionality is expressed by an a priori defined mathematical expression.
The linearity limits are the experimental limits of concentrations between which a linear calibration model can be applied with a known confidence level (generally taken to be equal to 1%).
Test material
Material or substance to which a measuring can be applied with the analysis method under consideration.
Reference material
Material or substance one or more of whose property values are sufficiently homogeneous and well established to be used for the calibration of an apparatus, the assessment of a measurement method, or for assigning values to materials.
Certified reference material
Reference material, accompanied by a certificate, one or more whose property values are certified by a procedure which establishes its traceability to an accurate realization of the unit in which the property values are expressed, and for which each certified value is accompanied by an uncertainty at a stated level of confidence.
Matrix
All the constituents of the test material other than the analyte.
Analysis method
Written procedure describing all the means and procedures required to carry out the analysis of the analyte, i.e.: scope, principle and/or reactions, definitions, reagents, apparatus, procedures, expression of results, precision, test report.
WARNING The expressions "titration method" and "determination method" are sometimes used as synonyms for the expression "analysis method". These two expressions should not be used in this way.
Quantitative analysis method
Analysis method making it possible to measure the analyte quantity present in the laboratory test material.
Reference analysis method (Type I or Type II methods)
Method, which gives the accepted reference value for the quantity of the analyte to be measured.
Nonclassified alternative method of analysis
A routine analysis method used by the laboratory and not considered to be a reference method.
NOTE An alternative method of analysis can consist in a simplified version of the reference method.
Measurement
Set of operations having the object of determining a value of a quantity.
Note The operations can be carried out automatically.
Measurand
Particular quantity subject to measurement.
Mean
For a series of n measurements of the same measurand, mean value, given by the formula:

being the result of the measurement.
Result of a measurement
Value assigned to a measurand, obtained by measurement
Ratio between the variation of the information value of the analysis method and the variation of the analyte quantity.
The variation of the analyte quantity is generally obtained by preparing various standard solutions, or by adding the analyte to a matrix.
Note 1 Defining, by extension, the sensitivity of a method as its capacity to detect small quantities should be avoided.
Note 2 A method is said to be “sensitive" if a low variation of the quantity or analyte quantity incurs a significant variation in the information value.
Measurement signal
Quantity representing the measurand and is functionally linked to it.
Specificity
Property of an analysis method to respond exclusively to the determination of the quantity of the analyte considered, with the guarantee that the measured signal comes only from the analyte.
Tolerance
Deviation from the reference value, as defined by the laboratory for a given level, within which a measured value of a reference material can be accepted.
Value of a quantity
Magnitude of a particular quantity generally expressed as a unit of measurement multiplied by a number.
True value of a quantity
Value compatible with the definition of a given particular quantity.
Note 1 The value that would be obtained if the measurement was perfect
Note 2 Any true value is by nature indeterminate
Accepted reference value
A value that serves as an agreedupon reference for comparison and which is derived as:
a) a theoretical or established value, based on scientific principles;
b) an assigned or certified value, based on experimental work of some national or international organization;
c) a consensus or certified value, based on collaborative experimental work under the auspices of a scientific or engineering group;
Within the particular framework of this document, the accepted reference value (or conventionally true value) of the test material is given by the arithmetic mean of the values of measurements repeated as per the reference method.
Variance
Square of the standard deviation.
4.1. Methodology
When developing a new alternative method, the laboratory implements a protocol that includes several steps. The first step, applied only once at the initial stage, or on a regular basis, is the validation of the method. This step is followed by permanent quality control. All the data collected during these two steps make it possible to assess the quality of the method. The data collected during these two steps are used to evaluate the measurement uncertainty. The latter, which is regularly assessed, is an indicator of the quality of the results obtained by the method under consideration.

All these steps are interconnected and constitute a global approach that can be used to assess and control measurement errors.
4.2. Definition of measurement error
Any measurement carried out using the method under study gives a result which is inevitably associated with a measurement error, defined as being the difference between the result obtained and the true value of the measurand. In practice, the true value of the measurand is inaccessible and a value conventionally accepted as such is used instead.
The measurement error includes two components:
Measurement error 



True value= Analysis results 
Systematic error 
Random error 
In practice, the systematic error results in a bias in relation to the true value, the random error being all the errors associated with the application of the method.
These errors can be graphically represented in the following way:

The validation and quality control tools are used to evaluate the systematic errors and the random errors, and to monitor their changes over time.
 Validating a method
5.1. Methodology
Implementing the validation comprises 3 steps, each with objectives. To meet these objectives, the laboratory has validation tools. Sometimes there are many tools for a given objective, and are suitable for various situations. It is up to the laboratory to correctly choose the most suitable tools for the method to be validated.
Steps 
Objectives 
Tools for validation 


Scope of application 

 To define the analyzable matrices 

 To define the analyzable range 
Detection and quantification limit 

Robustness study 

or bias 
 Linear response in the scale of analyzable values 
Linearity study 
 Specificity of the method 
Specificity study 

 Accuracy of the method 
Comparison with a reference method 

Comparison with reference materials 

Interlaboratory comparison 

 Precision of the method 
Repeatability study 

Intralaboratory reproducibility study 
5.2. Section one: Scope of method
5.2.1. Definition of analyzable matrices
The matrix comprises all constituents in the test material other than the analyte.
If these constituents are liable to influence the result of a measurement, the laboratory should define the matrices on which the method is applicable.
For example, in oenology, the determination of certain parameters can be influenced by the various possible matrices (wines, musts, sweet wines, etc.).
In case of doubt about a matrix effect, more indepth studies can be carried out as part of the specificity study.
5.2.2. Detection and quantification limit
This step is of course not applicable and not necessary for those methods whose lower limit does not tend towards 0, such as alcoholic strength by volume in wines, total acidity in wines, pH, etc.
5.2.2.1. Normative definition
The detection limit is the lowest amount of analyte that can be detected but not necessarily quantified as an exact value. The detection limit is a parameter of limit tests.
The quantification limit is the lowest quantity of the compound that can be determined using the method.
5.2.2.2. Reference documents
 NF V03110 Standard, intralaboratory validation procedure for an alternative method in relation to a reference method.
 International compendium of analysis methods – OIV, Assessment of the detection and quantification limit of an analysis method (Oeno resolution 7/2000).
5.2.2.3. Application
In practice, the quantification limit is generally more relevant than the detection limit, the latter being by convention 1/3 of the first.
There are several approaches for assessing the detection and quantification limits:
 Determination on blank
 Approach by the linearity study
 Graphic approach
These methods are suitable for various situations, but in every case they are mathematical approaches giving results of informative value only. It seems crucial, whenever possible, to introduce a check of the value obtained, whether by one of these approaches or estimated empirically, using the checking protocol for a predetermined quantification limit.
5.2.2.4. Procedure
5.2.2.4.1. Determination on blank
5.2.2.4.1.1. Scope
This method can be applied when the blank analysis gives results with a nonzero standard deviation. The operator will judge the advisability of using reagent blanks, or matrix blanks.
If the blank, for reasons related to uncontrolled signal preprocessing, is sometimes not measurable or does not offer a recordable variation (standard deviation of 0), the operation can be carried out on a very low concentration in analyte, close to the blank.
5.2.2.4.1.2. Basic protocol and calculations
Carry out the analysis of n test materials assimilated to blanks, n being equal to or higher than 10.
 Calculate the mean of the results obtained:

 Calculate the standard deviation of the results obtained:

 From these results the detection limit is conventionally defined by the formula:

 From these results the quantification limit is conventionally defined by the formula:

Example: The table below gives some of the results obtained when assessing the detection limit for the usual determination of free sulfur dioxide.
Test material # 
X ( mg/l) 
1 
0 
2 
1 
3 
0 
4 
1.5 
5 
0 
6 
1 
7 
0.5 
8 
0 
9 
0 
10 
0.5 
11 
0 
12 
0 
The calculated values are as follows:
 q = 12
 _{ = }0.375
 = 0.528 mg/l
 DL = 1.96 mg/l
 QL = 5.65 mg/l
5.2.2.4.2. Approach by linearity study
5.2.2.4.2.1. Scope
This method can be applied in all cases, and is required when the analysis method does not involve background noise. It uses the data calculated during the linearity study.
Note This statistical approach may be biased and give pessimistic results when linearity is calculated on a very wide range of values for reference materials, and whose measurement results include variable standard deviations. In such cases, a linearity study limited to a range of low values, close to 0 and with a more homogeneous distribution will result in a more relevant assessment.
5.2.2.4.2.2. Basic protocol and calculations
Use the results obtained during the linearity study which made it possible to calculate the parameters of the calibration function y = a+ b.x
The data to be recovered from the linearity study are (see chapter 5.3.1. linearity study):
 slope of the regression line:

 residual standard deviation:

 standard deviation at the intercept point (to be calculated):

The estimates of the detection limit DL and the quantification limit QL are calculated using following formulae:

Estimation detection limit 

Estimated quantification limit 
Example: Estimatation of the detection and quantification limits in the determination of sorbic acid by capillary electrophoresis, based on linearity data acquired on a range from 1 to 20 mg.L_{1}.
X (ref) 
Y1 
Y2 
Y3 
Y4 
1 
1.9 
0.8 
0.5 
1.5 
2 
2.4 
2 
2.5 
2.1 
3 
4 
2.8 
3.5 
4 
4 
5.3 
4.5 
4.7 
4.5 
5 
5.3 
5.3 
5.2 
5.3 
10 
11.6 
10.88 
12.1 
10.5 
15 
16 
15.2 
15.5 
16.1 
20 
19.7 
20.4 
19.5 
20.1 
Number of reference materials n = 8 Number of replicas p = 4 
Straight line (y = a + b*x) 
b = 0.9972 
a = 0.51102 
residual standard deviation: S_{res} = 0.588 
Standard deviation on the intercept point S_{a} = 0.1597 
The estimated detection limit is DL = 0.48 mg.L^{1}
The estimated quantification limit is QL = 1.6 mg.L^{1}
5.2.2.4.3. Graphic approach based on the background noise of the recording
5.2.2.4.3.1. Scope
This approach can be applied to analysis methods that provide a graphic recording (chromatography, etc.) with a background noise. The limits are estimated from a study of the background noise.
5.2.2.4.3.2. Basic protocol and calculation
Record a certain number of reagent blanks, using 3 series of 3 injections separated by several days.
Determine the following values:
 the greatest variation in amplitude on the yaxis of the signal observed between two acquisition points, excluding drift, at a distance equal to twenty times the width at midheight of the peak corresponding to the analyte, centered over the retention time of the compound under study.
 R, the quantity/signal response factor, expressed in height.
The detection limit DL, and the quantification limit QL are calculated according to the following formulae:
DL = 3 R
QL = 10 R
5.2.2.4.4. Checking a predetermined quantification limit
This approach can be used to validate a quantification value obtained by statistical or empirical approach.
5.2.2.4.4.1. Scope
This method can be used to check that a given quantification limit is a priori acceptable. It is applicable when the laboratory can procure at least 10 test materials with known quantities of analyte, at the level of the estimated quantification limit.
In the case of methods with a specific signal, not sensitive to matrix effects, the materials can be synthetic solutions whose reference value is obtained by formulation.
In all other cases, wines (or musts) shall be used whose measurand value as obtained by the reference method is equal to the limit to be studied. Of course, in this case the quantification limit of the reference method must be lower than this value.
5.2.2.4.4.2. Basic protocol and calculation
Analyze n independent test materials whose accepted value is equal to the quantification limit to be checked; n must at least be equal to 10.
 Calculate the mean of n measurements:

 Calculate the standard deviation of n measurements:

with results of the measurement of the test material.
The two following conditions must be met:
a) the measured mean quantity must not be different from the predetermined quantification limit QL:
If < 10 then quantification limit QL is considered to be valid.
Note 10 is a purely conventional value relating to the QL criterion.
b) the quantification limit must be other than 0:
If 5 < QL then the quantification limit is other than 0.
A value of 5 corresponds to an approximate value for the spread of the standard deviation, taking into account risk and risk to ensure that the QL is other than 0.
This is equivalent to checking that the coefficient of variation for QL is lower than 20%.
Note Remember that the detection limit is obtained by dividing the quantification limit by 3.
Note 2 A check should be made to ensure that the value of S_{LQ} is not too large (which would produce an artificially positive test), and effectively corresponds to a reasonable standard deviation of the variability of the results for the level under consideration. It is up to the laboratory to make this critical evaluation of the value of .
Example: Checking the quantification limit of the determination of malic acid by the enzymatic method.
Estimated quantification limit: 0.1 g.L^{1}
Wine 
Values 
1 
0.1 
2 
0.1 
3 
0.09 
4 
0.1 
5 
0.09 
6 
0.08 
7 
0.08 
8 
0.09 
9 
0.09 
10 
0.08 
Mean: 0.090
Standard deviation: 0.008
First condition: The quantification limit of 0.1 is considered to be valid.
Second condition: The quantification limit is considered to be significantly different from 0.
5.2.3. Robustness
5.2.3.1. Definition
Robustness is the capacity of a method to give close results in the presence of slight changes in the experimental conditions likely to occur during the use of the procedure.
5.2.3.2. Determination
If there is any doubt about the influence of the variation of operational parameters, the laboratory can use the scientific application of experiment schedules, enabling these critical operating parameters to be tested within the variation range likely to occur under practical conditions. In practice, these tests are difficult to implement.
5.3. Section two: systematic error study
5.3.1.1. Normative definition
The linearity of a method is its ability (within a given range) to provide an informative value or results proportional to the amount of analyte to be determined in the test material.
5.3.1.2. Reference documents
 NF V03110 standard. Intralaboratory validation procedure of an alternative method in relation to a reference method.
 ISO 11095 Standard, linear calibration using reference materials.
 ISO 84661 Standard, Water quality – Calibration and evaluation of analytical methods and estimation of performance characteristics
5.3.1.3. Application
The linearity study can be used to define and validate a linear dynamic range.
This study is possible when the laboratory has stable reference materials whose accepted values have been acquired with certainty (in theory these values should have an uncertainty equal to 0). These could therefore be internal reference materials titrated with calibrated material, wines or musts whose value is given by the mean of at least 3 repetitions of the reference method, external reference materials or certified external reference materials.
In the last case, and only in this case, this study also enables the traceability of the method. The experiment schedule used here could then be considered as a calibration.
In all cases, it is advisable to ensure that the matrix of the reference material is compatible with the method.
Lastly, calculations must be made with the final result of the measurement and not with the value of the signal.
Two approaches are proposed here:
 An ISO 11095 type of approach, the principle of which consists in comparing the residual error with the experimental error using a Fischer's test. This approach is valid above all for relatively narrow ranges (in which the measurand does not vary by more than a factor 10). In addition, under experimental conditions generating a low reproducibility error, the test becomes excessively severe. On the other hand, in the case of poor experimental conditions, the test will easily be positive and will also lose its relevance. This approach requires good homogeneity of the number of measurements over the entire range studied.
 An ISO 8466 type of approach, the principle of which consists in comparing the residual error caused by the linear regression with the residual error produced by a polynomial regression (of order 2 for example) applied to the same data. If the polynomial model gives a significantly lower residual error, a conclusion of nonlinearity could be drawn. This approach is appropriate in particular when there is a risk of high experimental dispersion at one end of the range. It is therefore naturally wellsuited to analysis methods for traces. There is no need to work with a homogeneous number of measurements over the whole range, and it is even recommended to increase the number of measurements at the borders of the range.
5.3.1.4. ISO 11095type approach
5.3.1.4.1. Basic protocol
It is advisable to use a number n of reference materials. The number must be higher than 3, but there is no need, however, to exceed 10. The reference materials should be measured p times, under reproducibility conditions, p shall be higher than 3, a number of 5 being generally recommended. The accepted values for the reference materials are to be regularly distributed over the studied range of values. The number of measurements must be identical for all the reference materials.
Note It is essential that the reproducibility conditions use a maximum of potential sources of variability, with the risk that the test shows nonlinearity in an excessive way.
The results are reported in a table presented as follows:
Reference materials 
Accepted reference value material 
Measured values 

Replica 1 
... 
Replica j 
... 
Replica p 

1 
x_{1} 
y_{11} 
... 
y1j 
... 
y1p 

... 
... 
... 
... 
... 
... 
... 

i 
x_{i} 
y_{i1} 
... 
y_{ij} 
... 
yip 

... 
... 
... 
... 
... 
... 
... 

n 
xn 
yn1 
... 
ynj 
... 
ynp 

5.3.1.4.2. Calculations and results
5.3.1.4.2.1. Defining the regression model
The model to be calculated and tested is as follows:

where
 is the replica of the reference material.
 is the accepted value of the reference material.
 b is the slope of the regression line.
 a is the intercept point of the regression line.
represents the expectation of the measurement value of the reference material.
is the difference between y_{ij} and the expectation of the measurement value of the reference material.
5.3.1.4.2.2. Estimating parameters
The parameters of the regression line are obtained using the following formulae:
 mean of p measurements of the i^{th }reference material

 mean of all the accepted values of n reference materials

 mean of all the measurements

 estimated slope b

 estimated intercept point a

 regression value associated with the reference material

 residual

5.3.1.4.2.3. Charts
The results can be presented and analyzed in graphic form. Two types of charts are used.
 The first type of graph is the representation of the values measured against the accepted values of reference materials. The calculated overlap line is also plotted.
 The second graph is the representation of the residual values against the estimated values of the reference materials () indicated by the overlap line.
The graph is a good indicator of the deviation in relation to the linearity assumption: the linear dynamic range is valid if the residual values are fairly distributed between the positive and negative values.


In case of doubt about the linearity of the regression, a FischerSnedecor test can be carried out in order to test the assumption: "the linear dynamic range is not valid", in addition to the graphic analysis.
5.3.1.4.2.4. Test of the linearity assumption
Several error values linked to calibration should be defined first of all: these can be estimated using the data collected during the experiment. A statistical test is then performed on the basis of these results, making it possible to test the assumption of nonvalidity of the linear dynamic range: this is the FischerSnedecor test.
Definitions of errors linked to calibration
These errors are given as a standard deviation, resulting from the square root of the ratio between a sum of squares and a degree of freedom.
Residual error
The residual error corresponds to the error between the measured values and the value given by the regression line.
The sum of the squares of the residual error is as follows:

The number of degrees of freedom is np2.
The residual standard deviation is then estimated by the formula:

Experimental error
The experimental error corresponds to the reproducibility standard deviation of the experimentation.
The sum of the squares of the experimental error is as follows:

The number of degrees of freedom is npn.
The experimental standard deviation (reproducibility) is then estimated by the formula:

Note This quantity is sometimes also noted S_{R}.
Adjustment error
The value of the adjustment error is the experimental error minus the residual error.
The sum of the squares of the adjustment error is:

Or

The number of degrees of freedom is n2
The standard deviation of the adjustment error is estimated by the formula:

Or

The ratio obeys the FischerSnedecor law with the degrees of freedom n2, npn.
The calculated experimental value is compared with the limit value: _{}(n2,npn), extracted from the Snedecor law table. The value for α used in practice is generally 5%.
If the assumption of the nonvalidity of the linear dynamic range is accepted (with a risk of α error of 5%).
If the assumption of the nonvalidity of the linear dynamic range is rejected
Example: Linearity study for the determination of tartaric acid by capillary electrophoresis. 9 reference materials are used. These are synthetic solutions of tartaric acid, titrated by means of a scale traceable to standard masses.
Ref. material 
Ti (ref) 
Y1 
Y2 
Y3 
Y4 
1 
0.38 
0.41 
0.37 
0.4 
0.41 
2 
1.15 
1.15 
1.12 
1.16 
1.17 
3 
1.72 
1.72 
1.63 
1.76 
1.71 
4 
2.41 
2.45 
2.37 
2.45 
2.45 
5 
2.91 
2.95 
2.83 
2.99 
2.95 
6 
3.91 
4.09 
3.86 
4.04 
4.04 
7 
5.91 
6.07 
5.95 
6.04 
6.04 
8 
7.91 
8.12 
8.01 
8.05 
7.9 
9 
9.91 
10.2 
10 
10.09 
9.87 
Regression line 

Line ( y = a + b*x) 

b = 1.01565 

a =  0.00798 

Errors related to calibration 

Residual standard deviation S_{res} = 0.07161 

Standard deviation of experimental reproducibility S_{exp} = 0.07536 

Standard deviation of the adjustment error S_{def} = 0.0548 

Interpretation, FischerSnedecor test 

= 0.53 < = 2.37 

The assumption of the nonvalidity of the linear dynamic range is rejected 
5.3.1.5. ISO 8466type approach
5.3.1.5.1. Basic protocol
It is advisable to use a number n of reference materials. The number must be higher than 3, but there is no need, however, to exceed 10. The reference materials should be measured several times, under reproducibility conditions. The number of measurements may be small at the center of the range studied (minimum = 2) and must be greater at both ends of the range, for which a minimum number of 4 is generally recommended. The accepted values of reference materials must be regularly distributed over the studied range of values.
Note It is vital that the reproducibility conditions use the maximum number of potential sources of variability.
The results are reported in a table presented as follows:
Reference materials 
Accepted value of the reference material 
Measured values 

Replica 1 
Replica 2 
Replica j 
... 
Replica p 

1 
x_{1} 
y_{11} 
y_{12} 
y1j 
... 
y1p 
... 
... 
... 
... 
... 
... 

i 
x_{i} 
y_{i1} 
y_{i2} 
_{ } 

_{ } 
... 
... 
... 
... 
... 
... 

N 
xn 
yn1 
... 
ynj 
... 
ynp 
5.3.1.5.2. Calculations and results
5.3.1.5.2.1. Defining the linear regression model
Calculate the linear regression model using the calculations detailed above.
The residual error of the standard deviation for the linear model S_{res} can then be calculated using the formula indicated in § 5.3.1.4.2.4.1
5.3.1.5.2.2. Defining the polynomial regression model
The calculation of the polynomial model of order 2 is given below
The aim is to determine the parameters of the polynomial regression model of order 2 applicable to the data of the experiment schedule.

The purpose is to determine the parameters a, b and c. This determination can generally be computerized using spreadsheets and statistics software.
The estimation formulae for these parameters are as follows:



Once the model has been established, the following values are to be calculated:
 Regression value associated with the reference material

 residual

Residual standard deviation of the polynomial model

Comparing residual standard deviations
Calculation of

Then

The value PG is compared with the limit value _{ }given by the FischerSnedecor table for a confidence level 1 α and a degree of freedom 1 and (N3).
Note In general the α risk used is 5%. In some cases the test may be optimistic and a risk of 10% will prove more realistic.
If PG : the nonlinear calibration function does not result in an improved adjustment; for example, the calibration function is linear.
If PG > : the work scope must be as narrow as possible to obtain a linear calibration function: otherwise, the information values from the analyzed samples must be evaluated using a nonlinear calibration function.
Example: Theoretical case.

Ti (ref) 
Y1 
Y2 
Y3 
Y4 
1 
22.6 
19.6 
21.6 
18.4 

2 
62 
49.6 
49.8 
53 

3 
90 
105.2 
103.5 


4 
130 
149 
149.8 


5 
205 
203.1 
202.5 
197.3 

6 
330 
297.5 
298.6 
307.1 
294.2 

y = 1.48.x – 0.0015
= 13.625
Polynomial regression
y =  0.0015x² + 1.485x – 27.2701
S'res = 7.407
Fischer's test
PG = 10.534 > F(5%) = 10.128
PG>F the linear calibration function cannot be retained
5.3.2. Specificity
5.3.2.1. Normative definition
The specificity of a method is its ability to measure only the compound being searched for.
5.3.2.2. Application
In case of doubt about the specificity of the tested method, the laboratory can use experiment schedules designed to check its specificity. Two types of complementary experiments are proposed here that can be used in a large number of cases encountered in the field of oenology.
 The first test is the standard addition test. It can be used to check that the method measures all the analyte.
 The second test can be used to check the influence of other compounds on the result of the measurement.
5.3.2.3. Procedures
5.3.2.3.1. Standard addition test
5.3.2.3.1.1. Scope
This test can be used to check that the method measures all the analyte.
The experiment schedule is based on standard additions of the compound being searched for. It can only be applied to methods that are not sensitive to matrix effects.
5.3.2.3.1.2. Basic protocol
This consists in finding a significant degree of added quantities on test materials analyzed before and after the additions.
Carry out variable standard additions on n test materials. The initial concentration in analyte of test materials, and the standard additions are selected in order to cover the scope of the method. These test materials must consist of the types of matrices called for routine analysis. It is advised to use at least 10 test materials.
The results are reported in a table presented as follows:
Test material

Quantity before addition 
Quantity added (v) 
Quantity after addition (w) 
Quantity found (r) 
1 
x_{1} 
v_{1} 
w_{1} 
r_{1} = w_{1} – x_{1} 
... 
... 
... 
... 
... 
i 
x_{i} 
v_{i} 
w_{i} 
r_{i} = w_{i} – x_{i} 
... 
... 
... 
... 
... 
n 
X_{n} 
V_{n} 
w_{n} 
r_{p} = w_{n} – x_{n} 
Note 1 An addition is made with a pure standard solution. It is advised to perform an addition of the same order as the quantity of the test material on which it is carried out. This is why the most concentrated test materials must be diluted to remain within the scope of the method.
Note 2 It is advised to prepare the additions using independent standard solutions, in order to avoid any systematic error.
Note 3 The quality of values x and w can be improved by using several repetitions.
5.3.2.3.1.3. Calculations and results
The principle of the measurement of specificity consists in studying the regression line r = a + b.v and checking that slope b is equivalent to 1 and that intercept point a is equivalent to 0.
5.3.2.3.1.3.1. Study of the regression line r = a + b.v
The parameters of the regression line are obtained using the following formulae:
 mean of the added quantities

 mean of the quantities found

 estimated slope b

 estimated intercept point a

 regression value associated with the reference material

 residual standard deviation

 standard deviation on the slope

 standard deviation on the intercept point

5.3.2.3.1.3.2. Analysis of the results
The purpose is to conclude on the absence of any interference and on an acceptable specificity. This is true if the overlap line r = a + bv is equivalent to the line y = x.
To do so, two tests are carried out:
 Test of the assumption that slope b of the overlap line is equal to 1.
 Test of the assumption that intercept point a is equal to 0.
These assumptions are tested using a Student test, generally associated with a risk of error of 1%. A risk of 5% can prove more realistic in some cases.
Let [dof; 1%] be a Student bilateral variable associated with a risk of error of 1% for a number of degrees of freedom (dof).
Step 1: calculations
Calculation of the comparison criterion on the slope at 1

Calculation of the comparison criterion on the intercept point at 0

Calculation of the Student critical value: T_{critical, bilateral}[ p2; 1%]
Step 2: interpretation
 If is lower than , then the slope of the regression line is equivalent to 1
 If is lower than , then the intercept point of the regression line is equivalent to 0.
If both conditions are true, then the overlap line is equivalent = y = x, and the method is deemed to be specific.
Note 1 Based on these results, a mean overlap rate can be calculated to quantify the specificity. In no case should it be used to "correct" the results. This is because if a significant bias is detected, the alternative method cannot be validated in relation to an efficiency rate of 100%.
Note 2 Since the principle of the test consists in calculating a straight line, at least three levels of addition have to be taken, and their value must be correctly chosen in order to obtain an optimum distribution of the points.
5.3.2.3.1.3.3. Overlap line graphics
Example of specificity


5.3.2.3.2. Study of the influence of other compounds on the measurement result
5.3.2.3.2.1. Scope
If the laboratory suspects the interaction of compounds other than the analyte, an experiment schedule can be set up to test the influence of various compounds. The experiment schedule proposed here enables a search for the influence of compounds defined a priori: thanks to its knowledge of the analytical process and its knowhow, the laboratory should be able to define a certain number of compounds liable to be present in the wine and to influence the analytical result.
5.3.2.3.2.2. Basic protocol and calculations
Analyze n wines in duplicate, before and after the addition of the compound suspected of having an influence on the analytical result; n must at least be equal to 10.
The mean values Mxi of the 2 measurements and made before the addition shall be calculated first, then the mean values My_{i} of the 2 measurements and made after the addition, and finally the difference between the values and .
The results of the experiment can be reported as indicated in the following table:
Samples 
x: Before addition 
y: After addition 
Means 
Difference 

Rep1 
Rep2 
Rep1 
Rep2 
x 
y 
d 

1 
x_{1} 
x’_{1} 
y_{1} 
y’_{1} 
Mx_{1} 
My_{1} 
d_{1 = }Mx_{1}My_{1} 

... 
... 
... 
... 
... 
... 
... 
... 
i 
x_{i} 
x’_{i} 
y_{i} 
y’_{i} 
Mx_{i} 
My_{i} 
d_{i} = MxiMy_{i} 
... 
... 
... 
... 
... 
... 
... 
... 
n 
x_{n} 
x’_{n} 
y_{n} 
y’_{n} 
Mx_{n} 
My_{n} 
d_{n} = Mx_{n}My_{n} 
The mean of the results before addition

The mean of the results after addition

Calculate the mean of the differences

Calculate the standard deviation of the differences

Calculate the Zscore

5.3.2.3.2.3. Interpretation
If the is 2, the added compound can be considered to have a negligible influence on the result of analysis with a risk of 5%.
If the is 2, the added compound can be considered to influence the result of analysis with a risk of 5%.
Note Interpreting the is possible given the assumption that the variations obey a normal law with a 95% confidence rate.
Example: Study of the interaction of compounds liable to be present in the samples, on the determination of fructose glucose in wines by Fourier transform infrared spectrophotometry (FTIR).
Before addition 
+ 250 mg.L^{1 }potassium sorbate 
+ 1 g. L^{1 }salicylic acid 
Differences 

vin 
rep1 
rep2 
rep1 
rep2 
rep1 
rep2 
sorbate diff 
salicylic diff 
1 
6.2 
6.2 
6.5 
6.3 
5.3 
5.5 
0.2 
0.8 
2 
1.2 
1.2 
1.3 
1.2 
0.5 
0.6 
0.05 
0.65 
3 
0.5 
0.6 
0.5 
0.5 
0.2 
0.3 
0.05 
0.3 
4 
4.3 
4.2 
4.1 
4.3 
3.8 
3.9 
0.05 
0.4 
5 
12.5 
12.6 
12.5 
12.7 
11.5 
11.4 
0.05 
1.1 
6 
5.3 
5.3 
5.4 
5.3 
4.2 
4.3 
0.05 
1.05 
7 
2.5 
2.5 
2.6 
2.5 
1.5 
1.4 
0.05 
1.05 
8 
1.2 
1.3 
1.2 
1.1 
0.5 
0.4 
0.1 
0.8 
9 
0.8 
0.8 
0.9 
0.8 
0.2 
0.3 
0.05 
0.55 
10 
0.6 
0.6 
0.5 
0.6 
0.1 
0 
0.05 
0.55 
Potassium sorbate 
Md = 
0.02 


Sd = 
0.086 


= 
0.23 
<2 





Salicylic acid 
Md = 
0.725 


Sd = 
0.282 


= 
2.57 
>2 
In conclusion, it can be stated that potassium sorbate does not influence the determination of fructose glucose by the FTIR gauging studied here. On the other hand, salicylic acid has an influence, and care should be taken to avoid samples containing salicylic acid, in order to remain within the scope of validity for the gauging under study.
5.3.3. Study of method accuracy
5.3.3.1. Presentation of the step
5.3.3.1.1. Definition
Correlation between the mean value obtained with a large series of test results and an accepted reference value.
5.3.3.1.2. General principles
When the reference value is output by a certified system, the accuracy study can be regarded a traceability link. This applies to two specific cases in particular:
 Traceability to certified reference materials: in this case, the accuracy study can be undertaken jointly with the linearity and calibration study, using the experiment schedule described for that study.
 Traceability to a certified interlaboratory comparison analysis chain.
The other cases, i.e. which use references that are not based on certified systems, are the most widespread in routine oenological laboratories. These involve comparisons:
 Comparison with a reference method
 Comparison with the results of an uncertified interlaboratory comparison analysis chain.
 Comparison with internal reference materials, or with external uncertified reference materials.
5.3.3.1.3. Reference documents
 NF V03110 Standard. intralaboratory validation procedure for an alternative method in relation to a reference method.
 NF V03115 Standard, Guide for the use of reference materials.
 ISO 11095 Standard, linear calibration using reference materials.
 ISO 84661 Standard. Water quality – Calibration and evaluation of analytical methods and estimation of performance characteristics
 ISO 57025 Standard, Exactitude of results and methods of measurement
5.3.3.2. Comparison of the alternative method with the OIV reference method
5.3.3.2.1. Scope
This method can be applied if the laboratory uses the OIV reference method, or a traced, validated method, whose performance quality is known and meets the requirements of the laboratory’s customers.
To study the comparative accuracy of the two methods, it is advisable first of all to ensure the quality of the repeatability of the method to be validated, and to compare it with the reference method. The method for carrying out the repeatability comparison is described in the chapter on repeatability.
5.3.3.2.2. Accuracy of the alternative method compared with the reference method
5.3.3.2.2.1. Definition
Accuracy is defined as the closeness of agreement between the values obtained by the reference method and that obtained by the alternative method, independent of the errors of precision of the two methods.
5.3.3.2.2.2. Scope
The accuracy of the alternative method in relation to the reference method is established for a field of application in which the repeatabilities of the two methods are constant.
In practice, it is therefore often advisable to divide the analyzable range of values into several sections or "range levels" (2 to 5), in which we may reasonably consider that the repeatabilities of the methods are comparable to a constant.
5.3.3.2.2.3. Basic protocol and calculations
In each range level, accuracy is based on a series of n test materials with concentration values in analyte covering the range level in question. A minimum number of 10 test materials is required to obtain significant results.
Each test material is to be analyzed in duplicate by the two methods under repeatable conditions.
A calculation is to be made of the mean values of the 2 measurements et made using the alternative method and the mean values of the 2 measurements et _{}made using the reference method, then the difference d_{i} is to be calculated between the values and .
The results of the experiment can be reported as in the following table:
Test material 
x: Alternative method 
y: Reference method 
Means 
Difference 

Rep1 
Rep2 
Rep1 
Rep2 
x 
y 
d 

1 
x_{1} 
x’_{1} 
y_{1} 
y’_{1} 
Mx_{1} 
My_{1} 
d_{1 = }Mx_{1 } My_{1} 

... 
... 
... 
... 
... 
... 
... 
... 

i 
x_{i} 
x’_{i} 
y_{i} 
y’_{i} 
Mx_{i} 
My_{i} 
d_{i} = Mxi  My_{i} 

... 
... 
... 
... 
... 
... 
... 
... 

n 
x_{n} 
x’_{n} 
y_{n} 
y’_{n} 
Mx_{n} 
My_{n} 
d_{n} = Mx_{n } My_{n} 

The following calculations are to be made
 The mean of the results for the alternative method

 The mean of the results for the reference method

 Calculate the mean of the differences

 Calculate the standard deviation of the differences

 Calculate the

5.3.3.2.2.4. Interpretation
 If the is lower than or equal to 2.0, it can be concluded that the accuracy of one method in relation to the other is satisfactory, in the range level under consideration, with a risk of error α = 5%.
 If the is higher than 2.0, it can be concluded that the alternative method is not accurate in relation to the reference method, in the range level under consideration, with a risk of error α = 5%.
Note Interpreting the is possible given the assumption that the variations obey a normal law with a 95% confidence rate.
Example: Study of the accuracy of FTIR gauging to determine glucose and fructose in relation to the enzymatic method. The first range level covers the scale from 0 to 5 g.L^{1 }and the second range level covers a scale from 5 to 20 g.L^{1}.
Wine 
FTIR 1 
IRTF2 
Enz 1 
Enz 2 
di 
1 
0 
0.3 
0.3 
0.2 
0.1 
2 
0.2 
0.3 
0.1 
0.1 
0.2 
3 
0.6 
0.9 
0.0 
0.0 
0.7 
4 
0.7 
1 
0.8 
0.7 
0.1 
5 
1.2 
1.6 
1.1 
1.3 
0.2 
6 
1.3 
1.4 
1.3 
1.3 
0.0 
7 
2.1 
2 
1.9 
2.1 
0.0 
8 
2.4 
0 
1.1 
1.2 
0.1 
9 
2.8 
2.5 
2.0 
2.6 
0.3 
10 
3.5 
4.2 
3.7 
3.8 
0.1 
11 
4.4 
4.1 
4.1 
4.4 
0.0 
12 
4.8 
5.4 
5.5 
5.0 
0.2 






Md 
0.13 




Sd 
0.23 




Z_{score} 
0.55 
< 2 



Wine 
FTIR 1 
IRTF2 
Enz 1 
Enz 2 
di 
1 
5.1 
5.4 
5.1 
5.1 
0.1 
2 
5.3 
5.7 
5.3 
6.0 
0.2 
3 
7.7 
7.6 
7.2 
7.0 
0.6 
4 
8.6 
8.6 
8.3 
8.5 
0.2 
5 
9.8 
9.9 
9.1 
9.3 
0.6 
6 
9.9 
9.8 
9.8 
10.2 
0.1 
7 
11.5 
11.9 
13.3 
13.0 
1.4 
For the two range levels, the is lower than 2. The FTIR gauging for the determination of fructose glucose studied here, can be considered accurate in relation to the enzymatic method.
5.3.3.3. Comparison by interlaboratory tests
5.3.3.3.1. Scope
Interlaboratory tests are of two types:
 Collaborative studies relate to a single method. These studies are carried out for the initial validation of a new method, mainly in order to define the standard deviation of interlaboratory reproducibility (method). The mean m could also be given.
 Interlaboratory comparison analysis chains, or aptitude tests. These tests are carried out for the validation of a method adopted by the laboratory, and the routine quality control (see § 5.3.3.3). The resulting value is the interlaboratory mean m, as well as the standard interlaboratory reproducibility and intermethod deviation SRinter.
By participating in an analysis chain, or in a collaborative study, the laboratory can exploit the results in order to study the accuracy of a method, in order to ensure its validation first of all, and its routine quality control.
If the interlaboratory tests are carried out within the framework of a certified organization, this comparison work can be used for method traceability.
5.3.3.3.2. Basic protocol and calculations
To obtain a sufficient comparison, it is recommended to use a minimum number of 5 test materials over the period.
For each test material, two results are provided:
 The mean of all the laboratories with significant results m
 The standard deviation for interlaboratory reproducibility
The test materials are analyzed with p replicas by the laboratory, these replicas being carried out under repeatable conditions. p must at least be equal to 2.
In addition, the laboratory must be able to check that the intralaboratory variability (intralaboratory reproducibility) is lower than the interlaboratory variability (interlaboratory reproducibility) given by the analysis chain.
For each test material, the laboratory calculates the , given by the following formula:

The results can be reported as indicated in the following table:
Test material 
Rep1 
... 
Rep j 
... 
Rep p 
Lab mean 
Chain mean 
Standard deviation 
Z_{score} 
1 
x_{11} 
... 
x_{1j} 
... 
x_{1p} 

m_{1} 
S_{Rinter(1)} 

... 
... 
... 
... 
... 
... 
... 
... 
... 
... 
i 
x_{i1} 
... 
x_{ij} 
... 
x_{ip} 

m_{i} 
S_{Rinter(i)} 

... 
... 
... 
... 
... 
... 
... 
... 
... 
... 
n 
x_{n1} 
... 
x_{nj} 
... 
x_{np} 

m_{n} 
S_{Rinte(n)} 

5.3.3.3.3. Interpretation
If all the results are lower than 2, the results of the method being studied can considered identical to those obtained by the laboratories having produced significant results.
Note Interpreting the is possible given the assumption that the variations obey a normal law with a 95% confidence rate.
Example: An interlaboratory analysis chain outputs the following results for the free sulfur dioxide parameter, on two samples.
Samples 




Lab mean 
Chain mean 
Standard deviation 

1 
34 
34 
33 
34 
33.75 
32 
6 
0.29 <2 
2 
26 
27 
26 
26 
26.25 
24 
4 
0.56 <2 
It can be concluded that on these two samples, the comparison with the analysis chain is satisfactory.
5.3.3.4. Comparison with reference materials
5.3.3.4.1. Scope
In situations where there is no reference method (or any other method) for a given parameter, and the parameter is not processed by the analysis chains, the only remaining possibility is comparison of the results of the method to be validated with accepted internal or external material reference values.
The reference materials, for example, could be synthetic solutions established with classA glassware, and/or calibrated metrology apparatus.
In the case of certified reference materials, the comparison constitutes the traceability value, and can be carried out at the same time as the gauging and linearity study.
5.3.3.4.2. Basic protocol and calculations
It is advisable to have n reference materials for a given range level, in which it can be reasonably estimated that repeatability is comparable to a constant; n must at least be equal to 10.
Analyze in duplicate each reference material.
Calculate the mean values for the 2 measurements and carried out using the alternative method.
Define the accepted value for the reference material.
The results can be reported as indicated in the following table:
Reference material 
x: Alternative method 
T: Accepted value of the reference material 
Difference 

Rep1 
Rep2 
Mean x 
d 

1 
x_{1} 
x’_{1} 
Mx_{1} 
T_{1} 
d_{1 = }Mx_{1}T_{1} 

... 


... 
... 
... 

i 
x_{i} 
x’_{i} 
Mx_{i} 
T_{i} 
d_{i} = MxiT_{i} 

... 


... 
... 
... 

n 
x_{n} 
x’_{n} 
Mx_{n} 
T_{n} 
d_{n} = Mx_{n}T_{n} 

The mean of the results of the alternative method

The mean of the accepted values of reference materials

Calculate the mean of the differences