Variances and Human Factors in Radiographic Image Interpretation
As conventional film radiographic testing continues transitioning to the digital or computed RT process, the POD method of analysis is likely to improve significantly.
All Nondestructive Evaluation (NDE) or examination methods comprise the following steps: the search for, and detection of the discontinuities, the indication or recording or signal–processing of those detected discontinuities, and the human interpretation of that indication, record or signal. Actual discontinuities may not be detected or identified for a number of reasons: the technique is insufficiently sensitive to detect the particular discontinuity, the recording or signal is insufficiently sensitive or clear to allow proper interpretation or the human interpretation step itself fails by not properly discerning an otherwise obvious discontinuity. Any disagreements or variances among qualified NDE inspectors in interpreting NDE (NDT) results therefore reflects, to varying degrees, the extents to which these reasons are applicable in any particular evaluation. In cases where a facility owner and contractor disagree about specific interpretations during a project, there are obvious concerns about project delays and the costs of possible repairs to the alleged discontinuity. Ultimately, all of the ‘missed’ interpretations trace back to inspectors who are either not properly trained or are incorrectly executing the applied technique.
The Probability of Detection (POD) analysis procedure has been developed over the last 40 or so years to address the complete detection-thru-interpretation cycle for various NDE methods. It is possible, with some simple basics of probability, to apply and extrapolate POD analysis to determine the degree of detection variance which might be expected to occur with statistically-independent, qualified NDE inspectors. The study presented here deals primarily with radiographic testing (RT) and RTFIs (Radiographic Testing Film Interpreters) involved in the petroleum industry, but the results should be equally applicable to any number of independent inspectors, in any industry, who interpret results from any NDE method for which POD data are available.
In various industries, owners may include specific clauses in their standards that place final authority for NDE interpretation and acceptance in accordance with their own company’s inspection organizations. These companies may also emphasize this final interpretation authority by including appropriate wording in their contracts with contractors. For example, petroleum company standards typically may refer to the American Society of Mechanical Engineers (ASME)i, commonly referenced as the ‘Code’, and the American Petroleum Institute (API)ii, over many editions, which specifically have placed final inspection and approval and rejection authority with the owner. This designation is logical when considering that the owner is responsible for accepting a properly designed, safe facility and is likewise responsible for operating that facility in a similarly safe manner. Notwithstanding the contractual clauses vesting final approval/rejection authority for interpretations with the owner, continuing disputes between the owner and contractor regarding radiographic interpretations, the particular case here, unfortunately can lead to arbitration or litigation.
When the contractor and owner/client cannot resolve interpretation disagreements and the dispute continues in spite of contractual designation of the owner as having ‘final say’, resolution of the dispute may be achieved by one of the following:
a. Independent Third Party Resolution – Some owners, primarily for manpower availability reasons, do select such contract RTFIs as the owner’s representative during the project, who act with the owner’s full authority to determine weld acceptability. However, the authors do not know of any case where an owner has abrogated his responsibility by allowing an independent third-party RTFI to ‘arbitrate’ owner-contractor interpretation disputes during the project. An owner, of course, can decide to ‘compromise’ after the project to allow such technical ‘arbitration’ of interpretations if he so chooses. Required rework for unacceptable welds and/or compensation to the contractor for acceptable welds incorrectly ‘repaired’ during the project may result.
b. Legal – This course of action should only be pursued when the Independent resolutions as mentioned above fail to settle the dispute, notwithstanding the owner’s contractual responsibility to have the ‘final say’ on interpretations. This is a costly and time-consuming process that many times ends up with a compromise settlement and should be avoided if at all possible.
The POD concept originated in the late 1960s with work done by notable researchers on NASA projects. iii iv v The original analysis was applied to ultrasonic testing of components in aerospace projects. Since the original studies, the POD concept has expanded to address other NDE methods being applied in various industries to varying degrees, including the petroleum industry. vi vii viii The POD graphs show that the probability of detection by a RTFI of a given discontinuity varies significantly with the size of that discontinuity. It is important to note that the derived POD graphs cover the full inspection process, from the search for discontinuities by the applied NDE method through the human interpretation of the method’s results. Note also that the POD analyses and RT interpretation variances, as discussed here, deal with the detection of the existence of a discontinuity, based on size, by an NDE method and qualified interpreter. The variances in identification of the specific type of discontinuity, e.g., incomplete fusion, undercut, by the RTFIs, are not considered in this paper.
Variances of RTFI Interpretations Regarding POD
The following radiographic interpretation outcomes are derived from a simple probability analysis of two RTFIs, for example, one each from the owner and contractor, respectively, each performing interpretations on the same radiographic image A key premise in this analysis is that the two interpreters are fully qualified and perform their interpretations in a statistically independent manner. In other words, the interpretation by one interpreter has no effect on the interpretation performed by the other interpreter. Thus, there are four possible detection outcomes when these two RTFIs interpret the same radiograph: (see Table 1.)
Where D = the RTFI detects that a discontinuity exists
ND = the RTFI does not detect that a discontinuity exists
In outcomes #1 and #2, the RTFIs from the owner and contractor reach the same conclusions: both see a discontinuity in outcome #1 and both do not see a discontinuity in outcome #2. In outcomes #3 and #4, one RTFI sees a discontinuity while the other does not.
The POD graphs developed for radiography provide a basis for estimating the interpretation variances to be expected when two independent RTFIs view a radiograph. To provide an example POD graph only on which to apply the simple probability exercise, the following Graph 1 was extracted from work done by NORDTEST on radiographic NDE, which incorporates experimental and XPOSE model data plus human reliability (HR) results.ix
Graph 1. Example POD Graph
The focus was on examining the NORDTEST experimental results solid line, again only for illustrating the probability technique to be applied. Readers are referred to the original paper for details on the radiographic technique and data analysis used in the construction of this Graph. In practice, readers can access and effectively use any graph derived and constructed from a POD analysis provided that derivation/construction is statistically valid. For illustration purposes, we consider a required radiographic quality level or sensitivity of 2% for the 25.4mm (1 inch) steel plate in Graph 1. In applying a 2% sensitivity level, owners typically select a 2-2T hole requirement when hole type image quality indicators (IQIs) are used or their equivalent, essential wire when wire-type IQIs are used. This 2% sensitivity is the default value contained in the American Society for Testing & Materials (ASTM) Standard E-1742 and ASME SE-94.x ASME Section Vxi defines the 2T hole image in a 2% shim type IQI as the essential image for establishing acceptable quality level unless another level of sensitivity is stipulated in the contract. A 2% sensitivity is a typical level applied in examining pipe welds within the petroleum industry. Note that this sensitivity relates to the smallest discernible detail seen by the RTFI on the radiographic image, not the minimum size discontinuity present in the part being radiographed. The 2% sensitivity applied to the 25.4mm (1in) plate yields an approximate 0.5mm (0.02in) ‘defect size’ (discontinuity), per Graph 1. At 0.5mm, the POD is approximately 0.2, or 20%, using, for illustration purposes, the ‘Nordtest Experimental Results’ line. Thus, there is a 20% probability that a given RTFI will ‘see’ or detect this discontinuity and 80% probability that he will not see it. The probability of occurrence of two independent events, in our case image interpretation results, is the product of the individual probabilities of each event.xii
Or, P(RTFI#1, RTFI#2) = P(RTFI#1) x P(RTFI#2)
P(RTFI#1, P(RTFI#2) = event probability of both RTFIs making certain independent image interpretations
P(RTFI#1) = event probability of RTFI#1 making a certain independent mage interpretation
P(RTFI#2) = event probability of RTFI#2 making a certain independent mage interpretation
This principle can be extended to any number of independent events, such as independent image interpretations.
Therefore, the probabilities that two RTFIs, say one for the owner and one for the contractor, either will ‘see’, or detect, this discontinuity is 0.2 x 0.2 or 0.04 or 4%. This scenario corresponds to detection outcome #1 in Table 1. The probabilities for the other three detection outcomes are calculated similarly, by multiplying the detection probabilities of each individual interpreter in each detection outcome case.
Thus, the probabilities of the various detection outcomes are: (See Table 2.)
In outcomes #1 and #2, the interpretations of the owner and contractor RTFIs agree, although the probability of non-detection by both the RTFIs is significantly greater than detection because of the relatively small ‘defect’ size. Outcomes #3 and #4, which comprise the variance or disagreement cases between the respective RTFIs, total to a 32% probability of disagreement. Thus there is a significant probability, 32%, of disagreement between qualified RTFIs when interpreting radiographs in this particular 25.4mm steel plate, 2% sensitivity case. This equates to an expected 68% (Outcomes #1 + 2: 4% + 64%) agreement between RTFIs.
Prior to this probability analysis, the authors, in private communications, informally discussed the question of RTFI detection variance with a number of very experienced NDE professionals from both industry and academia. These professionals advised that they generally would expect approximately 70-95% agreement between qualified RTFIs. This percentage agreement range covers both interpretation agreements on three categories, respectively: the acceptability of the sensitivity of radiographs alone, the detection or non-detection by the RTFI of discontinuities, assuming acceptable radiographic sensitivity and overall quality, and the identification by the RTFI of specific discontinuities, such as incomplete fusion. Most of those professionals considered that the RTFI agreement on radiographic quality concerning sensitivity would be in the higher end of the 70 - 95% range, with RTFI agreement on discontinuity detection and identification being generally in the lower end. Thus, the informal feedback from these discussions concerning RTFI variance or disagreement, while admittedly not performed in a rigorous manner, is roughly consistent with the results of the single POD analysis presented here. The main point is that complete agreement between qualified RTFIs in interpreting radiographs should not be expected and that this rough quantification on non-agreement is possible.
The following table provides a simple, generalized correlation between the POD percentages and the expected percentages of the detection outcome possibilities between two RTFIs for this specific, example POD graph shown inTable 3.
Per the ‘% Probability of Detection Outcome’ columns, the conclusions are that the smallest and largest ‘discontinuity’ sizes, corresponding to the lowest and highest PODs respectively, similarly correspond to the highest levels of detection or non-detection agreement between RTFIs. It intuitively might be expected that, in practice, owner and contractor RTFIs would disagree predominantly in their decisions or interpretations on the smallest ‘discontinuities’. However, probability shows that they mostly will reach, independently, a ‘false negative’ conclusion (64% agreement on ‘ND’) in that instance and agree that ‘no discontinuity’ is present.xiii Fortunately, this ‘consensus’ false negative occurs predominantly on the smallest discontinuities – left hand side of the graph and lower POD values in column 2 above - which, in general, provide less threat to the integrity of the weld in question. The data appears in graphical form in Graph 2.
There are a number of variables that contribute to the ability of RTFIs to identify discontinuities in radiographs. These variables include the RTFI personnel, including the conditions under which the RTFI performs the interpretations. The training, experience and integrity of the RTFI, as well as the radiographer, are all key factors in attaining on-specification radiographic images and their proper interpretations. The established POD procedure provides a search –through-interpretation, or end – to end, method for determining the probabilities that different size discontinuities will be so detected by the RTFI. These probabilities, when applied to different RTFIs, such as those of owners and contractors respectively, highlight the likely levels of disagreements or variances between these RTFIs. Based on the empirical foundation of the POD approach, such variances are shown to be expected. These variances in themselves provide support to the position of the various industry organizations, and owners, that owners must retain final rejection and approval authority over the interpretation of NDE results. It must also be noted that as conventional film radiographic testing continues transitioning to the digital or computed RT process, the POD method of analysis is likely to improve significantly. NDT
i Boiler & Pressure Vessel Code Section V Nondestructive Examination, Article 1 General Requirements
ii Recommended Practice (RP) 1104 Welding of Pipelines and Related Facilities, Section 9 Acceptance Standards for Nondestructive Testing
iii Rummel, W.D. “Probability of Detection as a Quantitative Measure of Nondestructive Testing End-To-End Process Capabilities”, Materials Evaluation ASNT 1998
ivBerens, A.P. and Hovey, P.W. “Flaw Detection Reliability Criteria, Volume I - Methods and Results.” AFWAL-TR-84- 4022, Air Force Wright Aeronautical Laboratories, Wright Patterson Air Force Base April, 1984
vGrills, Robert H., “NDT Solution – Probability of Detection – An NDT Solution”, Materials Evaluation ASNT 2011
viMatzkanin, G., Rummel, W.R. NDE Capabilities Data Book NTIAC-DB-97-02, Texas Research Institute Austin, Inc., Austin, TX, 1997.
viiForli, O. “How to Develop Acceptance Criteria for Pipeline Girth Weld Defects?” European-American Workshop Determination of Reliability and Validation Methods of NDE, Berlin - June 18-20, 1997
viiiDijkstra, F.H., de Raad, J. “Why Develop Acceptance Criteria for Pipeline Girth Weld Defects? European-American Workshop Determination of Reliability and Validation Methods of NDE. Berlin. 18 - 20 June 1997
ix Wall, M., Wedgwood, F.A., Burch, S. “Modelling of NDT Reliability (POD) and Applying Corrections for Human Factors”, 7th European Conference on Nondestructive Testing, Copenhagen 26 - 29 May 1998
x ASTM E-1742 Table 3 Standard Practice for Radiographic Examination; ASME SE-94 Standard Guide for Radiographic Testing;
xiBoiler & Pressure Vessel Code Section V Nondestructive Examination, Article 2 Radiographic Examination
xii Richmond, Samuel B. “Statistical Analysis” 2nd Ed., The Ronald Press Co..
xiii Ginzel, E. “Introduction to the Statistics of NDT”, NDT.net May 2006 Vol. 11 No. 5-