Human Factors Affecting Disagreements Among Interpreters
In the second part of a two-part series, we look at factors affecting probability of detection.
Editor’s Note: Part one appeared in June: “Variances and Human Factors in Radiographic Image Interpretation.”
Radiographic testing (RT) techniques should be developed to achieve the highest image quality level achievable. The choices of energy/source and film imaging plates and processing should be made accordingly. Probability of Detection (POD) analysis cannot be applied effectively where the quality level is marginal or unacceptable due to improper techniques, which many times are used to increase productivity. We have experienced many questionable RT images produced by the “trial and error” technique rather than by scientifically developing the optimum technique. When image quality is found to be unsatisfactory, it is critical for the RTFI to require that the radiographer retake exposures to produce images of an acceptable quality to allow proper interpretation.
It is commonly acknowledged within the NDE industry that the major variable in radiographic interpretation involves the personnel performing the evaluation—the RTFI, be it with film-based or digitally-based inspections. From our own experience, we have seen worldwide a general lack of consistency in the interpretation process due to the interpreter’s background, training, experience, and his ability to effectively understand and apply the requirements of the applicable interpretation code. In order for the POD concept to be meaningfully applied, the issues related to the variations in the qualifications of the interpreter must be addressed. There are many variables related to the interpreter and the interpretation process that must be considered, including but not limited to:
- The formal training that the interpreter has received.
- The accumulated experience of the interpreter.
- The visual acuity of the interpreter.
- The specific code(s) to which the interpretation is applicable.
- The quality of the radiographic image.
- The RT procedure/technique being used.
- The viewing facilities and equipment.
The disparities in the interpretation of radiographs are not new. In an early study conducted by a research laboratory, five certified RTFIs who had been trained in a master apprentice program reviewed 350 radiographs. Their interpretations were in agreement on detection and identification of the discontinuity on 238 radiographs or 68% of the radiographs. This meant that they disagreed on a total of 112 radiographs! Based on these results, a “unified” training program was initiated using welding discontinuity categories. A procedure was developed and the results of the nine certified RTFIs trained under this unified program were compared to those of nine RTFIs who were training under the master apprentice training program. The interpretation included 96 selected radiographs. There was a 17% disagreement among the unified training group interpreters while the master apprentice interpreters had a 44% disagreement rate.
What does all this really mean when considering POD? It merely points out the fact that the consistency and accuracy of radiographic image interpretation is an issue that is affected by the multiple variables mentioned earlier. And this lack of consistency is not limited to industrial radiographic image interpretation. In another early study, a review of the medical radiographs of patients who were diagnosed with tuberculosis indicated an average disagreement in one out of three cases or only 67% agreement. In a second independent interpretation of the same radiographs, several radiologists disagreed with their own previous interpretations in an average of one out of five times or 80%.
There are no reasonable solutions that will always guarantee highly reliable and accurate interpretations. The most logical solution is to combine in-depth extensive training with significant interpretation experience.
Reasons for Disagreements and Inconsistencies
Many of the disagreements encountered in radiographic testing should never happen. Whether it is due to ignorance such as a lack of understanding the requirements, sloppiness, or just inattention on behalf of RTFIs, it should not be tolerated. Incompetence does not have a place in NDE and it will have an impact on applying any POD analysis results. In addition to these ‘technical’ reasons for interpretation disagreement, it should also be recognized that occasionally basic, unethical behavior is seen in the radiographing and/or interpretation stages. We have seen instances where budget and/or contract schedule concerns appeared to play at least partial roles in the production of wholly unacceptable interpretation results. It should be obvious that the major source of disagreement in the interpretation of radiographs is the personnel. On a positive note, it should be recognized that there are many interpreters who are highly qualified and consistently perform RT image interpretations in a completely ethical manner with reliability and accuracy. The major reasons for disagreement among interpreters include:
Insufficient Sensitivity – The essential hole or wire image should be clearly discernible. In the case of the wire IQI, the essential wire image should be discernible across the entire weld image or area of interest. With the shim type IQI, one should not have to use their imagination to discern the essential hole image.
Inappropriate Film Density – There are sound technical reasons for Code density requirements. Densities that are too low will not provide the image contrast necessary to effectively discern discontinuities and densities that are excessively high will make interpretation difficult or impossible. Higher densities which fall within the Code acceptable range will result in better image contrast.
Type of Discontinuity – While there is high agreement when evaluating volumetric type discontinuities such as porosity and slag inclusions, the tight, planar, angular discontinuities are many times more difficult to discern even when high quality images have been produced. This is the area where most disagreement occurs and unfortunately involves the most critical type flaws (tight cracks, angular lack of fusion, narrow slag lines, etc.).
Coverage – This is an area that is overlooked at times but where disagreements occur. The area of interest is defined as the region that demonstrates that the quality level has been achieved and that the density falls within the acceptable range. This will happen more frequently when smaller diameter or thicker wall pipe welds are examined.
There are numerous cases that can be cited that emphasize the need for highly qualified interpreters and POD, in the area of radiographic interpretation. Several are mentioned here with fictitious references to the specific projects, but all are a matter of public record.
Case #1 – Hazardous Chemical Processing Plant
A prototype facility designed to decommission weapons utilizing highly poisonous gases was constructed on a remote atoll in the South Pacific. After the plant was up and running, substandard radiographs that were taken during construction were inadvertently discovered. Many of these radiographs could not be interpreted and were unreadable due to densities that were either too low or so high (dark) that they were impossible to evaluate. There were also issues dealing with unacceptable quality levels, artifacts in the areas of interest, and just overall inferior techniques. After a costly re-review was completed, the decision was made to shut the plant down, and over 700 welds had to be re-radiographed. The cost of the shutdown (for four months) and the re-examinations was major. After the affected welds were either repaired or replaced, the plant started up and continued operations until it was finally decommissioned. Other major costs involved the legal actions that were initiated against the contractor and their NDT subcontractor.
Case #2 – ‘Almost’ Completed Nuclear Power Plant
In 1979 during the days of nuclear power plant construction, in the movie “The China Syndrome,” a reporter found what appeared to be a cover-up at a nuclear power plant. In part, the cover-up dealt with the falsification of weld radiographs. Unfortunately, this doesn’t just happen in the movies. While the radiographs of piping were going through their final review at an almost completed nuclear power plant in the U.S. midwest, it was discovered that the 2-2T hole images in a number of radiographs had been tampered with. These radiographs were taken just prior to the Christmas holidays some years ago, and it was obvious that the image of the 2-2T hole (the essential hole image required by the Code) was not visible. Instead of taking the correct action and re-examining the welds, the radiographers decided to make their own hole image by scratching a circular pattern on the film surface in the exact location where the hole image was supposed to be. This led to the re-evaluation of radiographs at nineteen utilities and caused a number of welds to be re-radiographed during subsequent outages. This is an example of a lack of integrity on the part of the inspectors and company and should have never happened. It is also a case where any POD analysis is rendered useless.
Case #3 – Gas Processing Plant
The piping welds in a gas processing plant undergoing expansion were subjected to extensive radiographic examinations using several subcontractors. There were over 30,000 welds radiographed and evaluated by the subcontractors. Upon a quality control review of the radiographs by the owner, over 3,000 radiographs were interpreted as not meeting the Code and should have been rejected for the following reasons:
- Almost 80% were rejected for film quality and technique problems including unacceptable IQI (Image Quality Indicators) images, film densities below and above the Code stipulated range, artifacts in the area of interest, insufficient coverage, and the presence of excessive debris and weld spatter.
- Approximately 50% were rejected for discontinuities in excess of that permitted by the Code.
Note: Some radiographs exhibited both unacceptable film quality and discontinuities that did not meet the Code requirements. The radiographs rightly could and should have been rejected without any attempt at interpretation.
To summarize, out of the original 3,000-plus disputed radiographs, approximately 90% were determined to be unacceptable to the Code and approximately 10% were found to be acceptable. Based on industry trends, the transition to digital radiography should significantly reduce interpreter disagreement and provide for more accuracy in POD.
Minimizing Disagreements and Inconsistencies
Training - Satisfactory completion of high-level film interpretation training courses taught by instructors qualified in RT image interpretation is a good start. But training alone will not guarantee that an interpreter is qualified.
Experience - Experience should be achieved while jointly evaluating RT images with qualified interpreters. This requirement is essential and preferably should include reviews of a ‘substantial’ number of radiographs, which is quantified on a case-by case basis. Companies and RTFIs-in-training should not be content with the reaching of any minimum number of reviews established by professional societies or organizations. The fledgling RTFI should be ‘left on his own’ only when both the company and RTFI are convinced that the RTFI is capable of performing quality work. True interpretive competence builds when an increasing number of images are reviewed with a qualified interpreter who has extensive experience.
Audit - Oversight is an essential requirement in NDE. Oversights, or audits, need to be performed by qualified interpreters. This is a key function usually performed on behalf of the owner. There should be absolutely clear owner standards and contract requirements which place the ‘final say’ with the owner’s qualified RT oversight staff, considering the owner’s responsibility for safety of the facility and operating staff.
i Fuksoc, Ferenc, Muller, Christina, Scharmack, Martina “Human Factors: The NDE Reliability of Routine Radiographic Film Evaluation”, 15th World Congress on NonDestructive Testing, 15-21 October 2000, Rome, Italy
ii Megling, R.C. and M.L. Abrams. Relative Rules of Experience/Learning and Visual Factors on Radiographic Inspector Performance. Research Report SRR73-22. San Diego, CA: Naval Personnel and Training Research Laboratory June 1973
iii Berock, J.F. R.G. Wells and M.L. Abrams, Development and Validation of an Experimental Radiographic Reading Training Program. Report AD-782-332. San Diego, CA Navy Personnel Research and Development Center June, 1974
Lusted, L.B. “Signal Detectability and Medical Decision Making.” Science, Vol. 171 (March 1971, p. 1217-1219)
iv Mohr, Gregory A., Willems, Peter “Factors Affecting Probability of Detection with Computed Radiography”, 17th World Congress on NonDestructive Testing 25-28 October 2008, Shanghai, China
v Tang, B.P.Y., Hungler, P., Sweetapple,C.P., Bennett, P.G.I. “A Quantitative Evaluation of the Canadian Forces New Radiology Inspection Systems for the Detection of Water Ingress in CF188 Flight Control Surfaces”, 17th World Congress on NonDestructive Testing 25-28 October 2008, Shanghai, China