Although CT and MR imaging is now commonplace in the radiology department, few studies have examined complex
interpretative tasks such as the reading of multidimensional brain CT or MRI scans from the observer performance
perspective, especially with reference to Stroke. Modality performance studies have demonstrated a similar sensitivity of
less than 50% for both conventional modalities, with neither modality proving superior to the other in Stroke observer
performance tasks (Mohr, 1995; Lansberg, 2000; Wintermark, 2007). Visual search studies have not extensively
explored stroke imaging and an in-depth, comparative eye-movement study between CT and MRI has not yet been
conducted. A computer-based, eye-tracking study was designed to assess diagnostic accuracy and interpretation in stroke
CT and MR imagery. Forty eight predetermined clinical cases, with five images per case, were presented to participants
(novices, trainees and radiologists; n=28). The presence or absence of abnormalities was rated on a four-point Likert
scale and their locations reported. Results highlight differences in visual search patterns amongst novice, trainee and
expert observers; the most marked differences occurred between novice readers and experts. In terms of modality
differences; novice and expert readers spent longer appraising CT images than MR, compared with trainees, who spent
longer appraising MR than CT images. Image analysis trends did not appear to differ between modalities, but time spent
within clinical images, accuracy and relative confidence performing the task did differ between CT and MR reader
groups. To-date few studies have explored observer performance in neuroradiology and the present study examines
multi-slice image appraisal by comparing matched pairs of CT and MRI Stroke cases.
Previous work has outlined that certain mammographic appearances feature more prominently in reader's false negative
responses on a self-assessment scheme. Bi-annually 600 breast-screening film-readers complete at least one round of
the Personal Performance in Mammographic Screening (PERFORMS) self-assessment scheme in the UK. The main
occupational groups in UK Breast Screening can be categorised thus, Radiologist, Technologists and Symptomatics.
Previous work has shown that these groups can vary in their reading 'style' and accuracy on self-assessed cases. These
groups could be said to contain individuals each with (arguably) pronounced differences in their real life reading
experience, symptomatic readers routinely read a large number of cases with abnormal appearances and Technologists
(specially trained to read films) do not have the same medical background as breast-screening Radiologists. We aimed
to examine overall (national) and group (occupational) differences in terms of ROC analysis on those mammographic
cases with different mammographic appearance (feature type). Several main feature types were identified namely; Well
Defined Mass (WDM), Ill Defined Mass (IDM), Spiculate Mass (SPIC), Architectural Distortions (AD), Asymmetry
(ASYM) and Calcification (CALC). Results are discussed in light of differences in real-life practice for each of the
occupational groups and how this may impact on accuracy over certain mammographic appearances.
Digital mammography is gradually being introduced across all breast screening centres in the UK during 2010. This
provides increased training opportunities using lower resolution, lower cost and more widely available devices, in
addition to the clinical digital mammography workstations. This study examined how experienced breast screening
personnel performed when they examined sets of difficult DICOM two-view screening cases in three conditions: on GE
digital mammography workstations, on a standard LCD monitor (using a DICOM viewer) and an iPhone (running Osirix
software). In each condition they either viewed the full images unaided or were permitted to use the post-processing
manipulations of pan, zoom and window level/width adjustments. For each case they had to report the feature type, rate
their confidence on the presence of abnormality, classify the case and specify case density. Their visual search behaviour
was recorded throughout using a head mounted eye tracker. Additionally aspects of their real life screening performance
and performance on a national self assessment scheme were examined. Data indicate that screening experience plays a
major role in doing well on the self assessment scheme. Task performance was best on the clinical workstation.
However, the data also suggest that a DICOM viewer that runs on a PC or laptop with a standard LCD display allows
viewing digital images in full resolution support impressive cancer detection performance. The iPhone is not ideal for
examining full images due to the amount of scrolling and zooming required. Overall, the results indicate that low cost
devices could be used to provide additional tailored training as long as device resolution and HCI aspects are carefully
considered.
Incidence of cancer in the UK NHS Breast Screening Programme (NHSBSP) is relatively low (approximately 7% per
1,000 cases screened). As such, feedback from cancers missed or interval cancers can be a relatively lengthy process
(whereby a woman will not present for corroborating imaging for a further three years). Therefore in order to monitor
their radiological skill, all breast screening radiologists and technologists read a self-assessed, standard set of
challenging mammographic images bi-yearly. This scheme, 'PERFORMS' (Personal Performance in Mammographic
Screening) has been running since near the inception of the NHSBSP in 1991. Although PERFORMS has functioned as
an educational tool for film-readers on the UKBSP for decades, its relation to real life screening in past years has
proven to be somewhat equivocal (Cowley & Gale, 1999). The present study investigated the relationship between
performance measures in real life and their equivalent on the PERFORMS self assessment scheme namely: Miss Rate
(FN), Cases Arbitrated and Returned to Routine screening and Incorrect recall (FP), Specificity (TN) and Cancer
Detection (TP). Over 40 individuals from one NHS region in the UK submitted their real life data for comparison with
PERFORMS results from the same time frame. Data from this initial study were taken from the year 2005-2006 and
compared with the relevant PERFORMS set of cases. Results indicated a significant positive correlation between
PERFORMS performance measures and performance measures for real life. These results are discussed in the light of
the legitimacy of self-assessment comparative to film-reading skill (during real life clinical practice).
In the UK, most mammographic interpretation training needs to be undertaken where there is a mammo-alternator or
other suitable light box; consequently limiting the time and places where training can take place. However, the gradual
introduction of digital mammography is opening up new opportunities of providing such training without the restriction
of current viewing devices. Whilst high-resolution monitors in appropriate viewing environments are de rigour for actual
reporting; advantages of the digital image over film are in the flexibility of training opportunity afforded, e.g. training
whenever, wherever suits the individual. A previous study indicated the possible potential for reporting mammographic
cases utilising handheld devices with suitable interaction techniques. In a pilot study, a group of mammographers (n=4)
were questioned in semi-structured interviews in order to help establish current UK film-readers' training profile. On the
basis of the pilot study data, 109 Breast Screening Units (601 film readers) were approached to complete a structured
questionnaire in order to establish the potential role of smaller computer devices in mammographic interpretation
training (given the use of digital mammography). Subsequently, a study of radiologists' visual search behaviour in digital
screening has begun. This has highlighted different image manipulations than found in structured experiments in this
area and poses new challenges for visualising the inspection process. Overall the results indicate that using different
display sizes for training is possible but is also a challenging task requiring novel interaction approaches.
A radiographic 'false negative' or a case which has been 'missed' can be categorised in terms of errors of search (where
gaze does not fall upon the abnormality); detection (a perceptual error where the abnormality may be physically 'seen'
but remains undetected) and misinterpretation (a perceptual error whereby an abnormality, although detected, is not
deemed worthy of further assessment). This study aims to investigate perceptual errors in mammographic film-reading
and will focus on the later of the two error types, namely errors of misinterpretation and errors of non-detection.
Previous research has shown, on a self-assessment scheme of recent and difficult breast-screening cases, that certain
feature types are susceptible to errors of misinterpretation and others to errors of non-detection. This self assessment
scheme, 'PERFORMS' (Personal Performance in Mammographic Screening), is undertaken by the majority (at present
over 90%) of breast-screening mammographers in the UK Breast Screening Programme. The scheme is completed biannually
and confidentially and participants receive immediate and detailed feedback on their performance. Feedback
from the scheme includes information detailing their false negative decisions including case classifications (benign or
malignant), feature type (masses, calcification, asymmetries, architectural distortions and others) and case perception
error (percentage of misinterpretation and percentage of non-detection). Results from a recent round of PERFORMS
(n=506), revealed that certain feature types had significantly higher percentages of error overall (including architectural
distortion and asymmetries), and that these feature types also showed significant differences for error type. Implications
for real-life screening practice were explored using real-life self-reported data on years of screening experience.
KEYWORDS: Personal digital assistants, Mammography, Image resolution, Breast, Eye, Human-computer interaction, Breast cancer, Visualization, Pixel resolution, Cancer
In the UK a national self-assessment scheme (PERFORMS) for mammographers is undertaken as part of the National
Health Search Breast Screening Programme. Where appropriate, further training is suggested to improve performance.
Ideally, such training would be on-demand; that is whenever and wherever an individual decides to undertake it. To use
a portable device for such a purpose would be attractive on many levels. However, it is not known whether handheld
technology can be used effectively for viewing mammographic images. Previous studies indicate the potential for
viewing medical images with fairly low spatial resolution (e.g. CT, MRI) on PDAs. In this study, we set out to
investigate factors that might affect the feasibility of using PDAs as a training technology for examining large, high
resolution mammographic images. Two studies are reported: 20 mammographers examined a series of mammograms
presented on a PDA, specifying the location of any abnormality. Secondly, a group of technologists examined a series of
mammograms presented at different sizes and resolutions to mimic presentation on a PDA and their eye movements were
recorded. The results indicate the potential for using PDAs to show such large, high resolution images if suitable
Human-computer Interaction (HCI) techniques are employed.
PERFORMS (Personal Performance in Mammographic Screening), a self-assessment scheme for film-readers is
undertaken as an educational tool by mammographers reading breast-screening films in the UK. The scheme has been
running as a bi-annual exercise since its inception in 1991. In addition to completing the scheme each year the majority
of film-readers also choose to complete a questionnaire, administered as part of the scheme, indicating key aspects of
their every-day reading practice. These key aspects include, volume of cases read per week, time-on-task reading
screening films, incidence and time of break periods as well as typical number of film-reading sessions per week.
Previous recommendations on best screening practice (significantly the optimum time on task) were considered in the
light of these film-readers' self-reports on a current PERFORMS case set.
In addition we looked at performance accuracy of over 450 film-readers reading PERFORMS cases (60 difficult
mammographic cases). Performance on measures akin to True Positive (Correct Recall Percentages) and True Negative
(Correct Return to Screen Percentages) decisions were investigated. Data presented demonstrate that individual
behaviours in real life screening, for the interpretation of mammographic cases, affect film-reading accuracy on a test
set of mammograms for specificity and sensitivity (namely volume of cases read per week and film-reading
experience). The consequences for best screening practice, in real life, are considered.
In the UK there are two groups of radiologists who routinely read mammographic cases: Symptomatic and Screening Radiologists. We examined the performance of these two film-reading populations, Breast Screening Radiologists and Symptomatic Radiologists, to evaluate if there were group differences in their "style" of reading the same set of cases. Specifically we looked at each group's sensitivity and specificity measures. In addition we investigated if there were any individual group differences apparent in the cases which they found challenging and what (if any) were the characteristics of those cases. Data from 66 Breast Screening Radiologists and a matched group of 66 Symptomatic Radiologists were compared over a number of years (360 cases). Results are presented which demonstrate that whilst the two groups show overall similarities in performance there exist subtle underlying differences which we attribute to the differences in their everyday experience of the types of cases that they read. In conclusion, we argue that these differences are related to the volume of cases which UK Screening Radiologists read in order to maintain skill level.
Each year almost all film readers in the UK Breast Screening Programme voluntarily read a set of difficult mammographic cases as a means of self-assessing their film reading skills. We set out to investigate what case characteristics, if any, actually constituted a 'difficult' or 'easy' case in the opinion of radiological experts. We also examined how UK Breast Screening personnel performed on those cases which the experts deemed were difficult, in order to build up a profile of the types of cases that provide film readers with the most problems. We examined two main elements of case diagnosis, case classification and case features and investigated if there were any group differences in terms of case difficulty and the percentage of incorrectly reported cases. Data from over 15 radiological experts and approximately 400 film readers were compared on 180 cases. Significant differences were found between the expert and screening populations (p < .05) in terms of these case characteristics. These data contribute to the understanding of just what constitutes a difficult case as considered by experts and other film-readers, with a view to elucidating the type of cases most appropriate for advanced mammographic training.
Prompting is utilised in CAD systems to draw attention to regions of potential abnormality within screening mammograms. The benefit of such systems is under debate. Our previous research found that radiologists’ visual search patterns were significantly altered when mammographic prompts were displayed. Visual attention concentrated upon prompted areas, with significantly less attention to unprompted regions. Additionally, prompts caused a reduction in the amount of bi-lateral visual comparisons between the two breasts. Current CAD systems use a variety of prompts (e.g. circle and triangle) that appear incongruous to the mammogram and may inadvertently detract attention from unprompted regions. The aim of this experiment was to determine whether attentional focus would continue when using subtle prompts, without resulting in the insufficient search of unprompted areas. A series of paired medio lateral-oblique view mammographic cases were presented to participants on a monitor. Images were presented as "unprompted" and "prompted"- using various methods to highlight potentially abnormal areas. These included typical prompt shapes and also more novel prompts (e.g. altered brightness and colour). Participants were instructed to scan the images as they normally would when screening for abnormalities and to indicate their confidence that an abnormality was present, using a five-point scale. Eye movements were recorded during the task. Results demonstrated that visual attention was drawn to prompted regions. However, the potentially negative influence of prompts upon normal visual search patterns within mammograms was found to be less pronounced in conditions containing novel prompts. By comparing differing prompts during screening it was possible to establish their consequent impact upon visual search patterns. This research contributes to the establishment of optimal prompt displays in soft copy systems.
In the UK fewer radiologists are now specialising in breast cancer screening. Consequently, a number of technologists have been specially trained to read mammograms so as to double-read with existing radiologists. Each year the majority of these film-readers examine a set of difficult cases as a means of self-assessing their skills. We investigated whether the technologists performed as well as breast-screening radiologists on this difficult test set. We also investigated technologists’ performance over a number of years to compare the performance of those technologists who have read a greater number of breast screening films and those who have had less experience. Finally, we investigated real-life experience and performance on the scheme by comparing; volume of cases read, experience, and technologists’ performance over time versus radiologists’ performance. Data for approximately 250 breast screening Radiologists and 80 specially trained technologists over three years for six sets of 60 difficult recent screening cases were examined. Overall, those technologists who have not read the same volume of cases as radiologists did not perform as well on this particular task. Although when the group was fractionated by volume of cases read in real-life and the number of years reading cases, then the technologists performed at a level similar to the radiologists.
KEYWORDS: Breast, Mammography, Cancer, Chromium, Pathology, Statistical analysis, Signal detection, Medical imaging, Behavioral sciences, Breast cancer
U.K. breast screening radiologists typically read over 5,000 screening cases per annum, whereas in Europe this figure may be lower as in some countries national breast screening programs are in development. The PERFORMS scheme in the UK permits radiologists annual self-assessment of their film-reader skills. As part of a Bavarian breast-screening training scheme a number of German radiologists have now also read the current PERFORMS case set. We investigated whether real-life case volume affects reading performance by the comparison of matched groups reading these screening cases. For each case, individuals identified which key mammographic features were present, whether the case was abnormal and should be recalled or not. For this analysis the participants were matched on volume of cases read and years of experience. Assessment of case volume was elicited by questionnaire data. The radiologists were compared on several key performance measures; cancers detected, correct recall and correct return to screen, signal detection performance statistics and real-life screening practice. It was found that whilst the performance of the Bavarian radiologists on the current test sets was extremely good, on average they performed less well than their UK counterparts. Reasons for this are considered.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.