Article Text

Download PDFPDF

Comparison of informal caregiver and named nurse assessment of symptoms in elderly patients dying in hospital using the palliative outcome scale
  1. Rebecca Dawber1,
  2. Kathy Armour2,
  3. Peter Ferry3,
  4. Bhaskar Mukherjee4,
  5. Christopher Carter5 and
  6. Chantal Meystre6
  1. 1 Department of Palliative Care, Southend University Hospital NHS Trust, Southend-on-Sea, UK
  2. 2 Marie Curie Hospice West Midlands, Solihull, UK
  3. 3 Department of Geriatric Medicine, Karin Grech Hospital, Pieta, Malta
  4. 4 Department of Care of the Elderly/Stroke, Burton Hospitals NHS Trust, Burton upon Trent, UK
  5. 5 Department of Clinical Chemistry, Heart of England NHS Trust, Heartlands Hospital, Birmingham, UK
  6. 6 Marie Curie Hospice West Midlands and Heart of England NHS Trust, Solihull, UK
  1. Correspondence to Dr Rebecca Dawber, Department of Palliative Care, Southend University Hospital NHS Trust, 12 Cardigan Avenue, Southend-on-Sea SS0 0SF, UK; r.dawber{at}


Objectives A prospective study of symptom assessments made by a healthcare professional (HCP; named nurse) and an informal caregiver (ICG) compared with that of the patient with a terminal diagnosis. To look at the validity of HCP and ICG as proxies, which symptoms they can reliably assess, and to determine who is the better proxy between HCP and ICG.

Methods A total of 50 triads of patient (>65 years) in the terminal phase, ICG and named nurse on medical wards of an acute general hospital. Assessments were made using the patient and caregiver versions of the palliative outcome scale (POS), all taken within a 24 h period. Agreement between patient-rated, ICG-rated and HCP-rated POS and POS for symptoms (POS-S) was measured using weighted-κ statistics. Demographic and clinical data on each group of participants were collected.

Results ICG assessments have higher agreement with those of the patient than HCP. Better agreement in both groups was found for physical symptoms, and best agreement was for pain. The worst agreements were for psychological symptoms, such as anxiety and depression, and for satisfaction with information given. Psychological symptoms are overestimated by both ICG and HCP.

Conclusions ICGs are more reliable proxies than HCPs. A trend for overestimation of symptoms was found in both groups which may lead to undervaluation of the quality of life by proxy and overtreatment of symptoms. This highlights the need to always use the patient report when possible, and to be aware of the potential flaws in proxy assessment. Reasons for overestimation by proxies deserve further research.

  • Quality of life
  • Symptoms and symptom management
  • Psychological care
  • Education and training
  • Communication
  • Clinical assessment

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Palliative care, as defined by WHO, is an approach that improves the quality of life (QOL) of patients and their families facing the problems associated with life-threatening illness.1 In palliative care, QOL is measured as a principal outcome indicator to evaluate the quality of our service provision, to ensure safeguarding and improvement of standards.2 ,3 It is essential, therefore, to have a valid and reliable tool for measurement of QOL.4 No established consensus exists regarding which dimensions should be included in assessments of QOL, but it is generally accepted that they should include physical, psychological and social dimensions;5 reflecting QOL before death, symptom control and family support.6

Neugarten et al 7 said “The individual can be the only proper judge of his wellbeing.” It is accepted that the patients’ report of their symptoms and QOL is the gold standard. Furthermore, it is asserted in law in the Mental Capacity Act 2005, that a capacitous patient is the gold standard for decision-making about their treatment and care. It is common that, as illness progresses, a patients’ ability to report symptoms and make decisions about care are diminished. In this instance, proxy reports of symptoms are used, and family proxies are employed as surrogate decision-makers regarding treatment. The accuracy and reliability of proxies, therefore, have implications for the delivery of appropriate treatment interventions for the individual; when appraising the outcomes of service provision; and in research. Owing to the reliance on proxies, researchers have investigated patient and proxy concordance in assessment of symptoms and QOL to determine whether family or healthcare worker proxies make more accurate assessments of symptoms and QOL; which is the most accurate, and what characteristics are associated with proxy accuracy.3

Another outcome measured to appraise the success of palliative care services is whether the patient dies in their place of preference. It is commonly reported that most terminally ill patients wish to die at home, and symptom management has been identified by family caregivers as their primary concern in caring for patients with cancer at home.8 Studies have revealed a trend of family caregivers overestimating symptom distress,9–11 which, conceivably, could contribute to increased admissions to acute care services due to perceived inadequacy of symptom control and home care breakdown secondary to carer anxiety. Furthermore, caregiver burden is known to contribute to overestimation of symptoms,3 so investment in support of carers may lead to more accurate proxy assessments, better control of symptoms and less emergency hospital admissions.

Existing research into the validity of proxy symptom assessments is heterogeneous with no unanimity as to what represents an acceptable level of agreement between patient and proxy. Methodological weaknesses exist and the sample size for many of the studies in this field is small,12 rendering conclusions questionable and the generalisability of findings limited.12 Direct comparison of studies in this area is further complicated by the fact that tools used for measuring QOL and for symptom assessment vary between studies, and some tools lack evidence and validity. Moreover, statistical methods used in assessing agreement between patient and proxy assessments of QOL are disparate.

Our study uses the palliative outcome scale (POS); a widely accepted, validated tool for measuring symptoms and giving an indication of QOL.4 It is responsive to change, is brief and easy to administer, and takes approximately 10 min to complete.4 POS was developed from extensive review of the literature, and testing with users (patients and caregivers from a range of cultures) and clinicians. Independent validation found that it can usefully reflect practice,4 and that it is an appropriate instrument to assess not only cancer but non-cancer diagnoses, and moderately severely demented patients.13 POS includes questions relating to pain, symptoms, emotional, social, spiritual/existential and communication. The scale is not designed to record ‘QOL’, but it reflects the commonly accepted components of QOL and ‘total pain’.4

Despite the heterogeneity of research when considering the validity of proxy assessments of symptoms and QOL, there is some consistency.12 For concrete, observable symptoms (eg, vomiting and immobility) or symptoms that have behavioural clues (eg, appetite) there tends to be greater agreement between patient and proxy.3 ,9 ,12 For subjective symptoms (eg, psychological symptoms) the agreement is poorer.3 ,12 Proxies tend to overestimate psychological symptoms,3 ,14 difficulties with emotional well-being,15 and the distress associated with physical symptoms9–11 which may explain why they tend to underestimate QOL when compared to the patient,3 ,16–19 and the discrepancies in assessment of psychological symptoms.

While some studies20–23 suggest little difference between family caregiver and healthcare professional (HCP) in symptom assessment, there is no consensus. In most,3 ,12 ,16 but not all22 studies, congruence increases over time, with repeated assessment. This suggests that repeated measures may increase proxy accuracy.

The clinical use of proxy assessments at the end of life is unavoidable. The purpose of this study is to ascertain the reliability and the nature of proxy symptom assessors using a validated QOL measure; the POS and the additional POS for symptoms (POS-S).4 This will allow us to select the most reliable proxy, interpret the assessments selectively and determine strategies to improve proxy accuracy.



A prospective study of POS ratings recorded by patients, their nominated informal caregiver (ICG), and their named nurse within one 24 h period.

For the study to be powered at 0.8 a significance level of 0.05 with an estimated difference on POS of 0.4,17 the number of triads required was 50. We planned to oversample up to 70 triads to allow for missing data.

Setting and subjects

The study was undertaken on the medical wards of a district general hospital, 2004–2005. Patients over 65 years of age, terminally ill from any diagnosis, who had been in hospital for more than 7 days with regularly visiting ICGs. We selected the named nurse as healthcare professional as they provide the most constant care to the patient. The ‘named nurse’ is the nurse designated as responsible for a patient’s nursing care, and is allocated at the start of each nursing shift.

The patient was identified as ‘in the terminal phase of illness’ by the nurse in charge, and this was confirmed through assessment by the palliative medicine consultant. They were then approached by the researcher. After explanation, an information sheet that was either independently read or read out loud to the patient was provided. The researcher answered any questions and took written informed consent. The ethics approval specifically waived the usual 24 h period of consideration to account for the patient's extreme illness and the acknowledged attendant attrition rate of palliative care research. Interpreters were available.

Patients without visitors, with impaired cognition (Abbreviated Mental Test score 6 or below), or refusing consent were excluded.

Study tools

Patients completed the POS and POS-S, and ICG and HCP-completed modified versions for proxy assessments. POS-S is an additional scale that can be used alongside POS to assess key symptoms important in palliative care. The POS and POS-S asks for the effect of symptoms over the past 3 days. Each individual item is scored from 0 (not at all) to 4 (overwhelmingly). The patient version of POS asks directly about symptoms and information needs; the modified version for caregivers asks for their views of the patients’ experiences of the same issues.

Data collection

POS and POS-S questionnaires were distributed to patients on the ward. For those unable to do so for themselves the researcher transcribed their verbal answers to the questions. The named nurse was approached by the researcher and given the information sheet, consent form and questionnaire. Each participant in the triad answered the questionnaire independently. Forms were kept securely on the ward for the researcher to collect.

Demographic and clinical data, including patient age, gender, marital status, diagnosis, length of hospital stay and time to death were collected from the patient’s notes by the researcher. Gender, age and relationship to the patient were recorded for informal carers. Length of time since graduation and number of days looking after the patient were recorded for named nurses.


Mean and median score of patient, ICG and HCP ratings were calculated for each item. Wilcoxon signed-rank test was used to test for differences between patient and proxy scores. The agreement between patient and each proxy rater was measured using a weighted-κ statistic testing for agreement, controlling for chance, and its values were considered as follows: none (<0), slight (0.0–0.2), fair (0.21–0.4), moderate (0.41–0.60), substantial (0.61–0.80) and perfect agreement (>0.80).24


Subject characteristics

To achieve 50 triads of patients, ICG and HCP, 65 patients were interviewed, but for 15 of them, either the ICG or HCP results were missing due to their not having been collected within 24 h. All patient, ICG and HCP questionnaires are included in the analysis. Approximately half the patients were men and half women, but more ICG were women than men. Almost all HCP questioned were women (93.8%). The median age of the patient was 78 years, but the median age of the ICG was lower at 56.5 years. Data were not collected on age of HCP. Most ICG were offspring (43%) followed by spouse (24.6%). All patients included died within 1 month of assessment confirming the accuracy of patient selection. Demographic data are displayed in table 1.

Table 1

Demographic variables for each comparison group—patient, ICG and HCP

Intergroup comparisons

Item-specific median scores of patient self-rated and ICG-rated POS and POS-S, as well as significant Wilcoxon signed-rank values are displayed in table 2. For patient versus ICG, there were significant differences (determined by Wilcoxon signed-rank scores) for weakness, drowsiness, immobility, time spent by staff, and patient having felt good about themselves.

Table 2

Median score and Wilcoxon signed-rank (p value) for patient (n=50) and ICG (n=50) for each POS and POS-S item

Item-specific median scores of patient self-rated and HCP-rated POS and POS-S, as well as significant Wilcoxon signed-rank scores are displayed in table 3. For patient versus HCP, there were significant differences (determined by Wilcoxon signed-rank scores) for weakness, drowsiness, satisfaction with information given, depression, time spent by staff, and whether the patient had felt good about themselves.

Table 3

Median score and Wilcoxon signed-rank (p value) for patient (n=50) and HCP (n=50) for each POS and POS-S item

Patient versus informal caregiver

Weighted-κ and percentage agreement for each item on the POS and POS-S for patient versus ICG are found in table 4.

Table 4

Weighted-κ and percentage agreement between patient and ICG and between patient and HCP by POS and POS-S item

Slight agreement

Poor agreement was found between ICG and patient assessment of weakness, anxiety in the patient, whether friends and family were worried or anxious, satisfaction with amount of information given, patients feeling that they are able to share their feelings, depression, feeling enough time has been spent with staff, whether the patient feels good about themselves, and feeling that time has been wasted on appointments.

Fair agreement

ICG fairly assessed breathlessness, mouth problems, drowsiness, immobility, satisfaction with the standard of facilities and personal issues.

Moderate agreement

ICG and patient assessment agreed moderately for pain, nausea and vomiting, appetite and constipation.

Significant differences on Wilcoxon signed-rank statistic

There is a tendency of ICG to overestimate weakness, drowsiness, immobility, difficulties feeling good about self, nausea, depression, anxiety and satisfaction with information given.

Patient versus healthcare professional

Weighted-κ and percentage agreement for each item on the POS and POS-S for patient versus healthcare professional are found in table 4.

No agreement

HCP symptom assessments had no agreement with patient for assessment of weakness, patients feeling of being able to share feelings, depression, and patient feeling that staff have spent enough time with them.

Slight agreement

HCPs had only slight agreement with the patient in assessment of nausea, vomiting, appetite, mouth problems, drowsiness, anxiety in the patient, friends and family feeling anxious, standard of help for relatives, satisfaction with amount of information given, standard of facilities, patient feeling good about themselves, feeling of time wasted on appointments and personal issues.

Fair agreement

HCPs made fair assessments of pain, breathlessness, constipation and immobility.

Significant differences on Wilcoxon signed-rank statistic

HCPs overestimate weakness, depression, time spent with staff, patient having difficulty feeling good about self, anxiety and dissatisfaction with facilities.

HCPs underestimate problems with drowsiness compared with the patient.


Accuracy of proxy assessments and whose assessment is best

First, it is pertinent to state that acquisition of assessment of symptomatology should always come from the patient when possible. Proxy assessment of symptoms should only be sought when the patient, the gold standard, is unable to give his or her own report of symptoms. We will discuss who might represent the better proxy, but considering what level of agreement should be accepted to call the proxy ‘valid’ is a nugatory point, as proxy opinion should never replace the gold standard report—the patient—unless it is absent. By definition, the proxy is less accurate than the gold standard, but this cannot be altered, and often proxy assessment is the only one available.

These results demonstrate that agreement on assessment of symptoms was found to be better for ICG than HCP, meaning the ICG is a better proxy than the HCP. Findings confirm the published literature which reports a tendency for better agreement between ICG and patient than HCP and patient.20 ,25 Both ICG and HCP exhibit strengths and weaknesses when considering individual symptom assessment. A combined approach to assessment, taking into account the views of both family and healthcare professionals, has not been studied, but two raters together may give a more accurate assessment of symptoms than one proxy rater alone.

Our results show best proxy pain agreement by the ICG. This is also the best proxy-rated symptom in most,23 ,26–28 but not all,12 ,29 studies. The experience of pain is subjective with varied interpersonal expression, and thus, it might be expected that the ICG would give a more accurate proxy rating than the HCP by virtue of their personal knowledge of the patient. That the HCP comes a close second despite the subjectivity of pain may be explained by the focus on it as a symptom amenable to treatment by the clinical team.

Similarly, ICG and HCP gave closest agreement for constipation, but ICG was better. ICG also had better agreement for appetite and nausea and vomiting (moderate agreement) than HCP (slight agreement). This fits with previous findings that observable physical symptoms, such as nausea and vomiting, constipation and appetite, are best assessed.

A surprising finding was that nurses showed only slight agreement for nausea and vomiting and appetite. Previous studies have shown near-perfect agreement for proxy assessment of vomiting and nausea,30 and it seems intuitive that nurses would be able to report vomiting with reliability as this is a uniquely observable symptom. It may be that healthcare assistants are more involved with the front-line care of the vomiting patient than the nurses. Scrutiny of the validity of proxy assessment of symptoms by healthcare assistants could be of value.

Both proxy groups are poor at estimating the prevalence of psychological symptoms with a tendency for overestimation

HCP and ICG were poor at assessing psychological symptoms, as demonstrated by the κ and the percentage agreement, of anxiety in the patient, feeling able to share feelings, difficulty feeling good about self, anxiety in friends and family, and depression. The Wilcoxon results demonstrate direction, that is, either an overestimation or underestimation of patient symptoms. Taken with the κ and percentage agreement results, the overestimation of patient symptoms by the HCP and the ICG concurs with the published literature, where psychological symptoms are more poorly assessed by proxies than physical symptoms. In our study, HCP and ICG overestimate problems with anxiety, feeling good about self, and depression, compared with the patient.

The concept of caregiver burden influencing the quality of the proxy assessment is interesting. Caregiver burden refers to people's emotional response to the changes and demands of giving support to another.17 It has been suggested that higher caregiver burden leads to less accurate assessment of QOL,3 ,31 and that proxies who are depressed generally rate the QOL of the patient lower than the caregivers who are not depressed.11 ,32 An explanation for why ICG behave poorly as proxies for psychological symptoms may be that their overestimation of anxiety and depression is a reflection of their own symptoms making it difficult for them to uncouple their own feelings in order to make an objective assessment of the patient. Perspective taking is described as ‘a person's ability to understand what the other person is thinking, without experiencing the emotions’.9 It is said to be a dimension of empathy and intends not to take on the patients feelings but to appreciate the perceptions of the patient. Can it ever be possible for an ICG, who is intrinsically emotionally involved to be able to do this?

Cues used by ICG when they are assessing symptoms are also important. Lobchuk and Kristjanson33 looked at this and found that cues used for symptoms were different depending on the symptom being assessed. For example, appetite was reported to be assessed largely using impaired functioning as the cue: the inability to eat meals. Another example is bowel assessment for which ICG relied heavily on verbal cues. This is important because if symptom cues are principally verbal, at the point in a person’s illness, when communication is lost, the proxy assessment could be flawed. The study in question used Young’s symptom distress score which has only one question directly relating to psychological welfare—outlook—and its assessment was reported to be reliant mainly on verbal cues. Our results show the assessment of psychological symptoms, thought to be based on verbal cues, is poor when the patient is still able to provide those verbal cues, suggesting that the cues are not effective. At the end of life, such a proxy rating will be even less accurate once communication is lost.

Therefore, ICG and HCP should be regarded as poor proxies for psychological symptoms, and be aware that they probably underestimate QOL compared to the patient. It may be useful to investigate whether providing training to ICGs about the signs and symptoms of depression and anxiety might improve their assessments.

Poor assessment of satisfaction with information given

Consistent with existing literature, ICG and HCP were found to be poor at assessing satisfaction with the amount of information given. The disagreement (Wilcoxon) achieves significance for the HCP, but not for the ICG. The HCP underestimates problems, while the ICG overestimates them. Meeting the information needs of patients with progressive, life-limiting conditions and their families is a key concern of palliative care, and evidence of how to meet these needs is lacking.34 The literature shows that lack of information on the causes, symptoms, treatment and progression of disease adversely affects patients’ and caregivers’ abilities to cope with serious illness, and that good communication improves outcomes.35 As in our study, existing reports of patient versus proxy assessments of satisfaction with information given suggest that proxy assessors are less satisfied with the amount of information given than the patient.2 ,17 ,26 One study found that patients and families had a similar need for information initially, but this changed as the patients’ illness progressed, with caregivers wanting more information and patients less.36 When considering the disparity between assessment of satisfaction with information received it is possible that the proxies’ overestimation of dissatisfaction is a reflection of their own dissatisfaction. As with psychological symptom assessment, it could be difficult for the proxy to disentangle what they feel and what they perceive their relative feels, in light of the information needs of family caregivers, as they may differ from that of the patient. Meeting caregiver information needs directly may contribute to reduced caregiver burden, anxiety and depression which may, in turn, allow a truer perception of the suffering of the patient and a better proxy assessment.


ICGs are better proxy symptom assessors than HCPs. However, if the patient account is available, the proxy assessment should not be used, and should be reserved only for circumstances where the patient cannot answer for themselves. Proxy assessments of psychological symptoms should be interpreted with caution as they are poorly assessed by proxy. Proxy assessments are relied on when patients can no longer communicate. These results further undermine confidence in proxy accuracy when the patient is available to speak for themselves. When verbal cues are lost, proxy ratings may be entirely flawed. Investment in carer training in symptom recognition might resolve some sources of inaccuracy and be supportive to carer well-being. This should be an important future area of research. Further investigation of assessment tools would be required to define one universally accepted tool which would allow the literature to be comparable.


The limitations of this study include the sample size, though it is comparable with, or larger than, other studies of this type.22 ,23 ,26 In addition, considering ICG, we do not know the quality of their relationship with the patient, and how this might affect the results. Furthermore, we do not know the extent that each nurse had been involved in the care of the patient. Another limitation relates to extrapolation of the results; the study was performed on an acute hospital ward with nurses without specialist training in palliative care. It may be that proxy assessments are better for nurses on a specialist palliative care unit. Finally, though the entry criteria for the study were patients dying of any disease, the majority of the sample had a diagnosis of cancer, likely due to prognostication being easier in cancer.


The authors are pleased to acknowledge the generous financial assistance and continuing helpful liaison for this research from the West Midlands South Joint PCT Research Strategy Board. Dr Chantal Meystre and Dr Kathy Armour's posts are supported by Marie Curie Cancer Care.



  • Contributors CM is responsible for the completeness and accuracy of the data. BM assisted with data collection. Data analyses were co-designed by the listed authors and performed by CC. The manuscript was drafted by RD with inputs from CM and KA. All authors approved the final version. CM is the guarantor.

  • Competing interests None declared.

  • Ethics approval Ethics Warwickshire REC No 532 granted 2002 and extension granted to 2006.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Supplementary data available on request. Please contact CM by email;