Main

The European Organisation for Research and Treatment of Cancer (EORTC) QLQ-C30 (Aaronson et al, 1993) is one of the most widely used questionnaires for assessing health-related quality of life (HRQOL) in cancer patients. However, the QLQ-C30 does not meet all the needs of HRQOL assessment in cancer patients aged >70 years (Fitzsimmons et al, 2009; Johnson et al, 2010). There are substantial age-related differences in response on the EORTC QLQ-C30 (Hjermstad et al, 1998; Michelson et al, 2000; Schwarz and Hinz, 2001). Older people with cancer have a different HRQOL profile (Wright et al, 2005) and the specific needs of older people are often ignored in the development, validation and use of HRQOL instruments (Fitzsimmons et al, 2009). Similarly, healthy individuals report age-related differences in factors affecting well-being (Bowling, 2011). The QLQ-ELD15 was developed to supplement the QLQ-C30, and to take into account age-specific issues of relevance and importance to older cancer patients (Johnson et al, 2010).

Although older people represent the majority of cancer patients, there has been relatively little consideration for age-specific HRQOL in this population (Lichtman et al, 2007) and, as far as we know, no HRQOL instrument specifically designed for older people with cancer (Fitzsimmons et al, 2009). This may be partly explained by the under-representation of older patients in clinical trials (Scher and Hurria, 2012). Special interest organisations are now actively promoting research in elderly patients with cancer (Lichtman, 2012; Wildiers et al, 2012), so appropriate patient-reported outcome measures are required to assess HRQOL.

Health-related quality of life assessment is also important in routine clinical practice (Greenhalgh, 2009). Elderly cancer patients are more often treated with a non-curative approach and may be vulnerable to treatment toxicities (Wedding et al, 2007). Measurement of HRQOL aids the clinician in deciding whether the benefits of treatment outweigh the associated side effects, provided the instrument used is valid, reliable and responsive. We have previously described the EORTC QLQ-ELD15, a questionnaire designed to supplement the EORTC QLQ-C30, for use in older patients with cancer (Johnson et al, 2010). The aim of the present study was to test and, if necessary, modify the scale structure, along with the reliability, responsiveness to change and validity of the EORTC QLQ-ELD15 in conjunction with the EORTC QLQ-C30 in cancer patients aged 70 years.

Methods

This prospective multi-centre cohort study followed the EORTC Quality of Life Group guidelines for module development (Johnson et al, 2011). The full protocol is available from the authors.

Patients

Patients were recruited from September 2010 to December 2011 in four centres in the UK, three in France, two in the Netherlands and one each in Australia, Austria, Cyprus, Greece, Spain, Sweden and Taiwan. A convenience sample of consecutive in-patients and outpatients who met the inclusion criteria were invited to participate. Eligible patients had a confirmed diagnosis of any primary, recurrent or metastatic cancer, were aged >70 years at study entry and were capable of providing written informed consent and completing HRQOL questionnaires. Patients were excluded if they were participating in other HRQOL investigations, or had a history of a different cancer other than the primary cancer or previous localised skin cancer. Three subgroups were considered: solid tumour, potentially curative (Group A); solid tumour, palliative (Group B) and haematological cancer (Group C).

Recruitment targets

The primary aim of the study was to evaluate the hypothesised scale structure of the EORTC QLQ-ELD15. The target sample size of 450 (225 patients each in groups A and B) was determined by the number of items in the questionnaire (15) and the accepted ‘rule of thumb’ that 15 responses per item are needed (Johnson et al, 2011). Additionally, 50 Group C participants were recruited for comparison with the solid tumour patients.

Ethical and research governance approvals were obtained at each centre in accordance with local requirements and all patients provided written informed consent. The EORTC Quality of Life Group approved the protocol. The study was coordinated from Southampton and collaborators met every 6 months at EORTC Quality of Life Study Group meetings.

Questionnaires and data collection

All patients completed the EORTC QLQ-C30 (version 3.0) and QLQ-ELD15 at baseline. A ‘not applicable’ option was added at the request of the UK ethics committee to the three items of the QLQ-ELD15 which mentioned family (questions 35–37). In 29% of baseline questionnaires, the ‘not applicable’ option was omitted in error. A subset of patients, who were predicted to be clinically stable, completed the questionnaires again 1 week later (test–retest analysis) and a different subset, predicted to have a different clinical status, completed the questionnaires again 3 months after baseline (response to change analysis, RCA).

EORTC translation guidelines (Koller et al, 2007) were used to produce questionnaires in all the relevant languages. The QLQ-ELD15 contains 15 items in five scales: mobility (Q31–34), family support (Q35–36), worries about the future (Q37–41), maintaining autonomy and purpose (Q42–Q43), and burden of illness (Q44–45) (Johnson et al, 2010). All responses were converted to a score of between 0 and 100 using a linear transformation following EORTC guidelines (Fayers et al, 2001). High scores indicate poor mobility, good family support, much worry about the future, good maintenance of autonomy and purpose, and high burden of illness.

At baseline, participants completed a debriefing questionnaire that recorded time for completion, whether any help was needed and whether any of the items were upsetting, confusing or difficult to answer. Additional comments were invited. Sociodemographic and clinical data were recorded at each completion of the questionnaires, along with the Charlson Comorbidity Index (Charlson et al, 1987), Eastern Cooperative Oncology Group (ECOG) Common Toxicity Criteria and Performance Status (Oken et al, 1982), G-8 Geriatric Screening tool (Bellera et al, 2012) and Instrumental Activities of Daily Living (IADL) (Lawton and Brody, 1969).

Statistical analysis

Standard psychometric analyses were employed to evaluate the QLQ-ELD15. All analyses were performed using Stata/IC version 12 statistical software (Stata Corporation, College Station, TX, USA).

Scaling

The construct validity of the QLQ-ELD15, that is whether the individual items composing the questionnaire could be aggregated into the five hypothesised scales described above, was examined using multi-trait scaling. Construct validity comprises convergent validity and discriminant validity. Convergent validity is demonstrated when an item correlates highly with its own hypothesised scale, defined as a correlation of 0.40 (corrected for overlap) (Fayers and Machin, 2007). Discriminant validity is demonstrated when an item does not correlate highly with the scales it is not part of. Discriminant validity was supported and scaling success was identified when the correlation between an item and its hypothesised scale (corrected for overlap) was >2 standard errors higher than its correlation with other scales. Scaling failures were identified when an item correlated lower with its hypothesised scale (corrected for overlap) than with other scales.

Exploratory factor analysis (EFA), using principal factors and oblique promax rotation, was used to explore the factor structure of the QLQ-ELD15 (Fayers and Machin, 2007). The first model tested was based on the hypothesised five-scale structure described above. Item response theory (IRT) analyses were also used to check the proposed scale structure (Fayers and Machin, 2007).

Reliability

Two types of reliability were assessed: internal reliability is tested by examining the homogeneity of the multi-items scales and test–retest reliability is tested by checking whether the same responses are given when the instrument is completed on two separate occasions, a short time apart. The internal reliability of the QLQ-ELD15 was explored using Cronbach’s α coefficient, with a value of 0.70 regarded as adequate (Fayers and Machin, 2007). The test–retest reliability of scales was examined using intraclass correlations (ICC) on the scores from assessment 1 and 2 with an ICC of 0.70 regarded as adequate.

Convergent validity

To assess scale-convergent validity, correlations between conceptually related scales on the QLQ-ELD15 and QLQ-C30 were examined using Pearson’s product moment correlation. It was expected that those scales that are conceptually related would correlate substantially with one another (Pearson’s r>0.40). These scales were mobility (QLQ-ELD15) vs physical functioning (QLQ-C30), worries about the future (QLQ-ELD15) vs emotional functioning (QLQ-C30), maintaining autonomy and purpose (QLQ-ELD15) vs role functioning (QLQ-C30), and burden of illness (QLQ-ELD15) vs global health/QOL (QLQ-C30).

Known-group comparisons

The extent to which the QLQ-ELD15 differentiates between groups of patients was assessed using the method of known-group comparisons (Fayers and Machin, 2007). The worries about the future scale was predicted to differentiate subgroups based on treatment intention and disease stage. Subgroups based on Charlson comorbidity, ECOG and G-8 score, were predicted to be differentiated by mobility, worries about the future, maintaining autonomy and purpose, and burden of illness scales.

Responsiveness to change analysis

We used t-tests to test for the significance of changes in scores at the two assessment times. We expected changes on all the scales.

Results

Of 518 patients recruited, 275 were in Group A, 170 were in Group B and 54 were in Group C. Nineteen patients with solid tumours without information on treatment intention were assigned to an additional Group D. Further, 176 patients were from Northern Europe, 147 from Western Europe, 116 from Southern Europe and 79 from the rest of the world. Patient sociodemographic and clinical details are summarised in Table 1. The time taken to complete the QLQ-ELD15 was recorded for 416 participants; 391 took 15 min. Help to complete the questionnaire was required by 209 patients, predominantly reading and/or writing. Forty five patients reported finding at least one of the questions confusing or difficult to answer and 22 found at least one question upsetting but no question was found difficult or upsetting by more than 6 patients. A few patients provided additional comments: five patients queried why all the questions referred to the last week, two patients suggested that their answers were predominantly determined by their age and other illnesses, and one patient commented on how his responses were context-dependent.

Table 1 Patient sociodemographic and clinical details

The responses of Groups A and B combined together to the QLQ-ELD15 were compared graphically with those of Group C (data not shown). The distributions of responses were very similar. In addition, differential item functioning confirmed that there were no significant differences in the response probabilities across all items for the two groups (probability range: 0.21–1.00). This suggests that there is no difference in the responses given by patients with solid and haematological cancers who have the same HRQOL. The decision was therefore made that no additional haematological patients were required, and that it was reasonable to combine the data from solid and haematological malignancies.

Item and scale structure review

The authors reviewed the content of any item identified as confusing or upsetting. Item 35, ‘Has your relationship with your family became closer?’, was removed because the wording was problematic (patients who were already close to their family found this difficult to answer) and the time frame was inappropriate (unlikely to be applicable for most people in the last week). This left just one item, item 36, ‘Have you felt able to talk to your family about your illness?’, in the family support scale. This item was retained as a single item because it is important and relevant to elderly patients. An EFA (not shown) suggested that the hypothesised future worries scale should be split into two, with items 37 and 38 forming one scale (worries about others) and items 39–41 forming another scale (future worries). The EFA also indicated that item 32, ‘Have you had trouble with your joints (e.g. stiffness and pain)?’, contributed little to the mobility scale. Item response theory analyses (not shown) also supported removing item 32 from the mobility scale as it showed poor fit with the other items and contributed little additional information to the scale. The authors decided to retain item 32 as a single item because of its clinical relevance. Table 2 shows the items and revised scale structure of the EORTC QLQ-ELD14 (note the change from QLQ-ELD15). The QLQ-ELD14 comprises 14 items, made up of 5 scales and 2 single items. Item response theory analyses of the revised scales did not reveal any poorly performing items, and inspection of the contribution of each item to the respective scale total-information plots supported the proposed scale structure and the retention of items. The results reported below are all based on the QLQ-ELD14.

Table 2 EORTC QLQ-ELD14

Scaling

Results from the multi-trait scaling analyses are shown in Table 3. There were no scaling failures. The EFA supported the proposed scale structure but the single item joint stiffness correlated with the three-item mobility scale (r=0.48). Exploratory factor analysis therefore suggested a single composite score that combined all four items. However, joint stiffness is conceptually different from the other mobility scale items and as the correlation was modest, we decided to retain joint stiffness as a separate single item.

Table 3 Multi-trait scaling analyses and reliability of the scales in the QLQ-ELD14a

Reliability

Table 3 indicates that all the scales met the criterion for internal consistency except the maintaining purpose scale, which fell just short of our chosen threshold for adequate internal consistency (0.68 vs 0.70). For the test–retest analysis, ICCs were adequate for four scales and one single item. The low ICCs for burden of illness and family support were reflected by a statistically significant reduction in burden (P=0.003) and increase in family support (P=0.009). Because of the unexpected differences, a similar test–retest analysis was carried out for the QLQ-C30; this showed a significant worsening of physical (P=0.020), role (P=0.015) and social functioning (P=0.014).

Convergent validity

Correlations between the QLQ-C30 and QLQ-ELD14 are shown in Table 4. Three of four scale pairs predicted to be conceptually related did correlate substantially with one another (r>0.4), but the maintaining purpose (QLQ-ELD14) and role functioning (QLQ-C30) scales did not correlate well (r=0.18). Other correlations with r>0.4 that had not been predicted a priori were mobility (QLQ-ELD14) with social and role functioning, and with global health/QOL; burden of illness (QLQ-ELD14) with the physical, social and role functioning scales; the single item joint stiffness with physical functioning, and the future worries scale with social functioning.

Table 4 Pearson’s product moment correlations between QLQ-ELD14 and QLQ-C30 scalesa

Known-group comparisons

Table 5 shows the significant P-values in the known-group comparisons analyses. For the disease stage and treatment intention analyses, the only differences were on the future worries scale. Mobility, joint stiffness and maintaining purpose discriminated between patients with differing numbers of comorbidities. Patients above and below the cutoff (a score of 14) on the G-8 scored differently on each of the five multi-item scales, but not the two single items, and all seven scales differentiated patients with different ECOG scores.

Table 5 Means, F-statistics and probability values for significant results on the known-groups analysis

Responsiveness to change analysis

Although patients likely to show a change in clinical status were selected for the RCA, many of those included remained stable. We therefore used the ECOG to define groups for the RCA. We predicted that patients who improved on the ECOG would also improve on the mobility scale, and that patients whose performance status declined would have higher scores (worse mobility). Patients with worse ECOG (n=13) had significantly worse scores on the mobility scale (P=0.038). There was no improvement on the mobility scale in 19 patients with improved ECOG (P=0.58).

Discussion

This study examined the reliability, validity and psychometric properties of the EORTC QLQ-ELD15 in an international sample of 518 elderly patients, across 10 countries and in 8 languages. One item was removed from the module, due to problems with wording and content. The revised QLQ-ELD14 comprises five scales (mobility, worries about others, future worries, maintaining purpose and burden of illness) and two single items (joint stiffness and family support). The questionnaire is appropriate for patients with all types of malignancy and provides a patient-reported measure of HRQOL in line with the views expressed by patients during the development process (Johnson et al, 2010). Unlike EORTC site-specific modules, the QLQ-ELD14 has a strong focus on psychosocial issues. It is able to discriminate between groups of patients defined by disease stage, number of comorbidities, treatment intention, performance status and normal or abnormal G-8 score.

Almost all patients completed the questionnaire in less than 15 min, although many needed help with reading the questions and filling out the answers, usually because reading glasses were not available. No major omissions were identified in the debriefing interviews and <1.5% of patients described any item as difficult to understand or upsetting. We conclude that the QLQ-ELD14 is acceptable, quick and easy to complete, and has good content validity. Convergent validity was established by significant correlations between mobility and physical functioning, worries about the future and emotional functioning, and burden of illness and global health/QOL. Maintaining purpose and role functioning had also been predicted to correlate significantly with each other but only a modest correlation was observed. This may be explained by different emphasis of the scales: although both scales ask about hobbies and usual activities, the maintaining purpose scale covers motivation and ‘positive outlook’ while the role functioning scale asks about limitations in ability to work and perform daily activities.

Along with the predicted substantial correlations there were a number of relationships that had not been anticipated. However, all these associations were plausible. For example, mobility correlated with social and role functioning, both of which are concerned with whether physical condition had an impact on everyday life (either family/social life or work/hobbies). It seems reasonable that mobility can have an effect on these activities. Mobility also correlated with the global health and QOL score. These observations emphasise the central importance of mobility to HRQOL in elderly cancer patients. Joint stiffness was retained as a separate item because psychometric analyses indicated that it should not be part of the mobility scale but it had been strongly supported by patients as an important issue in the Phase 1 qualitative interviews (Johnson et al, 2010).

Previous development work (Johnson et al, 2010) had not tested the QLQ-ELD14 in patients with haematological cancers. Differential item functioning and comparison of the response pattern found no evidence of any differences between the two groups suggesting that the QLQ-ELD14 is appropriate for patients with haematological malignancies.

Although the reliability analysis showed that the maintaining purpose scale fell just short of the threshold for adequate internal consistency, this scale has good face validity. The weaker internal consistency suggests differences between the two concepts in this scale (positive outlook and motivation for activities), but it was agreed to retain the scale in its original form.

The test–retest reliability of the instrument was generally good. Unexpectedly, there was a significant improvement in the family support item and a significant reduction in illness burden. All the patients appeared clinically stable, although it was not possible to corroborate this with objective measures. There were also some unexpected changes on the QLQ-C30 between the two time points, with physical, role and social functioning all getting significantly worse. It is possible that these changes in response to the QLQ-C30 and the QLQ-ELD14 were influenced by non-clinical factors such as interaction with family, which may have changed subjective feelings of dependency and functional ability. For example, more family support during the last week may have reduced the patient’s perception of disease burden and need for physical activity.

Responsiveness to change was difficult to assess because many patients selected for RCA did not show a change in their clinical status. We therefore defined groups with an objective measure of change (ECOG worse or better). Patients with a lower function scored worse on the mobility scale on the second administration. There was no improvement in the scale score for patients with improved ECOG category. As in the test–retest analysis, context may have influenced the RCA. In retrospect, it would have been better to select patients for the RCA using documented evidence of change, rather than a prediction. Although the RCA was equivocal, we feel that this relative weakness is outweighed by the strengths of the QLQ-ELD14: content, convergent and face validity are all good. Given that everyday life events have an impact on QOL over a short period in a general sample of elderly participants (Bowling, 2009), it is likely everyday life events will also have a significant role in the HRQOL of elderly cancer patients, and so should be assessed alongside HRQOL.

Potential applications of the QLQ-ELD14 complement physician-rated data and the existing QLQ-C30 and site-specific modules of the EORTC HRQOL questionnaires. The QLQ-C30 is a generic cancer questionnaire; the site-specific modules explore HRQOL issues related to specific tumour types in depth. The QLQ-ELD14 addresses generic issues affecting older people with cancer, not covered by the QLQ-C30 or site-specific modules, and can be used in clinical studies that include older patients, regardless of tumour site. Perhaps it’s greatest use will be in the evaluation of the effect of changes in cancer services across a range of tumour sites.

Conclusion

The EORTC QLQ-ELD14 in conjunction with the QLQ-C30 is the first age-specific instrument for assessing HRQOL in cancer patients and is suitable, acceptable and validated for patients aged >70 years. Factors other than clinical status may affect elderly patients more than younger patients and future studies should explore this hypothesis.