Chapter 2: Determining the scope of the review and the questions it will address

James Thomas, Dylan Kneale, Joanne E McKenzie, Sue E Brennan, Soumyadeep Bhaumik

Key Points:

Systematic reviews should address answerable questions and fill important gaps in knowledge.
Developing good review questions takes time, expertise and engagement with intended users of the review.
Cochrane Reviews can focus on broad questions, or be more narrowly defined. There are advantages and disadvantages of each.
Logic models are a way of documenting how interventions, particularly complex interventions, are intended to ‘work’, and can be used to refine review questions and the broader scope of the review.
Using priority-setting exercises, involving relevant stakeholders, and ensuring that the review takes account of issues relating to equity can be strategies for ensuring that the scope and focus of reviews address the right questions.

Cite this chapter as: Thomas J, Kneale D, McKenzie JE, Brennan SE, Bhaumik S. Chapter 2: Determining the scope of the review and the questions it will address. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.2 (updated February 2021). Cochrane, 2021. Available from www.training.cochrane.org/handbook.

2.1 Rationale for well-formulated questions

As with any research, the first and most important decision in preparing a systematic review is to determine its focus. This is best done by clearly framing the questions the review seeks to answer. The focus of any Cochrane Review should be on questions that are important to people making decisions about health or health care. These decisions will usually need to take into account both the benefits and harms of interventions (see MECIR Box 2.1.a). Good review questions often take time to develop, requiring engagement with not only the subject area, but with a wide group of stakeholders (Section 2.4.2).

Well-formulated questions will guide many aspects of the review process, including determining eligibility criteria, searching for studies, collecting data from included studies, structuring the syntheses and presenting findings (Cooper 1984, Hedges 1994, Oliver et al 2017). In Cochrane Reviews, questions are stated broadly as review ‘Objectives’, and operationalized in terms of the studies that will be eligible to answer those questions as ‘Criteria for considering studies for this review’. As well as focusing review conduct, the contents of these sections are used by readers in their initial assessments of whether the review is likely to be directly relevant to the issues they face.

The FINER criteria have been proposed as encapsulating the issues that should be addressed when developing research questions. These state that questions should be Feasible, Interesting, Novel, Ethical, and Relevant (Cummings et al 2007). All of these criteria raise important issues for consideration at the outset of a review and should be borne in mind when questions are formulated.

A feasible review is one that asks a question that the author team is capable of addressing using the evidence available. Issues concerning the breadth of a review are discussed in Section 2.3.1, but in terms of feasibility it is important not to ask a question that will result in retrieving unmanageable quantities of information; up-front scoping work will help authors to define sensible boundaries for their reviews. Likewise, while it can be useful to identify gaps in the evidence base, review authors and stakeholders should be aware of the possibility of asking a question that may not be answerable using the existing evidence (i.e. that will result in an ‘empty’ review, see also Section 2.5.3).

Embarking on a review that authors are interested in is important because reviews are a significant undertaking and review authors need sufficient commitment to see the work through to its conclusion.

A novel review will address a genuine gap in knowledge, so review authors should be aware of any related or overlapping reviews. This reduces duplication of effort, and also ensures that authors understand the wider research context to which their review will contribute. Authors should check for pre-existing syntheses in the published research literature and also for ongoing reviews in the PROSPERO register of systematic reviews before beginning their own review.

Given the opportunity cost involved in undertaking an activity as demanding as a systematic review, authors should ensure that their work is relevant by: (i) involving relevant stakeholders in defining its focus and the questions it will address; and (ii) writing up the review in such a way as to facilitate the translation of its findings to inform decisions. The GRADE framework aims to achieve this, and should be considered throughout the review process, not only when it is being written up (see Chapter 14 and Chapter 15).

Consideration of opportunity costs is also relevant in terms of the ethics of conducting a review, though ethical issues should also be considered primarily in terms of the questions that are prioritized for answering and the way that they are framed. Research questions are often not value-neutral, and the way that a given problem is approached can have political implications which can result in, for example, the widening of health inequalities (whether intentional or not). These issues are explored in Section 2.4.3 and Chapter 16.

MECIR Box 2.1.a Relevant expectations for conduct of intervention reviews

C1: Formulating review questions (Mandatory)
Ensure that the review question and particularly the outcomes of interest, address issues that are important to review users such as consumers, health professionals and policy makers.	Cochrane Reviews are intended to support clinical practice and policy, not just scientific curiosity. The needs of consumers play a central role in Cochrane Reviews and they can play an important role in defining the review question. Qualitative research, i.e. studies that explore the experience of those involved in providing and receiving interventions, and studies evaluating factors that shape the implementation of interventions, might be used in the same way.
C3: Considering potential adverse effects (Mandatory)
Consider any important potential adverse effects of the intervention(s) and ensure that they are addressed.	It is important that adverse effects are addressed in order to avoid one-sided summaries of the evidence. At a minimum, the review will need to highlight the extent to which potential adverse effects have been evaluated in any included studies. Sometimes data on adverse effects are best obtained from non-randomized studies, or qualitative research studies. This does not mean however that all reviews must include non-randomized studies.

2.2 Aims of reviews of interventions

Systematic reviews can address any question that can be answered by a primary research study. This Handbook focuses on a subset of all possible review questions: the impact of intervention(s) implemented within a specified human population. Even within these limits, systematic reviews examining the effects of intervention(s) can vary quite markedly in their aims. Some will focus specifically on evidence of an effect of an intervention compared with a specific alternative, whereas others may examine a range of different interventions. Reviews that examine multiple interventions and aim to identify which might be the most effective can be broader and more challenging than those looking at single interventions. These can also be the most useful for end users, where decision making involves selecting from a number of intervention options. The incorporation of network meta-analysis as a core method in this edition of the Handbook (see Chapter 11) reflects the growing importance of these types of reviews.

As well as looking at the balance of benefit and harm that can be attributed to a given intervention, reviews within the ambit of this Handbook might also aim to investigate the relationship between the size of an intervention effect and other characteristics, such as aspects of the population, the intervention itself, how the outcome is measured, or the methodology of the primary research studies included. Such approaches might be used to investigate which components of multi-component interventions are more or less important or essential (and when). While it is not always necessary to know how an intervention achieves its effect for it to be useful, many reviews will aim to articulate an intervention’s mechanisms of action (see Section 2.5.1), either by making this an explicit aim of the review itself (see Chapter 17 and Chapter 21), or when describing the scope of the review. Understanding how an intervention works (or is intended to work) can be an important aid to decision makers in assessing the applicability of the review to their situation. These investigations can be assisted by the incorporation of results from process evaluations conducted alongside trials (see Chapter 21). Further, many decisions in policy and practice are at least partially constrained by the resource available, so review authors often need to consider the economic context of interventions (see Chapter 20).

2.3 Defining the scope of a review question

Studies comparing healthcare interventions, notably randomized trials, use the outcomes of participants to compare the effects of different interventions. Statistical syntheses (e.g. meta-analysis) focus on comparisons of interventions, such as a new intervention versus a control intervention (which may represent conditions of usual practice or care), or the comparison of two competing interventions. Throughout the Handbook we use the terminology experimental intervention versus comparator intervention. This implies a need to identify one of the interventions as experimental, and is used only for convenience since all methods apply to both controlled and head-to-head comparisons. The contrast between the outcomes of two groups treated differently is known as the ‘effect’, the ‘treatment effect’ or the ‘intervention effect’; we generally use the last of these throughout the Handbook.

A statement of the review’s objectives should begin with a precise statement of the primary objective, ideally in a single sentence (MECIR Box 2.3.a). Where possible the style should be of the form ‘To assess the effects of [intervention or comparison] for [health problem] in [types of people, disease or problem and setting if specified]’. This might be followed by one or more secondary objectives, for example relating to different participant groups, different comparisons of interventions or different outcome measures. The detailed specification of the review question(s) requires consideration of several key components (Richardson et al 1995, Counsell 1997) which can often be encapsulated by the ‘PICO’ mnemonic, an acronym for Population, Intervention, Comparison(s) and Outcome. Equal emphasis in addressing, and equal precision in defining, each PICO component is not necessary. For example, a review might concentrate on competing interventions for a particular stage of breast cancer, with stage and severity of the disease being defined very precisely; or alternately focus on a particular drug for any stage of breast cancer, with the treatment formulation being defined very precisely.

Throughout the Handbook we make a distinction between three different stages in the review at which the PICO construct might be used. This division is helpful for understanding the decisions that need to be made:

The review PICO (planned at the protocol stage) is the PICO on which eligibility of studies is based (what will be included and what excluded from the review).
The PICO for each synthesis (also planned at the protocol stage) defines the question that each specific synthesis aims to answer, determining how the synthesis will be structured, specifying planned comparisons (including intervention and comparator groups, any grouping of outcome and population subgroups).
The PICO of the included studies (determined at the review stage) is what was actually investigated in the included studies.

Reaching the point where it is possible to articulate the review’s objectives in the above form – the review PICO – requires time and detailed discussion between potential authors and users of the review. It is important that those involved in developing the review’s scope and questions have a good knowledge of the practical issues that the review will address as well as the research field to be synthesized. Developing the questions is a critical part of the research process. As such, there are methodological issues to bear in mind, including: how to determine which questions are most important to answer; how to engage stakeholders in question formulation; how to account for changes in focus as the review progresses; and considerations about how broad (or narrow) a review should be.

MECIR Box 2.3.a Relevant expectations for conduct of intervention reviews

C2: Predefining objectives (Mandatory)
Define in advance the objectives of the review, including population, interventions, comparators and outcomes (PICO).	Objectives give the review focus and must be clear before appropriate eligibility criteria can be developed. If the review will address multiple interventions, clarity is required on how these will be addressed (e.g. summarized separately, combined or explicitly compared).

2.3.1 Broad versus narrow reviews

The questions addressed by a review may be broad or narrow in scope. For example, a review might address a broad question regarding whether antiplatelet agents in general are effective in preventing all thrombotic events in humans. Alternatively, a review might address whether a particular antiplatelet agent, such as aspirin, is effective in decreasing the risks of a particular thrombotic event, stroke, in elderly persons with a previous history of stroke. Increasingly, reviews are becoming broader, aiming, for example, to identify which intervention – out of a range of treatment options – is most effective, or to investigate how an intervention varies depending on implementation and participant characteristics.

Overviews of reviews (see Chapter V), in which multiple reviews are summarized, can be one way of addressing the need for breadth when synthesizing the evidence base, since they can summarize multiple reviews of different interventions for the same condition, or multiple reviews of the same intervention for different types of participants. It may be considered desirable to plan a series of reviews with a relatively narrow scope, alongside an Overview to summarize their findings. Alternatively, it may be more useful – particularly given the growth in support for network meta-analysis – to combine comparisons of different treatment options within the same review (see Chapter 11). When deciding whether or not an overview might be the most appropriate approach, review authors should take account of the breadth of the question being asked and the resources available. Some questions are simply too broad for a review of all relevant primary research to be practicable, and if a field has sufficient high-quality reviews, then the production of another review of primary research that duplicates the others might not be a sensible use of resources.

Some of the advantages and disadvantages of broad and narrow reviews are summarized in Table 2.3.a. While having a broad scope in terms of the range of participants has the potential to increase generalizability, the extent to which findings are ultimately applicable to broader (or different) populations will depend on the participants who have actually been recruited into research studies. Likewise, heterogeneity can be a disadvantage when the expectation is for homogeneity of effects between studies, but an advantage when the review question seeks to understand differential effects (see Chapter 10).A distinction should be drawn between the scope of a review and the precise questions within, since it is possible to have a broad review that addresses quite narrow questions. In the antiplatelet agents for preventing thrombotic events example, a systematic review with a broad scope might include all available treatments. Rather than combining all the studies into one comparison though, specific treatments would be compared with one another in separate comparisons, thus breaking a heterogeneous set of treatments into narrower, more homogenous groups. This relates to the three levels of PICO, outlined in Section 2.3. The review PICO defines the broad scope of the review, and the PICO for comparison defines the specific treatments that will be compared with one another; Chapter 3 elaborates on the use of PICOs.

In practice, a Cochrane Review may start (or have started) with a broad scope, and be divided up into narrower reviews as evidence accumulates and the original review becomes unwieldy. This may be done for practical and logistical reasons, for example to make updating easier as well as to make it easier for readers to see which parts of the evidence base are changing. Individual review authors must decide if there are instances where splitting a broader focused review into a series of more narrowly focused reviews is appropriate and implement appropriate methods to achieve this. If a major change is to be undertaken, such as splitting a broad review into a series of more narrowly focused reviews, a new protocol must be written for each of the component reviews that documents the eligibility criteria for each one.

Ultimately, the selected breadth of a review depends upon multiple factors including perspectives regarding a question’s relevance and potential impact; supporting theoretical, biologic and epidemiological information; the potential generalizability and validity of answers to the questions; and available resources. As outlined in Section 2.4.2, authors should consider carefully the needs of users of the review and the context(s) in which they expect the review to be used when determining the most optimal scope for their review.

Table 2.3.a Some advantages and disadvantages of broad versus narrow reviews

Broad scope

Narrow scope

Choice of population

e.g. corticosteroid injection for shoulder tendonitis (narrow) or corticosteroid injection for any tendonitis (broad)

Advantages:

Comprehensive summary of the evidence.

Opportunity to explore consistency of findings (and therefore generalizability) across different types of participants.

Advantages:

Manageability for review team.

Ease of reading.

Disadvantages:

Searching, data collection, analysis and writing may require more resources.

Interpretation may be difficult for readers if the review is large and lacks a clear rationale (such as examining consistency of findings) for including diverse types of participants.

Disadvantages:

Evidence may be sparse.

Unable to explore whether an intervention operates differently in other settings or populations (e.g. inability to explore differential effects that could lead to inequity).

Increased burden for decision makers if multiple reviews must be accessed (e.g. if evidence is sparse for the population of interest).

Scope could be chosen by review authors to produce a desired result.

Mode of intervention

e.g. supervised running for depression (narrow) or any exercise for depression (broad)

Advantages:

Comprehensive summary of the evidence.

Opportunity to explore consistency of findings across different implementations of the intervention.

Advantages:

Manageability for review team.

Ease of reading.

Disadvantages:

Searching, data collection, analysis and writing may require more resources.

Interpretation may be difficult for readers if the review is large and lacks a clear rationale (such as examining consistency of findings) for including different modes of an intervention.

Disadvantages:

Evidence may be sparse.

Unable to explore whether different modes of an intervention modify the intervention effects.

Increased burden for decision makers if multiple reviews must be accessed (e.g. if evidence is sparse for a specific mode).

Scope could be chosen by review authors to produce a desired result.

Choice of interventions and comparators

e.g. oxybutynin compared with desmopressin for preventing bed-wetting (narrow) or interventions for preventing bed-wetting (broad)

Advantages:

Comprehensive summary of the evidence.

Opportunity to compare the effectiveness of a range of different intervention options.

Advantages:

Manageability for review team.

Relative simplicity of objectives and ease of reading.

Disadvantages:

Searching, data collection, analysis and writing may require more resources.

May be unwieldy, and more appropriate to present as an Overview of reviews (see Chapter V).

Disadvantages:

Increased burden for decision makers if not included in an Overview since multiple reviews may need to be accessed.

2.3.2 ‘Lumping’ versus ‘splitting’

It is important not to confuse the issue of the breadth of the review (determined by the review PICO) with concerns about between-study heterogeneity and the legitimacy of combining results from diverse studies in the same analysis (determined by the PICOs for comparison).

Broad reviews have been criticized as ‘mixing apples and oranges’, and one of the inventors of meta-analysis, Gene Glass, has responded “Of course it mixes apples and oranges… comparing apples and oranges is the only endeavour worthy of true scientists; comparing apples to apples is trivial” (Glass 2015). In fact, the two concepts (‘broad reviews’ and ‘mixing apples and oranges’) are different issues. Glass argues that broad reviews, with diverse studies, provide the opportunity to ask interesting questions about the reasons for differential intervention effects.

The ‘apples and oranges’ critique refers to the inappropriate mixing of studies within a single comparison, where the purpose is to estimate an average effect. In situations where good biologic or sociological evidence suggests that various formulations of an intervention behave very differently or that various definitions of the condition of interest are associated with markedly different effects of the intervention, the uncritical aggregation of results from quite different interventions or populations/settings may well be questionable.

Unfortunately, determining the situations where studies are similar enough to combine with one another is not always straightforward, and it can depend, to some extent, on the question being asked. While the decision is sometimes characterized as ‘lumping’ (where studies are combined in the same analysis) or ‘splitting’ (where they are not) (Squires et al 2013), it is better to consider these issues on a continuum, with reviews that have greater variation in the types of included interventions, settings and populations, and study designs being towards the ‘lumped’ end, and those that include little variation in these elements being towards the ‘split’ end (Petticrew and Roberts 2006).

While specification of the review PICO sets the boundary for the inclusion and exclusion of studies, decisions also need to be made when planning the PICO for the comparisons to be made in the analysis as to whether they aim to address broader (‘lumped’) or narrower (‘split’) questions (Caldwell and Welton 2016). The degree of ‘lumping’ in the comparisons will be primarily driven by the review’s objectives, but will sometimes be dictated by the availability of studies (and data) for a particular comparison (see Chapter 9 for discussion of the latter). The former is illustrated by a Cochrane Review that examined the effects of newer-generation antidepressants for depressive disorders in children and adolescents (Hetrick et al 2012).

Newer-generation antidepressants include multiple different compounds (e.g. paroxetine, fluoxetine). The objectives of this review were to (i) estimate the overall effect of newer-generation antidepressants on depression, (ii) estimate the effect of each compound, and (iii) examine whether the compound type and age of the participants (children versus adolescents) is associated with the intervention effect. Objective (i) addresses a broad, ‘in principle’ (Caldwell and Welton 2016), question of whether newer-generation antidepressants improve depression, where the different compounds are ‘lumped’ into a single comparison. Objective (ii) seeks to address narrower, ‘split’, questions that investigate the effect of each compound on depression separately. Answers to both questions can be identified by setting up separate comparisons for each compound, or by subgrouping the ‘lumped’ comparison by compound (Chapter 10, Section 10.11.2). Objective (iii) seeks to explore factors that explain heterogeneity among the intervention effects, or equivalently, whether the intervention effect varies by the factor. This can be examined using subgroup analysis or meta-regression (Chapter 10, Section 10.11) but, in the case of intervention types, is best achieved using network meta-analysis (see Chapter 11).

There are various advantages and disadvantages to bear in mind when defining the PICO for the comparison and considering whether ‘lumping’ or ‘splitting’ is appropriate. Lumping allows for the investigation of factors that may explain heterogeneity. Results from these investigations may provide important leads as to whether an intervention operates differently in, for example, different populations (such as in children and adolescents in the example above). Ultimately, this type of knowledge is useful for clinical decision making. However, lumping is likely to introduce heterogeneity, which will not always be explained by a priori specified factors, and this may lead to a combined effect that is clinically difficult to interpret and implement. For example, when multiple intervention types are ‘lumped’ in one comparison (as in objective (i) above), and there is unexplained heterogeneity, the combined intervention effect would not enable a clinical decision as to which intervention should be selected. Splitting comparisons carries its own risk of there being too few studies to yield a useful synthesis. Inevitably, some degree of aggregation across the PICO elements is required for a meta-analysis to be undertaken (Caldwell and Welton 2016).

2.4 Ensuring the review addresses the right questions

Since systematic reviews are intended for use in healthcare decision making, review teams should ensure not only the application of robust methodology, but also that the review question is meaningful for healthcare decision making. Two approaches are discussed below:

Using results from existing research priority-setting exercises to define the review question.
In the absence of, or in addition to, existing research priority-setting exercises, engaging with stakeholders to define review questions and establish their relevance to policy and practice.

2.4.1 Using priority-setting exercises to define review questions

A research priority-setting exercise is a “collective activity for deciding which uncertainties are most worth trying to resolve through research; uncertainties considered may be problems to be understood or solutions to be developed or tested; across broad or narrow areas” (Sandy Oliver, referenced in Nasser 2018). Using research priority-setting exercises to define the scope of a review helps to prevent the waste of scarce resources for research by making the review more relevant to stakeholders (Chalmers et al 2014).

Research priority setting is always conducted in a specific context, setting and population with specific principles, values and preferences (which should be articulated). Different stakeholders’ interpretation of the scope and purpose of a ‘research question’ might vary, resulting in priorities that might be difficult to interpret. Researchers or review teams might find it necessary to translate the research priorities into an answerable PICO research question format, and may find it useful to recheck the question with the stakeholder groups to determine whether they have accurately reflected their intentions.

While Cochrane Review teams are in most cases reviewing the effects of an intervention with a global scope, they may find that the priorities identified by important stakeholders (such as the World Health Organization or other organizations or individuals in a representative health system) are informative in planning the review. Review authors may find that differences between different stakeholder groups’ views on priorities and the reasons for these differences can help them to define the scope of the review. This is particularly important for making decisions about excluding specific populations or settings, or being inclusive and potentially conducting subgroup analyses.

Whenever feasible, systematic reviews should be based on priorities identified by key stakeholders such as decision makers, patients/public, and practitioners. Cochrane has developed a list of priorities for reviews led by review groups and networks, in consultation with key stakeholders, which is available on the Cochrane website. Issues relating to equity (see Chapter 16 and Section 2.4.3) need to be taken into account when conducting and interpreting the results from priority-setting exercises. Examples of materials to support these processes are available (Viergever et al 2010, Nasser et al 2013, Tong et al 2017).

The results of research priority-setting exercises can be searched for in electronic databases and via websites of relevant organizations. Examples are: James Lind Alliance , World Health Organization, organizations of health professionals including research disciplines, and ministries of health in different countries (Viergever 2010). Examples of search strategies for identifying research priority-setting exercises are available (Bryant et al 2014, Tong et al 2015).

Other sources of questions are often found in ‘implications for future research’ sections of articles in journals and clinical practice guidelines. Some guideline developers have prioritized questions identified through the guideline development process (Sharma et al 2018), although these priorities will be influenced by the needs of health systems in which different guideline development teams are working.

2.4.2 Engaging stakeholders to help define the review questions

In the absence of a relevant research priority-setting exercise, or when a systematic review is being conducted for a very specific purpose (for example, commissioned to inform the development of a guideline), researchers should work with relevant stakeholders to define the review question. This practice is especially important when developing review questions for studying the effectiveness of health systems and policies, because of the variability between countries and regions; the significance of these differences may only become apparent through discussion with the stakeholders.

The stakeholders for a review could include consumers or patients, carers, health professionals of different kinds, policy decision makers and others (Chapter 1, Section 1.3.1). Identifying the stakeholders who are critical to a particular question will depend on the question, who the answer is likely to affect, and who will be expected to implement the intervention if it is found to be effective (or to discontinue it if not).

Stakeholder engagement should, optimally, be an ongoing process throughout the life of the systematic review, from defining the question to dissemination of results (Keown et al 2008). Engaging stakeholders increases relevance, promotes mutual learning, improves uptake and decreases research waste (see Chapter 1, Section 1.3.1 and Section 1.3.2). However, because such engagement can be challenging and resource intensive, a one-off engagement process to define the review question might only be possible. Review questions that are conceptualized and refined by multiple stakeholders can capture much of the complexity that should be addressed in a systematic review.

2.4.3 Considering issues relating to equity when defining review questions

Deciding what should be investigated, who the participants should be, and how the analysis will be carried out can be considered political activities, with the potential for increasing or decreasing inequalities in health. For example, we now know that well-intended interventions can actually widen inequalities in health outcomes since researchers have chosen to investigate this issue (Lorenc et al 2013). Decision makers can now take account of this knowledge when planning service provision. Authors should therefore consider the potential impact on disadvantaged groups of the intervention(s) that they are investigating on disadvantaged groups, and whether socio-economic inequalities in health might be affected depending on whether or how they are implemented.

Health equity is the absence of avoidable and unfair differences in health (Whitehead 1992). Health inequity may be experienced across characteristics defined by PROGRESS-Plus (Place of residence, Race/ethnicity/culture/language, Occupation, Gender/sex, Religion, Education, Socio-economic status, Social capital, and other characteristics (‘Plus’) such as sexual orientation, age, and disability) (O’Neill et al 2014). Issues relating to health equity should be considered when review questions are developed (MECIR Box 2.4.a). Chapter 16 presents detailed guidance on this issue for review authors.

MECIR Box 2.4.a Relevant expectations for conduct of intervention reviews

C4: Considering equity and specific populations (Highly desirable)
Consider in advance whether issues of equity and relevance of evidence to specific populations are important to the review, and plan for appropriate methods to address them if they are. Attention should be paid to the relevance of the review question to populations such as low socio-economic groups, low- or middle-income regions, women, children and older people.	Where possible reviews should include explicit descriptions of the effect of the interventions not only upon the whole population, but also on the disadvantaged, and/or the ability of the interventions to reduce socio-economic inequalities in health, and to promote use of the interventions to the community.

2.5 Methods and tools for structuring the review

It is important for authors to develop the scope of their review with care: without a clear understanding of where the review will contribute to existing knowledge – and how it will be used – it may be at risk of conceptual incoherence. It may mis-specify critical elements of how the intervention(s) interact with the context(s) within which they operate to produce specific outcomes, and become either irrelevant or possibly misleading. For example, in a systematic review about smoking cessation interventions in pregnancy, it was essential for authors to take account of the way that health service provision has changed over time. The type and intensity of ‘usual care’ in more recent evaluations was equivalent to the interventions being evaluated in older studies, and the analysis needed to take this into account. This review also found that the same intervention can have different effects in different settings depending on whether its materials are culturally appropriate in each context (Chamberlain et al 2017).

In order to protect the review against conceptual incoherence and irrelevance, review authors need to spend time at the outset developing definitions for key concepts and ensuring that they are clear about the prior assumptions on which the review depends. These prior assumptions include, for example, why particular populations should be considered inside or outside the review’s scope; how the intervention is thought to achieve its effect; and why specific outcomes are selected for evaluation. Being clear about these prior assumptions also requires review authors to consider the evidential basis for these assumptions and decide for themselves which they can place more or less reliance on. When considered as a whole, this initial conceptual and definitional work states the review’s conceptual framework. Each element of the review’s PICO raises its own definitional challenges, which are discussed in detail in the Chapter 3.

In this section we consider tools that may help to define the scope of the review and the relationships between its key concepts; in particular, articulating how the intervention gives rise to the outcomes selected. In some situations, long sequences of events are expected to occur between an intervention being implemented and an outcome being observed. For example, a systematic review examining the effects of asthma education interventions in schools on children’s health and well-being needed to consider: the interplay between core intervention components and their introduction into differing school environments; different child-level effect modifiers; how the intervention then had an impact on the knowledge of the child (and their family); the child’s self-efficacy and adherence to their treatment regime; the severity of their asthma; the number of days of restricted activity; how this affected their attendance at school; and finally, the distal outcomes of education attainment and indicators of child health and well-being (Kneale et al 2015).

Several specific tools can help authors to consider issues raised when defining review questions and planning their review; these are also helpful when developing eligibility criteria and classifying included studies. These include the following.

Taxonomies: hierarchical structures that can be used to categorize (or group) related interventions, outcomes or populations.
Generic frameworks for examining and structuring the description of intervention characteristics (e.g. TIDieR for the description of interventions (Hoffmann et al 2014), iCAT_SR for describing multiple aspects of complexity in systematic reviews (Lewin et al 2017)).
Core outcome sets for identifying and defining agreed outcomes that should be measured for specific health conditions (described in more detail in Chapter 3).

Unlike these tools, which focus on particular aspects of a review, logic models provide a framework for planning and guiding synthesis at the review level (see Section 2.5.1).

2.5.1 Logic models

Logic models (sometimes referred to as conceptual frameworks or theories of change) are graphical representations of theories about how interventions work. They depict intervention components, mechanisms (pathways of action), outputs, and outcomes as sequential (although not necessarily linear) chains of events. Among systematic review authors, they were originally proposed as a useful tool when working with evaluations of complex social and population health programmes and interventions, to conceptualize the pathways through which interventions are intended to change outcomes (Anderson et al 2011).

In reviews where intervention complexity is a key consideration (see Chapter 17), logic models can be particularly helpful. For example, in a review of psychosocial group interventions for those with HIV, a logic model was used to show how the intervention might work (van der Heijden et al 2017). The review authors depicted proximal outcomes, such as self-esteem, but chose only to include psychological health outcomes in their review. In contrast, Bailey and colleagues included proximal outcomes in their review of computer-based interventions for sexual health promotion using a logic model to show how outcomes were grouped (Bailey et al 2010). Finally, in a review of slum upgrading, a logic model showed the broad range of interventions and their interlinkages with health and socio-economic outcomes (Turley et al 2013), and enabled the review authors to select a specific intervention category (physical upgrading) on which to focus the review. Further resources provide further examples of logic models, and can help review authors develop and use logic models (Anderson et al 2011, Baxter et al 2014, Kneale et al 2015, Pfadenhauer et al 2017, Rohwer et al 2017).

Logic models can vary in their emphasis, with a distinction sometimes made between system-based and process-oriented logic models (Rehfuess et al 2018). System-based logic models have particular value in examining the complexity of the system (e.g. the geographical, epidemiological, political, socio-cultural and socio-economic features of a system), and the interactions between contextual features, participants and the intervention (see Chapter 17). Process-oriented logic models aim to capture the complexity of causal pathways by which the intervention leads to outcomes, and any factors that may modify intervention effects. However, this is not a crisp distinction; the two types are interrelated; with some logic models depicting elements of both systems and process models simultaneously.

The way that logic models can be represented diagrammatically (see Chapter 17 for an example) provides a valuable visual summary for readers and can be a communication tool for decision makers and practitioners. They can aid initially in the development of a shared understanding between different stakeholders of the scope of the review and its PICO, helping to support decisions taken throughout the review process, from developing the research question and setting the review parameters, to structuring and interpreting the results. They can be used in planning the PICO elements of a review as well as for determining how the synthesis will be structured (i.e. planned comparisons, including intervention and comparator groups, and any grouping of outcome and population subgroups). These models may help review authors specify the link between the intervention, proximal and distal outcomes, and mediating factors. In other words, they depict the intervention theory underpinning the synthesis plan.

Anderson and colleagues note the main value of logic models in systematic review as (Anderson et al 2011):

refining review questions;
deciding on ‘lumping’ or ‘splitting’ a review topic;
identifying intervention components;
defining and conducting the review;
identifying relevant study eligibility criteria;
guiding the literature search strategy;
explaining the rationale behind surrogate outcomes used in the review;
justifying the need for subgroup analyses (e.g. age, sex/gender, socio-economic status);
making the review relevant to policy and practice;
structuring the reporting of results;
illustrating how harms and feasibility are connected with interventions; and
interpreting results based on intervention theory and systems thinking (see Chapter 17).

Logic models can be useful in systematic reviews when considering whether failure to find a beneficial effect of an intervention is due to a theory failure, an implementation failure, or both (see Chapter 17 and Cargo et al 2018). Making a distinction between implementation and intervention theory can help to determine whether and how the intervention interacts with (and potentially changes) its context (see Chapter 3 and Chapter 17 for further discussion of context). This helps to elucidate situations in which variations in how the intervention is implemented have the potential to affect the integrity of the intervention and intended outcomes.

Given their potential value in conceptualizing and structuring a review, logic models are increasingly published in review protocols. Logic models may be specified a priori and remain unchanged throughout the review; it might be expected, however, that the findings of reviews produce evidence and new understandings that could be used to update the logic model in some way (Kneale et al 2015). Some reviews take a more staged approach, pre-specifying points in the review process where the model may be revised on the basis of (new) evidence (Rehfuess et al 2018) and a staged logic model can provide an efficient way to report revisions to the synthesis plan. For example, in a review of portion, package and tableware size for changing selection or consumption of food and other products, the authors presented a logic model that clearly showed changes to their original synthesis plan (Hollands et al 2015).

It is preferable to seek out existing logic models for the intervention and revise or adapt these models in line with the review focus, although this may not always be possible. More commonly, new models are developed starting with the identification of outcomes and theorizing the necessary pre-conditions to reach those outcomes. This process of theorizing and identifying the steps and necessary pre-conditions continues, working backwards from the intended outcomes, until the intervention itself is represented. As many mechanisms of action are invisible and can only be ‘known’ through theory, this process is invaluable in exposing assumptions as to how interventions are thought to work; assumptions that might then be tested in the review. Logic models can be developed with stakeholders (see Section 2.5.2) and it is considered good practice to obtain stakeholder input in their development.

Logic models are representations of how interventions are intended to ‘work’, but they can also provide a useful basis for thinking through the unintended consequences of interventions and identifying potential adverse effects that may need to be captured in the review (Bonell et al 2015). While logic models provide a guiding theory of how interventions are intended to work, critiques exist around their use, including their potential to oversimplify complex intervention processes (Rohwer et al 2017). Here, contributions from different stakeholders to the development of a logic model may be able to articulate where complex processes may occur; theorizing unintended intervention impacts; and the explicit representation of ambiguity within certain parts of the causal chain where new theory/explanation is most valuable.

2.5.2 Changing review questions

While questions should be posed in the protocol before initiating the full review, these questions should not prevent exploration of unexpected issues. Reviews are analyses of existing data that are constrained by previously chosen study populations, settings, intervention formulations, outcome measures and study designs. It is generally not possible to formulate an answerable question for a review without knowing some of the studies relevant to the question, and it may become clear that the questions a review addresses need to be modified in light of evidence accumulated in the process of conducting the review.

Although a certain fluidity and refinement of questions is to be expected in reviews as a fuller understanding of the evidence is gained, it is important to guard against bias in modifying questions. Data-driven questions can generate false conclusions based on spurious results. Any changes to the protocol that result from revising the question for the review should be documented in the section ‘Differences between the protocol and the review’. Sensitivity analyses may be used to assess the impact of changes on the review findings (see Chapter 10, Section 10.14). When refining questions it is useful to ask the following questions.

What is the motivation for the refinement?
Could the refinement have been influenced by results from any of the included studies?
Does the refined question require a modification to the search strategy and/or reassessment of any decisions regarding study eligibility?
Are data collection methods appropriate to the refined question?
Does the refined question still meet the FINER criteria discussed in Section 2.1?

2.5.3 Building in contingencies to deal with sparse data

The ability to address the review questions will depend on the maturity and validity of the evidence base. When few studies are identified, there will be limited opportunity to address the question through an informative synthesis. In anticipation of this scenario, review authors may build contingencies into their protocol analysis plan that specify grouping (any or multiple) PICO elements at a broader level; thus potentially enabling synthesis of a larger number of studies. Broader groupings will generally address a less specific question, for example:

‘the effect of any antioxidant supplement on …’ instead of ‘the effect of vitamin C on …’;
‘the effect of sexual health promotion on biological outcomes’ instead of ‘the effect of sexual health promotion on sexually transmitted infections’; or
‘the effect of cognitive behavioural therapy in children and adolescents on …’ instead of ‘the effect of cognitive behavioural therapy in children on …’.

However, such broader questions may be useful for identifying important leads in areas that lack effective interventions and for guiding future research. Changes in the grouping may affect the assessment of the certainty of the evidence (see Chapter 14).

2.5.4 Economic data

Decision makers need to consider the economic aspects of an intervention, such as whether its adoption will lead to a more efficient use of resources. Economic data such as resource use, costs or cost-effectiveness (or a combination of these) may therefore be included as outcomes in a review. It is useful to break down measures of resource use and costs to the level of specific items or categories. It is helpful to consider an international perspective in the discussion of costs. Economics issues are discussed in detail in Chapter 20.

2.6 Chapter information

Authors: James Thomas, Dylan Kneale, Joanne E McKenzie, Sue E Brennan, Soumyadeep Bhaumik

Acknowledgements: This chapter builds on earlier versions of the Handbook. Mona Nasser, Dan Fox and Sally Crowe contributed to Section 2.4; Hilary J Thomson contributed to Section 2.5.1.

Funding: JT and DK are supported by the National Institute for Health Research (NIHR) Collaboration for Leadership in Applied Health Research and Care North Thames at Barts Health NHS Trust. JEM is supported by an Australian National Health and Medical Research Council (NHMRC) Career Development Fellowship (1143429). SEB’s position is supported by the NHMRC Cochrane Collaboration Funding Program. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR, the Department of Health or the NHMRC.

2.7 References

Anderson L, Petticrew M, Rehfuess E, Armstrong R, Ueffing E, Baker P, Francis D, Tugwell P. Using logic models to capture complexity in systematic reviews. Research Synthesis Methods 2011; 2: 33–42.

Bailey JV, Murray E, Rait G, Mercer CH, Morris RW, Peacock R, Cassell J, Nazareth I. Interactive computer-based interventions for sexual health promotion. Cochrane Database of Systematic Reviews 2010; 9: CD006483.

Baxter SK, Blank L, Woods HB, Payne N, Rimmer M, Goyder E. Using logic model methods in systematic review synthesis: describing complex pathways in referral management interventions. BMC Medical Research Methodology 2014; 14: 62.

Bonell C, Jamal F, Melendez-Torres GJ, Cummins S. ‘Dark logic’: theorising the harmful consequences of public health interventions. Journal of Epidemiology and Community Health 2015; 69: 95–98.

Bryant J, Sanson-Fisher R, Walsh J, Stewart J. Health research priority setting in selected high income countries: a narrative review of methods used and recommendations for future practice. Cost Effectiveness and Resource Allocation 2014; 12: 23.

Caldwell DM, Welton NJ. Approaches for synthesising complex mental health interventions in meta-analysis. Evidence-Based Mental Health 2016; 19: 16–21.

Cargo M, Harris J, Pantoja T, Booth A, Harden A, Hannes K, Thomas J, Flemming K, Garside R, Noyes J. Cochrane Qualitative and Implementation Methods Group guidance series-paper 4: methods for assessing evidence on intervention implementation. Journal of Clinical Epidemiology 2018; 97: 59–69.

Chalmers I, Bracken MB, Djulbegovic B, Garattini S, Grant J, Gülmezoglu AM, Howells DW, Ioannidis JPA, Oliver S. How to increase value and reduce waste when research priorities are set. Lancet 2014; 383: 156–165.

Chamberlain C, O’Mara-Eves A, Porter J, Coleman T, Perlen S, Thomas J, McKenzie J. Psychosocial interventions for supporting women to stop smoking in pregnancy. Cochrane Database of Systematic Reviews 2017; 2: CD001055.

Cooper H. The problem formulation stage. In: Cooper H, editor. Integrating Research: A Guide for Literature Reviews. Newbury Park (CA) USA: Sage Publications; 1984.

Counsell C. Formulating questions and locating primary studies for inclusion in systematic reviews. Annals of Internal Medicine 1997; 127: 380–387.

Cummings SR, Browner WS, Hulley SB. Conceiving the research question and developing the study plan. In: Hulley SB, Cummings SR, Browner WS, editors. Designing Clinical Research: An Epidemiological Approach. 4th ed. Philadelphia (PA): Lippincott Williams & Wilkins; 2007. p. 14–22.

Glass GV. Meta-analysis at middle age: a personal history. Research Synthesis Methods 2015; 6: 221–231.

Hedges LV. Statistical considerations. In: Cooper H, Hedges LV, editors. The Handbook of Research Synthesis. New York (NY): USA: Russell Sage Foundation; 1994.

Hetrick SE, McKenzie JE, Cox GR, Simmons MB, Merry SN. Newer generation antidepressants for depressive disorders in children and adolescents. Cochrane Database of Systematic Reviews 2012; 11: CD004851.

Hoffmann T, Glasziou P, Boutron I. Better reporting of interventions: template for intervention description and replication (TIDieR) checklist and guide. BMJ 2014; 348: g1687.

Hollands GJ, Shemilt I, Marteau TM, Jebb SA, Lewis HB, Wei Y, Higgins JPT, Ogilvie D. Portion, package or tableware size for changing selection and consumption of food, alcohol and tobacco. Cochrane Database of Systematic Reviews 2015; 9: CD011045.

Keown K, Van Eerd D, Irvin E. Stakeholder engagement opportunities in systematic reviews: Knowledge transfer for policy and practice. Journal of Continuing Education in the Health Professions 2008; 28: 67–72.

Kneale D, Thomas J, Harris K. Developing and optimising the use of logic models in systematic reviews: exploring practice and good practice in the use of programme theory in reviews. PloS One 2015; 10: e0142187.

Lewin S, Hendry M, Chandler J, Oxman AD, Michie S, Shepperd S, Reeves BC, Tugwell P, Hannes K, Rehfuess EA, Welch V, McKenzie JE, Burford B, Petkovic J, Anderson LM, Harris J, Noyes J. Assessing the complexity of interventions within systematic reviews: development, content and use of a new tool (iCAT_SR). BMC Medical Research Methodology 2017; 17: 76.

Lorenc T, Petticrew M, Welch V, Tugwell P. What types of interventions generate inequalities? Evidence from systematic reviews. Journal of Epidemiology and Community Health 2013; 67: 190–193.

Nasser M, Ueffing E, Welch V, Tugwell P. An equity lens can ensure an equity-oriented approach to agenda setting and priority setting of Cochrane Reviews. Journal of Clinical Epidemiology 2013; 66: 511–521.

Nasser M. Setting priorities for conducting and updating systematic reviews [PhD Thesis]: University of Plymouth; 2018.

O’Neill J, Tabish H, Welch V, Petticrew M, Pottie K, Clarke M, Evans T, Pardo Pardo J, Waters E, White H, Tugwell P. Applying an equity lens to interventions: using PROGRESS ensures consideration of socially stratifying factors to illuminate inequities in health. Journal of Clinical Epidemiology 2014; 67: 56–64.

Oliver S, Dickson K, Bangpan M, Newman M. Getting started with a review. In: Gough D, Oliver S, Thomas J, editors. An Introduction to Systematic Reviews. London (UK): Sage Publications Ltd.; 2017.

Petticrew M, Roberts H. Systematic Reviews in the Social Sciences: A Practical Guide. Oxford (UK): Blackwell; 2006.

Pfadenhauer L, Gerhardus A, Mozygemba K, Lysdahl KB, Booth A, Hofmann B, Wahlster P, Polus S, Burns J, Brereton L, Rehfuess E. Making sense of complexity in context and implementation: the Context and Implementation of Complex Interventions (CICI) framework. Implementation Science 2017; 12: 21.

Rehfuess EA, Booth A, Brereton L, Burns J, Gerhardus A, Mozygemba K, Oortwijn W, Pfadenhauer LM, Tummers M, van der Wilt GJ, Rohwer A. Towards a taxonomy of logic models in systematic reviews and health technology assessments: a priori, staged, and iterative approaches. Research Synthesis Methods 2018; 9: 13–24.

Richardson WS, Wilson MC, Nishikawa J, Hayward RS. The well-built clinical question: a key to evidence-based decisions. ACP Journal Club 1995; 123: A12–13.

Rohwer A, Pfadenhauer L, Burns J, Brereton L, Gerhardus A, Booth A, Oortwijn W, Rehfuess E. Series: Clinical epidemiology in South Africa. Paper 3: Logic models help make sense of complexity in systematic reviews and health technology assessments. Journal of Clinical Epidemiology 2017; 83: 37–47.

Sharma T, Choudhury M, Rejón-Parrilla JC, Jonsson P, Garner S. Using HTA and guideline development as a tool for research priority setting the NICE way: reducing research waste by identifying the right research to fund. BMJ Open 2018; 8: e019777.

Squires J, Valentine J, Grimshaw J. Systematic reviews of complex interventions: framing the review question. Journal of Clinical Epidemiology 2013; 66: 1215–1222.

Tong A, Chando S, Crowe S, Manns B, Winkelmayer WC, Hemmelgarn B, Craig JC. Research priority setting in kidney disease: a systematic review. American Journal of Kidney Diseases 2015; 65: 674–683.

Tong A, Sautenet B, Chapman JR, Harper C, MacDonald P, Shackel N, Crowe S, Hanson C, Hill S, Synnot A, Craig JC. Research priority setting in organ transplantation: a systematic review. Transplant International 2017; 30: 327–343.

Turley R, Saith R, Bhan N, Rehfuess E, Carter B. Slum upgrading strategies involving physical environment and infrastructure interventions and their effects on health and socio-economic outcomes. Cochrane Database of Systematic Reviews 2013; 1: CD010067.

van der Heijden I, Abrahams N, Sinclair D. Psychosocial group interventions to improve psychological well-being in adults living with HIV. Cochrane Database of Systematic Reviews 2017; 3: CD010806.

Viergever RF. Health Research Prioritization at WHO: An Overview of Methodology and High Level Analysis of WHO Led Health Research Priority Setting Exercises. Geneva (Switzerland): World Health Organization; 2010.

Viergever RF, Olifson S, Ghaffar A, Terry RF. A checklist for health research priority setting: nine common themes of good practice. Health Research Policy and Systems 2010; 8: 36.

Whitehead M. The concepts and principles of equity and health. International Journal of Health Services 1992; 22: 429–25.