Chapter 26: Individual participant data

Jayne F Tierney, Lesley A Stewart, Mike Clarke; on behalf of the Cochrane Individual Participant Data Meta-analysis Methods Group

Key Points:

Individual participant data (IPD) reviews are a specific type of systematic review that involve the collection, checking and re-analysis of the original data for each participant in each study. Data may be obtained either from study investigators or via data-sharing repositories or platforms.
IPD reviews should be considered when the available published or other aggregate data do not permit a good quality review, or are insufficient for a thorough analysis. In certain situations, aggregate data synthesis might be an appropriate first step.
The IPD approach can bring substantial improvements to the quality of data available and offset inadequate reporting of individual studies. Risk of bias can be assessed more thoroughly and IPD enables more detailed and flexible analysis than is possible in systematic reviews of aggregate data.
Access to IPD offers scope to analyse data and report results in many different ways, so analytical methods should be pre-specified in detail and reporting should follow the PRISMA-IPD guideline.
Most commonly, IPD reviews are carried out by a collaborative group, comprising a project management team, the researchers who contribute their study data, and an advisory group.
An IPD review usually takes longer and costs more than a conventional systematic review of the same question, and requires a range of skills to obtain, manage and analyse data. Thus, they are difficult to do without dedicated time and funding.

This chapter should be cited as: Tierney JF, Stewart LA, Clarke M. Chapter 26: Individual participant data. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.4 (updated August 2023). Cochrane, 2023. Available from cochrane.org/handbook.

This chapter should be cited as: Tierney JF, Stewart LA, Clarke M. Chapter 26: Individual participant data [last updated October 2019]. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.5. Cochrane, 2024. Available from cochrane.org/handbook.

26.1 Introduction

26.1.1 What is an IPD review?

Systematic reviews incorporating individual participant data (IPD) include the original data from each eligible study. The IPD will usually contain de-identified demographic information for each participant such as age, sex, nature of their health condition, as well as information about treatments or tests received and outcomes observed (Stewart et al 1995, Stewart and Tierney 2002). These data can then be checked and analysed centrally and, if appropriate, combined in meta-analyses (Stewart et al 1995, Stewart and Tierney 2002). Most commonly, IPD are sought directly from the study investigators, but access through data-sharing platforms and data repositories may increase in the coming years.

Advantages of an IPD approach are summarized in Table 26.1.a. Compared with aggregate data, the collection of IPD can bring about substantial improvements to the quantity and quality of data, for example, through the inclusion of more trials, participants and outcomes (Debray et al 2015a, Tierney et al 2015a). A Cochrane Methodology Review of empirical research shows some of these advantages (Tudur Smith et al 2016). IPD also affords greater scope and flexibility in the analyses, including the ability to investigate how participant-level covariates such as age or severity of disease might alter the impact of the treatment, exposure or test under investigation (Debray et al 2015a, Debray et al 2015b, Tierney et al 2015a). With such better-quality data and analysis, IPD reviews can help to provide in-depth explorations and robust meta-analysis results, which may differ from those based on aggregate data (Tudur Smith et al 2016). Not surprisingly then, IPD reviews have had a substantial impact on clinical practice and research, but could be better used to inform treatment guidelines (Vale et al 2015), and new studies (Tierney et al 2015b). However, IPD reviews can take longer than other reviews; those evaluating the effects of therapeutic interventions typically taking at least two years to complete. Also, they usually require a skilled team with dedicated time and specific funding.

This chapter provides an overview of the IPD approach to systematic reviews, to help authors decide whether collecting IPD might be useful and feasible for their review. As most IPD reviews have assessed the efficacy of interventions, and have been based on randomized trials, this is the focus of the chapter. However, the approach also offers particular advantages for the synthesis of diagnostic and prognostic studies (Debray et al 2015a) and many of the principles described will apply to these sorts of synthesis. The chapter does not provide detailed guidance on practical or statistical methods, which are summarized elsewhere (Stewart et al 1995, Stewart and Tierney 2002, Debray et al 2015b, Tierney et al 2015a). Therefore, anyone contemplating carrying out their first IPD meta-analysis as part of a Cochrane Review should seek appropriate advice and guidance from experienced researchers through the IPD Meta-analysis Methods Group.

Table 26.1.a Advantages of the IPD approach to systematic review and meta-analysis. Adapted from (Tierney et al 2015a) (licensed under CC BY 4.0).

Aspect of systematic review/meta-analysis	Advantages of the IPD approach
Study inclusion	Asking the IPD collaborative group (of study investigators and other experts in the clinical field) to supplement list of identified studies.* Clarify study eligibility with trial investigators.*
Data quality	Include studies that are unpublished or not reported in full. Include unreported data (e.g. more outcomes per study, and more complete information on those outcomes, data on participants excluded from study analyses). Check the integrity of study IPD and resolve any queries with investigators. Derive standardized outcome definitions across trials or translate different definitions to a common scale. Derive standardized classifications of participant characteristics or their disease/condition or translate different definitions to a common scale. Update follow-up of time-to-event or other outcomes beyond that reported.
Risk of bias	Clarify study design, conduct and analysis methods with trial investigators.* Check risk of bias of study IPD and obtain extra data where necessary.
Analysis	Analyse all important outcomes. Determine validity of analysis assumptions with IPD (e.g. proportionality of hazards for a Cox model). Derive measures of effect directly from the IPD. Use a consistent unit of analysis for each study. Apply a consistent method of analysis for each study. Conduct more detailed analysis of time-to-event outcomes (e.g. generating Kaplan-Meier curves). Achieve greater power for assessing interactions between effects of interventions and participant or disease/condition characteristics. Conduct more complex analyses not (usually) possible with aggregate data (e.g. simultaneous assessment of the relationship between multiple study and/or participant characteristics and effects of interventions). Use non-standard models or measures of effect. Account for missing data at the patient level (e.g. using multiple imputation). Use IPD to address secondary clinical questions (e.g. to explore the natural history of disease, prognostic factors or surrogate outcomes).
Interpretation	Discuss implications for clinical practice and research with a multidisciplinary group of collaborators including study investigators who supplied data.

* These may also be done for non-IPD reviews.

26.1.2 How do IPD and standard Cochrane Review methods differ?

The general approach to an IPD review is the same as for an aggregate data systematic review, and the only substantial differences relate to data collection, checking and analysis (Stewart and Tierney 2002). Thus, a detailed protocol should be prepared and include: the objectives for the review; the specific questions to be addressed; the reasons why IPD are being sought; study and any participant eligibility criteria; which descriptive, baseline and outcome data will be collected and how this will be managed, and the planned analyses, as well as other standard review methods. Because IPD reviews offer the potential for a greater number of analyses, they pose a greater risk of data being interrogated repeatedly until the desired results are obtained. Therefore, it is particularly important that analyses methods are pre-specified in the protocol, or a separate analysis plan.

Involving the investigators responsible for the primary studies can highlight additional eligible studies done by or known to them, and help to clarify the design and conduct of included studies, thereby improving the reliability of risk of bias assessments (Vale et al 2013). Moreover, the ability to directly check IPD and seek additional data may alleviate some of the biases associated with aggregate data reviews (Stewart et al 2005).

The project should culminate in the preparation and dissemination of a structured report, following PRISMA-IPD (Stewart et al 2015) where possible. This is a stand-alone extension to PRISMA that is geared to the IPD approach and, while it focuses on reviews of efficacy, many elements are applicable to other types of IPD review.

Systematic reviews based on IPD require expertise in data management and statistical analysis, as well as skills in managing research collaborations, and they often take longer and require more resource than a conventional aggregate data systematic review of the same question. Therefore, IPD reviews are difficult to conduct in review authors’ ‘spare time’, and are likely to require dedicated resources and staff.

26.1.3 How are IPD reviews organized?

IPD reviews are usually carried out as collaborative projects whereby all study investigators contributing data from their studies, together with the research team managing and carrying out the project, become part of an active collaboration (Stewart et al 1995, Stewart and Tierney 2002). Ideally, this collaboration should be structured so as to keep the research team at ‘arm’s length’ from the trialists’ group. Such a group might comprise a project team who lead and are responsible for all aspects of design and conduct; an advisory group who provide clinical and methodological guidance and aid strategic decisions; and the trialists, who provide trial information and IPD and comment on the draft manuscript. Projects led solely by study investigators, or by a single group or company with a vested interest, are at greater risk of (real or perceived) bias, and findings of such projects may be viewed as less credible.

Often, the research team convenes a meeting of all collaborators to present and discuss preliminary results, and can draw on these discussions when drafting manuscripts. Results are usually published in the name of the collaborative group, with all collaborators being listed as co-authors of the review publication, and all contributions and conflicts should be clearly described therein.

26.1.4 Which healthcare areas have used IPD reviews?

IPD meta-analyses have an established history in cardiovascular disease and cancer (Clarke et al 1998), where the methodology has been developing steadily since the late 1980s, and most are still conducted in these fields (Simmonds et al 2015). However, IPD have also been collected for systematic reviews in many other fields (Simmonds et al 2005, Simmonds et al 2015), including diabetes, infections, mental health, dementia, epilepsy, hernia and respiratory disease. The Cochrane IPD Meta-analysis Methods Group website includes publications of ongoing and completed IPD reviews conducted by members of the Group.

26.1.5 When is an IPD review appropriate?

Generally, IPD reviews should be considered in circumstances where the available published or other aggregate data do not permit a good quality review. Specifically, it is worth considering carefully what value the collection of IPD will bring over the traditional aggregate data approach, in terms of the aims, data quantity and quality, and analyses required (Tudur Smith et al 2015) (Table 26.1.a). This means it will often be necessary to conduct or consult an aggregate data systematic review as a first step (Tudur Smith et al 2015). Alternatively, if it is known that a key objective is to explore subpopulations and potential effect modification, then proceeding directly to an IPD review and meta-analysis may be warranted.

Another important consideration is whether sufficient IPD are likely to be available to permit credible analysis. For example, some study data may have been destroyed or lost, some outcomes, such as adverse effects or quality of life may not have been collected systematically for all studies, or study investigators may not wish to collaborate (although this may not be known at the outset). Also, it may not be possible to complete an IPD review in a suitable time frame for the question of interest and, in some situations, the additional resource required may be prohibitive. Weighing up these various factors will help determine when the IPD approach is likely to bring most benefit.

Before embarking on an IPD review, review authors need to think carefully about which skills and resources will be required for the project to succeed, and seek advice and training. The Cochrane IPD Meta-analysis Methods Group is a good first point of contact.

26.2 Collecting IPD

26.2.1 Obtaining data from the original researchers

Typically, systematic reviews based on IPD are international collaborative projects anchored on addressing one or more pre-specified questions (Stewart et al 1995, Stewart and Tierney 2002). They might be initiated by systematic review authors in collaboration with clinicians, but increasingly they may arise from trialists’ consortia or via specific calls from funders.

Negotiating and maintaining collaboration with study investigators from different countries, settings and disciplines can take considerable time and effort. For example, it can be difficult to trace the people responsible for eligible studies, and they may be initially reluctant to participate in the meta-analysis. Often the first approach will be by email or letter to the principal investigator, inviting collaboration, explaining the project, describing what participation will entail and how the meta-analysis will be managed and published. A protocol is generally supplied at this stage to provide more detailed information, but data are not usually sought in the first correspondence. It may also be necessary to establish additional contact with the data centre or research organization responsible for management of the study data, and to whom data queries will be sent; the principal investigator can advise who would be most appropriate.

In encouraging study investigators to take part in the IPD review, it is important to be as supportive and flexible as possible, to take the time required to build relationships and to keep all collaborators involved and informed of progress. Regular newsletters, e-mail updates or a website can be useful, especially as the project may take place over a prolonged period. A randomized trial has examined different ways of establishing these connections and obtaining the IPD (Veroniki et al 2016, Veroniki et al 2019).

26.2.2 Obtaining data from sources other than the original researchers

A number of initiatives are helping to increase the availability of IPD from both academic and industry-led studies, either through generic data sharing platforms such as Yale Open Data, Clinical Study Data Request, DataSphere or Vivli. These have been in response to calls from federal agencies (e.g. NIH), funders (e.g. MRC), journal editors, the AllTrials campaign and Cochrane to make results and IPD from clinical studies more readily available.

As the focus of these efforts is to make the data from individual studies available, formatting and coding are not necessarily standard or consistent across the different study datasets. Some platforms offer fully unrestricted access to IPD and others moderated access, with release subject to approval of a project proposal. Also, while some sources allow transfer of IPD directly to the research team conducting the review, others limit the use of IPD to within a secure area within a platform. Therefore, for any given review, the availability of study IPD from these platforms may be patchy, the modes of access variable, and the usual process of re-formatting and re-coding data in a consistent way will likely be required. Thus, although promising, as yet they do not provide a viable alternative to the traditional collaborative IPD approach. As the culture of data sharing gathers pace, the increased availability and accessibility of IPD should benefit the production of IPD reviews.

26.2.3 Establishing ‘topic-based’ repositories with the original researchers

An alternative to an IPD review with a narrow focus, or broad-based data sharing repositories, is to establish a retrospective or prospective repository of IPD from all studies of relevance to a particular healthcare area or topic. Previously, such repositories have been built from existing collaborative IPD reviews and generate a unique resource for looking investigating clinical questions in depth and potentially tackling additional questions.

For instance, since 1985, the Early Breast Cancer Trialists’ Collaborative Group has amassed the majority of trials in early breast cancer and collected extended follow-up, in order to evaluate the effects of all the key interventions in the long term (http://gas.ndph.ox.ac.uk/ebctcg). For example, they have shown that women with oestrogen-receptor positive breast cancer still face a substantial risk of cancer recurrence more than 20 years after their endocrine treatment (Pan et al 2017). The ACCENT repository built on existing colorectal cancer IPD reviews has been used to identify disease-free survival as a surrogate for overall survival (Sargent et al 2007), and show the prognostic impact of baseline body mass index on survival (Sinicrope et al 2013), and a network meta-analysis of multiple IPD reviews of drug monotherapy for epilepsy, shows the most suitable first-line treatments for partial onset and generalized tonic-clonic seizures (Nevitt et al 2017).

A considerable advantage of such repositories is that data items can be coded to a common format from the outset, facilitating subsequent re-use of data, and the IPD can be checked by those with topic expertise. The benefits of working with study investigators are also retained. Of course, the retention and re-use of IPD should comply with the same data security and confidentiality measures as for the original review, and new ethics approval and data use agreements should be sought if required. It is vitally important that any new analyses follow a new pre-specified protocol and/or analysis plan.

26.2.4 Data security and confidentiality

Study investigators naturally expect there to be safeguards that ensure their study data will be transferred, stored and used appropriately. For this reason, a data sharing or data use agreement between the original investigators and the IPD review team is usually required. The details of such agreements vary, but most will state that data will be held securely, accessed only by authorized members of the project team and will not be copied or distributed elsewhere. It is also important to request that individual participants are adequately de-identified in the supplied data, by removing or recoding identifiers, and data use agreements should prohibit researchers from attempting to re-identify individuals. The degree of de-identification required may be dictated by the data protection legislation of the country from which the study originates. For example, it may be necessary to also remove or redact free-text verbatim terms, and remove explicit information on the dates of events. Note that full anonymization, whereby all links between the de-identified datasets and the original datasets are destroyed, limits the utility of IPD for systematic reviews and therefore is not recommended. All participant data should be transferred via a secure data transfer site or by encrypted email.

Historically, ethical review was not sought for IPD reviews, on the premise that they were addressing the same research question as the original studies for which participants already gave their informed consent. However, evolving data protection regulations (e.g. the EU General Data Protection Regulation) and changing attitudes to data sharing mean that, in some circumstances, formal ethical approval will be required by the Institutes holding IPD and be expected by those supplying data. This should be explored with the ethics committee/board under whose jurisdiction the research team operate, and even if formal review is not required, it may be useful to send written confirmation of this to those providing data. It is perhaps more likely that ethical review will be required if review authors are using IPD to address a different question from the original studies, or when seeking data from a research study that was not subject to prior ethical review and did not obtain formal patient consent, such as clinical audit data. This does not imply, however, that new consent will need to be obtained from the participants in the original study; de-identification of data usually means this is not necessary. Moreover, in many circumstances it would be difficult or impossible to obtain consent retrospectively, for example in older studies (because participants would be difficult to trace) or, in studies of life-limiting conditions (because many participants will have died).

26.2.5 Deciding which data items to collect

When deciding on the data items (or variables) to collect for an IPD review, it is sensible to consider the planned analyses carefully. This minimizes the possibility that information essential to the analyses will not be sought or that data will be collected unnecessarily. Understandably, the original researchers may be aggrieved if they go to the trouble of providing data that are not subsequently analysed and reported.

In addition, the aim should be to maximize the quality of the data and so enhance the analyses. For example, data on all participants and outcomes included in studies should be sought irrespective of whether they were part of the reported analyses. Thus, before embarking on data collection, it is worthwhile checking the study protocols and/or with the original researchers to determine which data are actually available. In many cases it will only be necessary to collect outcomes and participant characteristics as defined in the individual studies. However, additional variables might be required to provide greater granularity (e.g. subscales in quality of life instruments), or to allow outcomes or other variables to be defined in a consistent way for each study. For example, to redefine pre-eclampsia according to a common definition, data on systolic and diastolic blood pressure and proteinurea are needed (Askie et al 2007).

IPD provides the most practical way to synthesize data for time-to-event outcomes, such as time to recovery, time free of seizures, or time to death. Therefore, it is important to collect data on whether an event (e.g. death) has happened, the date of the event (e.g. date of death) and the date of last follow-up for those not experiencing an event. As a bare minimum, whether an event happened and the time that each individual spent ‘event-free’ may suffice. IPD also allows follow-up to be updated sometimes substantially beyond the point of publication (Stewart et al 1995, Stewart and Tierney 2002), which has been particularly important in evaluating the long-term effects of therapies in the cancer field (Pan et al 2017).

26.2.6 Obtaining sufficient data

It is not always possible to obtain all the desired data for an IPD review. For example, it might be difficult to obtain IPD for all relevant trials because trial investigators cannot be traced or no longer have access to the data. If investigators do not respond or refuse to participate, it might be to suppress unfavourable results, and therefore not including such trials could bias the meta-analysis. On the other hand, if it is to avoid providing trials of poor quality, then not including these trials might make a meta-analysis more robust. Aiming to obtain a large proportion of the eligible trials and participants will both counter bias (Tierney et al 2015a) and enable exploration of any quality issues (Ahmed et al 2012), and so will help to provide a reliable and precise assessment of the effects of an intervention. Another factor is whether the IPD will likely provide sufficient power to detect an effect reliably, but to date this has received little attention (Ensor et al 2018).

26.3 Managing and checking IPD

26.3.1 Re-coding and re-defining data

Inevitably, the different studies included in an IPD review will have collected and defined data in different ways. However, it is relatively straightforward to re-code data items into a common format and it should be possible to harmonize, for example, definitions of staging, grading, ranking or other scoring systems in a consistent way, to facilitate pooling of data across studies. Thus, as well as giving investigators clear instructions on which data are needed and the process for secure data transfer, the preferred data format and coding for each variable should be supplied (Stewart et al 1995). Of course, if study investigators are unwilling or unable to prepare data according to this pre-specified format, the review team should accept data in whichever format is most convenient, and recode it as necessary. A copy of the data, as supplied, should be archived before carrying out conversions or modifications to the data, and it is vital that any alterations made are properly logged.

26.3.2 Checking the completeness and integrity of incoming data

The aims of checking and ‘cleaning’ data are to ensure that included data are accurate, valid and internally consistent (Stewart et al 1995, Stewart and Tierney 2002, Tierney et al 2015a). Independent scrutiny of data by the review team may also increase project credibility. When data files are first received, it is important to confirm that they can be read and loaded into the central storage/analysis system. For example, if data arrive electronically, they should be checked to ensure that the files can be opened and that data are for the correct study. Furthermore, it is useful to confirm that all participants recruited or randomized are included, and that there are no obvious omissions or duplicates in the sequence of patient identifiers. More in-depth checks for missing, invalid, out of range or inconsistent items might highlight, for example, records of unusually old or young patients or those with abnormally high or low levels of important biomarkers.

Also, the data supplied should be checked against any relevant study publications or results repositories to highlight any inconsistencies in, for example, the distribution of baseline characteristics, the number of participants and the outcome results. However, it should be borne in mind that differences might arise because of continued enrolment or further follow-up subsequent to publication.

26.3.3 Checking the risk of bias of included studies

Just as for other types of systematic review, assessing risk of bias of included studies (Higgins et al 2011, Sterne et al 2016) is recommended for IPD reviews. With the collaborative IPD approach, additional information obtained from protocols, codebooks and forms supplied by study investigators can increase the clarity of risk of bias assessments compared to those based on study reports (Mhaskar et al 2012, Vale et al 2013). Also, checking the IPD directly can provide further insight into potential biases, some of which might be reduced or not transpire when updated or additional data are obtained. These checks are best established for reviews of randomized trials (Stewart et al 1995, Stewart and Tierney 2002, Tierney et al 2015a) and are outlined next.

26.3.3.1 Checking randomization and allocation sequence concealment

For randomized trials it is important to check the IPD to ensure that the methods of randomization and allocation sequence concealment appear appropriate, so as to guard against the inclusion of non-randomized studies or participants. The pattern of treatment allocation can be checked directly, and in various ways, for any unusual patterns (Stewart et al 1995, Stewart and Tierney 2002, Tierney et al 2015a).

26.3.3.2 Checking for attrition

IPD should be checked to ensure that data on all or as many randomized participants as possible are included for each outcome, and that they are assigned to their allocated intervention. This helps to minimize bias associated with the dropout of participants or their exclusion from study analyses (Tierney and Stewart 2005), and allows an intention-to-treat analysis of all randomized participants, avoiding the potential bias of a per-protocol analysis.

26.3.3.3 Checking outcomes included

An IPD review should collect all the outcomes of relevance to the review question whether reported or not. This will help to overcome the biases that can be associated with differential reporting of outcomes (Kirkham et al 2010), and provide a more balanced view of benefits and harms. Precisely because some measured outcomes may not be reported, it is worth checking the study protocol, trial registry entry and with investigators to firmly establish which outcomes might be available (Dwan et al 2011).

For time-to-event outcomes, where events are observed over a prolonged period, for example survival in cancer trials, it is important to also check that follow-up is sufficient and balanced by randomized group. By requesting follow-up that is as up to date as possible, and which may be substantially beyond the results reported in trial publications, transitory effects can be avoided and any benefits or harms of interventions that take a long time to accrue, such as late side effects of treatment or late recurrence of disease, can be picked up. For example, in an IPD meta-analysis of chemotherapy for soft tissue sarcoma (Sarcoma Meta-analysis Collaboration 1997), the median follow-up for trials reporting it ranged from 16 to 64 months, but increased to between 74 and 204 months when updated IPD were obtained (Stewart et al 2005).

26.3.4 Assessing the overall quality of a study

For any individual study, the results of the data and risk of bias checks should be considered together in order to build up an overall picture of the quality of the data supplied and study design and conduct. Any concerns should be brought diplomatically to the attention of the responsible study team, and any subsequent changes or updates to study data should be properly recorded. Many data issues turn out to be simple errors or misunderstandings that have minimal impact on the study or meta-analysis results (Burdett and Stewart 2002), and major problems are rare. However, these checks serve to improve understanding of the peculiarities of each study, and safeguard against occurrences of major problems in study data (Burdett and Stewart 2002). If such problems exist, or it is anticipated that the design or conduct of a study might introduce significant bias into the meta-analysis, it may need to be excluded.

26.4 Analysis of IPD

26.4.1 Analysis advantages

Having access to IPD for each study enables checking of analytical assumptions, thorough exploration of the data and consistent analysis across trials (Table 26.1.a). Also, outcomes and measures of risk and effect are derived directly from analysis of the IPD, so there is no need to rely on interpreting information and analyses presented in published reports, or to combine summary statistics from studies that have been analysed in different ways. Re-analysis of IPD also avoids any problems or limitations with the original analyses. For example, it should be possible to carry out analyses according to intention-to-treat principles, even if the original/published trial analyses did not, use more appropriate effect measures and perform sophisticated analyses to account for missing data.

As IPD offers the potential to analyse data in many different ways, it is particularly important that all methods relating to analysis are pre-specified in detail in the review protocol or analysis plan (Tierney et al 2015a) and are clearly reported in publications (Stewart et al 2015). This should include: outcomes and their definitions; methods for checking IPD and assessing risk of bias of included studies; methods for evaluating treatments effects, risks or test accuracy (including those for exploring variations by trial or patient characteristics) and methods for quantifying and accounting for heterogeneity. Unplanned analyses can still play an important role in explaining or adding to the results, but such exploratory analyses should be justified and clearly reported as such.

Statistical methods for the analysis of IPD can be complex and are described in more detail elsewhere (Debray et al 2015b). These methods are less well developed for prognostic or diagnostic test accuracy reviews than for interventions reviews based on randomized trials, so we outline some key principles for the re-analysis of IPD from randomized trials.

26.4.2 Assessing overall effects of interventions

It is important to stratify or account for clustering of participants in an IPD meta-analysis (Abo-Zaid et al 2013), because participants will have been recruited according to different study protocols. Combining IPD across studies, as though part of single ‘mega’ trial, could lead to biased comparisons of interventions and over-precise estimates of effect (Tierney et al 2015a). To date, most IPD meta-analyses have used a two-stage approach to analysis (Simmonds et al 2005, Bowden et al 2011, Simmonds et al 2015), whereby each individual study is analysed independently in the first stage, reducing the IPD to summary statistics (i.e. aggregate data). In the second stage, these are combined to provide a pooled estimate of effect, in much the same way as for a conventional systematic review (Simmonds et al 2005). Thus, standard statistics and forest plots can be produced.

A one-stage model is typically a regression that estimates intervention effects, while stratifying by study (e.g. including an indicator variable for each study), but does require a higher degree of statistical expertise to implement, and interpretation is not as straightforward as the more familiar two-stage approach. Although one- and two-stage meta-analyses often produce similar results, variations do occur, but may arise because of different modelling assumptions rather than the choice of one- versus two-stage (Burke et al 2017, Morris et al 2018). Yet, for some, a one-stage model seems preferable, and their use has increased dramatically in recent years (Simmonds et al 2015, Fisher et al 2017). As it is difficult to derive standard meta-analysis statistics directly from a one-stage model, a compromise is to do one-stage analysis to obtain estimates of effect, and a two-stage analysis to obtain further statistics and forest plots. Whichever approach is taken, it is important that the choice is specified in advance or that results for both approaches are reported (Stewart et al 2012).

26.4.3 Assessing if effects vary by trial characteristics

Exploring whether intervention effects vary by study characteristics is an important aspect of any meta-analysis, and can be readily investigated with IPD, using the same analytical approaches that are used for aggregate data (Deeks et al 2019). Thus, subgroup analysis might be used, whereby studies are grouped according to a particular characteristic such as drug type, and the effects compared indirectly between these groups. Alternatively, meta-regression might be used to explore whether the overall effect of an intervention varies in relation to a study treatment characteristic such as drug dose.

26.4.4 Assessing if effects vary by participant characteristics

Collecting IPD is the most reliable and often the only way to investigate whether intervention effects vary by participant characteristics, for example, whether an intervention is more or less effective in women compared to men (Stewart et al 1995, Stewart and Tierney 2002). Again, this can be done in two stages. In the first stage, interactions between gender and the intervention effect at the individual participant-level are estimated within each study, and in the second stage these interactions are pooled across studies using standard meta-analysis techniques; so-called ‘within-trial’ interactions (Fisher et al 2011, Fisher et al 2017). In the widely used ‘subgroup analysis’ approach, each study is first split into subgroups, say men and women, and a meta-analysis of effects in men is compared with a meta-analysis of effects in women. Unfortunately, this approach conflates within and across-trial interactions, so is susceptible to bias and might best be avoided (Fisher et al 2011, Fisher et al 2017). Alternatively, a one-stage approach can be used, but to avoid bias, again care must be taken to distinguish within-study interactions from any between-study interactions (Riley et al 2008, Fisher et al 2011).

Importantly, and irrespective of the analytical method, where multiple subgroups have been investigated and/or subgroups effects lack biological plausibility, results should be viewed with caution (Clarke and Halsey 2001). Where there is no particular evidence that trial or participant characteristics impact on the results, emphasis should be placed on the overall effects.

26.4.5 Software for IPD meta-analysis

Owing to the complexity and range of analyses possible with IPD, it is difficult for any software to accommodate fully all the analyses and plots required. One-stage meta-analysis typically requires mixed-effects or multilevel regression modelling, which can be achieved in a range of statistical software (Debray et al 2015b). For the first stage of a two-stage approach, these packages can also be used, and the summary statistics then combined in the second stage using either a standard meta-analysis command (e.g. metan command in Stata), or input into a separate meta-analysis package such as RevMan. The user-written Stata package ipdmetan (Fisher 2015) has been developed to facilitate two-stage IPD meta-analysis, by allowing the user to specify both the regression model to apply to each study in the first stage, and the meta-analytical method to apply in the second stage.

26.5 Reporting IPD reviews

Where possible, IPD reviews should be reported in accordance with the PRISMA-IPD guideline (Stewart et al 2015). This was developed as a standalone extension to PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) (Moher et al 2009), to ensure that specific features of the IPD approach are addressed, such as the reporting of the methods used to obtain, check and synthesize IPD, and to deal with studies for which IPD were not available. PRISMA-IPD is, however, geared to IPD reviews of efficacy, but much of it is also relevant to IPD reviews of, for example, diagnostic, prognostic and observational studies (Stewart et al 2015).

26.6 Appraising the quality of IPD reviews

Although clearly they offer considerable advantages, and their use has increased across a range of healthcare areas (Simmonds et al 2015), not all IPD reviews are done or reported to the same standard (Riley et al 2010, Ahmed et al 2012). Moreover, the process of collecting, checking and analysing IPD is more complex than for aggregate data, and there are usually many more analyses to be reported, so it can be difficult to judge the quality of IPD reviews. This may, in turn, hinder their conduct, dissemination and influence guidelines (Vale et al 2015) and new trials (Tierney et al 2015b). For example, an ad hoc IPD meta-analysis of randomized trials (e.g. from a single institution or company) may not include all studies of relevance, and therefore might give a biased or otherwise unrepresentative view of the effects of a particular intervention. By contrast, the quality of the included studies might be a more important determinant of reliability in an IPD meta-analysis of prognosis or diagnosis (Debray et al 2015a). Therefore, guidance has been prepared to help researchers, clinicians, patients, policy makers, funders and publishers understand, appraise and make best use of IPD reviews of randomized trials (Tierney et al 2015a), and diagnostic and prognostic modelling studies (Debray et al 2015a).

26.7 Chapter information

Authors: Jayne F Tierney, Lesley A Stewart, Mike Clarke; on behalf of the Cochrane Individual Participant Data Meta-analysis Methods Group

Funding: JFT and coordination of the IPD Meta-analysis Methods Group is funded by the UK Medical Research Council (MC_UU_12023/24); Lesley A Stewart is funded by the University of York and Mike Clarke is funded by Queen’s University Belfast.

26.8 References

Abo-Zaid G, Guo B, Deeks JJ, Debray TP, Steyerberg EW, Moons KG, Riley RD. Individual participant data meta-analyses should not ignore clustering. Journal of Clinical Epidemiology 2013; 66: 865-873 e864.

Ahmed I, Sutton AJ, Riley RD. Assessment of publication bias, selection bias, and unavailable data in meta-analyses using individual participant data: a database survey. BMJ 2012; 344: d7762.

Askie LM, Duley L, Henderson-Smart D, Stewart LA, on behalf of the PARIS Collaborative Group. Antiplatelet agents for prevention of pre-eclampsia: a meta-analysis of individual patient data. Lancet 2007; 369: 1791-1798.

Bowden J, Tierney JF, Simmonds M, Copas AJ. Individual patient data meta-analysis of time-to-event outcomes: one-stage versus two-stage approaches for estimating the hazard ratio under a random effects model. Research Synthesis Methods 2011; 2: 150-162.

Burdett S, Stewart LA. A comparison of the results of checked versus unchecked individual patient data meta-analyses. International Journal of Technology Assessment in Health Care 2002; 18: 619-624.

Burke DL, Ensor J, Riley RD. Meta-analysis using individual participant data: one-stage and two-stage approaches, and why they may differ. Statistics in Medicine 2017; 36: 855-875.

Clarke M, Stewart L, Pignon JP, Bijnens L. Individual patient data meta-analyses in cancer. British Journal of Cancer 1998; 77: 2036-2044.

Clarke M, Halsey J. DICE 2: a further investigation of the effects of chance in life, death and subgroup analyses. International Journal of Clinical Practice 2001; 55: 240-242.

Debray TP, Riley RD, Rovers MM, Reitsma JB, Moons KG, Cochrane IPD Meta-analysis Methods group. Individual participant data (IPD) meta-analyses of diagnostic and prognostic modeling studies: guidance on their use. PLoS Medicine 2015a; 12: e1001886.

Debray TP, Moons KG, van Valkenhoef G, Efthimiou O, Hummel N, Groenwold RH, Reitsma JB, GetReal methods review g. Get real in individual participant data (IPD) meta-analysis: a review of the methodology. Research Synthesis Methods 2015b; 6: 293-309.

Deeks JJ, Higgins JPT, Altman DG (editors). Chapter 10: Analysing data and undertaking meta-analyses. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.0 (updated July 2019). Cochrane, 2019. Available from cochrane.org/handbook.

Dwan K, Altman DG, Cresswell L, Blundell M, Gamble CL, Williamson PR. Comparison of protocols and registry entries to published reports for randomised controlled trials. Cochrane Database of Systematic Reviews 2011; 1: MR000031.

Ensor J, Burke DL, Snell KIE, Hemming K, Riley RD. Simulation-based power calculations for planning a two-stage individual participant data meta-analysis. BMC Medical Research Methodology 2018; 18.

Fisher DJ, Copas AJ, Tierney JF, Parmar MKB. A critical review of methods for the assessment of patient-level interactions in individual patient data (IPD) meta-analysis of randomised trials, and guidance for practitioners. Journal of Clinical Epidemiology 2011; 64: 949-967.

Fisher DJ. Two-stage individual participant data meta-analysis and generalized forest plots. Stata Journal 2015; 15: 369-396.

Fisher DJ, Carpenter JR, Morris TP, Freeman SC, Tierney JF. Meta-analytical methods to identify who benefits most from treatments: daft, deluded, or deft approach? BMJ 2017; 356: j573.

Higgins JPT, Altman DG, Gøtzsche PC, Jüni P, Moher D, Oxman AD, Savović J, Schulz KF, Weeks L, Sterne J, Group. CBM, Group. CSM. The Cochrane Collaboration's tool for assessing risk of bias in randomised trials. BMJ 2011; 343: d5928.

Kirkham JJ, Dwan KM, Altman DG, Gamble C, Dodd S, Smyth R, Williamson PR. The impact of outcome reporting bias in randomised controlled trials on a cohort of systematic reviews. BMJ 2010; 340: c365.

Mhaskar R, Djulbegovic B, Magazin A, Soares HP, Kumar A. Published methodological quality of randomized controlled trials does not reflect the actual quality assessed in protocols. Journal of Clinical Epidemiology 2012; 65: 602-609.

Moher D, Liberati A, Tetzlaff J, Altman D, The PRISMA Group. Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement. PLoS Medicine 2009; 6: e1000097. doi:1000010.1001371/journal.pmed.1000097.

Morris TP, Fisher DJ, Kenward MG, Carpenter JR. Meta-analysis of Gaussian individual patient data: Two-stage or not two-stage? Statistics in Medicine 2018; 37: 1419-1438.

Nevitt SJ, Sudell M, Weston J, Tudur Smith C, Marson AG. Antiepileptic drug monotherapy for epilepsy: a network meta-analysis of individual participant data. Cochrane Database of Systematic Reviews 2017; 12: CD011412.

Pan H, Gray R, Braybrooke J, Davies C, Taylor C, McGale P, Peto R, Pritchard KI, Bergh J, Dowsett M, Hayes DF, Ebctcg. 20-Year Risks of Breast-Cancer Recurrence after Stopping Endocrine Therapy at 5 Years. New England Journal of Medicine 2017; 377: 1836-1846.

Riley RD, Lambert PC, Staessen JA, Wang J, Gueyffier F, Thijs L, Boutitie F. Meta-analysis of continuous outcomes combining individual patient data and aggregate data. Statistics in Medicine 2008; 27: 1870-1893.

Riley RD, Lambert PC, Abo-Zaid G. Meta-analysis of individual participant data: rationale, conduct, and reporting. BMJ 2010; 340: c221.

Sarcoma Meta-analysis Collaboration. Adjuvant chemotherapy for localised resectable soft tissue sarcoma in adults: meta-analysis of individual patient data. Lancet 1997; 350: 1647-1654.

Sargent DJ, Patiyil S, Yothers G, Haller DG, Gray R, Benedetti J, Buyse M, Labianca R, Seitz JF, O'Callaghan CJ, Francini G, Grothey A, O'Connell M, Catalano PJ, Kerr D, Green E, Wieand HS, Goldberg RM, de Gramont A, Group A. End points for colon cancer adjuvant trials: observations and recommendations based on individual patient data from 20,898 patients enrolled onto 18 randomized trials from the ACCENT Group. Journal of Clinical Oncology 2007; 25: 4569-4574.

Simmonds M, Stewart G, Stewart L. A decade of individual participant data meta-analyses: A review of current practice. Contemporary Clinical Trials 2015.

Simmonds MC, Higgins JPT, Stewart LA, Tierney JF, Clarke MJ, Thompson SG. Meta-analysis of individual patient data from randomised trials - a review of methods used in practice. Clinical Trials 2005; 2: 209-217.

Sinicrope FA, Foster NR, Yothers G, Benson A, Seitz JF, Labianca R, Goldberg RM, Degramont A, O'Connell MJ, Sargent DJ, Adjuvant Colon Cancer Endpoints G. Body mass index at diagnosis and survival among colon cancer patients enrolled in clinical trials of adjuvant chemotherapy. Cancer 2013; 119: 1528-1536.

Sterne JAC, Hernán MA, Reeves BC, Savović J, Berkman ND, Viswanathan M, Henry D, Altman DG, Ansari MT, Boutron I, Carpenter JR, Chan AW, Churchill R, Deeks JJ, Hróbjartsson A, Kirkham J, Jüni P, Loke YK, Pigott TD, Ramsay CR, Regidor D, Rothstein HR, Sandhu L, Santaguida PL, Schünemann HJ, Shea B, Shrier I, Tugwell P, Turner L, Valentine JC, Waddington H, Waters E, Wells GA, Whiting PF, Higgins JPT. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ 2016; 355: i4919.

Stewart GB, Altman DG, Askie LM, Duley L, Simmonds MC, Stewart LA. Statistical analysis of individual participant data meta-analyses: a comparison of methods and recommendations for practice. PloS One 2012; 7: e46042.

Stewart L, Tierney J, Burdett S. Do systematic reviews based on individual patient data offer a means of circumventing biases associated with trial publications? In: Rothstein H, Sutton A, Borenstein M, editors. Publication Bias in Meta-Analysis: Prevention, Assessment and Adjustments. Chichester: John Wiley & Sons; 2005. p. 261-286.

Stewart LA, Clarke MJ, on behalf of the Cochrane Working Party Group on Meta-analysis using Individual Patient Data. Practical methodology of meta-analyses (overviews) using updated individual patient data. Statistics in Medicine 1995; 14: 2057-2079.

Stewart LA, Tierney JF. To IPD or Not to IPD? Advantages and disadvantages of systematic reviews using individual patient data. Evaluation and the Health Professions 2002; 25: 76-97.

Stewart LA, Clarke M, Rovers M, Riley RD, Simmonds M, Stewart G, Tierney JF, PRISMA-IPD Development Group. Preferred reporting items for a systematic review and meta-analysis of individual participant data: the PRISMA-IPD statement. JAMA 2015; 313: 1657-1665.

Tierney JF, Stewart LA. Investigating patient exclusion bias in meta-analysis. International Journal of Epidemiology 2005; 34: 79-87.

Tierney JF, Vale CL, Riley R, Tudur Smith C, Stewart LA, Clarke M, Rovers M. Individual participant data (IPD) meta-analyses of randomised controlled trials: Guidance on their use. PLoS Medicine 2015a; 12: e1001855. https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1001855

Tierney JF, Pignon J-P, Gueffyier F, Clarke M, Askie L, Vale CL, Burdett S. How individual participant data meta-analyses can influence trial design and conduct Journal of Clinical Epidemiology 2015b; 68: 1325-1335.

Tudur Smith C, Clarke M, Marson T, Riley R, Stewart L, Tierney J, Vail A, Williamson P. A framework for deciding if individual participant data are likely to be worthwhile (oral session). 23rd Cochrane Colloquium; 2015; Vienna, Austria. https://abstracts.cochrane.org/2015-vienna/framework-deciding-if-individual-participant-data-are-likely-be-worthwhile.

Tudur Smith C, Marcucci M, Nolan SJ, Iorio A, Sudell M, Riley R, Rovers MM, Williamson PR. Individual participant data meta-analyses compared with meta-analyses based on aggregate data. Cochrane Database of Systematic Reviews 2016; 9: MR000007.

Vale CL, Tierney JF, Burdett S. Can trial quality be reliably assessed from published reports of cancer trials: evaluation of risk of bias assessments in systematic reviews. BMJ 2013; 346: f1798.

Vale CL, Rydzewska LHM, Rovers MM, Emberson JR, Gueyffier F, Stewart LA. Uptake of systematic reviews and meta-analyses based on individual participant data in clinical practice guidelines: descriptive study. BMJ 2015; 350: h1088.

Veroniki AA, Straus SE, Ashoor H, Stewart LA, Clarke M, Tricco AC. Contacting authors to retrieve individual patient data: study protocol for a randomized controlled trial. Trials 2016; 17: 138.

Veroniki AA, Rios P, Le S, Mavridis D, Stewart L, Clarke M, Ashoor H, Straus S, Tricco A. Obtaining individual patient data depends on study characteristics and can take longer than a year after a positive response. Journal of Clinical Epidemiology 2019; in press.