How to use ICEMAN

Users and applications

Users of ICEMAN include

  • Trial investigators who are planning or considering the results of a subgroup analysis;
  • Meta-analysts who are planning or considering the results of a subgroup analysis;
  • Authors of systematic reviews and clinical practice guidelines who assess subgroup claims made in published reports of trials or meta-analyses;
  • Journal editors, referees, methods consultants, and others concerned with the quality of subgroup analyses in trials or meta-analyses.

Assessment in duplicate

Confidence in the assessment increases if two investigators independently apply ICEMAN, discuss discrepancies, and present a consensus version.

Reporting

We recommend specifying use of ICEMAN in the study protocol, and in the methods, results, and interpretation sections of the final publication:

  • Protocol: “We will assess the credibility of potentially relevant effect modification using ICEMAN.”
  • Methods: “We used ICEMAN to assess the credibility of potentially relevant effect modification.”
  • Results: “We judged the credibility of the potential effect modification as low, with uncertainty arising from lack of prior evidence and an inconclusive test of interaction (see supplement).”
  • Interpretation: “A formal credibility assessment rated the apparent effect modification as likely spurious. We recommend considering the overall effect estimate for all patients.”
Warning

We do not recommend reporting overall credibility as a percentage (e.g. “30% credible”).

Users can download the fillable ICEMAN form for RCTs or meta-analyses, complete it on their device, and attach it to the appendix of their publication. Alternatively, the online tool generates a downloadable reporting table directly.

We also provide four table templates for summarising one or more ICEMAN assessments in a publication. Download all templates (.docx)

Legend for all tables: (–) definitely reduces credibility; (-) probably reduces credibility or unclear; (+) probably increases credibility; (++) definitely increases credibility. Not applicable items receive no code.

Template1 — Single assessment (RCT). Full item-level table for one outcome, effect measure, and effect modifier in an RCT.

Item Response Rationale
1 — Direction of effect modification hypothesized a priori? (+) Probably yes Direction stated before analysis.
2 — Effect modification supported by prior evidence? (+) Some support Consistent indirect evidence.
3 — Chance unlikely explanation of effect modification? (+) Chance may not explain Interaction p = 0.008.
4 — Few effect modifiers tested or multiplicity considered? (-) Probably no or unclear Six modifiers tested; no adjustment.
5 — Arbitrary cut points avoided for continuous modifier? Not applicable
6 — Additional credibility considerations? None No additional concerns.
Overall credibility Moderate: likely effect modification; use separate subgroup effects, but note uncertainty Multiplicity lowered credibility.

Template2 — Single assessment (meta-analysis). Full item-level table for one outcome, effect measure, and effect modifier in a meta-analysis.

Item Response Rationale
1 — Based on within- rather than between-trial comparison? (+) Mostly within Most information came from within-trial subgroup comparisons.
2 — Effect modification similar across trials? (+) Mostly similar Trial-specific estimates had similar direction, with some variation in magnitude.
3 — Number of trials large for between-trial comparison? Not applicable Assessment did not rely on between-trial comparison.
4 — Direction of effect modification hypothesized a priori? (+) Probably yes Direction stated before analysis.
5 — Chance unlikely explanation of effect modification? (+) Chance may not explain Meta-regression p = 0.008.
6 — Few effect modifiers tested or multiplicity considered? (-) Probably no or unclear Several modifiers tested; no adjustment.
7 — Random-effects model used? (++) Definitely yes Authors explicitly used a random-effects model.
8 — Arbitrary cut points avoided for continuous modifier? Not applicable
9 — Additional credibility considerations? None No additional concerns.
Overall credibility Moderate: likely effect modification; use separate subgroup effects, but note uncertainty Multiplicity lowered credibility.

Template3 — Multiple effect modifiers. Compact table comparing ICEMAN items across several candidate effect modifiers.

Item Age Sex Diabetes status Baseline severity
Interaction p-value 0.008 0.04 0.003 0.07
1 — Direction a priori? (+) Probably yes (-) Probably no or unclear (++) Definitely yes (-) Probably no or unclear
2 — Prior evidence? (+) Some support (-) Little or no support (++) Strong support (+) Some support
3 — Chance unlikely? (+) Chance may not explain (-) Chance likely (++) Chance unlikely (-) Chance likely
4 — Multiplicity? (-) Probably no or unclear (-) Probably no or unclear (-) Probably no or unclear (-) Probably no or unclear
5 — Arbitrary cut points? Not applicable Not applicable Not applicable (-) Probably no or unclear
6 — Additional? None None None None
Overall credibility Moderate: likely effect modification, but uncertainty remains Low: some but insufficient support High: very likely effect modification Very low: minimal to no support

Template4 — Summary table. One row per assessment; suitable for summarising several assessments across outcomes and effect modifiers.

Footnote: Each row represents one candidate effect modification (one outcome, effect measure, and effect modifier). ICEMAN was applied only when the interaction p-value was 0.1 or smaller. Full item-level assessments appear in a supplement.
Outcome Effect measure Effect modifier Interaction p Main credibility concerns Overall credibility
30-day mortality Risk ratio Age 0.008 Multiplicity not addressed Moderate
30-day mortality Risk ratio Sex 0.04 Direction not prespecified; weak prior evidence; multiplicity not addressed Low
Stroke at 1 year Odds ratio Diabetes status 0.003 None major High
Pain at 6 months Mean difference Baseline severity 0.07 Weak statistical support; arbitrary cut point Very low
Serious adverse events Risk difference Age 0.14 ICEMAN not applied (p > 0.1) Not assessed

Using ICEMAN with other instruments

ICEMAN can be combined with the Cochrane Risk of Bias tool for RCTs1 or the ROBIS tool for systematic reviews,2 and with the GRADE framework3:

  • Moderate or high credibility: Apply GRADE to subgroup-specific estimates. Note remaining uncertainty if moderate. Considering subgroup-specific estimates may sometimes resolve concerns due to heterogeneity and consequently increase certainty of evidence and strength of recommendation.
  • Low or very low credibility: Apply GRADE to the overall effect estimate. Note remaining uncertainty if low, especially if the potential effect modification appears to explain heterogeneity.

References

1. Higgins J, Sterne J, Savović J, Page M, Hrõbjartsson A, Boutron I, Reeves B, Eldridge S. A revised tool for assessing risk of bias in randomized trials. Cochrane Database of Systematic Reviews. 2016 ;1029–31.
2. Whiting P, Savović J, Higgins JPT, Caldwell DM, Reeves BC, Shea B, Davies P, Kleijnen J, Churchill R, ROBIS group. ROBIS: A new tool to assess risk of bias in systematic reviews was developed [Internet]. J. Clin. Epidemiol. 2016 Jan. ;69225–234.Available from: http://dx.doi.org/10.1016/j.jclinepi.2015.06.005
3. Guyatt GH, Oxman AD, Kunz R, Woodcock J, Brozek J, Helfand M, Alonso-Coello P, Glasziou P, Jaeschke R, Akl EA, Norris S, Vist G, Dahm P, Shukla VK, Higgins J, Falck-Ytter Y, Schünemann HJ, GRADE Working Group. GRADE guidelines: 7. Rating the quality of evidence–inconsistency [Internet]. J. Clin. Epidemiol. 2011 Dec. ;64(12):1294–1302.Available from: http://dx.doi.org/10.1016/j.jclinepi.2011.03.017