This vignette uses the bundled
dataset_real_cancer_drivers_4 dataset to illustrate a real
biological analysis: how do four canonical cancer driver catalogs
overlap?
The four sources are:
The lists are very different in size — Vogelstein is the smallest curated set; OncoKB is the most permissive at this annotation tier.
The dataset was built from a 20,000-gene background
(universe_size):
This is the population N used in the hypergeometric
over-representation tests (see
vignette("v05_statistics_deep_dive")).
The default model for 4 sets is venn-4-set
(Edwards-style).
broom::glance() returns a one-row tibble with the
headline numbers:
The default render uses the dataset’s set names as labels. To shorten them for the diagram, pass a per-letter override:
svg <- render_venn_svg(
result,
set_names = c(A = "Vogelstein", B = "COSMIC", C = "OncoKB", D = "IntOGen"),
title = "Cancer driver overlap (4 sources)"
)
nchar(svg)(See vignette("v08_custom_styling_and_export") for color
overrides and post-render SVG manipulation.)
For 4+ sets, an UpSet plot is often easier to read than the Venn diagram — each intersection size is a bar, sorted by cardinality.
(The chunk above is gated on R >= 4.6 because the
CRAN release of ComplexUpset (1.3.3) is incompatible with
ggplot2 >= 4.0 on older R — see
?vennDiagramLab::render_upset for context.)
broom::tidy() returns one row per set pair, with all
five pairwise metrics plus the BH-FDR-adjusted hypergeometric
p-value:
top_pairs <- broom::tidy(result)
top_pairs[order(top_pairs$p_adjusted), c("set_a", "set_b", "intersection",
"jaccard", "p_adjusted",
"significant")]Every pair is significant at FDR < 0.05 (as expected — these catalogs are designed to overlap on biology).
broom::augment() returns one row per gene with
set-membership flags and the region label.
vignette("v05_statistics_deep_dive") — interpret the
Jaccard / Dice / hypergeometric numbers in detail.vignette("v07_pdf_reports") — turn this analysis into a
multi-page PDF.vignette("v08_custom_styling_and_export") — customize
colors, embed in a ggplot, export to PDF/PNG.