Title: | Create Complex UpSet Plots Using 'ggplot2' Components |
---|---|
Description: | UpSet plots are an improvement over Venn Diagram for set overlap visualizations. Striving to bring the best of the 'UpSetR' and 'ggplot2', this package offers a way to create complex overlap visualisations, using simple and familiar tools, i.e. geoms of 'ggplot2'. For introduction to UpSet concept, see Lex et al. (2014) <doi:10.1109/TVCG.2014.2346248>. |
Authors: | Michał Krassowski [aut, cre] |
Maintainer: | Michał Krassowski <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.3.6 |
Built: | 2025-01-03 03:43:36 UTC |
Source: | https://github.com/krassowski/complex-upset |
Generate mapping for labeling percentages
aes_percentage(relative_to, digits = 0, sep = "")
aes_percentage(relative_to, digits = 0, sep = "")
relative_to |
defines proportion that should be calculated, relative to |
digits |
number of digits to show (default=0) |
sep |
separator separator between the digit and percent sign (no separator by default) |
Arrange points for Venn diagram
arrange_venn( data, sets = NULL, radius = 1.5, max_iterations = 10, verbose = FALSE, outwards_adjust = 1.3, extract_sets = FALSE, extract_regions = FALSE, repeat_in_intersections = FALSE, starting_grid_size = "auto" )
arrange_venn( data, sets = NULL, radius = 1.5, max_iterations = 10, verbose = FALSE, outwards_adjust = 1.3, extract_sets = FALSE, extract_regions = FALSE, repeat_in_intersections = FALSE, starting_grid_size = "auto" )
data |
a dataframe including binary columns representing membership in sets |
sets |
vector with names of columns representing membership in sets |
radius |
the radius of the circle |
max_iterations |
the maximal number of iterations |
verbose |
should debugging notes be printed? |
outwards_adjust |
the multiplier defining the distance from the centre |
extract_sets |
should only sets be extracted? |
extract_regions |
should all unique regions be extracted? |
repeat_in_intersections |
repeat intersection k times where k is the number of sets it belongs to? |
starting_grid_size |
the starting size of the grid for placement of elements |
Compare covariates between intersections
compare_between_intersections( data, intersect, test = kruskal.test, tests = list(), ignore = list(), ignore_mode_columns = TRUE, mode = "exclusive_intersection", ... )
compare_between_intersections( data, intersect, test = kruskal.test, tests = list(), ignore = list(), ignore_mode_columns = TRUE, mode = "exclusive_intersection", ... )
data |
a dataframe including binary columns representing membership in classes |
intersect |
which columns should be used to compose the intersection |
test |
the default test function; it is expected to accept |
tests |
a named list with tests for specific variables, overwriting the default test |
ignore |
a list with names of variables to exclude from testing |
ignore_mode_columns |
whether the membership columns and size columns for all modes should be ignored |
mode |
region selection mode; note that modes other than |
... |
passed to |
Create an example dataset with three sets: A, B and C
create_upset_abc_example()
create_upset_abc_example()
Circle for Venn diagram
geom_venn_circle( data, mapping = aes_(), sets = NULL, radius = 1.5, resolution = 100, size = 0.8, color = "black", ... )
geom_venn_circle( data, mapping = aes_(), sets = NULL, radius = 1.5, resolution = 100, size = 0.8, color = "black", ... )
data |
a dataframe including binary columns representing membership in sets |
mapping |
the aesthetics mapping |
sets |
vector with names of columns representing membership in sets |
radius |
the radius of the circle |
resolution |
the resolution of the circle rasterizer |
size |
width of the outline |
color |
the color of the outline |
... |
Arguments passed on to
|
Label for a region of Venn diagram
geom_venn_label_region( data, mapping = aes_(), sets = NULL, outwards_adjust = 1.3, fill = alpha("white", 0.85), size = 5, label.size = 0, ... )
geom_venn_label_region( data, mapping = aes_(), sets = NULL, outwards_adjust = 1.3, fill = alpha("white", 0.85), size = 5, label.size = 0, ... )
data |
a dataframe including binary columns representing membership in sets |
mapping |
the aesthetics mapping |
sets |
vector with names of columns representing membership in sets |
outwards_adjust |
the multiplier defining the distance from the centre |
fill |
the fill of the label |
size |
the text size |
label.size |
the size of the label outline |
... |
Arguments passed on to
|
Label for a set of Venn diagram
geom_venn_label_set( data, mapping = aes_(), sets = NULL, outwards_adjust = 2.5, fill = alpha("white", 0.85), size = 5, label.size = 0, ... )
geom_venn_label_set( data, mapping = aes_(), sets = NULL, outwards_adjust = 2.5, fill = alpha("white", 0.85), size = 5, label.size = 0, ... )
data |
a dataframe including binary columns representing membership in sets |
mapping |
the aesthetics mapping |
sets |
vector with names of columns representing membership in sets |
outwards_adjust |
the multiplier defining the distance from the centre |
fill |
the fill of the label |
size |
the text size |
label.size |
the size of the label outline |
... |
Arguments passed on to
|
Region of Venn diagram
geom_venn_region(data, mapping = aes_(), sets = NULL, resolution = 250, ...)
geom_venn_region(data, mapping = aes_(), sets = NULL, resolution = 250, ...)
data |
a dataframe including binary columns representing membership in sets |
mapping |
the aesthetics mapping |
sets |
vector with names of columns representing membership in sets |
resolution |
the resolution of the circle rasterizer |
... |
Arguments passed on to
|
!!
)Retrieve symbol for given mode that can be used in aesthetics mapping with double bang (!!
)
get_size_mode(mode, suffix = "_size")
get_size_mode(mode, suffix = "_size")
mode |
the mode to use. Accepted values: |
suffix |
the column suffix in use as passed to |
Prepare layers for sets sizes plot
intersection_matrix( geom = geom_point(size = 3), segment = geom_segment(), outline_color = list(active = "black", inactive = "grey70") )
intersection_matrix( geom = geom_point(size = 3), segment = geom_segment(), outline_color = list(active = "black", inactive = "grey70") )
geom |
a geom_point call, allowing to specify parameters (e.g. |
segment |
a geom_segment call, allowing to specify parameters (e.g. |
outline_color |
a named list with two colors for outlines of active and inactive dots |
A large intersection size can be driven by a large number of members in a group; to account for that, one can divide the intersection size by the size of a union of the same groups. This cannot be calculated for the null intersection (observations which do not belong to either of the groups).
intersection_ratio( mapping = aes(), counts = TRUE, bar_number_threshold = 0.75, text_colors = c(on_background = "black", on_bar = "white"), text = list(), text_mapping = aes(), mode = "distinct", denominator_mode = "union", width = 0.9, ... )
intersection_ratio( mapping = aes(), counts = TRUE, bar_number_threshold = 0.75, text_colors = c(on_background = "black", on_bar = "white"), text = list(), text_mapping = aes(), mode = "distinct", denominator_mode = "union", width = 0.9, ... )
mapping |
additional aesthetics for |
counts |
whether to display count number labels above the bars |
bar_number_threshold |
if less than one, labels for bars height greater than this threshold will be placed on (not above) the bars |
text_colors |
a name vector of characters specifying the color when |
text |
additional parameters passed to |
text_mapping |
additional aesthetics for |
mode |
region selection mode, defines which intersection regions will be accounted for when computing the size. See |
denominator_mode |
region selection mode for computing the denominator in ratio. See |
width |
bar width, by default set to 90% |
... |
Arguments passed on to
|
Barplot annotation of intersections sizes
intersection_size( mapping = aes(), counts = TRUE, bar_number_threshold = 0.85, text_colors = c(on_background = "black", on_bar = "white"), text = list(), text_mapping = aes(), mode = "distinct", position = position_stack(), width = 0.9, ... )
intersection_size( mapping = aes(), counts = TRUE, bar_number_threshold = 0.85, text_colors = c(on_background = "black", on_bar = "white"), text = list(), text_mapping = aes(), mode = "distinct", position = position_stack(), width = 0.9, ... )
mapping |
additional aesthetics for |
counts |
whether to display count number labels above the bars |
bar_number_threshold |
if less than one, labels for bars height greater than this threshold will be placed on (not above) the bars |
text_colors |
a name vector of characters specifying the color when |
text |
additional parameters passed to |
text_mapping |
additional aesthetics for |
mode |
region selection mode, defines which intersection regions will be accounted for when computing the size. See |
position |
position passed to |
width |
bar width, by default set to 90% |
... |
Arguments passed on to
|
upset_set_size()
Inspired by Brian Diggs' answer which is CC-BY-SA 4.0.
reverse_log_trans(base = 10)
reverse_log_trans(base = 10)
base |
logarithm base (default 10) |
Color scale for Venn diagram
scale_color_venn_mix( data, sets = NULL, colors = c("red", "blue", "green"), na.value = "grey40", highlight = NULL, active_color = "orange", inactive_color = "NA", scale = scale_color_manual, ... )
scale_color_venn_mix( data, sets = NULL, colors = c("red", "blue", "green"), na.value = "grey40", highlight = NULL, active_color = "orange", inactive_color = "NA", scale = scale_color_manual, ... )
data |
a dataframe including binary columns representing membership in sets |
sets |
vector with names of columns representing membership in sets |
colors |
named list of colors for sets (one set=one color) |
na.value |
value for elements not belonging to any of the sets |
highlight |
which regions of the diagram to highlight |
active_color |
color for highlight |
inactive_color |
color for lack of highlight |
scale |
the base scale (default= |
... |
Arguments passed on to
|
Fill scale for Venn diagram
scale_fill_venn_mix(..., na.value = "NA")
scale_fill_venn_mix(..., na.value = "NA")
... |
Arguments passed on to
|
na.value |
value for elements not belonging to any of the known sets |
Compose an UpSet plot
upset( data, intersect, base_annotations = "auto", name = "group", annotations = list(), themes = upset_themes, stripes = upset_stripes(), labeller = identity, height_ratio = 0.5, width_ratio = 0.3, wrap = FALSE, set_sizes = upset_set_size(), mode = "distinct", queries = list(), guides = NULL, encode_sets = TRUE, matrix = intersection_matrix(), ... )
upset( data, intersect, base_annotations = "auto", name = "group", annotations = list(), themes = upset_themes, stripes = upset_stripes(), labeller = identity, height_ratio = 0.5, width_ratio = 0.3, wrap = FALSE, set_sizes = upset_set_size(), mode = "distinct", queries = list(), guides = NULL, encode_sets = TRUE, matrix = intersection_matrix(), ... )
data |
a dataframe including binary columns representing membership in classes |
intersect |
which columns should be used to compose the intersection |
base_annotations |
a named list with default annotations (i.e. the intersection size barplot) |
name |
the label shown below the intersection matrix |
annotations |
a named list of annotations, each being a list with:
|
themes |
a named list of themes for components and annotations, see |
stripes |
specification of the stripes appearance created with |
labeller |
function modifying the names of the sets (rows in the matrix) |
height_ratio |
ratio of the intersection matrix to intersection size height |
width_ratio |
ratio of the overall set size width to intersection matrix width |
wrap |
whether the plot should be wrapped into a group (makes adding a tile/combining with other plots easier) |
set_sizes |
the overall set sizes plot, e.g. from |
mode |
region selection mode for computing the number of elements in intersection fragment. See |
queries |
a list of queries generated with |
guides |
action for legends aggregation and placement ('keep', 'collect', 'over' the set sizes) |
encode_sets |
whether set names (column in input data) should be encoded as numbers (set to TRUE to overcome R limitations of max 10 kB for variable names for datasets with huge numbers of sets); default TRUE for upset() and FALSE for upset_data(). |
matrix |
the intersection matrix plot |
... |
Arguments passed on to
|
Simplifies creation of annotation panels, automatically building aesthetics mappings,
at a cost of lower flexibility than when providing a custom mapping; aes(x=intersection)
is prespecified.
upset_annotate(y, geom)
upset_annotate(y, geom)
y |
A string with the name of the y aesthetic |
geom |
A geom to be used as an annotation |
Prepare data for UpSet plots
upset_data( data, intersect, min_size = 0, max_size = Inf, min_degree = 0, max_degree = Inf, n_intersections = NULL, keep_empty_groups = FALSE, warn_when_dropping_groups = FALSE, warn_when_converting = "auto", sort_sets = "descending", sort_intersections = "descending", sort_intersections_by = "cardinality", sort_ratio_numerator = "exclusive_intersection", sort_ratio_denominator = "inclusive_union", group_by = "degree", mode = "exclusive_intersection", size_columns_suffix = "_size", encode_sets = FALSE, max_combinations_datapoints_n = 10^10, intersections = "observed" )
upset_data( data, intersect, min_size = 0, max_size = Inf, min_degree = 0, max_degree = Inf, n_intersections = NULL, keep_empty_groups = FALSE, warn_when_dropping_groups = FALSE, warn_when_converting = "auto", sort_sets = "descending", sort_intersections = "descending", sort_intersections_by = "cardinality", sort_ratio_numerator = "exclusive_intersection", sort_ratio_denominator = "inclusive_union", group_by = "degree", mode = "exclusive_intersection", size_columns_suffix = "_size", encode_sets = FALSE, max_combinations_datapoints_n = 10^10, intersections = "observed" )
data |
a dataframe including binary columns representing membership in classes |
intersect |
which columns should be used to compose the intersection |
min_size |
minimal number of observations in an intersection for it to be included |
max_size |
maximal number of observations in an intersection for it to be included |
min_degree |
minimal degree of an intersection for it to be included |
max_degree |
maximal degree of an intersection for it to be included |
n_intersections |
the exact number of the intersections to be displayed; n largest intersections that meet the size and degree criteria will be shown |
keep_empty_groups |
whether empty sets should be kept (including sets which are only empty after filtering by size) |
warn_when_dropping_groups |
whether a warning should be issued when empty sets are being removed |
warn_when_converting |
whether a warning should be issued when input is not boolean |
sort_sets |
whether to sort the rows in the intersection matrix (descending sort by default); one of: |
sort_intersections |
whether to sort the columns in the intersection matrix (descending sort by default); one of: |
sort_intersections_by |
the mode of sorting, the size of the intersection (cardinality) by default; one of: |
sort_ratio_numerator |
the mode for numerator when sorting by ratio |
sort_ratio_denominator |
the mode for denominator when sorting by ratio |
group_by |
the mode of grouping intersections; one of: |
mode |
region selection mode for sorting and trimming by size. See |
size_columns_suffix |
suffix for the columns to store the sizes (adjust if conflicts with your data) |
encode_sets |
whether set names (column in input data) should be encoded as numbers (set to TRUE to overcome R limitations of max 10 kB for variable names for datasets with huge numbers of sets); default TRUE for upset() and FALSE for upset_data() |
max_combinations_datapoints_n |
a fail-safe limit preventing accidental use of |
intersections |
whether only the intersections present in data ( |
Return the default UpSet themes with all themes modified with provided arguments
upset_default_themes(...)
upset_default_themes(...)
... |
arguments passed to |
By default the annotations are given data corresponding to the same mode as the mode of the passed in the upset()
call.
upset_mode(mode)
upset_mode(mode)
mode |
region selection mode, defines which mode data will be made available for the annotation. See |
Return the default UpSet themes with specific themes modified with provided themes
upset_modify_themes(to_update)
upset_modify_themes(to_update)
to_update |
a named list of themes to be used to modify themes of specific components; see |
Highlight sets or intersections matching specified query.
upset_query( set = NULL, intersect = NULL, group = NULL, only_components = NULL, ... )
upset_query( set = NULL, intersect = NULL, group = NULL, only_components = NULL, ... )
set |
name of the set to highlight |
intersect |
a vector of names for the intersection to highlight; pass |
group |
name of the set to highlight when using |
only_components |
which components to modify; by default all eligible components will be modified; the available components are 'overall_sizes', 'intersections_matrix', 'Intersection size', and any annotations specified |
... |
|
upset_query(intersect=c('Drama', 'Comedy'), color='red', fill='red') upset_query(set='Drama', fill='blue')
upset_query(intersect=c('Drama', 'Comedy'), color='red', fill='red') upset_query(set='Drama', fill='blue')
Prepare layers for sets sizes plot
upset_set_size( mapping = aes(), geom = geom_bar(width = 0.6), position = "left", filter_intersections = FALSE )
upset_set_size( mapping = aes(), geom = geom_bar(width = 0.6), position = "left", filter_intersections = FALSE )
mapping |
additional aesthetics |
geom |
a geom to use |
position |
on which side of the plot should the set sizes be displayed ('left' or 'right') |
filter_intersections |
whether the intersections filters (e.g. |
Define appearence of the stripes
upset_stripes( mapping = aes(), geom = geom_segment(size = 7), colors = c("white", "grey95"), data = NULL )
upset_stripes( mapping = aes(), geom = geom_segment(size = 7), colors = c("white", "grey95"), data = NULL )
mapping |
additional aesthetics |
geom |
a geom to use, should accept |
colors |
a vector of colors to repeat as many times as needed for the fill of stripes, or a named vector specifying colors for values of the variable mapped to the color aesthetics in the mapping argument |
data |
the dataset describing the sets with a column named |
This is a wrapper around compare_between_intersections()
, adding sorting by FDR, warnings, etc.
upset_test(data, intersect, ...)
upset_test(data, intersect, ...)
data |
a dataframe including binary columns representing membership in classes |
intersect |
which columns should be used to compose the intersection |
... |
Arguments passed on to
|
For use together with intersection_size
or intersection_ratio
upset_text_percentage(digits = 0, sep = "", mode = "distinct")
upset_text_percentage(digits = 0, sep = "", mode = "distinct")
digits |
How many digits to show when rounding the percentage? |
sep |
set to space ( |
mode |
region selection mode for computing the numerator in ratio. See |
ggplot2::aes(label=!!upset_text_percentage())
ggplot2::aes(label=!!upset_text_percentage())
List of default themes for upset components
upset_themes
upset_themes
An object of class list
of length 4.