The tidy data sheet df is used as an input for the plot that shows the distribution:
In [3]:
ggplot(df, aes(x=Size, fill=Sample)) +geom_histogram(bins =100, alpha=.8, color='grey40') +# scale_x_log10() +labs(y="Count", x="Size [µm]") +# coord_cartesian(xlim = c(0.5,120)) +theme_light(base_size =16) +theme(axis.text.y =element_blank()) +facet_wrap(~Sample) +theme(legend.position ="none") +#Force the y-axis to start at zeroscale_y_continuous(expand =c(0, NA), limits =c(0,150)) +#Apply a logarithmic scale to the x-axis and set the numbers for the scalescale_x_log10(breaks =c(1,10,100), limits =c(.5,200)) +#Remove minor gridlinestheme(panel.grid.minor =element_blank()) +#Add ticks to the bottom, outsideannotation_logticks(sides="b", outside =TRUE) +#Give a little more space to the log-ticks by adding margin to the top of the x-axis texttheme(axis.text.x =element_text(margin =margin(t=8))) +#Needed to see the tcks outside the plot panelcoord_cartesian(clip ="off")
Distribution of the measured size of human cheek cells and their nucleus on a log scale. Data from three years.
The size data of the Cells is selected and used to plot the distributions as a violin plot per group. The median value is indicated as a black dot:
In [4]:
df_cell <- df %>%filter(Sample =="Cell") p <-ggplot(df_cell, aes(x=Group, y=Size, fill=Group)) p <- p +geom_violin() +stat_summary(fun = median, geom ="point") p <- p +scale_y_log10() p <- p +labs(x="Group", y="Size [µm]") p <- p +coord_cartesian(ylim =c(0.5,200)) p <- p +theme_light(base_size =16) p <- p +theme(legend.position ="none")p