r - Discrepancies in median values between dplyr table and ggplot2 boxplot -
this question has answer here:
i'm summarising data , different median values in table created dplyr package , boxplot (ggplot2). sample data can found here :
dplyr table
library(dplyr) library(ggplot2) sample2 = read.csv("sample2.csv") sample2 %>% group_by(category) %>% summarise(median_avg=median(avg_value), median_total = (median(total_value)))
the result 307 3+ category
# tibble: 3 × 3 category median_avg median_total <chr> <dbl> <dbl> 1 1 17.500 37.07 2 2 16.830 117.48 3 3+ 17.375 306.95
however, when try visualise in boxplot, different median 3+ category, below 200:
boxplot
sample2 %>% ggplot(aes(category, total_value)) + geom_boxplot() + scale_y_continuous(limits = c(0,500))
i tried using dummy data , there's no discrepancy between table , boxplot, ideas causes problems in particular dataset? help! ideas
Comments
Post a Comment