Nathan's Hot Dog Eating Contest

By Ajinkya Shinde in R

June 18, 2018

Foreward

Nathan’s Hot Dog Eating Contest is an annual competetion to hunt down the only “one” who gobbles down most of the Nathan’s hot dog under ten minutes.

This blog reconstructes the analysis of Nathan’s Hot Dog Contest and talks about why line charts is more appropriate than time series charts.

1. Relevant R Code

"%||%" <- function(a, b) {
  if (!is.null(a)) a else b
}

geom_flat_violin <- function(mapping = NULL, data = NULL, stat = "ydensity",position = "dodge", trim = TRUE, scale = "area",
  show.legend = NA, inherit.aes = TRUE, ...) {
  layer(
    data = data,
    mapping = mapping,
    stat = stat,
    geom = GeomFlatViolin,
    position = position,
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      trim = trim,
      scale = scale,
      ...
    )
  )
}

#' @rdname ggplot2-ggproto
#' @format NULL
#' @usage NULL
#' @export
GeomFlatViolin <-
  ggproto("GeomFlatViolin", Geom,
          setup_data = function(data, params) {
            data$width <- data$width %||%
              params$width %||% (resolution(data$x, FALSE) * 0.9)
# ymin, ymax, xmin, and xmax define the bounding rectangle for each group
            data %>%
              group_by(group) %>%
              mutate(ymin = min(y),
                     ymax = max(y),
                     xmin = x,
                     xmax = x + width / 2)
          },
          
          draw_group = function(data, panel_scales, coord) {
            # Find the points for the line to go all the way around
            data <- transform(data, xminv = x,
                              xmaxv = x + violinwidth * (xmax - x))
            
            # Make sure it's sorted properly to draw the outline
            newdata <- rbind(plyr::arrange(transform(data, x = xminv), y),
                             plyr::arrange(transform(data, x = xmaxv), -y))
            
            # Close the polygon: set first and last point the same
            # Needed for coord_polar and such
            newdata <- rbind(newdata, newdata[1,])
            
            ggplot2:::ggname("geom_flat_violin", GeomPolygon$draw_panel(newdata, panel_scales, coord))
          },
          
          draw_key = draw_key_polygon,
          
          default_aes = aes(weight = 1, colour = "grey20", fill = "white", size = 0.5,
                            alpha = NA, linetype = "solid"),
          
          required_aes = c("x", "y")
)
ggplot(hot_dogs, aes(y = num_eaten, x = gender)) + 
  geom_flat_violin(alpha=.5,fill = "#7570b3",
                colour = NA,
                na.rm = TRUE)+  
geom_dotplot(aes(fill=gender),color="#695250",binaxis = "y", position = "dodge",dotsize=0.75, stackratio= 1.1)+
  labs(y="Number Of Hot Dogs Eaten",x="Gender",title="Distribution of Nathan's Hot Dog across gender")+
  scale_fill_brewer(type="qual",palette = 2,direction = -1)+
   theme_minimal()+theme(text=element_text( family="Palatino Linotype"),legend.position="none")

2. Description of the TYPE of graph.

The graph is a dot plot to measure the spread/distribution of hot dogs eaten during Nathan’s hot dog contest across the gender

3. Description of the DATA you used

The data refers to the nathan’s hotdog contest dataset. In this graph, the final data frame is hot_dogs. The quantitative variable is stored in num_eaten i.e. number of hot dogs eaten.The qualitative variable is the gender who ate the hot dogs

4. Description of the AUDIENCE you are aiming for

The audience over here is someone who wants to see how each gender performed in the contest

5. Representation Description:

This graph shows the spread of hot dogs eaten across genders in the dataset

6. How to read it & What to look for

On x-axis lies the gender and on y-axis lies the number of hot dogs. This is not the same as traditional histogram where the distribution is measured on x-axis but here the co-ordinates are flipped using bin_axis=y

Major Highlights It seems that male have roughly extreme eating(either too low/high) habits as compared to the female who perform quite well since the data is cluttered around the mean of hot dogs ate by female.

7. Presentation tips

Since, gender is a nominal variable ,have used the discrete color palette from ColorBrewer with type="qual".

8. Variations and alternatives

Could have used violin plot as an alternative.

9. How I created it

I formulated the question of what exactly I want: I wanted to see how each gender performed in the hot dog contest.So I just used the variables gender and num_eaten to see the histogram. Then used two ways to represent distribution (dotplot,flat-violin-split).

Posted on:
June 18, 2018
Length:
3 minute read, 627 words
Categories:
R
Tags:
ggplot2 tidyverse
See Also:
MoMA Tour
US GDP Analysis