dhist.Rd
When constructing a histogram, it is common to make all bars the same width. One could also choose to make them all have the same area. These two options have complementary strengths and weaknesses; the equal-width histogram oversmooths in regions of high density, and is poor at identifying sharp peaks; the equal-area histogram oversmooths in regions of low density, and so does not identify outliers. We describe a compromise approach which avoids both of these defects. We regard the histogram as an exploratory device, rather than as an estimate of a density.
dhist(
x,
a = 5 * iqr(x),
nbins = grDevices::nclass.Sturges(x),
rx = range(x, na.rm = TRUE),
eps = 0.15,
xlab = "x",
plot = TRUE,
lab.spikes = TRUE
)
is a numeric vector (the data)
is the scaling factor, default is 5 * IQR
is the number of bins, default is assigned by the Stuges method
is the range used for the left of the left-most bin to the right of the right-most bin
used to set artificial bound on min width / max height of bins as described in Denby and Mallows (2009) on page 24
is label for the x axis
= TRUE produces the plot, FALSE returns the heights, breaks and counts
= TRUE labels the % of data in the spikes
list with two elements, heights of length n and breaks of length n+1 indicating the heights and break points of the histogram bars.
Lorraine Denby, Colin Mallows. Journal of Computational and Graphical Statistics. March 1, 2009, 18(1): 21-31. doi:10.1198/jcgs.2009.0002.