![what is a box and whisker plot used for what is a box and whisker plot used for](https://miro.medium.com/max/1400/1*2c21SkzJMf3frPXPAR_gZA.png)
We will try to understand the distribution of this data and try to find some insights out of it. We have data on different house prices in 5 different areas of Bangalore. Any results of data that fall outside of the minimum and maximum values known as outliers are easy to determine on a box plot graph. By extending the lesser and greater data values to a max of 1.5 times the inter-quartile range, the box plot delivers outliers or obscure results. There might be one outlier or multiple outliers within a set of data, which occurs both below and above the minimum and maximum data values. At a glance, a box plot allows a graphical display of the distribution of results and provides indications of symmetry within the data.Ī box plot is one of very few statistical graph methods that show outliers. It is particularly useful for quickly summarizing and comparing different sets of results from different experiments. Use a box plot in combination with another statistical graph method, like a histogram, for a more thorough, more detailed analysis of the data.Ī box plot is a highly visually effective way of viewing a clear summary of one or more sets of data. A box plot shows only a simple summary of the distribution of results so that you can quickly view it and compare it with other data. The box plot does not keep the exact values and details of the distribution results, which is an issue with handling such large amounts of data in this graph type. Organizing data in a box plot by using five key concepts is an efficient way of dealing with large data too unmanageable for other graphs, such as line plots or stem and leaf plots. A box plot consists of the median, which is the midpoint of the range of data the upper and lower quartiles, which represent the numbers above and below the highest and lower quarters of the data and the minimum and maximum data values. The most feasible option will be 65 as the minimum value of the box plot.ĭue to the five-number data summary, a box plot can handle and present a summary of a large amount of data. Here the smallest value is 0.005 but it is most likely to be an outlier and hence the box plot will not mark this as the minimum value. This point does not correspond to the smallest value in your dataset. This point in the box plot represents the lowest value in the data distribution over which the box plot is built and is not an outlier (smallest value in the Interquartile range of the distribution).
![what is a box and whisker plot used for what is a box and whisker plot used for](https://www.intellspot.com/wp-content/uploads/2018/01/box-and-whisker-plot-example.png)
The most feasible option will be 105 as the maximum value of the box plot. Here the largest value is 100000 but it is most likely to be an outlier and hence the box plot will not mark this as the maximum value. This point does not correspond to the highest value in your dataset. This point in the box plot represents the highest value in the data distribution over which the box plot is built which is not an outlier. The IQR is often seen as a better measure of a spread than the range (highest value-lowest value) as it is not affected by outliers. It is the difference between the lower quartile and upper quartile. The Interquartile range is from Q1 to Q3. It can be also seen as the middle value of the upper half. It splits lowest 75% (or highest 25%) of data. Upper quartile is also known as the third quartile. The lower quartile is the middle value of the lower half. Each group represents the one-fourth of the data set. Quartiles are three points that divide the data set into four equal groups. The lower quartile is also known as the first quartile, splits the lower 25% of the data. If the number of values is even, the median is computed by averaging the two numbers closest to the middle. When the set contains an odd number of values, the median value is exactly in middle. Value or quantity that falls halfway between a set of values arranged in an ascending or descending order. Let us understand these 5 components of the box plot