Tuesday, June 4, 2019
Measures of Dispersion
Measures of DispersionSummaryThe pass judgment of rudimentary tendency, as plowed in the previous chapter tells us that about the characteristics of a particular serial publication. They do not describe whatsoever thing on the notes or information ideally. In opposite wards, measures of primordial tendency do not tell any thing about the variations that exist in the data of a particular serial. To make the concept, let discuss an pattern. It was found by using formula of immoral that the fair(a) perspicaciousness of a river is 6 feet. One cannot confidently enter into the river because in nigh places the depth may be 12 feet or it may have 3 feet. Thus this type of interpretation by using the measures of interchange tendency close to times proves to be use little. then the measure of central tendency alone to measure the characteristics of a series of observations is not sufficient to draw a effectual conclusion. With the central comfort one must drive in as t o how the data is distri just nowed. Different sets of data may have the same measures of central tendency but differ greatly in terms of variation. For this knowledge of central survey is not enough to appreciate the nature of diffusion of cheer. Thus there is the emergency of roughly additional measures along with the measures of central tendency which will describe the shell out of the entire set of tax along with the central quantify. One such(prenominal) measure is popularly called as sprinkle or variation. The discover of dispersion will enables us to know whether a series is homogeneous (where all the observations form around the central appreciate) or the observations is heterogeneous (there will be variations in the observations around the central valuate like 1, 50, 20, 28 etc., where the central value is 33). Hence it can be said that a measure of dispersion describes the spread or scattering of the individual values of a series around its central value.Exp erts opine different opinion on why the variations in a distribution argon so important to con locationr? pastime(a) argon some views on validity of the measure of dispersionMeasures of variation provide the tecs some additional teaching about the behaviour of the series along with the measures of central tendency. With this information one can judge the reliability of the value that is derived by using the measure of central tendency. If the data of the series are widely dispersed, the central posture is less representatives of the data as a whole. On the another(prenominal) hand, when the data of a series is less dispersed, the central location is much than(prenominal) representative to the entire series. In other wards, a high degree of variation would suppose little uniformity whereas a low degree of variation would compressed great uniformity.When the data of a series are widely dispersed, it creates practical problems in executing data. Measure of dispersion helps in understanding and tackling the widely dispersed data.It facilitates to get word the nature and cause of variation in order to control the variation itself.Measures of variation enable comparison to be made of two or more series with regard to their variability.DEFINITIONFollowing are some definitions defined by different experts on measures of dispersion. L.R. Connor defines measures of dispersion as dispersion is the measure broaden to which individual items vary. Similarly, Brookes and Dick opines it as dispersion or spread is the degree of the scatter or the variation of the variables about a central value. Robert H. Wessel defines it as measures which fate the spread of the values are called measures of dispersion. From all these definition it is clear that dispersion measures more or less describes the spread or scattering of the individual values of a series around its central value.METHODS OF MEASURING DISPERSIONDispersion of a series of data can be regardd by using fo llowing four widely apply methodsDispersion when measured on priming coat of the difference among two extreme values selected from a series of data. The two head known measures areThe RangeThe Inter-quartile Range or Quartile DeviationDispersion when measured on basis of middling divergence from some measure of central tendency. The well known measures areThe smashed/average differenceThe measurement Deviation andThe Coefficient of variation andThe Gini coefficient and the Lorenz curveAll the tools are discussed in details below one after the other.THE RANGEThe scarper is the simplest measure of the dispersion. The range is defined as the difference amongst the highest value and the lowest value of the series. Range as a measure of variation is having limited applicability. It is widely used for weather forecasting by the meteorological departments. It also used in statistical quality control. Range is a good indicator to measure the fluctuations in price change like tha t of studying the variations in the price of shares and debentures and other related matters. Following is the procedure of calculating rangeRange= value of the highest observation (H) value of the lowest observation (L) or Range = H L payoffs of RangeRange is the simplest of obtaining dispersion.It is easily understandable and can be interpreted easily.It requires fewer times to obtain the variation in the series.Disadvantages of RangeAs it considers only two extreme values, hence it doesnt include all the observations of the series.It fails to tell any thing about the characteristics of a distributionIt is having very limited scope of applicabilityHaving no mathematical treatmentTHE INTER-QUARTILE RANGE OR QUARTILE DEVIATIONA second measure of dispersion is the inter-quartile range which takes into account the mid(prenominal)dle half i.e., 50% of the data thus, avoiding the problem of extreme values in the data. Hence it measures approximately how furthermost from the median on e must go on either side in the first place it can be include one-half the values of the data set. Inter-quartile range can be numberd by dividing the series of observations into four part each part of the series contains 25 percent of the observations. The quartiles are then the highest values in each of these four parts, and the inter-quartile range is the difference between the values of the first and the third quartile. Following are the strides of calculating the inter-quartile rangeArrange the data of the series in ascending order. reckon the first quartile which is denoted as (Q1) by using the formulaIn grapheme of grouped data the first quartile (Q1) can be encipher by using the formula Where N= number of observations in the series i.e., the snapper of frequencies, L = lower limit of the quartile gradation, p.c.f. = commutative frequency prior to the quartile class, f = frequency of the quartile class and i = class interval. Quartile class can be heady by using th e formula. elaborate the third quartile which is denoted as (Q3) by using the formula in case of ungrouped data.In case of grouped data the third quartile (Q3) can be calculated by using the formula Where N= number of observations in the series i.e., the sum of frequencies, L = lower limit of the quartile class, p.c.f. = commutative frequency prior to the quartile class, f = frequency of the quartile class and i = class interval. Quartile class can be determined by using the formula.THE MEAN/AVERAGE DEVIATIONMean/average difference is the arithmetic fee-tail of the difference of a series computed from any measure of central tendency i.e., either recreation from stringent or median or look ons. The living values of each observation are calculated. Clark and Schekade opine mean divergence or average deviations as the average amount of scatter of the items in a distribution from either the mean or the median, ignoring the signs of the deviations. Thus the average that is inte rpreted of scatter is an arithmetic mean, which accounts for the fact that this measure is often called as mean deviation or average deviations.Calculations of Mean Deviation in case of Discrete SeriesIn case of discrete series, mean deviation can be calculated through following stepsThe first step is to calculate the mean or median or mode of the snap offn seriesCompute the deviations of the observations of the series from the calculated mean or median or mode. This deviation is also denoted as capital letter D and is always taken as mod value i.e., ignoring the plus or minus sign. dupe the summation of the deviations (sum of D) and divide it by number of observations (N).In the same way one can calculate mean deviation from median or mode in case of individual series.Calculations of Mean Deviation in case of discrete seriesMean deviation can be calculated in case of discrete series in a little bit different way. Following are some steps to calculate the average mean when the ser ies is discrete.The first step is to calculate the mean or median or mode of the given series by using the formula as discussed in the previous chapter.Compute the deviations of the observations of the series from the calculated mean or median or mode value. This deviation is also denoted as capital letter D and is always taken as mod value i.e., ignoring the plus or minus sign.Multiply the corresponding frequency with each deviation value i.e., calculate f * D.Similarly, one can calculate the mean deviation or average deviation by taking deviations from median or mode.Calculations of Mean Deviation in case of continuous seriesThe first step is to calculate the mean or median or mode of the given series by using the formula as discussed in the previous chapter.In the second step, get the mid values of the observations (m)Compute the deviations of the observations of the series from the calculated mean or median or mode value. This deviation is also denoted as capital letter D = m m ean or median or mode and is always taken as mod value i.e., ignoring the plus or minus sign.Multiply the corresponding frequency with each deviation value i.e., calculate f * D.Take the summation i.e., (sum of D) and divide it by number of observations (N). The formula may beAdvantages of mean deviationThe computation process of mean deviation is based on all the observations of the series.The value of mean deviation is less affected by the extreme items.These are three alternatives available with the researcher while calculating the mean. One can consider the mean or median or mode. Hence it is more flexible in calculation.Disadvantages of mean deviationThe practical usefulness of mean deviation is very less.Mean deviation is not having enough scope for further mathematical calculations.Mod values are considered while calculating the mean deviation. It is criticized by some experts as illogical and unsound.THE STANDARD DEVIATION trite deviation or other wise called as root mean sq uare deviation is the most important and widely used measure of variation. It measures the absolute variation of a distribution. It is the right measure that highlights the spread of the observation over and around the mean value. The greater the rate of variation of observations in a series, the greater will be the value of archetype deviation. A flyspeck value of banal deviation implies a high degree of homogeneity among the observations in the series. If there will be a comparison between two or more bill deviations of two or more series, than it is always advisable to choose that series as ideal one which is having atrophied value of amount deviation. Standard deviation is always measures from the mean or average value of the series. The credit for introducing this concept in the literary works goes to Karl Pearson, a notable statistician. It is denoted by the Greek letter (pronounced as sigma)Standard deviation is calculated in following three different seriesStandard d eviation in case of Individual seriesStandard deviation in case of Discrete seriesStandard deviation in case of Continuous seriesAll the above conditions are discussed in detail below.a. Standard deviation in case of individual seriesIn case of individual series, the value of standard deviation can be calculated by using two methods.Direct method- when deviations are taken from actual meanShort-cut method- when deviations are taken from assumed mean1. Direct method- when deviations are taken from actual meanFollowing are some steps to be followed for calculating the value of standard deviation.The first step is to calculate the actual mean value of the observationIn the following(a) column calculate the deviation from each observation i.e., develop out () where is the mean of the series.In the near column calculate the square value of the deviations and at the end of the column calculate the sum of the square of the deviations i.e., portion out the total value with the number of observations (N) and than square root of the value. The formula will be .Since the series is having individual observations, some times it so happens that there is no need of taking the deviations. In such a case the researcher can straight off calculate the value of the standard deviation. The formula for calculating directly is .2. Short-cut method- when deviations are taken from assumed meanIn practical uses it so happens that while calculating standard deviation by using the arithmetic mean, the mean value may be in some fractions i.e., .25 etc. This creates the real problem in calculating the value of standard deviation. For this purpose, instead of calculating standard deviation by using the above discussed arithmetic mean methods, researchers generally prefer the method of short-cut which is nothing rather calculation of standard deviation by assuming a mean value. Following are some steps that to be followed for calculating standard deviation in case of assumed mean methodT he first step is to assume a value from the X values as mean. This mean value is denoted as A.In the abutting step deviations are to be calculated from this assumed mean as (X-A) and this value is denoted as D.At the end of the same column, the sum of D () is to be calculated. weigh the square of each observation of D i.e., calculate.The following formula is to be used to calculate standard deviation of the series. where N is the number of observations in the series.b. Standard deviation in case of discrete seriesDiscrete series are the series which are having some frequencies or repetitions of observations. In case of a discrete series standard deviation is calculated by using following three methodswhen deviations are taken from actual meanwhen deviations are taken from assumed meanFollowing are the detailed analysis of the above the two methods.1. When deviations are taken from actual mean The steps to calculate standard deviation when deviations are calculated from the actual m ean areThe first step is to calculate the actual mean value of the observationIn the near column calculate the deviation from each observation i.e., find out () where is the mean of the series, this can be denoted as D.In the next column calculate the square value of the deviations and at the end of the column calculate the sum of the square of the deviations i.e.,Multiply corresponding frequencies of each observation with the value of D2 in the next column.Divide the total value with the number of observations (N) and than square root of the value. The formula will be2. When deviations are taken from assumed mean The steps to calculate standard deviation when deviations are calculated from the actual mean areThe first step is to assume a mean value from the observationsIn the next column calculate the deviation from each observation i.e., find out () where A is the mean of the series, this deviation can be denoted as D.In the next column calculate the square value of the deviation s and at the end of the column calculate the sum of the square of the deviations i.e.,Multiply corresponding frequencies (f) of each observation with the value of D2 in the next column.Use the following formula to calculate standard deviationc. Standard deviation in case of Continuous seriesStandard deviation in case of a continuous series can be calculated by using the following stepsCalculate the mid value of the series and denote it as m.Assume any value from the mid values and denote it as ADeviations can be calculated from each series i.e., calculate m A and than divide it with the class interval value (i) i.e.,Multiply the corresponding frequencies of each observation with the deviation value and take the sum at the end of the column i.e., calculateIn the next column square the deviation values of each observation i.e., calculateMultiply the value of with its frequencies i.e., calculateUse the following formula to get standard deviation.Properties of standard deviationAs tool of variance, standard deviation is used as a good measure of interpretation of the scatteredness of observation of a series. It is a fact that in a normal distribution approximately 68 per cent of the observations of a series lies less than standard deviation away from the mean, again approximately 95.5 per cent of the items lie less than 2 standard deviation value away from the mean and in the same way 99.7 per cent of the items lie within 3 standard deviations away from the mean. Hencecovers 68.27 per cent of the items in a series with normal distribution.covers 95.45 per cent of the items in a series with normal distribution andcovers 99.73 per cent of the items in a series with normal distribution.Advantage of Standard Deviation Following are some advantages of standard deviation as a measure of dispersionThis is the highest used technique of dispersion.It is regarded as a very satisfactory measure of the dispersion of a series.It is capable of further mathematical calculations .Algebraic signs are not ignored while measuring the value of standard deviation of a series.It is less affected by the extreme observations of a series.The coefficients make the standard deviation very popular measure of the scatteredness of a series.Disadvantages of standard deviation The disadvantages areIt is not easy to understand the concept easily and quickly.It requires a good exercise to calculate the values of standard deviation.It gives more weight to observations which are away from the arithmetic mean.THE COEFFICIENT OF VARIATIONAnother useful statistical tool for measuring dispersion of a series is coefficient of variation. The coefficient of variation is the congeneric measure of standard deviation which is an absolute measure of dispersion. This tool of dispersion is mostly used in case of comparing the variability two or more series of observation. While comparing, that series for which the value of the coefficient of variation is greater is said to be more variabl e (i.e., the observations of the series are less consistent, less uniform, less stable or less homogeneous). Hence it is always advisable to choose that series which is having less value of coefficient of variation. The value of coefficient is less implies more consistent, more uniform, more stable and of course more homogeneous. The value of coefficient of variation is always measured by using the value of standard deviation and its relative arithmetic mean. It is denoted as C.V., and is measured by using simple formula as discussed belowIn practical field, researchers generally prefer to use standard deviation as a tool to measure the dispersion than that of coefficient of variance because of a numbers of reasons (researchers are advised to refer any standard statistics book to know more on coefficient of variance and its usefulness).GINI COEFFICIENT AND THE LORENZ CURVEAn illuminating manner of viewing the Gini coefficient is in terms of the Lorenz curve due to Lorenz (1905). It is generally defined on the basis of the Lorenz curve. It is popularly known as the Lorenz ratio. The most universal definition of the Gini coefficient is in terms of the Lorenz diagram is the ratio of the field of operations between the Lorenz curve and the line of equality, to the area of the triangle OBD below this line (figure-1). The Gini coefficient varies between the limits of 0 (perfect equality) and 1 (perfect inequality), and the greater the departure of the Lorenz curve from the diagonal, the larger is the value of the Gini coefficient. Various geometrical definitions of Gini coefficient discussed in the literature and useful for different purposes are examined here.CONCLUSIONSThe study of dispersion will enables us to know whether a series is homogeneous (where all the observations remains around the central value) or the observations is heterogeneous (there will be variations in the observations around the central value Hence it can be said that a measure of dispersio n describes the spread or scattering of the individual values of a series around its central value. For this there are a numbers of methods to determine the variations as discussed in this chapter. But it is always confusing among the researchers that which method is the best among the different techniques that we have discussed? The answer to this question is very simple and says that no single average can be considered as best for all types of data series. The most important factors are the type of data available and the purpose of investigation. Critiques suggest that if a series is having more extreme values than standard deviation as technique is to be avoided. On the other hand in case of more skewed observations standard deviation may be used but mean deviation needs to be avoided where as if the series is having more gaps between two observations than quartile deviation is not an appropriate measure to be used. Similarly, standard deviation is the best technique for any purp ose of data.SUMMARYThe study of dispersion will enables us to know whether a series is homogeneous (where all the observations remains around the central value) or the observations is heterogeneous (there will be variations in the observations around the central value).Dispersion when measured on basis of the difference between two extreme values selected from a series of data. The two well known measures are (i) The Range and (ii) The Inter-quartile Range.Dispersion when measured on basis of average deviation from some measure of central tendency. The well known measures are (i) The Mean/average deviation, (ii) The Standard Deviation, (iii) The Coefficient of variation and (iv) The Gini coefficient and the Lorenz curveThe range is defined as the difference between the highest value and the lowest value of the series. Range as a measure of variation is having limited applicability.The inter-quartile range measures approximately how far from the median one must go on either side befo re it can be include one-half the values of the data set.Mean/average deviation is the arithmetic mean of the difference of a series computed from any measure of central tendency i.e., either deviation from mean or median or mode. The absolute values of each observation are calculated.A small value of standard deviation implies a high degree of homogeneity among the observations in the series. If there will be a comparison between two or more standard deviations of two or more series, than it is always advisable to choose that series as ideal one which is having small value of standard deviation.Standard deviation is always measures from the mean or average value of the series.The coefficient of variation is the relative measure of standard deviation which is an absolute measure of dispersion. This tool of dispersion is mostly used in case of comparing the variability two or more series of observation.The most common definition of the Gini coefficient is in terms of the Lorenz diagr am is the ratio of the area between the Lorenz curve and the line of equality, to the area of the triangle below the equality line.IMPORTANT QUESTIONS1. eld of ten students in a class is considered. Find the mean and standard deviation. 19, 21, 20, 20, 23, 25, 24, 25, 22, 26The following table derives the marks obtained in Statistics paper by snow students in a class. Calculate the standard deviation and mean deviation.The monthly profits of 150 shop keepers selling different commodities in a city footpath is derived below. Calculate the mean, mean deviation and standard of the distribution.The daily wage of 160 labourers working in a cotton mill in Surat cith is derived below. Calculate the range, mean deviation and standard of the distribution.Calculate the mean deviation and standard deviation of the following distribution.What do you mean by measure of dispersion? How far it helpful to a decision-maker in the process of decision making?Define measure of Dispersion? Among the various tools of dispersion which tool according to you is the best one, give suitable reason of your answer.What do you mean by measure of dispersion? Compare and contrast various tools of dispersion by pointing out their advantages and disadvantages.Discuss with example the relative merits of range, mean deviation and standard deviation as measures of dispersion.Define standard deviation? Why standard deviation is more useful than other measures of dispersion?The data derived below shows the ages of 100 students pursuing their master degree in economics. Calculate the Mean deviation and standard deviation.Following is the results of a study carried out to determine the number of mileage the marketing executives drove their cars over a 1-year period. For this 50 marketing executives are sampled. Based on the findings, calculate the range and inter-quartile range.In an enquiry of the number of days 230 patients chosen randomly stayed in a Government hospital following after operatio n. On the basics of observation calculate the standard deviation.Cars sold in small car segment in November 2009 at 10 Maruti Suzuki dealers in Delhi city is explained below. Compute the range, mean deviation and standard deviation of the data series.Following is the daily data on the number of persons entered through main gate in a month to institute. Calculate the range and standard deviation of the series.Calculate the range and coefficient of range of a group of students from the marks obtained in two papers as derived belowFollowing are marks obtained by some students in a class-test. Calculate the range and coefficient of range.By using the direct and indirect method, calculate the mean deviation by using both arithmetic mean and mode from the following data set which is related to age and numbers of residents of Vasundara apartment, Gaziabad.A local geezer shaper at Greater Noida has developed a new and chief variety of geezers which are meant of lower and middle income hous eholds. He carried out a survey in some apartments asking the expectations of the customers that they are ready to invest on purchase of geezer. Calculate the standard deviation of the series.Calculate median of the following distribution. From the median value calculate the mean deviation and coefficient of mean deviation.Calculate median of the following distribution. From the median value calculate the mean deviation and coefficient of mean deviation.Calculate the arithmetic average and standard deviation from the following daily data of rickshaw puller of Hyderabad City.From the students of 250 candidates the mean and standard deviations of their total marks were calculated as 60 and 17. Latter in the process of verification it is found that a score 46 was misread 64. Recalculate the correct mean and standard deviation.The wage twist paid on daily basis of two cotton factories are derived below. In order to show the inequality, draw the Lorenz curve.Total marks obtained by the students in two sections are derived below. By using the data draw a Lorenz curve.Draw the Lorenz curve of the following data.Find the range and co-efficient of range for the following data set.The height of 10 firemen working in a fire station are 165, 168, 172, 174, 175, 178, 156, 158, 160, 179 cms. Calculate the range of the series. Now let that the tallest and the shortest firemen are get transform from the fire station. Calculate the range of the new firemen. What percentage change is found in the earlier range and the latter range?Calculate the quartile deviation from the following derived data.Calculate the interquartile range, quartile deviation and its coefficient for the following data series.Calculate the mean deviation from the following data.Calculate the mean deviation from median and mean for the following series.The distribution derived below reveals the difference in age between husband and wife in a community. Based on the data, calculate mean deviation and standar d deviation.Calculate th
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.