International Development homework help. 221 Week MATH 1 Student Name Statistics for Decision – Making Lecture Notes The Preliminaries – Statistics for a Modern World What is statistics? Statistics is the branch of mathematics that deals with gathering, tabulating and presenting data or information and rendering a decision based upon the presented data / information. In essence, whenever data or information is collected, a statistical process is created. Who uses statistics? Statistical processes are a daily part of businesses, the government, and everyday people at home, at school and at work. What are the two major categories of statistics? The field of statistics is typically divided into two major categories: descriptive statistics and inferential statistics. Descriptive statistics: descriptive statistics is the part of statistics that deals with the gathering, tabulating and presentation of data / information. Inferential statistics: inferential statistics deals with rendering a decision based upon the presented summarized data. What are the types of data? Statistics typically means working with data. The major types of data include ordinal, nominal, interval and ratio data. Ordinal data is data that can be arranged in numerical order. An example would be arranging, in decreasing order, the test scores of students from a math class. Another example would be arranging, in numerical order, the hourly temperature readings for a certain city. Still another example is arranging shoe sizes, in decreasing order. Nominal data is data where numbers have meaning “in name only”. That is, a number is used to classify and therefore not used as an amount or value. An example would be classifying income tax return filing status as: a “1” stands for single filing status, a “2” stands for a married filing status, a “3” stands for a married filing separate filing status, a “4” stands for head of household status and a “5” stands or qualifying widow(er). Another example is scaling information such that a 1 means “likes very much”, 2 means “likes just a little” and 3 means “totally dislikes.” Ratio data is data that can be described as a ratio or fraction. That is one quantity can be compared with another. An example of ratio data is a baseball players “slugging percentage” which is a comparison of the number of “slugging” type hits divided by the total number of times at bat. Another example is a students grade point average or GPA, which compares the students total grade points to the total number of credit hours earned. Interval data is data that can be described by a particular numerical interval, which can be obtained by taking the difference of successive numbers. An example of interval data is recording the postal charge for 10 random customers each day for five successive days and then noting the differences in their average purchases from day to day to check if the average lies within a certain predefined interval. Copyright 2002 by P.E.P. 1 MATH 221 Week 1 Student Name Statistics for Decision – Making Lecture Notes The Preliminaries – Statistics for a Modern World Summarizing data One way to summarize data is to present the data in the form of a frequency distribution table. A frequency distribution table is a statistical construction that lists data according to individual classes. Such a table is useful in analyzing which class or group of data had the least or greatest activity. Example The test scores of a physics class are listed below. In the frequency distribution table that follows, arrange these scores according to their individual classes by taking a data value, determining its class and then entering a check mark into the appropriate row of the Tally column. Total the number of check marks for each row and enter this total in the Frequency column. 97 48 50 68 57 59 98 60 45 59 63 46 92 73 51 80 80 76 61 91 73 91 98 67 91 73 21 12 43 83 Class 00 – 010 11 – 020 21 – 030 31 – 040 41 – 050 51 – 060 61 – 070 71 – 080 81 – 090 91 – 100 Tally Frequency 35 81 80 80 19 37 Was there, in this example, one class of scores that had a greater activity than other classes? Stem and leaf displays Another popular way to summarize a group of data values is to use a stem and leaf display. To use a stem and leaf display, each given number is separated into two parts, a stem and a leaf. A number such as 35, for example, has a stem of 3 and a leaf of 5. The number 91 has a stem of 9 and a leaf of 1. Example The following stem and leaf display is equivalent to the data array below. 10 12 20 21 25 26 31 35 38 44 45 47 49 Stem Leaf 0 2 1 0 1 5 6 2 1 5 8 3 4 5 7 9 4 Stem and leaf displays are useful in locating particular classes of numbers such as those that have the least or most activity as well as locating the center – most number of the entire data set, referred often as the median value, of the data set. Copyright 2002 by P.E.P. 2 MATH 221 Week 1 Student Name Statistics for Decision – Making Lecture Notes The Preliminaries – Statistics for a Modern World Exercise 1 A shopping mall clerk is collecting data from mall patrons by asking them whether or not they use a particular laundry detergent. This type of statistical process of gathering data is best classified as _____________ . (a) Descriptive Statistics (b) Inferential Statistics A student calculates her GPA ( grade point average ) by taking the quotient of her total grade points divided by her total credit hours. This type of data is best classified as _______________ data. (a) ratio (b) interval (c) nominal (d) ordinal A fast food restaurant lists meal special # 1 as a cheeseburger, medium French fries and a medium soft drink, meal special # 2 as a gyros sandwich, medium French fries and a medium soft drink and meal special # 3 as a pizza slice with a large soft drink. This type of data is best classified as _______________ data. (a) ratio (b) interval (c) nominal (d) ordinal A payroll manger arranges employee hourly wages in descending order. This type of data is best classified as _______________ data. (a) ratio (b) interval (c) nominal (d) ordinal For the frequency distribution table given below, total the tally marks for each class to arrive at the class frequency and then answer each of the questions which follow. Class 00 – 020 21 – 040 41 – 060 61 – 080 81 – 100 (a) (b) (c) / / / / / Tally ///// /// / //// /////// Frequency Exercise 2 Exercise 3 Exercise 4 Exercise 5 True or False? Exactly 13 data values are greater than 60. True or False? Exactly 18 data values are equal to 81 or less. True or False? Exactly 7 data values must be in the range of 10 or more but not more than 40. Exercise 6 For the frequency distribution table given below, total the tally marks for each class to arrive at the class frequency and then, for each class calculate the relative frequency by dividing the class frequency by the total overall number of tally marks. / / / / / / / / / / / / / / / Tally / //// / //// Frequency Relative Frequency Class 00 – 020 21 – 040 41 – 060 61 – 080 81 – 100 Copyright 2002 by P.E.P. 3 MATH 221 Week 1 Student Name Statistics for Decision – Making Lecture Notes The Preliminaries – Statistics for a Modern World Exercise 7 The stem – and – leaf display given below is equivalent to the ordered set of numbers 2 , 6 , 11 , 12 , 13 , 13 , 20 , 25 , etc. Determine the midpoint or center – most value of this ordered data array. Stem 0 1 2 3 4 5 6 7 8 9 Leaf 2 1 0 0 8 0 1 3 5 1 6 2 5 3 8 0 6 5 1 3 7 4 4 9 6 3 5 6 7 8 9 7 7 8 9 One method to determine the midpoint or center – most value of this ordered data array is outlined as follows: (1) count the number of leaves; (2) if the total number of the leaves is an odd number, subtract one from the count, if the total number of the leaves is an even number, subtract two from the count; (3) divide your result from step (2) by two; (4) starting from either the top or the bottom of the stem – and – leaf display, and in numerical order, count the number of leaves until you arrive at your result from step (3), if the original count of the leaves was odd, the next leaf will then be the center – most value, if the original count of the leaves was even, the average o
f the next two leaves will be the center most value. Graphical Presentation of Data The various ways to present data in the form of a graph include: pie charts, bar charts, line graphs, dot plots and histograms. Pie Charts 2, 10% 4, 20% 8, 40% 6, 30% Figure 1 A pie chart is useful to show the relation of a “part out of the total.” That is, it graphically illustrates the size relation of the various classes of data. For example, the chart, shown in Figure 1, represents the following data grid, which shows the number of employees receiving a particular salary. Since the total number of employees is 20, the percent, or part of the total, can be determined by dividing the given salary amount by the number of employees earning that salary. Number Salary 4 $ 5,000 6 $ 2,200 8 $ 3,800 2 $ 9,000 Copyright 2002 by P.E.P. 4 MATH 221 Week 1 Student Name Statistics for Decision – Making Lecture Notes The Preliminaries – Statistics for a Modern World Exercise 8 Create a pie chart from the given information. A group of students were survey as to their most favorite type of music and their responses are summarized below. Use the space to provided display your chart. Type of Music Classical Gospel New Age Rock Other Number of Students 06 12 15 33 18 Student’s Music Survey Bar Charts A bar chart is generally useful to show a “running total”. That is, it graphically illustrates which class or item of data has the most activity. A Pareto diagram is a special type bar graph with the bars arranged from the most numerous category to the least numerous category. This type of bar graph includes a line graph which displays the cumulative percentages and counts for the bars. Example Create a bar chart from the given information. A group of students were surveyed as to how often they study their college course work per week. Their responses are summarized below. Student’s Study Habits Survey Hours of Study Time 0 020 2 040 4 060 6 080 8 100 Number of Students 12 28 20 10 05 Copyright 2002 by P.E.P. 5 MATH 221 Week 1 Student Name Statistics for Decision – Making Lecture Notes The Preliminaries – Statistics for a Modern World The bar chart ( vertical – type bars ), which follows, shows the time category on the vertical axis versus the frequency ( number of students ) on the horizontal axis. 30 25 20 15 10 5 0 02 24 46 Study Time 68 8 10 Exercise 9 Copyright 2002 by P.E.P. Students Create a vertical bar chart from the given information. A survey of forty college students revealed the following information about their favorite ice cream flavor. Butter Pecan, Chocolate, Butter Pecan, Strawberry, Vanilla, Strawberry, Chocolate Fudge, Neapolitan, Strawberry, Chocolate, Vanilla, Strawberry, Vanilla, Strawberry, Vanilla, Chocolate, Chocolate Fudge, Neapolitan, Neapolitan, Vanilla, Neapolitan, Butter Pecan, Chocolate Fudge, Strawberry, Neapolitan, Strawberry, Vanilla, Neapolitan, Vanilla, Chocolate, Vanilla, Butter Pecan, Strawberry, Neapolitan, Chocolate Fudge, Strawberry, Strawberry, Vanilla, Neapolitan, Chocolate Fudge First create a frequency distribution table to classify, according to ice cream flavor, the choices of each of the students. Let the vertical axis represent the class frequencies and let the horizontal axis represent the individual flavors, arranged in alphabetical order from left to right. 6 MATH 221 Week 1 Student Name Statistics for Decision – Making Lecture Notes The Preliminaries – Statistics for a Modern World Lines Charts A line chart is generally useful to show “trends over a period of time”. That is, it graphically illustrates the change in value of some variable. Temperature 60 50 40 30 20 y M on da ne sd ay rs da y sd ay Fr id ay A weatherman’s Five – Day Forecast usually uses a line chart to show how the temperature of a city will vary over a given five – day period. Tu e Dot Plots A dot plot is a type of graphical representation of data and is used to show how many data values belong to a certain class by the number of “dots” in the column. An example of a dot plot is shown below. This example shows the number of packages, which weighed from 1 to 10 ounces, that were sent from a mailroom. 4 6 1 2 3 W ed Th u 5 7 8 9 10 Exercise 10 In a twenty – minute period of time, the numbers of customers who enter per hour into a certain electronics retail superstore are listed below. The data was collected over a thirty day period during the time period 5:00 pm to 6:00 pm. 50 35 92 45 48 72 08 99 39 91 80 09 18 94 73 32 83 81 85 07 19 78 34 71 51 65 73 60 75 14 Using the space below, construct a dot plot based upon the above data. 0 0 19 20 29 30 39 40 49 50 59 60 69 70 79 80 89 90 99 Copyright 2002 by P.E.P. 7 MATH 221 Week 1 Student Name Statistics for Decision – Making Lecture Notes The Preliminaries – Statistics for a Modern World Histograms One of the most common types of graphical presentation of a frequency distribution is the histogram or ” history graph “. A histogram is a type of column chart such that the columns are ” fused ” together. An example of a histogram is shown below. The vertical axis represents the frequency of the class and the horizontal axis represents the class. Note that the class marks, defined below, 4.5 , 14.5 , etc. , are used to identify the individual histogram bars. 110 100 90 80 70 60 50 40 30 20 10 0 4.5 14.5 24.5 34.5 44.5 The above histogram corresponds to the following data: Class 00 090 10 190 20 290 30 390 40 490 Frequency 090 070 110 090 060 The lowest class, the one with the range of values from 0 through 9, has the following class boundaries: a lower class limit of 0 and an upper class limit of 9. Since the sum between the lower and the upper class limit is 0 + 9 or 9, the class mark is this sum divided by 2 or 4.5 . The class mark for the next class similarly is ( 19 + 10 ) / 2 or 14.5 . By convention instead of using the class boundaries, i.e. the upper and lower limits of the class, the class mark is used to identify the individual histogram bars. A histogram is constructed using the following steps: STEP 1 STEP 2 STEP 3 Represent the measurements or observations, which are grouped, on the horizontal scale of a graph. Represent the class frequencies on the vertical scale of the graph. Draw rectangles whose bases are equal to the class intervals and whose heights are determined by the respective class frequencies. Copyright 2002 by P.E.P. 8 MATH 221 Week 1 Student Name Statistics for Decision – Making Lecture Notes The Preliminaries – Statistics for a Modern World Exercise 11 The following are the scores that 30 students obtained on a computer science test. 71 72 79 85 39 58 91 48 63 93 39 98 48 77 74 81 84 70 18 89 63 92 90 31 38 82 47 51 75 92 Construct a histogram with classes 0 – 9, 10 – 19, 20 – 29, . . . , 90 – 99. Use the space below to construct the histogram. Creating a Histogram with MS Excel To construct this same histogram using MS Excel use the steps outlined below. STEP 1 In a blank Excel worksheet, type the word Class in cell A1 and type the word Frequency in cell B1. Enter the 10 classes of the data 0 to 9, 10 to 19, 20 to 29, . . . , 90 to 99 in cells A2 through A11, respectively. Enter the class frequencies in worksheet cells B2 through B11, respectively. Highlight the cell range A2 : B11 . Chart Wizard icon Open the Excel Chart Wizard by either clicking the or by clicking Insert on the main title bar and then selecting Chart . When the Chart wizard opens proceed as follows: Chart Wizard Step 1: Select the Column chart type, click the STEP 2 chart subtype shown at the right, button. and click the Next > Copyright 2002 by P.E.P. 9 MATH 221 Week 1 Student Name Statistics for Decision – Making Lecture Notes The Preliminaries – Statistics for a Modern World Chart Wizard Step 2: Chart Wizard Step 3: Chart Wizard Step 4: At this point, click the Next > button. At this point, click the Next > button. At this point, click the Finish button. Once your chart has appeared, use your mouse to move the chart directly to the right of your source data. STEP 3 Right click your mouse on one of the vertical bars of your chart object. Select the menu item Format Data Series… . When the Format Data Series di
alog box opens, click the Options tab and change the Gap Width to 0 . Click OK to close the dialog box. Your chart object should now appear as a convention Histogram. Shapes of Histograms The various types of histograms are described below: Symmetrical: Both sides of the distribution are identical. ( the halves are mirror images ) This type of distribution is often referred to as a normal distribution since the data gathers around the center and becomes sparse at the extremes. Uniform ( Rectangular ): Every value appears with equal frequency. Skewed ( to the right ): One tail is stretched longer than the other. The direction of the skewness is one the side of the longer tail. Skewed ( to the left ): One tail is stretched longer than the other. The direction of the skewness is one the side of the longer tail. Copyright 2002 by P.E.P. 10 MATH 221 Week 1 Student Name J – Shaped: Statistics for Decision – Making Lecture Notes The Preliminaries – Statistics for a Modern World There is no tail on the side of the class with the highest frequency. Bimodal: One or more classes separates the two most populous classes. This type of distribution often implies that two populations are being sampled. Exercise 12 A wholesalers daily shipment of trousers varied from 1,152 to 9,888 pairs per day. Indicate the limits of nine classes into which these shipments might be grouped. Exercise 13 The selling prices of 30 homes listed for sales in a community vary from $ 68,950 to $ 194,900 . Write the class limits and also the class boundaries with seven equal classes into which these prices might be grouped. Exercise 14 The class marks of a distribution of the number of electric light bulbs replaced daily in a large corporate office center are 5 , 10 , 15 and 20 . (a) (b) Find the class boundaries. Find the class limits. Exercise 15 The daily number of employee absentees in a retail superstore is grouped into a table with the classes 0 4 , 5 9 , 10 14 , 15 19 and 20 or more. Is it possible to determine, from this table, the number of days in which there were (a) (b) (c) (d) at least 5 absentees? more than 5 absentees? at least 14 absentees? at least 15 absentees? Copyright 2002 by P.E.P. 11