Survey Tabulation Basics: Statistics

This article provides a summary of the basic descriptive statistics typically shown in crosstabs and other survey data analyses. Included are the basic measures with which all researchers should be comfortable.

Descriptive Statistics are those measures which help "describe" a distribution of survey responses. Where applicable, the mean (average), median, standard deviation and standard error are often included on tables for analysis purposes. For example, it might be helpful to show the mean of a rating scale question and other numeric fields (i.e., age or income values). These measures summarize the key results in a few succinct numbers.

A simple mean of a distribution is the arithmetic average - the sum of all responses divided by the number of responses. The mean of response values 1,2,2,2,3,4,5 is 2.7(19 divided by 7). A weighted mean takes this one step further by assigning weights to each response value. This is typically used when response values represent ranges such as age ranges - in this case, the mean is typically calculated based on the midpoints of the ranges (i.e., if the response value "1" is used to represent the age range 18-24, the range midpoint - "21" would be used to calculate a weighted mean which reflects age values. Weighted means may also be used to alter the coded values in a rating scale, for example, reversing a rating scale so the highest (or lowest) value indicates the value of greater importance.

The median of a distribution is the "middle" value when all values are listed in order from lowest to highest. In the example above, the median value is "2" (the 4th value in the 7-value list). Medians are often used where the presence of outliers (extreme responses) would skew the mean. For example, a distribution of income ranges of $18k, $24k, $35k, $42k, $46k, $65k, $72k, $125k, $4.5M would have a mean of $547k but a median value of $46k, a statistic that better describes the income level of the sample group.

Standard Deviation (SD) and Standard Error (SE) are perhaps the two least understood statistics shown in data tables. Both provide additional insight regarding the mean of a distribution. The standard deviation describes how far, on average, the individual values fall from the mean. A small SD would indicate that most values are clustered close to the mean value, which a large SD would describe a distribution where the values vary widely from the mean. For example, the distribution 11,11,12,12,12,13,13 has the same mean (12) of this distribution: 1, 2, 4, 6, 15, 21, 35 but very different SDs. If distribution values were represented on a frequency curve, a small SD would be indicated by a narrow, tall shape, while a large SD would be depicted by a short, wide shape The SE, on the other hand, is an indication of the reliability of the mean. A small SE is an indication that the sample mean is an accurate reflection of the actual population mean, usually based on a combination of a large sample size and low SD.

Significance Testing was covered in detail in a previous edition of StarTips (What Every Researcher Should Know About Statistical Significance). Data tables are often provided with results of statistical tests which show the reader at a glance which results are "statistically significant" when making comparisons across groups. DataStar also provides a complimentary tool, StarStat (and an iPhone app) which calculates z-tests, t-tests, and sample precision estimates.

 

This content was provided by DataStar, Inc. Visit their website at www.surveystar.com.

Company profile

DataStar, Inc.

Waltham, Massachusetts, United States of America
Telephone:
(781) 647-7900
Email:
info@surveystar.com
Website:
http://www.surveystar.com
About DataStar, Inc.:
We are the Survey Specialists! Contact us for top quality survey management, incl. web programming/hosting, mail, data entry, tabulation and analysis.
http://www.surveystar.com

other content shared by DataStar, Inc.

Anatomy of a Crosstab

DataStar, Inc.

Are you new to market research or do you need a refresher course in basic survey tabulation principles? The following guide will help you better understand how to read and interpret the results of your survey project.
What Every Researcher Should Know About Statistical Significance

DataStar, Inc.

Survey researchers use significance testing as an aid in expressing the reliability of survey results. We use phrases such as "significantly different," "margin of error," and "confidence levels" to help describe and make comparisons when analyzing data. The purpose of this article is to foster a better understanding of the underlying principles behind the statistics.
Datum et veritas - Having Faith In Your Data

DataStar, Inc.

Simply put, bad information begets bad decisions. Using inaccurate data as an information source puts you at risk for making the wrong decision. The importance of data verification is fundamental to all business and scientific practices, and has been the subject of many conferences and discussion groups. Yes Virginia, paper surveys are still alive and verification practices are an integral part of assuring high quality data capture.
Quality vs. Quantity - A Look At The Options For Data Capture

DataStar, Inc.

We are often asked about the pros and cons of scanning data as opposed to traditional data entry. If the quantity of survey forms and the schedule demand it, using an advanced data capture system can make sense, but in the majority of situations we prefer to employ manual data capture methods.