# Margin of error

## Information about Margin of error

The top portion of this graphic depicts probability densities (for a binomial distribution) that show the relative likelihood that the "true" percentage is in a particular area given a reported percentage of 50%. The bottom portion of this graphic shows the margin of error, the corresponding zone of 95% confidence. In other words, one is 95% sure that the "true" percentage is in this region given a poll with the sample size shown to the right. The larger the sample is, the smaller the margin of error is.

The margin of error is a statistic expressing the amount of random sampling error in a survey's results. The larger the margin of error, the less confidence one should have that the poll's reported results are close to the "true" figures; that is, the figures for the whole population.

## Explanation

The margin of error is usually defined as the radius of a confidence interval for a particular statistic from a survey. One example is the percent of people who prefer product A verses product B. When a single, global margin of error is reported for a survey, it refers to the maximum margin of error for all reported percentages using the full sample from the survey. If the statistic is a percentage, this maximum margin of error can be calculated as the radius of the confidence interval for a reported percentage of 50%.

The margin of error has been described as an "absolute" quantity, equal to a confidence interval radius for the statistic. For example, if the true value is 50 percentage points, and the statistic has a confidence interval radius of 5 percentage points, then we say the margin of error is 5 percentage points. As another example, if the true value is 50 people, and the statistic has a confidence interval radius of 5 people, then we might say the margin of error is 5 people.

In some cases, the margin of error is not expressed as an "absolute" quantity; rather it is expressed as a "relative" quantity. For example, suppose the true value is 50 people, and the statistic has a confidence interval radius of 5 people. If we use the "absolute" definition, the margin of error would be 5 people. If we use the "relative" definition, then we express this absolute margin of error as a percent of the true value. So in this case, the absolute margin of error is 5 people, but the "percent relative" margin of error is 10% (because 5 people are ten percent of 50 people). Often, however, the distinction is not explicitly made, yet usually is apparent from context.

Like confidence intervals, the margin of error can be defined for any desired confidence level, but usually a level of 90%, 95% or 99% is chosen (typically 95%). This level is the probability that a margin of error around the reported percentage would include the "true" percentage. Along with the confidence level, the sample design for a survey, and in particular its sample size, determines the magnitude of the margin of error. A larger sample size produces a smaller margin of error, all else remaining equal.

If the exact confidence intervals are used, then the margin of error takes into account both sampling error and non-sampling error. If an approximate confidence interval is used (for example, by assuming the distribution is normal and then modeling the confidence interval accordingly), then the margin of error may only take random sampling error into account. It does not represent other potential sources of error or bias such as a non-representative sample-design, poorly phrased questions, people lying or refusing to respond, the exclusion of people who could not be contacted, or miscounts and miscalculations.

## Concept

### Running example

A running example from the 2004 U.S. presidential campaign will be used to illustrate concepts throughout this article. According to an October 2, 2004 survey by Newsweek, 47% of registered voters would vote for John Kerry/John Edwards if the election were held on that day, 45% would vote for George W. Bush/Dick Cheney, and 2% would vote for Ralph Nader/Peter Camejo. The size of the sample was 1,013.[1] Unless otherwise stated, the remainder of this article uses a 95% level of confidence.

### Basic concept

Polls typically involve taking a sample from a certain population. In the case of the Newsweek poll, the population of interest is the population of people who will vote. Because it is impractical to poll everyone who will vote, pollsters take smaller samples that are intended to be representative; that is, a random sample of the population.[2] It is possible that pollsters sample 1,013 voters who happen to vote for Bush when in fact the population is evenly split between Bush and Kerry, but this is extremely unlikely (p = 2-1013 ≈ 1.13923782 × 10-305) given that the sample is random.

Sampling theory provides methods for calculating the probability that the poll results differ from reality by more than a certain amount, simply due to chance; for instance, that the poll reports 47% for Kerry but his support is actually as high as 50%, or is really less than 44%. This theory and some Bayesian assumptions suggest that the "true" percentage will probably be fairly close to 47%. The more people that are sampled, the more confident pollsters can be that the "true" percentage is close to the observed percentage. The margin of error is a measure of how close the results are likely to be.

However, the margin of error only accounts for random sampling error, so it is blind to systematic errors that may be introduced by non-response or by interactions between the survey and subjects' memory, motivation, communication and knowledge.[3]

### Calculations assuming random sampling

This section will briefly discuss the standard error of a percentage, the corresponding confidence interval, and connect these two concepts to the margin of error. For simplicity, the calculations here assume the poll was based on a simple random sample from a large population.

The standard error of a reported proportion or percentage p measures its accuracy, and is the estimated standard deviation of that percentage. It can be estimated from just p and the sample size, n, if n is small relative to the population size, using the following formula:[4]

Standard error =

When the sample is not a simple random sample from a large population, the standard error and the confidence interval must be estimated through more advanced calculations. In most cases, the true confidence interval is approximated by assuming the distribution is normal, and inputing the interval. For normal distributions, the confidence interval radii are proportional to the standard error. Usually, the true standard error is unknown, so an estimate's standard error is calculated from the sample data.

Note that there is not necessarily a strict connection between the true confidence interval, and the true standard error. The true p-percent confidence interval is the interval [a,b] that contains p percent of the distribution, and where (100-p)/2 percent of the distribution lies below a, and (100-p)/2 percent of the distribution lies above b. The true standard error of the statistic is the square root of the true sampling variance of the statistic. These two may not be directly related, although in general, for large distributions that look like normal curves, there is a direct relationship.

In the Newsweek poll, Kerry's level of support p = 0.47 and n = 1,013. The standard error (.016 or 1.6%) helps to give a sense of the accuracy of Kerry's estimated percentage (47%). A Bayesian interpretation of the standard error is that although we do not know the "true" percentage, it is highly likely to be located within two standard errors of the estimated percentage (47%). The standard error can be used to create a confidence interval within which the "true" percentage should be to a certain level of confidence.

The estimated percentage plus or minus its margin of error is a confidence interval for the percentage. In other words, the margin of error is half the width of the confidence interval. It can be calculated as a multiple of the standard error, with the factor depending of the level of confidence desired; a margin of one standard error gives a 68% confidence interval, while the estimate plus or minus 1.96 standard errors is a 95% confidence interval, and a 99% confidence interval runs 2.58 standard errors on either side of the estimate.

### Definition

The margin of error for a particular statistic of interest is usually defined as the radius (or half the width) of the confidence interval for that statistic.[5][6] The term can also be used to mean sampling error in general. In media reports of poll results, the term usually refers to the maximum margin of error for any percentage from that poll.

### Maximum margin of error

The maximum margin of error for any percentage is the radius of the confidence interval when p = 50%. As such, it can be calculated directly from the number of poll respondents. For 95% confidence, assuming a simple random sample from a large population:

(Maximum) margin of error (95%) = 1.96 ×

This calculation gives a margin of error of 3% for the Newsweek poll, which reported a margin of error of 4%. The difference was probably due to weighting or complex features of the sampling design that required alternative calculations for the standard error. It is also possible that Newsweek have rounded conservatively to avoid overstating the confidence of their results.

### Different confidence levels

For a simple random sample from a large population, the maximum margin of error is a simple re-expression of the sample size n. The numerators of these equations are rounded to two decimal places.

Margin of error at 99% confidence

Margin of error at 95% confidence

Margin of error at 90% confidence

If an article about a poll does not report the margin of error, but does state that a simple random sample of a certain size was used, the margin of error can be calculated for a desired degree of confidence using one of the above formulae. Also, if the 95% margin of error is given, one can find the 99% margin of error by increasing the reported margin of error by about 30%.

### Maximum and specific margins of error

While the margin of error typically reported in the media is a poll-wide figure that reflects the maximum sampling variation of any percentage based on all respondents from that poll, the term margin of error also refers to the radius of the confidence interval for a particular statistic.

The margin of error for a particular individual percentage will usually be smaller than the maximum margin of error quoted for the survey. This maximum only applies when the observed percentage is 50%, and the margin of error shrinks as the percentage approaches the extremes of 0% or 100%.

In other words, the maximum margin of error is the radius of a 95% confidence interval for a reported percentage of 50%. If p moves away from 50%, the confidence interval for p will be shorter. Thus, the maximum margin of error represents an upper bound to the uncertainty; one is at least 95% certain that the "true" percentage is within the maximum margin of error of a reported percentage for any reported percentage.

### Effect of population size

The formulae above for the margin of error assume that there is an infinitely large population and thus do not depend on the size of the population of interest. According to sampling theory, this assumption is reasonable when the sampling fraction is small. The margin of error for a particular sampling method is essentially the same regardless of whether the population of interest is the size of a school, city, state, or country, as long as the sampling fraction is less than 10%.

In cases where the sampling fraction exceeds 10%, analysts can adjust the margin of error using "finite population correction," (FPC) to account for the added precision gained by sampling close a larger percentage of the population. FPC can be calculated using the formula:

To adjust for a large sampling fraction, the fpc factored into to the calculation of the margin of error, which has the effect of narrowing the margin of error. It holds that the fpc approaches zero as the sample size (n) approaches the population size (N), which has the effect of eliminating the margin of error entirely. This makes intuitive sense because when N = n, the sample becomes a census and sampling error becomes moot.

Analysts should be mindful that the sample remain truly random as the sampling fraction grows, lest sampling bias be introduced.

### Other statistics

Confidence intervals can be calculated, and so can margins of error, for a range of statistics including individual percentages, differences between percentages, averages, medians[7] and totals.

The margin of error for the difference between two percentages is larger than the margins of error for each of these percentages, and may even be larger than the maximum margin of error for any individual percentage from the survey.

## Comparing percentages

In a plurality voting system, it is important to know who is ahead. The terms "statistical tie" and "statistical dead heat" are sometimes used to describe reported percentages that differ by less than a margin of error, but these terms can be misleading.[8][9] For one thing, the margin of error as generally calculated is applicable to an individual percentage and not the difference between percentages, so the difference between two percentage estimates may not be statistically significant even when they differ by more than the reported margin of error. The survey results also often provide strong information even when there is not a statistically significant difference.

When comparing percentages, it can accordingly be useful to consider the probability that one percentage is higher than another.[10] In simple situations, this probability can be derived with 1) the standard error calculation introduced earlier, 2) the formula for the variance of the difference of two random variables, and 3) an assumption that if anyone does not choose Kerry they will choose Bush, and vice versa; they are perfectly negatively correlated. This may not be a tenable assumption when there are more than two possible poll responses. For more complex survey designs, different formulas for calculating the standard error of difference must be used.

The standard error of the difference of percentages p for Kerry and q for Bush, assuming that they are perfectly negatively correlated, follows:

Standard error of difference =

Given the observed percentage difference pq (2% or 0.02) and the standard error of the difference calculated above (.03), any statistical calculator may be used to calculate the probability that a sample from a normal distribution with mean 0.02 and standard deviation 0.03 is greater than 0.

Applying these calculations to the Newsweek example results in a 75% probability that Kerry was "truly" leading.

## Notes

1. ^ Newsweek (2004-10-02). NEWSWEEK POLL: First Presidential Debate. Press release. Retrieved on 2006-05-31.
2. ^ Wonnacott and Wonnacott (1990), pp. 4–8.
3. ^ Sudman, S.L. and Bradburn N.M. (1982) Asking Questions. Jossey-Bass: pp. 17-19
4. ^ Sample Sizes, Margin of Error, Quantitative Analysis
5. ^ Lohr, Sharon L. (1999). Sampling: Design and Analysis. Pacific Grove, California: Duxbury Press, 49. ISBN 0534353614. “The margin of error of an estimate is the half-width of the confidence interval ...
6. ^ Stokes, Lynne; Tom Belin (2004). What is a Margin of Error? (PDF). What is a Survey? 64. Survey Research Methods Section, American Statistical Association. Retrieved on 2006-05-31.
7. ^ Income - Median Family Income in the Past 12 Months by Family Size, U.S. Census Bureau. Retrieved February 15, 2007.
8. ^ Braiker, Brian. "The Race is On: With voters widely viewing Kerry as the debate’s winner, Bush’s lead in the NEWSWEEK poll has evaporated". MSNBC, October 2, 2004. Retrieved on 2007-02-02.
9. ^ Rogosa, D.R. (2005). A school accountability case study: California API awards and the Orange County Register margin of error folly. In R.P. Phelps (Ed.), Defending standardized testing (pp. 205–226). Mahwah, NJ: Lawrence Erlbaum Associates.
10. ^ Drum, Kevin. Political Animal, Washington Monthly, August 19, 2004. Retrieved on 2007-02-15.

## References

• Sudman, Seymour and Bradburn, Norman (1982). Asking Questions: A Practical Guide to Questionnaire Design. San Francisco: Jossey Bass. ISBN 0875895468
• Wonnacott, T.H. and R.J. Wonnacott (1990). Introductory Statistics, 5th ed., Wiley. ISBN 0471615188.

Factor of safety (FoS) can mean either the fraction of structural capability over that required, or a multiplier applied to the maximum expected load (force, torque, bending moment or a combination) to which a component or assembly will be subjected.
In engineering, tolerance is the permissible limit of variation in 1) a physical dimension, 2) a measured value or property of a material, manufactured object, system, or service, or 3) other measured values (such as temperature, humidity, etc).
Margin of Error is a 1939 play by American playwright Clare Boothe Luce. It was adapted to the screen by director Otto Preminger in 1943 starring Joan Bennett and Milton Berle.
sampling error and is controlled by ensuring that, as much as possible, the samples taken have no systematic characteristics and are a true random sample from all possible samples.
Statistical surveys are used to collect quantitative information about items in a population. Surveys of human populations and institutions are common in political polling and government, health, social science and marketing research.
In statistics, a statistical population is a set of entities concerning which statistical inferences are to be drawn, often based on a random sample taken from the population.
In classical geometry, a radius (plural: radii) of a circle or sphere is any line segment from its center to its perimeter. By extension, the radius of a circle or sphere is the length of any such segment. The radius is half the diameter.
confidence interval (CI) is an interval estimate of a population parameter. Instead of estimating the parameter by a single value, a whole interval of likely estimates is given. How likely the estimates are is determined by the confidence coefficient.
A statistic (singular) is the result of applying a function (statistical algorithm) to a set of data.
In mathematics, a percentage is a way of expressing a number as a fraction of 100 (per cent meaning "per hundred"). It is often denoted using the percent sign, "%". For example, 45 % (read as "forty-five percent") is equal to 45 / 100, or 0.45.
Probability is the likelihood that something is the case or will happen. Probability theory is used extensively in areas such as statistics, mathematics, science and philosophy to draw conclusions about the likelihood of potential events and the underlying mechanics of
Sampling is that part of statistical practice concerned with the selection of individual observations intended to yield some knowledge about a population of concern, especially for the purposes of statistical inference.
The sample size of a statistical sample is the number of repeated measurements that constitute it. It is typically denoted n, and is a non-negative integer (natural number).

Typically, different sample sizes lead to different accuracies of measurement.
sampling error and is controlled by ensuring that, as much as possible, the samples taken have no systematic characteristics and are a true random sample from all possible samples.
bias is used for describing several different concepts:
• A biased sample is one in which some members of the population are more likely to be included than others.

questionnaire construction is critical to the success of a survey. Inappropriate questions, incorrect ordering of questions, incorrect scaling, or bad questionnaire format can make the survey valueless. A useful method for checking a questionnaire for problems is to pretest it.
The United States presidential election of 2004 was held on Election Day, Tuesday, November 2, 2004. Republican candidate George Walker Bush, the President of the United States, was elected over Democratic candidate John Kerry, the junior United States Senator from
October 2 is the 1st day of the year (2nd in leap years) in the Gregorian calendar. There are 0 days remaining.

## Events

• 1187 - Siege of Jerusalem: Saladin captures Jerusalem after 88 years of Crusader rule.

20th century - 21st century - 22nd century
1970s  1980s  1990s  - 2000s -  2010s  2020s  2030s
2001 2002 2003 - 2004 - 2005 2006 2007

2004 by topic:
News by month
Jan - Feb - Mar - Apr - May - Jun
Newsweek is an American weekly newsmagazine published in New York City and is distributed throughout the United States and internationally in 12 local language editions. It is the second largest news weekly magazine in the U.S.
Editing of this page by unregistered or newly registered users is currently disabled due to vandalism.
If you are prevented from editing this page, and you wish to make a change, please discuss changes on the talk page, request unprotection, log in, or .
This article or section contains information about one or more candidates in an upcoming or ongoing election.
Content may change as the election approaches.
Editing of this page by unregistered or newly registered users is currently disabled due to vandalism.
George Walker Bush (born July 6, 1946) is the forty-third and current President of the United States of America, originally inaugurated on January 20, 2001. Bush was first elected in the 2000 presidential election, and reelected for a second term in the 2004 presidential election.
This page is currently protected from editing until (UTC) or until disputes have been resolved.
Protection is not an endorsement of the current [ version] ([ protection log]).
Peter Miguel Camejo (born December 31, 1939) is an American financier, businessman, political activist, and author. In 2004, he was selected by independent candidate Ralph Nader as his vice-presidential running mate.
The sample size of a statistical sample is the number of repeated measurements that constitute it. It is typically denoted n, and is a non-negative integer (natural number).

Typically, different sample sizes lead to different accuracies of measurement.
A sample is a subject chosen from a population for investigation. A random sample is one chosen by a method involving an unpredictable component. Random sampling can also refer to taking a number of independent observations from the same probability distribution, without involving
sampling theory may mean:
• Nyquist–Shannon sampling theorem, digital signal processing (DSP)
• statistics, statistical sampling
• Fourier sampling