By David Bookstaber

When testing guns for accuracy it is common practice to look at the Extreme Spread of a group of 3 or 5 test shots. I will explain why this is a statistically bad measure on a statistically weak sample. Then I will explain why serious shooters and statisticians look instead at some variation of circular error probable (CEP) when assessing precision . . .

It is easy to fool yourself with Extreme Spread…and even easier to fool others. First, note that even the most precise rail gun will occasionally print a “flier.” If 99 out of 100 shots nestle into a dime group, but one breaks away by an inch, would you characterize the accuracy of that gun as shooting one-inch groups? Now pick up a rifle, pop in a beta mag, close your eyes, and empty it from the hip. There’s a decent chance that 3 of those shots will be touching at 100 yards. But does that tell you anything about the precision of the rifle or your shooting?

The answer to both questions is “of course not.” And it gets worse: Why not crop one of those cloverleaf shot-from-the-hip groups and share it online. Then tell everyone your rifle shoots like that “all day long.”

Well not *you*, of course. You probably want to test your mettle and your gun’s metal and really figure out what you can do together. Maybe your manufacturer gave you a “1-MOA guarantee,” by which they mean that your gun will shoot a 3-shot group into 1” at 100 yards with quality ammo. So you hit the range, tighten your scope, and fire three shots at the same point. Bingo: You don’t need calipers to see that they’re within an inch of each other. You’ve really got a sub-MOA rifle!

Well you’ve got a full box of ammo, so why not knock the center out of that pretty little group? You take another shot and, damnit, it goes wide. (You’ll have to crop that group pretty tightly to show it off now!) But everyone knows you can call fliers, so you take a breath and try some more. Before you’re through with the box you will probably notice something unfortunate: *The more shots you take, the wider your group tends to get.*

Now something doesn’t add up here. The manufacturer guaranteed your rifle would shoot three shots within 1 MOA. But neither they, nor you, nor your gun could predict which order the shots in that group would appear. If your gun is really sub-MOA you should be able to pick any three at random as your “group” and it should measure under 1 MOA. You’ve just discovered one of the industry’s inside jokes: Accuracy guarantees expressed in terms of finite shots are either impossible or meaningless.

(Side note: I’ve actually made two accuracy returns in my life. One was to a manufacturer with no explicit accuracy guarantee, who nonetheless test-fired the gun, got decent results with one type of ammo I hadn’t tried, and returned it with that explanation and sample targets. In another case the manufacturer eventually agreed the barrel was bad and sent me a new one.)

You’ve also discovered one of the problems with the Extreme Spread: It all depends on the number of shots you take. Worse yet, it doesn’t differentiate between a target where most of the shots are in a tight group and there’s a lone “flier,” and one with the same extreme spread but with every shot scattered about the same distance from the center.

If you kept your sight zeroed and logged every shot taken with a given rifle and lot of ammunition, after 1000 shots your aggregated target would look something like this:

We’ll call this picture a sample “distribution” of shots. It doesn’t matter how accurate or inaccurate your gun: its shot distribution is the same as the one that produced this sample. The only thing that varies with accuracy is how large or small this cluster is. (The dots represent the center of each hit, not the size of the holes cut by the bullet.) The red circle is drawn around the center of the distribution and contains exactly half of the shots in the picture. The radius of that red circle is called the Circular Error Probable (CEP), and that single value is sufficient to characterize the accuracy of a gun. Some real-world CEP values:

- The U.S. Precision Sniper Rifle contract called for CEP better than 0.3MOA
- The M24 and M110 acceptance standards require CEP better than 0.6MOA
- XM193 ammunition from a test barrel has to shoot tighter than 0.7MOA CEP

For example, from a fixed barrel XM193 ammunition should put at least half its shots inside a 1.5” diameter circle at 100 yards. (Note that all of these specifications are given with the angular measure MOA, or Minute Of Arc, which is equal to about 1.05” at 100 yards. By using an angular measure we don’t need to specify the distance of a particular target or test.)

A useful “accuracy guarantee” would read something like these military performance standards, which one way or another boil down to the following: “CEP will be no greater than a certain value.” Traditionally CEP is given for the 50% (median) level, but it can also be given for other probability levels. For example, a manufacturer might instead say that their gun should put 90% of its shots inside a 1-MOA circle.

We are focusing on circles, but if you’ve shot a lot of groups you might be more used to finding ellipses. There are a few things that can cause your shots to string out vertically, like barrel heating or long distances with less precise ammunition. Also, wind can cause your shots to string out horizontally. But removing those factors your gun will shoot into a circular distribution like the one shown.

What does this mean for accuracy? You don’t get to pick the order in which those shots are fired. Any individual shot is essentially selected at random from that distribution of shots. This has substantial implications for people who try to deduce their accuracy from small samples.

If you pick 3 shots at random from that distribution you could end up with three holes practically touching. They might be near the center of impact, or they might be far away. Conversely, you could end up with three shots quite far from each other. Three shots don’t tell you very much: Half the time their extreme spread is wider than the CEP.

More importantly, especially for sighting in, on average the 3-shot center is 60% of the CEP away from the true center! It’s theoretically impossible to be certain you’ve found the *exact* center of impact, but you can see that three shots aren’t a very good indication. In fact, to double the precision of your zero you have to take 10 shots! (At that point we expect the sample center to be within 30% CEP of the true center.)

So what’s a shooter to do? For one thing: ignore 3-shot groups. If you want to get a sense of a gun’s precision then shoot larger groups. I tend to shoot 10-round groups and use computerized target markers to precisely calculate my CEP. For the most accurate guns at closer distances that tend to create jagged holes I instead shoot several 5-round groups.

If you’re not going to bother calculating CEP and just want to stick with Extreme Spread because it’s so easy to measure at least move up to 5-round groups. They are statistically more efficient and less prone to abuse than 3-round groups. Use multiple groups, and don’t throw away the bad ones. For example, the *American Rifleman*’s protocol of taking the average extreme spread of five 5-round groups is only about a third less efficient than the most statistically efficient precision estimator.

For more on the theory and practice behind measuring shooting precision visit ballistipedia.com.

Statistics, meet ballistics. Ballistics, meet statistics. See, they CAN get along! Thanks for an interesting and informative article.

I’m sure all this is true, but it’s probably unnecessary in the real world for most people. Three shots off the bench with a sandbag rest grouping under an inch at 100 yards, has resulted in a life-long record of one shot one kill on white tail deer for me. Precision competition target shooting may be different, but this all seems to complicate the crap out of something that should be fun.

To some people, taking measurements and statistically analyzing them is, in fact, fun.

Go figure!

Punny!

I know right?! Shoot 5 instead of 3. Shoot 10 instead of 5. Okay!

^this. i cant imagine why someone who is not a competition shooter, would need to have to do all this.

So even if I have the most accurate gun in the world, I can still use it as an excuse for missing that deer. Cool!

Now I would like to see an article about using standard deviation of chronographed velocities and its usefulness in evaluating powders, bullets, etc.

That’s how I analyze my hand loads, in addition to group size.

As a person who measures things for a living I agree with this approach. Its common to measure error as 1 standard deviation from the mean. Which means 68% if the time you will hit that accuracy. You can call error as 2 standard deviations to get 95%

True. About 68% of points will be within one standard deviation in either direction of the population mean (arithmetic mean, commonly called an average). About 95% will be within two std. dev. and about 99.7% within +/- three standard deviations of the population mean. However, that’s only true of the entire population and only if that population is normally distributed.

If the population is not normally distributed (the famous bell curve shape), or if it is bell shaped but is especially flat or peaked, then all bets are off. Even if it actually is perfectly bell shaped, the numbers can still be off if your sample size is too small.

If you’re dealing with a small sample, say 30 or fewer observations, you might not be able to assume it’s representative of the population as a whole. So what to do if your population is not normally distributed, or your sample is too small? How can you draw meaningful inferences about how widely dispersed the data is, based on the standard deviation? The Russians to the rescue!

You can use Chebyshev’s Inequality, which holds that no more than 1/k-squared of the distribution’s data points will be more than k standard deviations from the mean. Put the opposite way, at least 1 – (1/k-squared) data points will be within k std. devs. of the mean.

Run the numbers and we find that this inequality states that at least 50% of data points will be within 1.41 standard deviations of the mean, at least 75% will be within 2 std. dev., and at least 89% will be within 3 std. dev. of the mean. This is true regardless the shape of the distribution. The more normal the curve, the higher the actual percentages, of course, but the inequality is still correct in stating “at least” this percentage.

It’s useful as a rough estimate of the distribution when you have an odd shaped distribution or a small sample size, and was an important finding in the early development of Calculus. (Note: it doesn’t hold for k values less than 1.)

Be sure to have a look at http://ballistipedia.com/index.php?title=Closed_Form_Precision for the closed-form math. The statistical model used here (for 2-dimensional targets) is a bivariate normal. The standard deviation carries over (we even keep the same notation: sigma) but in two dimensions the “68-95-99%” rule for expected sample encompassed by x-stdevs becomes 39-86-99%.

The cost of measuring barrel accuracy can be hard especially when dealing with a barrel that will degrade with 100 shots. Your fliers could be based on variability of the ammo you are shooting as well. Neck tension, bullet runnout, Ogive length, powder charge variability, and primer variability. Not to mention environmental factors.

Anyone trying to shoot a 6 PPC or a 6-284 or 6.5-284 would use up good barrel life trying to see if that barrel performs under your standard. I don’t disagree with the 3 shot group as a poor indicator which is why I do load testing with 5 shot groups before I settle on a load for my next meet. A ten shot group is better than a 3 shot or even 5 shot as long as it is not too fast as to heat up the barrel which adds another factor degrading accuracy. The best thing to do is compare the barrel using points scored over a several match time frame so you can see how it shoots averaging out weather, barrel temp, and reloading variables.

This is why people who are serious about rifle accuracy need to keep logs. Only with logs and score sheets can one really start to judge how good one barrel is over another barrel.

The lifetime of a 6.5-284 is about, what, these days? 1700 rounds? Yes, that’s short, but there’s a significant overbore condition there, and there’s no way around the fact that trying to stuff that much powder down that small a bore is going to overheat the throat.

Several friends and myself have switched to 5 or 10 round groups to determine accuracy depending on the price of the ammo or the purpose of the weapon. For precision rifles we usually use 10 rounds while we stick to 5 for hunting or plinking guns. Pistols are always 10 rounds.

I’m pretty sure TTAG is responsible for our new habits but I won’t be switching to 1000 round groups anytime soon.

You’re not a

truemember of the Armed Intelligentsia unless you shoot 1000-round groups. 😉Fair. As soon as you send me my first free ammo shipment I’ll join the club for realsies…lol.

I’m always looking for way to improve our gun reviews. Recently, I’ve started shooting five shot groups and analyzing them using OnTarget. The outputs there are max spread and average to center.

As a reader of TTAG, do you think we should go to 10 shot groups using OnTarget?

I evaluate my handgun targets on a very basic level, but I never shoot less than 10 rounds on a target before taking it down.

Tyler,

More data is always better than less data. But the article here glossed over the fact that there are multiple factors other than the gun that affect accuracy (ammo, barrel temp, weather, wind, coffee, etc.). So better data and analysis of a shot group may or may not give us a better measurement of the quality of the firearm used.

I personally like the 5 round groups because it’s easier to compare them to reviews done before. I’d rather compare my apples to pears than to watermelons. Is there a reason you can’t include both?

You could try a few different ammo types using 5 then shoot 10 or more once you’ve settled on the best for the final test.

10 shot groups are better than five shot groups. Multiple groups are better than one group. The data from all groups should be included in the review. The worst groups and the “fliers” should not be excluded from the evaluations.

Most important of all, though, is TTAG should develop a standard protocol for evaluating accuracy and have all reviewers use it. At a minimum, it should involve multiple five shot groups.

I just realized I didn’t answer your question directly. Yes, I think TTAG should use 10 shot groups.

With five shot groups, I see a tendency among shooters (including myself) to think of their group size as the better 60% or so of their five shot groups. The five shot groups with “fliers” tend to get forgotten or ignored. A rifle’s accuracy then gets overstated. Variations of this can be see in some TTAG reviews.

Ten shot groups force you to look at a statistically significant sample size instead of cherry-picking the better than average groups from a statistically insignificant sample size.

Three shot groups are OK for adjusting zero, and that’s about it.

I’ve been using 10-shot groups and OnTarget too. The “Average-to-Center” the software automatically calculates, also known as “Mean Radius,” is practically the same as Circular Error Probable (mathematically only 4% higher, so within what I would consider measurement error).

For backwards compatibility and/or ease of marking one can shoot multiple 5-shot groups and then overlay them into a single larger group to get the aggregated CEP.

However the reviewers slice it I think 10 recorded shots, with no cherry-picking, should be the minimum when reporting accuracy. This is where accuracy measures just begin to be repeatable and consistent.

10 shot groups give you more information, the downside is that it takes more time, and twice as expensive as 5 shot groups.

3 shot groups are almost useless!

Yes.

And rifles should be shot off the bench, on a rest with a bag for the rear.

Pistols/revolvers should be shot from a Ransom rest.

Shotguns being tested for pattern efficacy should be shot on a pattern board.

Yes. I vote for 10 shot groups with the On Target analysis, with the mean radius included.

This is very interesting. For some time now I have used five and ten round strings of shots because I noticed three round groups were not giving me consistent information. Ten round strings yield the best information and recording and numbering each shot really helps.

The referenced webpage looks very promising for better analysis.. Thanks!

The title of this piece should be “Understanding Rifle Accuracy,” not “precision.”

Precision and accuracy are two different mathematical issues. Precision is about resolution, accuracy is about repeatability.

I’ll explain:

If I have a measurement instrument that allows me to measure something (eg, size of some part) to a very fine level of detail (let’s say it is a set of digital calipers, and it can measure down to 0.0005″ or one-half-thousandth of an inch), I would call that a “precise” instrument.

Now, with this precise instrument, I try to measure something. Let’s say I’m slapping this set of calipers on a 1″ grade 0 gage block that has a certificate of NIST traceability and so on. I know that the 1″ block is actually 1″ down to less than 10 millionths of an inch, or whatever the level of certification I have is.

I read 1.0015 on my digital calipers. I take another measurement. I get 1.0010. And another: 0.9995.

Who is right? The certification on the block, or the fancy digital calipers?

The block, unless I can show that it is damaged or simply wrong (which will take several other instruments and references to show conclusively). And if the block is really a grade 0 block with NIST certification, I paid some hefty money for that set of gage blocks – likely more than most people are spending on a rifle, BTW.

How can such a precise instrument give me so many different readings?

Ahhhhhh. Now you’ve arrived at the nut of the issue:

Accuracy is about repeatability, not resolution.The digital calipers have precision, but they lack accuracy to justify their precision.If I had a set of calipers that read down to only 0.005 resolution, all my readings above would have been the same, because they’re all within 0.005″ of each other. A not-so-precise set of calipers would have read 1.000 and that was it. Suddenly, these less precise calipers look more accurate – and they are. They’re not throwing up a level of precision to which they cannot repeat.

The truth about digital calipers is that they’re a useful instrument – with limitations. A change in your thumb pressure while closing the jaws can make the readings change by at least 0.003″ – easily. Resolution down to a half-thou is silliness writ large on calipers. On a micrometer? OK, that’s an instrument where I can read down to tenths of a thousandth repeatably. Calipers just don’t have the inherent mechanical support to allow such resolution to be believed.

Another way I explain precision vs. accuracy is this: Let’s say you put a Nightforce 24X scope on your hunting rifle. Your hunting rifle can group no better than about 2″ at 100 yards, but so what? The deer still hit the ground.

But you’ve decided that you want to spend money on a piece of glass, so there it is: a Nightforce scope with a max magnification of 24X on your 2-minute+ hunting rifle.

You take it afield. You see a deer. You lay the scope on the deer. HOLY CRAP! All you see is a wall of hair. We have to move this around a bit… OK, there’s the nose, we’ll start moving back on the head….

With this level of magnification at 100 yards, you can see individual hairs on the deer. You can actually lay your crosshairs on a particular hair… you get to choose your aiming point so precisely that you’re now terrifically confident that you’ll be able to put it right at the base of the skull, damaging almost no meat.

Let us assume for the sake of argument that you can hold the rifle that still, because as everyone who has actually looked through high-magnification optics knows, you can seem to see the point of aim move with individual blood cells flowing through your shoulder and hands. Above about 12X, it is terrifically difficult to get your sight picture to be steady without a hard rest. At 20X+, I’ve seen guys get seasick from trying to get themselves still enough on off-hand positions.

Still, you’ve chosen your aiming point most fine, you pull the trigger and…. you barely nick the deer above the neck. Two bounds and he’s gone. What happened?

While a high power scope gave you precision, the lack of your rifle’s ability to repeatably shoot to your exact point of aim means you have an inaccurate rifle for making very high-precision shots, and you need to aim at parts of the deer that present big, fat, wide targets that contain the vast majority of your shots – like the lung area. It also means that you could pull the Nightforce scope off your rifle and use a set of buckhorn iron sights and you’d now have sights with a level of precision that your rifle’s accuracy can justify.

I generally agree with your assertions regarding the definitions of accuracy and precision with regard to measuring devices. I’m not sure measuring devices are analogous to firearms in this context. I also don’t agree that accuracy and repeatability are synonymous.

I think accuracy in the context of measuring instruments refers to whether or not the instrument is properly calibrated. If you measure your 1″ gage block five times, and every time you get a reading of 1.3025, that measurement is both precise and repeatable, but not accurate.

In my previous career in the world of quantitative genetics, the term “repeatability” was used as an estimate of the probability that subsequent measurements would be similar to previous measurements. But “repeatability” didn’t tell us how accurate the initial measurement was.

You’re right that accuracy is not a perfect equivalent of repeatability, but repeatability is wholly necessary for accuracy.

If we have repeatability, we can then calibrate an instrument to achieve the net result we want.

If we don’t have repeatability, then we we will never have accuracy.

To go back into the firearms domain of knowledge: If I have a rifle that is laying down 1″ groups, but not at the point of aim, I have a reasonably accurate rifle, and the sights are merely in need of adjustment. We merely adust the sight(s) to put the 1″ group at out point of aim and wha-la! We’ve got something that works.

If, on the other hand, I have a rifle that can’t group tighter than 6″ at 100 yards, then the exact adjustment of the sights becomes something of an exercise of adjusting it to the statistical center of a bunch of shots, rather than one of adjusting it to lay a right group where we want it to hit.

DG

Very well said. I’ll try and remember all that.

I disagree. Generally precision is synonymous with repeatability, not what you’re calling resolution. If the dial calipers aren’t giving a repeatable measurement, they’re not a precise instrument, no matter how many numbers are behind the decimal. Precision is the size of the group irrespective of it’s location in relation to the target center.

Accuracy is what most shooters confuse with precision. Accuracy is actually something more along the lines of “how close to the target center was the group center (or the shot)?”

A 0.2 MOA group in the 7-ring is a precise but inaccurate group. A 6 MOA group, with the group center at the X-ring is an accurate but imprecise group.

I use 3 shots to do large obvious elevation/windage adjustments.

Then 5 shots to make finer adjustments to the center.

And finally 10 shots to really hone in with precise clicks.

Anyone can shoot fine off a bench in your own time.

Learning to shoot under stress is.something else. I shoot service rifle in service matches. You learn to shoot when you’re not ready. You have targets that turn on for 4, 8, 40, or 50 seconds, depending on if the course is a single-snap, double-tap-snap, 8 round rapid, or 10 round rapid.

At 100 metres we stand, 200 we sit, and at 300+ we go prone. Every week is a different distance and course.

Iron sights and unsupported. Learn to shoot the hard way and you appreciate the easy way.

Yes, yes, yes. Not only “yes,” but “hell yes!”

I don’t know if I could agree any more vehemently, other than to say doing this whilst running a bolt gun helps make a person appreciate the budgeting of time and to forget the idea of “easy follow up shots,” which means “you’d better make this one count.”

What kind of groups can you get with irons at 200 and 300 yards? I’ve never shot irons beyond 100.

Depends on “what kind of iron sights?”

If you’re shooting aperture sights (go find any picture of Kristin shooting her .22 rifle – her sights on that rifle are what I mean when I say “aperture sights”), you can throw down some very tight groups. You need the correct aperture insert in the front sight for the size of the black target ball you’re shooting, and once you have that issue nailed down, you can throw down MOA and sub-MOA groups at 100, 200, whatever. Aperture sights rule in the accuracy world as the “best” competition iron sight.

With open iron sights, (eg, the normal hunting “buckhorn” types of sights) the groups open up – maybe to two- MOA. I know of people who have shot groups under two MOA at 200 yards with 1903 Springfields and match ammo using issue sights, issue barrel, no gunsmithing on the rifle, no bedding in the stock, with only a little attention paid to the trigger.

I’ve shot 2.5 MOA groups with a rack-grade Springfield M1A from a standing offhand position at 200 yards when I was younger. That’s just a peep and a front post with guards.

Again, a little attention to the trigger, but nothing else. I’m nothing special in way of a competitor.

Guys like David Tubb can throw down some absurdly good groups with iron sights at 200+ yards.

Using iron sights and only supported by a sling, the best I can do is about 2 MOA in competition. The best 300 metre group was about 1.5″ shooting off a rest in a load test during a practice session.

The sights are half-minute Central or Rawson back sights, front sight is standard No4 Lee-Enfield. The sling I prefer is the 1907 Springfield sling, but other competitors use the standard webbing sling. I mostly use the .223 conversions for cost and ammunition availability. I have a .308 M10B but it needs a barrel change as the supplied barrel will only do at best 6 MOA.

You do learn to pace your shots with a bolt-action. And a Lee-Enfield is the slickest and smoothest bolt-action for service shooting. Mausers and Mosin-Nagants are just too clunky in comparison.

So I shoot 3 shot groups but I do it 20 times over the course of a year. None have been more than .75″ at 100 yards and that’s out of 60 total shots. Let’s then say I shoot a 60 shot group all at once and it’s 5 inches. What’s that say about the gun? Is it a 5″ gun or a 0.75″ gun? I ask because as shot count increases so does the chance of human error, barrel temp fluctuations, weather factors, etc. I do 3 or 5 shot groups for this reason. Every gun if shot enough will look less accurate as the count increases.

If all you record is extreme spread of 3-shot groups you may as well throw away the data. As you reiterated: extreme spread increases with the number of shots and highlights fliers (one-off extreme errors, human or otherwise). Furthermore, 3-shot groups are statistically insignificant, including in ways that are not immediately obvious but which show up in the math if you care to review it.

However, if you record the shots relative to point of aim, and you believe you’ve held a constant zero over a shooting season, then you can aggregate the data.

This is why we advocate sample-size invariant measures like CEP: If shooting conditions are roughly the same (i.e., holding ammunition and significant environmental conditions constant) then the rifle’s CEP is expected to stay constant over time (at least until the barrel is shot out) and by recording more shots you only tighten your estimate of the gun’s true, inherent CEP.

Once you have recorded 60 shots compute the CEP and you’ll find the confidence interval about that estimate to be very tight, and very predictive of your gun’s future performance. In fact, you can extrapolate from there to get your hit probabilities on targets of various sizes at various distances. A good example is the project Rifleslinger spent some time compiling: http://artoftherifleblog.com/position-analysis-summary/2014/10/position-analysis-summary.html

Thank you for the info!

In the killing-ISIS-shitheads world of dropping JDAM, CEP isn’t used. If we use a number that defines half of those bombs striking outside of that radius, and gives us no meaningful description of how far away from center that half fall, it’s bad news for collateral damage/fratricide concerns. We use CE90, something the author slides to but doesn’t name. CE90 is a radius from center of the aimpoint that historically describes where 90% of shots will strike.

And now you know…

I would say this is another in a series of articles written by some one who is clueless about guns, and likely only googled some key words. I have seen random people randomly pick 3 shots and say nice group,but that is the extent. For example a weatherby will shot 3 bullets out of 3 shot with a cold barrel under 1 inch if promised, with any ammo, if you experiment or reload you will do better. The solder of fortune Facebook heroes are clueless!

You stated that a fixed barrel test of XM193 ammunition should put at least half its shots inside a 1.5” diameter circle at 100 yards. This was based on a .7 MOA CEP. Does this mean that half the shots will be inside 1.5″ and the other half will be inside .7″ (or .7 x 1.05)

In the applied world of hunting three shot groups are used with regular success to sight in a rifle, to reliably hit targets as small a coyote having 5″ broadside width at 275 yds. The shooter needs achieve moa or slightly better three shot groups and center these 2.0″ high at 100 yds. Fire two or three groups on different targets letting the barrel cool to confirm this zero. If scope adjustments are needed correct either windage or elevation between groups. When elevation is set then work on windage. I am not sure about CEP, but you can get precisely zeroed to successfully, repetitively take small game using a succession of 3 shot groups during sight in There are decades worth of empirical data to prove this.