Return to the clickable list of items

A BRIEF INTRODUCTION BY LUDWIK KOWALSKI


The previous unit (#117) was revised to incorporate valuable suggestions made by Dr. Kirk Shanahan. The most important omission of the first draft was that procedural errors were not mentioned. After posting the revised essay I asked Kirk to write closing comments on what we were discusseing for nearly two weeks. Kirk thinks that procedural errors (both systematic and random) are likely to be responsible for the apparent excess heat in all cold fusion experiments. This is very serious criticism. Unfortunately, I am in no position to comment on electrochemical errors in different kinds of calorimeters. The purpose of my essay, triggered by Kirk’s published paper (see item #116), was to review the most basic concepts of traditional error analysis. As far as know, the purpose of error analysis has always been to establish the level of confidence in experimental data.


118) Error analysis; new versus old?

Kirk Shanahan (12/19/2003)



Some Comments on the 'New School' Approach to Error


In Unit 117, Dr. Kowalski appended a couple of my comments to him regarding a 'new' way of thinking about error. His units focus on the classical approach using the concepts of systematic and random error. While the scientist does need to know about the extremes of a purely systematic error and a purely random error, reality is never so cooperative as to give us the extremes. A systematic error is generally considered to be an error that produces an accuracy change. Specifically that means it is noise-free, and causes your result to be shifted away from the true value. On the other hand, a random error is usually considered to not shift the result, but to just induce noise on top of the result, which is nominally at the true value. But when the scientist walks up to her real experiment, and looks at her real results, she almost never sees anything this pure. The 'new school' of error thought focuses on the real situations, and is most often observed being applied in real world situations like process control and quality control troubleshooting.

Why should we be talking about error anyway? Because it's there, like the mountain/mountain climber case? The real reason a 'new schooler' (NSer) talks about error is because practical experience indicates it is very costly. It results in bad product being manufactured that has to be discarded or reworked, which costs money. Or it results in some cases in accidents, where people can and do get hurt. So to the NSer, anyone who doesn't define their error is economically and/or scientifically unethical. (By the way, if you're a scientist, what is your product? In the most general sense, it is information.)

So what is the fundamental difference between a NSer and an OSer (old schooler)? It seems to boil down to the attitude on what to do with error. The OSer seems to want to quantify it, and classify it (as random or systematic), but then he wants to forget it. The subtle problem with the OS approach is that it allows one to think you are done when you have quantified and classified. Isn't that what scientists do, quantify and classify their results? What else is there?

The NSer likewise quantifies error (which include propagating random error through equations by the way), but then takes a different turn. She then seeks to understand
the error. To do that, she employs more advanced techniques than is 'usual', i.e. OS. She uses factor analysis and contour plots and statistically designed experiments and correlation plots and... In other words, she studies the error. You can appreciate this tends to require a lot more work. And because of that there is a prior step I skipped, namely an economic evaluation. She asks, “OK, now that I have this level of error, do I need to get better (lower the total error) and can I afford to do it?” So it is the economic stakes, the presumed cost/benefit ratio, that really keys in the NSer's additional work. In the cold fusion case, the potential benefits are so high that considerable effort to understand the situation is warranted. (By the way, in this situation one 'benefit' is cost avoidance, such as avoiding injury to people. The 'cost' is the actual economic cost of doing the work proposed.)

Let's talk briefly about some rules of thumb aimed at helping one make that kind of decision. In analytical chemistry (which usually underlies most scientific research and quality manufacturing) there are some reasonably clear breakpoints in method development. It is relatively easy for an analytical chemist to 'whip out' a method with a 10% error (and by that I am being deliberately imprecise, the 10% could be either 1, 2, or 3 standard deviations, but if it is 1, the following discussion will also deal with 1, likewise if it's 3). With an approximately equal work investment, the chemist can drop the error to 2-5%. Now, maybe with another equal time investment, he can drop it to 1%, but it is almost as likely that it will take considerably more It is nearly impossible to get much better than a 1% technique in analytical chemistry. (Remember I am talking rules of thumb here, there are always exceptions and modifications to rules of thumb.) With purely physics measurements you can often do a little better, specifically it's the measurement of composition that is so difficult.

So how does this apply to the cold fusion (CF) field, specifically to calorimetry? Well, when Dr. Edmund Storms presented his paper on CF from platinum electrodes, as part of that work he showed results from two kinds of cell calibrations that were approximately 3% different. Dr. Storms is a good scientist, reasonably careful and precise, and so it is entirely expected that his presented method falls in the '2-5%' bracket. That is the 'normal' bracket one ends up in when you do good quality work on your own. It takes cooperative efforts most usually to get better than that. However, his results (and others) show an anomalous result. This is where the OS/NS difference becomes radically apparent.

In the OS approach, once the error is quantified and classified, you are done. You have basically concluded there is nothing else you can do, and that is just what the method does. So when the CF calorimeters produced an apparent excess heat signal, the simple explanation of 'instrument error' had already been eliminated. Therefore (and this is correct if there is no calorimeter error) the signal must be real, and it is large enough that it can't be a typical chemistry-based heat production, it has to be nuclear.

But the NSer says; “And where is your detailed error analysis that justifies the error-free assumption?” In other words, why should anyone accept the radical new idea of a novel nuclear process? The more common result is that there IS an error present, heretofore unknown and unnoticed. This is the 'establishment' point of view, and it was developed through long-term experience. The CFer however, wants us to agree with him that his analysis of the error situation is correct.

Since the OSer hasn't done the study of the error, he is unable to answer the NSer's question. So the NSer starts to work. The technical basis to the NS approach at this point is the concept that every theory (in this case I am speaking of the set of mathematical equations that are used to predict what the experiment does) is oversimplified and every experiment has thousands of hidden variables that may or may not be active at any given instant. Sounds impossible to deal with doesn't it? Thousands of variables available, and all theories inadequate to explain them. Wow, what can a researcher do? The OSer throws up his hands and says “Nothing! That's what the error IS!” And he is right on the latter point. But in fact, there are approaches that have been developed to start to address this problem, and thereby to reduce error. One key tactic from the NSer toolbox is to take a long, hard look at the implicit assumptions of the method. Back it up right to the start and ask, “Was it right to apply this that way?” or “Was it correct to assume that?” And, in order to be efficient and economical, the NSer focuses initially on any obvious difficulties (these are often called 'special causes'). In the CF case, the 3% disagreement in calibration equations was a tip-off to me.

So I did the normal NSer thing, I asked what effect the 3% error would have. Now the implicit assumption used by all CF calorimetrists to date is that their calorimeter is stable over the span of their experiments. I decided to quickly test that, so I did some algebra and derived an expression for excess power signal in the case where the 'wrong' calibration coefficients were used. What I found was that the 'theory' I developed looked A LOT like the real results. Following up further led me to a whole picture of how the apparent excess heat signal could arise from mundane chemistry and physics, no nuclear stuff required, and then I publicized my findings. The key idea then is that by working out the impact of an observed error, I was led to a mundane explanation of how a calorimeter could indicate an enormous new heat source had suddenly appeared.

Now the next thing to happen should be to consider how we can run the experiments to differentiate whether or not my proposal is really correct. It is an unfortunate fact that whenever a hidden variable is discovered, the prior experimentation was not designed to address, i.e. record or control, that variable. Thus the prior work tends to be of no value in differentiating. It can however be used to show that the prosaic explanation works in most cases. Note that word 'most'. Remember that in a given experiment, there are 'thousands' of hidden variables theoretically available. It shouldn't be surprising then that any new factor will not fit every case.

So to summarize, the NS approach to error is to study it and understand it. This is versus the OS approach to quantify, classify, and forget it. The NSer cares little if the error is classified, because that tends to be an after-the-fact activity. The NSer's goal is to understand the errors and their impact, and she goes to much greater lengths to define what possible hidden variables are present.

In the CF arena, the calibration constant shift 'error' seems to be both systematic and random in nature. It is systematic in the sense that for a given datum, interpreting the datum with a shifted constant will produce a fixed offset from the true value, clearly a systematic error signature. But the actual shifts observed vary, so in that respect there is an aspect of randomness. (Actually if you examine Dr. Storms' paper carefully, you will see some clear time-dependent trends, so the shift is not strictly random.) So which is it, random or systematic? But more importantly, does that really matter? Isn't it more important to decide if the apparent excess heat signal is real or mot?

Disclaimer: The opinions expressed above are expressly those of Dr. K. Shanahan. A discerning reader should expect to find considerable objections to these opinions. Otherwise what is the point of comparing and contrasting the 'old' vs. the 'new' schools?

===============================================


Ludwik:
In one of my messages I asked Kirk for a reference on “the new school” ideas about errors analysis. He mentioned Deming. I asked “who is Deming?” The reply ( referring to the following URL: http://www.ricoh-usa.com/about/awards/deming_the_man.asp ) was:

Kirk:
Deming represents the other half of the textbooks on 'errors' that you seemingly haven't read, which isn't surprising. Most University training doesn't get into these issues at all, let alone deeply enough. It's the industrial world that lives and dies by error reduction and tighter control. A 'special cause' or 'special causal factor' is an variance-producing causal factor that typically produces 'flyer' data, and tends to be easy to identify and fix. Also, the factor tends not to be always active like typical 'random' variables. Thus, special causes tend to convert an underlying random error distribution into a non-gaussian form. Once 'special causes' have been eliminated from the error pool, the residual tends to be more nearly randomly distributed. In other words, in the process of tweaking up a chemical process or analytical method, the special causes are the first targets. The Deming Award is Japan's highest award for quality.

Deming was born on October 14, 1900 in Sioux City Iowa. After studying engineering at the University of Wyoming, he earned a Masters Degree in mathematics and physics from the University of Colorado, and was awarded a doctorate in physics from Yale in 1928. Deming developed his philosophy of using statistics to control manufacturing while working at AT&T's Hawthorne manufacturing plant in Chicago. After World War II, as the Census Bureau's chief mathematician, Deming developed sampling techniques that drastically narrowed the margin of error. He first went to Japan in 1947 to help with the country's census. The Japan Union of Scientists and Engineers (JUSE) became keenly interested in Deming's statistical methods. At the time, the Japanese product reputation was terrible. Within four years of employing Deming's methods, Japanese industry had made a complete turnaround, and was well on the road to producing the outstanding products available today. In this country, several major organizations, including Ford, Pontiac and the U.S. Navy have thrived by implementing Deming's methods. Deming lived and worked out of a modest Washington, D.C. home. So obsessed was Deming with eliminating waste, one of his daughters recalled he dated his eggs with a felt tip pen so that the oldest would be eaten first. He conducted his intensive four-day business seminars right up until less than two weeks before his death in December 1993, at the age of 93.


Return to the clickable list of items