Some have told us this page is "sour grapes." Consumers Reports rates Chrysler cars as more reliable than Mercedes, so perhaps our grapes aren't so sour now. Also, see the discussion of the CR rating systems at TrueDelta.
A high response rate is the key to validity. Employee survey findings can be questioned when fewer than half of the employees respond. So how many people respond to a Consumer Reports survey? "Of over 4 million questionnaires sent this year, the magazine received responses regarding about 480,000 vehicles," according to Detroit News. If most people reported on two cars (because most families have two or more cars), that would put the response rate at a mere 6%. Even assuming one car per family - a highly dubious assumption - we have a taudry 12% response rate.
What could CR do about this? They could send e-mail reminders and/or follow-up post cards, or make spot calls. This costs money, but when you have the world's most influential auto reliability study, a little investment in validity makes sense, and could just be a dent in the adveritisng budget. If nothing else, they could try it and see if they find any of the differences which survey practitioners talk about.
People who buy different car models may also maintain them differently.
What causes a person to buy a car might also cause them to change the trans fluid frequently, or not at all. This may result in different reliabilities.
Matt Kennell also pointed out, "the sort of people who buy [brand] [may be] the same sort of people who are rigorous about preventive maintenance. This isn't too unreasonable to imagine, as it would be characteristic of the same personality type: someone who goes out and meticulously researches all the cars, and thus seems like he or she would be conscientious about maintenance."
A study on American Honda owners found that most cleaned their garage floors on a regular basis...they appeared to be meticulous about maintenance. Could that affect reliability? (Bob Meyer found this article, from 8/27/97, at the Detroit News).
A related issue: Those who select from a manufacturer may have different driving characteristics than those who select from another manufacturer! Some people drive their cars more aggressively than others, which may wear them out faster.
John Greenstreet: "the CR survey may over/understate the reliability of certain cars because the people that own them are not homogeneous. ... many people will have a subconscious need to justify their purchase of a Japanese auto over of a domestic one, and they could do this by believing superior reliability is the reason they bought it. Because of cognitive dissonance, they would tend to overlook or downplay anything that would attack this mind-set. We do see many people who vehemently defend Japan's cars' reliability and smear that of others."
All car owners are not alike, and they can have personality traits that directly influence their choice of vehicle, their vehicle expectations, and how they subsequently treat and maintain their cars. Consumer Reports does not control for this kind of systematic error in their surveys. (As far as we know, nobody does, so it's more of a "heads up" than a criticism). Perhaps a statistical analysis one year could accompany the reviews with a footnote given when reviews are quoted.
Note that CR has in recent years lumped together siblings sold under different labels to give the appearance of validity.
Like this article? Digg it!
In July 1996, Consumer Reports tested motor oils for their readers, but instead of using normal cars, they used New York City taxis, which are normally run 24 hours a day and never allowed to cool down - which means that the most strenuous test of motor oil, the cold start (which causes most engine damage), occured rarely, if ever, during their testing. They found no difference between any of the motor oils, from the cheapest to the best synthetic, and concluded that all “natural” oils are interchangeable, but that synthetics still hold an advantage for some drivers. The idea that the research was meaningless because their research methods were horribly flawed was not brought up; nor did they go to the natural conclusion that if they couldn't tell the difference between Mobil One and the cheapest oil on the shelf, they probably couldn't use that research model to tell whether individual natural oils were different in quality. More on synthetic oils
Different drivetrains have different reliabilities -- CR often lumps them all together. (Now they are also combining "corporate twins" to hide the anomalies of years past). Standard and Grand Caravans are listed in the same category, despite the very different repair histories of the two transmissions and the different engines. They separate some engines but lump the 3.0 V6 oil-leaker in with the more reliable 3.3 engine.
David Ta wrote: "I'd expect CR to point me to those unique problem(s) from different make-model-year combination. Those CR reliability reports, regardless how they were done, did not reveal those problems. For example, 6 months ago, I noticed a bunch of postings on [Japanese SUV] problem of blown head gasket, within the first 70k miles. I checked CR report on [vehicle] and compared to CAA report on the same make-model-year. Not surprisingly, CR reported a [best rating] under the "engine" category. And surprisingly CAA reported the same make-model-year a "much worse than average" two red-dots for that category."
Lloyd says that unreliable options or components are sometimes pointed out in the ratings. This is indeed true, with an emphasis on "sometimes." We have to trust Consumer Reports on that, which we'd rather not do, considering what they do not tell us. WhatCar? used to have a very consistent approach to this; and WhatCar? also used to point out what the manufacturer had done, or had not done, to solve the problem. (WhatCar? used actual repair records from vehicles leased in the UK.)
Will Mast said that Consumer Reports' harping on some cars may sensitize owners to existing issues. For example, people who never noticed "bad" shifting may suddenly "see" problems where none existed before; while those who experience similar issues on a "good" car will not see anything.
This is related to a common problem in psychology: people who volunteer normally have a high need for approval. Therefore, they may try to to bring their experience into line with what Consumers Reports seems to want. Consumers Union provides clear "demand characteristics" - we know what cars they believe are the best. The research on this topic indicates that people will change their perceptions to match what they think the experimenter (in this case Consumer Reports) wants.
We should note that these issues are certainly present in car reviewers, whether they work with CR or not. We've suggested blindfolding testers for the ride and noise evaluations, and covering up internal badging and identity cues (and blindfolding testers on their way to the cars) as ways to help avoid this bias. Yes, some cars will of course be clearly recognizable anyway; but the ride and noise would be thoroughly unbiased, if opinions were given before the blindfold came off, and eventually perhaps the other evaluations would become more fair. Consumers Guide says they're going to work on the second idea; we hope Consumer Reports will better them.
Note that CR has in recent years lumped together siblings sold under different labels to give the appearance of validity.
People who are inclined to buy different brands may define "serious" differently (see above). If you've never received a survey, ask a friend who subscribes to see theirs before they return it (if they return it). You will notice that Consumers' Reports really doesn't say what a "serious" problem is. I believe should define it or say "any" problem.
This was evident in reactions to the problem of sludge in the engines of many Toyotas - a problem which Toyota, to its credit, eventually admitted and acted on. The Corolland forums were full of people claiming the problem was not real but simply in the minds of those who claimed they had it; and if was real, it was the fault of owners and not Toyota. We doubt they'd feel the same way if, say, Neons were victims of sludge.
Jim Eldridge essentially wrote this for us by example:
I have a 1985 Dodge Daytona that has 135,000 miles on it. It runs great. At about 85,000 miles the timing belt broke, stranding my wife. The maintenance schedule says nothing about replacing the belt. Dodge thinks it's OK to wait till it breaks and then replace it; the design is such that it does nothing bad to the engine. However, to my wife, the car broke down and had a "serious engine problem." [Note: the manual actually does suggest replacing the belt at 105,000 miles.]
My friend with a Nissan Maxima just had his 60,000 mile maintenance at the dealer. He had the timing belt replaced, the fuel injectors cleaned, oil change, etc. and a fuel injector replaced. Cost, $850! If he filled out the CR form, he would show no major problems, just routine maintenance.
He then told me he was considering replacing all of his shocks because "it was about time." No Dodge owner would ever consider replacing shocks before the car bounced down the road. All Dodge had to do was recommend the belt change at 60,000 miles to avoid a "serious engine problem."
Will Mast said, “A friend with a Toyota used to brag about how trouble free it was until I showed him all the repairs, including a cracked exhaust valve, that were hidden in his 30,000 mile "maintenance" visits to the dealer.”
The solution is to get far more specific - and perhaps, to be really careful, to find out something about owners' routine maintenance.
We do not think a sample of two people is significant. These are illustrations of a general principle.
Those who send in their surveys are different from those who do not. Most studies try to raise their response rates through follow-up calls, letters, even post-cards. Many studies check on the characteristics of the nonrespondents to see what the error might be. Consumers' Reports does neither of these, as far as I know. Brent Peterson wrote a wonderful simile:
[A controlled experiment could use 30 carefully bred rats in cages]... A survey would be like having 100 lab rats starting the experiment and then letting them roam freely around the building with access to doors leading outside. Then measuring those who came back for dinner in their cages at the end of the experiment. Say 8 rats returned, you do not know what happened with those other 92 rats that escaped. ... [we presume they have different characteristics than the two that returned, just like people who do not return surveys are different from those who do. Note that we've adjusted the example slightly to reflect Consumer Reports' apparent response rate].
Raymond DeGennaro II pointed out that
CR does not draw their data from the general public, only from subscribers....They have to prove that their data represents the general public, and they haven't.
The solution here is to get a larger non-subscriber sample and compare the results every ten years or so, if there's no difference.
In some cases, it seems that the difference between an "average" vehicle and a "better than average" or "worse than average" vehicle is quite small, especially considering the actual number of people reporting problems. This opens the possibility that it's just random chance. We can't tell because they don't report standard deviations. We believe that there should probably only be three categories, considering their research methods: low, average, and high. Borderline cases can be footnoted or described in text.
The new bar charts that show exact standings (rather than lumping into categories) are a wonderful alternative, but really should have error bars to show whether, say, the difference between the Civic and the Neon, or the Corolla and PT Cruiser, are greater than the sampling error. But that might make their ratings look absurd... which in itself would be valuable information.
Consumers Reports does not report the number of people responding to each item, or the standard deviation. Supposing we found out that the standard deviation was fairly large - we could not reliably differentiate an above average from a below average rating? Perhaps they should use fewer categories - "above average," "average", "below average," where "above average" took the place of today's "much better than average."
What is the difference between an average and below average rating, in terms of actual owner ratings? Would five owners reporting "serious" problems cause a car to get a black dot instead of a red dot? How many does it take? When will they tell us?
Incidentally, what is the sampling error for each model? We really need this to know whether there is really a difference between two models we are considering.
Consumer Reports' ads imply that they have no bias. Their articles prove otherwise. When they say they are unbiased because they do not accept advertising, think about their logic for a moment. Is the Toyota Corolla enthusiast page unbiased because it does not have advertising? Our ideas on reducing bias are shown up front - essentially, try to make sure testers do not know which car they are testing (to avoid self-fulfilling test results) and also to keep an eye out for bias. Our friends at Consumers Guide could do that better, too - they seem to have a need to write "but not up to the best of the European/Japanese imports" at the end of every American review. Well, some of the imports aren't up to the best of the Americans - but we never read that. (If you still believe Consumer Reports, take a look at Mercedes quality ratings, and tell me about "the best of the imports!")
For that matter, we can look at their February 2005 issue, where they call the interior of the Caravan "plasticky" (no more than the Sienna in our experience); and, as "Grim" said,
[Regarding a comparison of the Acura TSX and Volvo S40], seeing them trash a car from a company I don't like confirmed their bias more than seeing them trash a car from a company I do like (where I might be biased myself). For example, in a one page review, they said five times that the Volvo had unacceptably tight rear legroom. This despite the fact that in the objective measurements published on the next page, the Volvo had as much legroom as any other car in the comparo (there were four) and more than most...They also call the Acura's gas mileage "good," while they call the Volvo's "acceptable." That's interesting, since they get the exact same mileage and the Volvo gets it on regular gas rather than premium like the Acura. They also ding the Volvo a couple of times for sluggish acceleration, despite the fact that it's only two-tenths slower to 60 than the Acura (which was "good" and "peppy"). Two-tenths falls well within the range of measurement error.
John Phillips wrote: "A few years ago, they had the [2 domestic nameplates and one foreign nameplate all of the same car] owner's satisfaction. The [domestic nameplate] had the least owner satisfaction of these three. Next was the [other domestic nameplate]. The best owner support was for the [foreign nameplate]. There was a fair spread between them. Funny thing: all of these are built at the same American plant, only varying, primarily, in "hood ornaments." How can the same car be perceived differently when the only real difference was the label?"
Chris Jardine wrote:
I've noticed a number of occasions where data they have presented simply CANNOT be correct. Example 1 - a few years ago I looked at their reliability chart for the [car and car with another engine]. They claim that exterior fit and finish was [good rating] on the [one engine] and [terrible rating] for the [other engine] . This translates to a 4 and a 1 on a 1 to 5 scale. Since these vehicles were produced by the same workers, tools, raw materials, etc it is not possible for this to happen! I could buy a difference of one but not three between the two. A short statistical analysis lesson would be appropriate here. You can expect a variation of one when working with something like this. If you see the deviation that you do here you simply have not sampled the data properly! This is basic statistics. If this difference came in something that was not common to the two, like the engine, cooling system, transmission, etc. I would be able to accept the variation as correct. However, there is no way that this deviation from one to the next can occur with common items to the two.
Example 2 - [same cars, different nameplates]. There were major differences with the engine, electrical, fit and finish, etc. between these two. The only difference between them was the name plate applied near the end of the assembly line and a code in the VIN. There were differences in standard levels of equipment, but, that should not statistically effect what CR would have us believe it did. This is another case of improper statistical procedures.
For these reasons, I for one simply cannot believe much of anything CR prints as statistical data.
(Webmaster note: the reliability differences could have based on different types of people buying each car, and treating them differently. If we generalize from this, are any Consumer Reports ratings worth looking at? Can we really compare a "sporty" car with a regular sedan, or cars in different price classes? Or even cars in the same "general" price class but with a couple of thousand dollars' difference in price?)
- Since the time this section was written, CR has "solved" (we would say "hidden") the problem by merging statistics for under-the-skin-twins. That makes it harder to criticize them, but does nothing to solve their underlying validity issues.
Steven Lee posted: (Edited for length)
...In addition to a random sample pool that are representative of the general population, a good survey should not allow their responses to be optional to the surveyed. For example, phone surveys are good because it is harder for the surveyed to decline to respond. Surveys that solicit responses through magazine inserts, web pages, open forums (e.g. TV or newspaper ads requesting responses to a post office box) generally range from horrible to completely useless, regardless of the sample size. Usually only a small proportion of those who read the surveys eventually respond and so they are susceptible to the "fail safe syndrome": only those with the "expected" responses end up responding. Most of the "failure" cases would never get reported. A magazine once conducted a survey of "unhappy marriages in the United States" using postage-paid inserts. A ridiculously large portion of the responses reported unhappy marriages. Therefore most marriages in the U.S. were concluded to be bad. To unsuspecting readers such surveys would be believable because the magazine collected thousands of responses. However, if examined closely, one can quickly realize that those with unhappy marriages were much more likely to respond. People with happy marriages would probably dismiss such surveys as pointless or media hype. The sample size, regardless how big it was, became irrelevant.By the same token, if CR conducts their survey with voluntary responses, the conclusions are probably worthless. People with problematic [car make] cars would be more likely to complain and whine about the expected [car make]'s lack of reliability and swear that "they will never buy another [car make] again." People with lemon [other car makes] would be more likely to keep their problems to themselves because they don't want others to know that they were unlucky [or at fault themselves - Webmaster] to own lemon [foreign country] cars because "[foreign country]'s companies don't make lemons."
Surveys aren't just all about math! Techniques count more!
Eric Bechtol wrote:
The one thing that CU never figures in is the "periodic maintenance" that is required by the dealer. Some Hondas require the dealer to repack front wheel bearings every 15K miles. Also, with the solid lifters, they need to be adjusted at certain intervals [Honda has switched to hydraulic lifters on some V6 engines since this was written - 2005 four-cylinders still have solid lifters, though].
My brother is a dyed in the wool Honda guy. He has had to have his gas tank, exhaust system, and AC condenser all replaced on his 88 Honda, all due to rust. Well, my 1993 Spirit has been driven in the same weather. I have had to replace none of these items because of the superior undercoating/rust prevention items and the stainless steel exhaust. Truth be told, all of my repair parts on the car have added up to less than 150 bucks, including installation!!! Does CU give credit for this? No, because the owners will say "well, mufflers should go out at 140K miles." Funny, mine never did. Also note that just the muffler was something like 100 bucks from the dealer. He stated that he could run his engine much longer than mine and that may be true, however, I can buy a rebuilt head or even an engine for the price of his repairs and maintenance. [This story is meant as an illustration, for those who prefer concrete explanations to abstract concepts, not as proof.]
Everybody needs to understand the limitations and bias in their information.
CR needs to tell all their information and be skeptical about their own methods. They need to report more about their methods and formulas, and any problems they can see. They also need to address any other problems noted above.
I have been very disturbed by the quality of some recent reviews, and no longer trust their performance figures. Be careful!
Like this article? Digg it!
I sometimes get e-mail saying things like "I had a Dodge and it stunk, so CR is right and you're just full of sour grapes." Hey, I don't make up the statistical and logical rules, I just report on them. It's my job to know this stuff. If you had a bad Dodge...well...you're not the only ones, there are plenty of lemons made by all the automakers. That's why we don't use sample sizes of one. No statistic can be used with n=1. [also, lately, Chrysler has been scoring high - and we're keeping this page up.]
The more rare a breakdown is, the more units are needed to get a reliable figure. I wouldn't want to publish any auto reliability data using fewer than 100 units.
I have been a reader and later a member of CU since my college days, about 1960. The shortcomings of the "Reports", which you mostly capture, have been apparent to me for nearly 40 years. Never-the-less I remain an enthusiastic supporter of CU, simply because in many areas, they are the "only game in town"; while in other areas they complement the leading reviewers. (Cars, photography, audio and equipment, and computers, being some areas that interest me keenly, and for which other good sources of review are available.)
I am pleased that you have taken a formal approach to pointing out, to CU and the world, that there is definitely room for improvement. Although, you sometimes only hint at the direction in which improvement is needed, rather than providing a useful road-map. Your criticisms are mostly well taken, even when not well stated, and ought to be treated seriously by CU.
Bias is a universally human characteristic, even among, those of us who consider ourselves to be scientists and have received training on how to avoid bias in our research. Your web site illustrates a lot of bias, while Consumer Reports illustrates noticeably less in my opinion. However biased or not, both of you have an important role to play, and ought to play it to the best of your respective abilities. CU can definitely do better, not just on the basis of eliminating bias, but in the quality of their reports, which carry both ambiguity and outright errors; more so recently (15 years) than in the past. I can often identify the existence of an error, while reading the reports, even before doing the research that tells me the extent of the error. But I don't subscribe to the notion that 'the critic or her critique must be perfect', rather, all I require is a rigorous and honest approach. By that standard both your Web site and CU's reports meet my criterion.
CR gave Chrysler strong recommendations in the mid-Sixties very nearly across the board, and A-bodies continued getting favorable reports through the end of production. Their ratings of B- and C-bodies started plummeting in the late 1960s (say 1969 or so) but I can't remember which they started trashing first. The Volare and Aspen initially got high recommendations based on goodwill generated by their predecessors, but that didn't last.
Your info about Consumer Reports inconsistent ratings is on the mark. My father treats the CU ratings as gospel, so when we bought a new 1996 Dodge Grand Caravan, he almost fell over. He showed me all the poor CU ratings for the Mopar minivans then sat back and waited for all the big repairs we were supposedly in for. Never happened. We owned that van for 8 years and with two boys, used it hard. It got an oil change every 3-4k miles and my wife kept up with the in and out appearance. We only sold it because we needed a small SUV for the 4wd and so my wife could pull our son's dirt bikes if needed. Ironically we ended up getting an Isuzu Trooper, a vehicle CU declared unacceptable. It's been a year and this vehicle has been equally as outstanding. If I understand correctly, all SUVs are at risk of tipping but CU went after Isuzu with a vengence- which is probably why Isuzu is all but out of the passenger vehicle business. I have a hard time believing these ratings are impartial, when for over 20 years this magazine insisted we should all be driving a Toyota or Honda.
Like this article? Digg it!
Allpar covers all Chrysler and related vehicles* with news, performance tips, forums, histories, repairs, racing, and more. Use the menus on top of the pages!
Cars - Engines - History - Forums - Repairs - Reviews - Other car reviews - Us - Terms of Service - News - Random link -
Corrections/Additions
Please read the terms of use! * Mopar, Dodge, Jeep, Chrysler, HEMI, and certain other names are trademarks of Chrysler, LLC. We are not Chrysler. We are not responsible for the consequences of actions taken based on this site and make no guarantees regarding validity or applicability of information or advice. The Webmaster is not an expert. Copyright © 1998-2000, David Zatz; copyright © 2001-2008, Allpar LLC. All rights reserved. Recommend this page!
We hope you liked Allpar's Jeep, Plymouth, Chrysler, and Dodge car, truck, and minivan information.