How we test: TVs
Updated February 10, 2012
Today's TV choices cover a broad spectrum of screen size features, technologies, and prices, but the item TV shoppers consistently consider most important is picture quality. While an assessment of image quality would appear hopelessly subjective, CNET Labs has come up with a set of tools and procedures designed to arrive at unbiased results by utilizing industry-accepted video-quality evaluation tools, objective testing criteria, and trained experts.
You can read the full description of our testing procedures below, but if you want to get the short version, check out this video.
Table of contents
Test environment
The most important piece of test equipment is a trained, expert eye. Test patterns and the latest gear are no substitute for a knowledgeable, keen-eyed evaluator with a background in reviewing similar types of TVs. CNET's TV reviewers, David Katzmaier and Ty Pendlebury, have extensive experience reviewing and calibrating displays, and performs all measurements and tests themselves.
Konica Minolta
CS-2000
Test and reference equipment
Our primary mechanical test device is a Konica Minolta CS-2000 spectroradiometer, which replaced an older CS-200 in June 2008. The CS-2000 improves upon the CS-200 in its capability to measure low-luminance sources, and is generally regarded as one of the most-accurate devices of its kind. It measures chromaticity and luminance from any type of display, including plasma, LCD, LED-based LCD and DLP, in direct-view as well as front- and rear-projection configurations. Chromaticity is hue and saturation of a color regardless of luminance, and is commonly expressed as x/y coordinates on the CIE chromaticity diagram. Luminance is a measure of brightness. The CS-2000 is supported and controlled directly by the Calman software (below).
-
Additional reference and test gear in CNET's TV lab:
- Current reference display: As of September 2008, CNET uses the Pioneer Elite Kuro PRO-111FD as the primary reference display for black level performance. As of early 2012 it still produces the best overall 2D picture quality we've tested. In 2012 we will also use the Samsung PN59D8000 and Sharp Elite PRO-60X5FD for color and black-level reference, respectively, as well as the Samsung UN55D8000 for 3D reference.
- AV Foundry VideoForge: A signal generator that outputs a variety of test patterns at various resolutions and formats, including all HDTV resolutions, 1080p and 3D, via HDMI It's supported and controlled directly by the Calman software (below), and as of 2012 replaces the Sencore MP-500 we used previously.
- Key Digital 1x8 HDMI distribution amplifier, Key Digital 4x1 HDMI switch: The eight-output HDMI distribution amplifier/switch combo can send any of four HDMI sources (including 3D) to as many as eight displays simultaneously without any signal degradation. We use this setup for side-by-side comparison testing.
- Extron DA6 YUV A: A six-output component-video/RGBHV distribution amplifier that can send one SD or HD source to as many as six different displays simultaneously without any signal degradation. We use it primarily for side-by-side comparison testing of component-video.
- Sony PlayStation 3 Slim: Blu-ray player (reference, 3D compatible)
- Oppo DV-980H: DVD player
- DirecTV HR24: high-def DVR (3D compatible)
- Monoprice and Key Digital HDMI cables
-
The reference and test software in CNET's lab includes:
- Calman 4.x by Spectracal: This flexible software program controls both our spectraradiometer and signal generator via a laptop PC to aid in the calibration process. It provides a step-by-step procedure for adjusting TV picture controls, including advanced grayscale and color management, according to guidelines used by the Imaging Science Foundation (ISF). Every TV CNET reviews is calibrated prior to evaluation using this procedure, and the reports and many of the numeric evaluation results at the end of the review are generated by Calman.
- Digital Video Essentials: HD Basics (Blu-ray): This test disc is a secondary source for the patterns used for calibration and evaluation.
- HQV Benchmark (Blu-ray): Patterns from this disc are used to help evaluate video processing.
- FPD Benchmark Software for Professional (Blu-ray): Patterns from this disc are used to evaluate motion resolution
TV review sample information
Unless noted otherwise, CNET HDTV reviews are based on one reviewer's hands-on experience with a single particular sample of one model. While our experiences are usually representative of other samples with the same name by the same manufacturer, we can't always be sure of that since performance can vary somewhat from sample to sample--particularly if newer samples receive updated firmware, or if manufacturers make changes without updating the model name. We typically review models as quickly as possible, so we often receive early versions of firmware that are sometimes corrected later. However, we never review preproduction samples. All of the samples used in CNET HDTV reviews represent, as far as we can tell, shipping models.
Sometimes a firmware version will have a direct effect on the performance of a television, and thus on its final review score. When this is the case and we're made aware of it--usually after a CNET reviewer or a reader finds a performance-related problem--we'll note the firmware version in the review and post related follow-up information in a note referenced in the review body.
TV makers generally group their models into series, which share identical features, styling, and specifications across multiple screen sizes. In 2009, CNET's TV reviews were expanded to cover other sizes in the series, not just the one size we typically review hands-on. While we don't test these other sizes directly, we feel that the performance-related remarks, as well as other portions of the review, apply closely enough to all sizes to warrant a "series review" approach. Even so, we are careful to check with the manufacturer to make sure there aren't any "odd" members of the series to which the review wouldn't apply. Check out our in-depth explanation for more.
It's worth noting that CNET obtains most of its review samples directly from manufacturers, typically by an editor asking a public relations representative for the desired model. This, unfortunately, can lead to manufacturers sending nonrepresentative samples, or even tampering with the units before they are sent, to help ensure more-positive reviews. If we spot a blatant case of tampering, we'll note it in the review, but we can't always prove it (and in case you're wondering, no, we've never spotted a case of tampering that we could prove enough to mention in a review). If a manufacturer cannot ship us a sample or doesn't want us to review a particular set, we sometimes buy the model in question ourselves.
Test procedure
We strive to consistently test all TVs we review using the procedure below. In cases where not all of the tests are followed, we'll note the missing items in the review.
Aside from the bright-room portion of the test (see below), all CNET HDTV reviews take place in a completely darkened environment. We realize that most people don't always watch TV in the dark, but we use a dark environment ourselves for a number of reasons. Most importantly, darkness eliminates the variable of light striking the TV's screen, which can skew the appearance of the image. It makes differences in image quality easier to spot, especially perceived black-level performance, which is severely affected by ambient light. Darkness also allows viewers at home to more easily match the experiences written about by the CNET reviewer. Finally, darkness is the environment we find most satisfying for watching high-quality material on a high-performance TV.
Calibration
Before we perform formal evaluations of HDTVs, we first calibrate their picture settings, with the help of the Calman software, to achieve peak performance in our dark room. Though it may seem more realistic to test TVs in the default picture settings, those settings are nearly always designed for maximum brightness, saturation, and impact on the showroom floor. That might sound desirable, but we believe a more natural, realistic picture looks better--in other words, one that most accurately reproduces the incoming signal. Unfortunately, most of the picture presets on HDTVs are not designed for that kind of accurate reproduction, which is one reason why we perform calibrations. Another is to provide a level playing field for comparisons.
Unlike some of the third-party TV calibrations offered today, the ones performed for CNET TV reviews do not utilize settings in the hidden "service menus" of televisions. Nearly all TVs have these menus, and previously we would access them to better calibrate our review samples. In the last few years, however, we have posted our ideal dark-room picture settings as part of our reviews, and since users cannot typically access those service menus (at least, not without voiding the warranty), we decided to no longer use them in our calibrations. We recommend that TV viewers avoid accessing the service menus themselves, because without proper training they can do more harm than good. Happily, many new HDTVs offer ample controls to achieve optimum picture quality without having to resort to service menus. Check out this Q&A for more.
- CNET TV calibrations follow a few steps, utilizing the ISF Advanced workflow from Calman 4.x and patterns from the VideoForge signal generator connected via HDMI to the TV:
- Choose the picture mode (typically Movie or Cinema) and color temperature preset (typically Warm or Low) that produces the most accurate initial dim-room picture, allows full access to detailed controls and comes closest to D65, or 6500K.
- Disable or minimize any automatic picture adjustment controls, dynamic contrast, ambient light sensors, auto black, auto color/flesh tone, or other circuits that change the picture on the fly. Engage settings, such as local dimming on LED displays, that generally improve picture quality.
- Adjust brightness and contrast for maximum dynamic range without clipping, using the Black and White Pluge patterns.
- Adjust maximum light output to 40 fL (footlambert) from a 100 percent window pattern. This light level is bright enough to provide excellent contrast but not be overwhelming in dim and dark rooms; it is achievable by most TVs we test.
- Choose the gamma preset (if available) that comes closest to an average of 2.2, the standard for professional monitors and appropriate for a dim room.
- Calibrate color management system, if available. We attempt to achieve proper absolute luminance for primary colors and proper hue for secondary colors, as dictated by Calman and the Rec709 HD color standard. CMS adjustments are made using 75 percent luminance window patterns. If CMS can't improve on default settings or introduces artifacts, we disable it.
- Calibrate grayscale using 2-point and/or 10-point system, if available. We attempt to adjust all levels of gray, in 5 percent increments using window patterns, to come as close as possible to D65 (x=0.3127, y=0.329) while maintaining 2.2 gamma.
- Adjust brightness, contrast, light output (luminance), color, tint, and sharpness a final time
All of our picture settings used to achieve the calibrated image are published on a page specific to each TV in CNET's picture settings forum. Each review contains a link and image (see above) to that page.
Side-by-side comparison
Nearly every HDTV CNET reviews is compared with others in the room during the evaluation. This is a direct, side-by-side comparison; the TVs are literally lined up next to one another and compared in real-time, with the reviewer recording observations on a laptop computer. We use numerous sources fed through a switch and a distribution amplifier--a device that can feed multiple TVs the exact same signal with no degradation. TVs being compared often share similar price points, screen sizes, and other characteristics, but can just as often be more or less expensive or have different characteristics to better illustrate major differences (such as between LCD and plasma, or an extremely expensive set versus a less-expensive model).
These comparisons allow CNET's to make definitive, in-context statements about virtually every area of a TV's performance, and their accuracy depends on each of the TVs sharing a level playing field. For that reason, we compare only calibrated televisions. We know of no other professional publication that regularly performs side-by-side comparisons as a part of nearly every review.
Image-quality tests
We perform a broad range of tests on all televisions we review, organized into a few key categories. Most comments in a TV review's performance section are based on observations of a Blu-ray movie, since Blu-ray is the highest-quality source typically available to HDTV viewers today. We use a variety of films, as opposed to one or two "reference" films, to better illustrate that performance characteristics are universal and apply regardless of which movie's being watched (they also make the reviews more fun to read and write). An argument can be made for using the same movie every time, and we do have a few scenes in certain films that we return to over and over, but in general we prefer to spread it around.
Here are the main picture quality areas addressed in CNET reviews:
- Black-level: We comment on the depth of black a TV is capable of producing. Since deeper, "blacker" blacks lead to more-realistic pictures, higher contrast, and more "pop" and color saturation, we consider black level the most important single performance characteristic of a TV. We also talk about shadow detail and gamma in this section. Subjective observations are supported by the "Black luminance (0%)" and "Avg. gamma" measurements in the Geek Box (see below).
- Color accuracy: We evaluate the combination of color temperature and primary and secondary color accuracy according to the Rec709 HD color standard. Subjective observations are supported by the majority of measurements in the Geek Box, everything from "Near-black x/y" to "Yellow hue."
- Video processing: This broad range of tests includes objective measurements such as resolution capabilities and 1080i de-interlacing and subjective tests with both patterns and real-world material. One of the most important is the ability to properly handle 1080p/24 cadence (see HDTV resolution explained for more). As of September 2008, we also began testing for motion resolution, which has both subjective and objective elements and so is usually reported as a range, e.g. "between 300 and 400 lines." If a TV has motion processing, such as 120Hz or 240Hz smoothing (dejudder), we also address its real-world effects in this section. We'll also talk about excessive video noise here, if we can trace its fault to the TV, as well as other miscellaneous issues such as false contouring (aka solarization) not dealt with elsewhere. The remainder of the Geek Box below hue is devoted to video processing.
- Uniformity: With LCDs and rear-projection sets, we use this section to subjectively address backlight uniformity across the screen, making subjective observations with full-raster test patterns and flat-color scenes, such as shots of skies, from program material. We also talk about off-angle viewing in this section, using similar material and subjective comparisons. Plasma TVs usually have effectively perfect uniformity and off-anlge viewing, so we don't typically don't include this section in plasma reviews, but we will if the plasma's uniformity is atypical to our eye.
- Bright lighting: Another subjective test, we turn on the lights in our testing area and open the windows during daytime to see how the TV handles ambient light. We note the screen's reflectivity compared with its peers, as well as its ability to maintain black levels.
- 3D: Our final tests involve 3D picture quality, and at the moment they're entirely subjective. Moreover we don't perform calibrations in 3D, although if the default "Movie" or "Cinema" settings for 3D seem particularly incorrect, we'll do some tweaking of the basic controls. In this section we usually address crosstalk, the depth effect, overall luminance, and video processing in 3D (see the 3D TV FAQ for more on these issues). We don't normally evaluate a TV's 2D to 3D conversion, however. Note that a TV's 3D picture quality is the sole item from this list that doesn't factor into the TV's numeric Performance score.
CNET does not usually evaluate HDTV audio quality. We think that anyone who cares would be better served investing in a separate audio system; the least expensive models will nearly always outperform a TV's built-in audio.
We have also dropped standard-definition testing from our regimen as of spring 2011. The main reason is that most TVs we test will be mated to high-definition cable or satellite boxes, which usually perform the conversion of standard-def channels to high-def internally, making a TV's standard-def processing moot.
In 2012 we also stopped testing TVs with PC sources since we saw little variation in how TVs handled digital (HDMI) video from computers, and analog (VGA) computer connections are less common. Check out How to use your TV as a computer monitor if you're interested in doing so.
Geek Box and Calman report
The Geek Box is where we put the most-important objective results we attain from measurements. It was expanded significantly in April 2011 and remains unchanged in 2012. The numbers therein adhere to the following guidelines to arrive at Good, Average, or Poor scores.
We determined the cutoffs for those scores based on measurements taken from TVs we reviewed in 2010, guidelines in the Calman software where applicable, and editorial discretion. Note that while these numbers and scores are useful, they don't necessarily represent the full picture quality of a display, and we consider many other factors when arriving at the numeric performance score in a CNET review.
Unless otherwise noted, all test patterns measured are windows--a rectangle of white, gray, or color in the center of the screen surrounded by black--generated by the VideoForge; all numbers reported in the charts are taken directly from the Calman software's post-calibration captures; luminance units are in fL (footlamberts), where 1 fL = 3.43 candelas per meter squared (cd/m2), and all percentages refer to test pattern's luminance, where 0 percent is black and 100 percent is white.
Black luminance (0%) Example result: 0.0140
This is the measure of the luminance of "black," and a lower number is better. It's often referred to as MLL, for minimum luminance level, but since this measurement is taken post-calibration it may be higher than the TV's minimum. We consider the post-calibration black level most important because the calibration process aims to prevent crushing of shadow detail and "tricks" like dynamic contrast that can affect this measurement.
Good: +/- less than 0.009
Average: +/- 0.009 to 0.019
Poor: +/- 0.02 or higher
Avg. gamma Example result: 2.24
Gamma is a measure of how much light a display produces when fed a certain level of signal. The score is based on the result's +/- deviation from 2.2, the standard for professional video monitors and appropriate for a dim room.
Good: less than 0.1 deviation
Average: 0.2 or less deviation
Poor: more than 0.2 deviation
Near-black x/y (5%) Example result: 0.2947/0.2865
The color of gray at 5 percent luminance, slightly brighter than black. The result is presented as coordinates x and y on the 1931 CIE chromaticity color space diagram, which is a more accurate presentation than degrees kelvin (K) often used to represent color temperature. The target is 0.3127/0.329, corresponding to D65 specification used by the Rec709 HD color standard. The score is based on an average +/- deviation from the target, which results from adding the difference between x and the spec to the difference between y and the spec. Since near-black is often difficult to get correct, and because errors here are less noticeable to the eye than in brighter shades of gray, our cutoffs are much less stringent here than in the next two grayscale tests.
Good: less than 0.01 deviation
Average: 0.03 or less deviation
Poor: more than 0.03 deviation
Dark gray x/y (20%) and Light gray x/y (70%) Example result: 0.3131/0.3310
The color of gray at 20 percent and 70 percent luminance, presented and scored similarly to 5 percent above and with the same target. Cutoffs are relatively strict since we expect most TVs to have detailed 2-point grayscale controls that allow 20 percent and 70 percent gray to come extremely close to the target.
Good: less than 0.002 deviation
Average: 0.004 or less deviation
Poor: more than 0.004 deviation
Before and After avg. color temp. Example result: 6,711
The average color temperature in degrees Kelvin, measured both before (using the most-favorable picture mode) and after calibration. The score is based on an average +/-deviation from the target of 6504K.
Good: less than 100 deviation
Average: 100 to 200 deviation
Poor: +/- more than 200 deviation
Red, Green, and Blue lum. error (de94_L), Example result: 2.4247
Luminance error of the three primary colors, compared with a 100 percent white reference and expressed as a value of delta L using the 1994 formula. Lower numbers are better, and per Calman any value below 3 is considered acceptable. For primary colors, relative luminance is more important than hue or other factors. The numbers reported here are precalibration unless the TV has a CMS (color management system) that provides an improvement--in which case we report post-calibration results instead.
Good: less than 1.5
Average: 1.5 to 3.0
Poor: higher than 3.0
Cyan, Yellow, and Magenta hue x/y, Example result: 0.323/0.1508
The three secondary color points, expressed as coordinates on the 1931 CIE chromaticity color space diagram. The target is the Rec709 color space specification: Cyan=0.225/0.329, Magenta: 0.321/.0154, Yellow: 0.419/0.505. The score is based on an average +/- deviation from the target, which results from adding the difference between x and the spec to the difference between y and the spec. For secondaries, x and y coordinates are more important than luminance or other factors. The numbers reported here are precalibration unless the TV has a CMS (color management system) that provides some improvement--in which case we report post-calibration results instead.
Good: less than 0.012 deviation
Average: 0.012 to 0.02 deviation
Poor: +/- more than 0.02 deviation
1080p/24 Cadence (IAL) Example result: Pass
In this subjective test we look at our favorite test for proper film cadence, a helicopter flyover from the Blu-ray of "I Am Legend" (Chapter 7, 24:58 in) played back at 1080p/24 resolution. If the TV, in its most favorable setting, delivers the same look to the scene as our reference display, it passes. If it introduces smoothing or the hitching motion of 2:3 pull-down, it fails.
Good: Proper film cadence (denoted by "Pass").
Poor: Improper film cadence (denoted by "Fail").
No average score possible
1080i Deinterlacing (film) Example result: Fail
We use the HQV Benchmark on Blu-ray's Film Resolution Loss Test to determine whether the display can recognize film-based content recorded at 24fps and convert it to the display's native resolution without losing detail.
Good: Fine horizontal lines visible in corner boxes (denoted by "Pass")
Poor: Boxes exhibit strobing and/or vertical bands (denoted by "Fail")
No average score possible
Motion resolution (max) and (dejudder off) Example result: 1,000
We use the FPD Benchmark Software for Professional Blu-ray's moving Monoscope pattern to measure the maximum number of horizontal lines of resolution the display preserves during motion. Higher results are better. This test is often difficult to evaluate so it's subjective to a certain extent; we report the higher number in the range if in doubt. Check out our in-depth explanation for more. In the (max) row the TV is set to the most-favorable picture setting, while in the (dejudder off) row video processing that introduces smoothing is disabled to the largest extent possible. If such processing is impossible to turn off, we list a result of "N/A."
Good: 900 lines or more
Average: 500 to 899 lines
Poor: fewer than 500 lines
Good: Accepts and displays every line of native panel resolution (denoted by "Pass")
Poor: Doesn't accept native resolution; introduces softness, ringing or overscan that adjustments cannot fix (denoted by "Fail")
No average score possible
For a complete overview of our TV testing methodology, scroll to the top of this document.
Calman report
Beginning in April 2011, CNET reviews include the complete calibration report from Calman, available as a PDF document at the end of the review. Many of the numbers in the Geek Box are drawn directly from that report, and it provides a visual representation of the TV's color and gamma characteristics (note that luminance on the chart is reported as cd/m2, not fL). For a full explanation of the charts in the report, check out How to read a TV review, part 2: Calibration results.

