CEA-708-D defines DTV Closed Captioning (DTVCC) and provides specifications and guidelines for
caption service providers, distributors of television signals, decoder and encoder manufacturers, DTV
receiver manufacturers, and DTV signal processing equipment manufacturers. CEA-708-D may also be
useful in other systems. CEA-708-D includes the following:
a) a description of the transport method of DTVCC data in the DTV signal
b) a specification for processing DTVCC information
c) a list of minimum implementation recommendations for DTVCC receiver manufacturers
d) a set of recommended practices for DTV encoder and decoder manufacturers
The use of the term DTV throughout is intended to include, and apply to, High Definition Television
(HDTV) and Standard Definition Television (SDTV).
1.1 Overview
DTVCC is a migration of the closed-captioning concepts and capabilities developed in the 1970’s for
National Televison Systems Committee II (NTSC) television video signals to the digital television
environment defined by the ATV (Advanced Television) Grand Alliance and standardized by ATSC. This
new television environment provides for larger screens and higher screen resolutions, as well as higher
data rates for transmission of closed-captioning data.
NTSC Closed Captioning (CC) consists of an analog waveform inserted on line 21, field 1 and possibly
field 2, of the NTSC Vertical Blanking Interval (VBI). That waveform provides a transport channel which
can deliver 2 bytes of data on every field of video. This translates to a nominal 60 or 120 bytes per
second (Bps), or a nominal 480 or 960 bits per second (bps).
In contrast, DTV Closed Captioning is transported as a logical data channel in the DTV digital bitstream.
DTV-specific closed captioning is allocated 9600 bps for each program. This increased capacity opens
the possibility for simultaneous transmission of captions in multiple languages and with multiple reading
levels, as well as the transport of an entire CEA-608 datastream1.
The DTV standard also accommodates a variety of increased horizontal and vertical resolutions (e.g.,
704x480, 1280x720 and 1920x1080), versus the single 525 vertical scan line format for NTSC. These
added resolutions provide for more defined representations of character fonts and other on-screen
objects.
The heart of any DTVCC caption display is the caption “window,” which is similar to the window concept
found in many computer Graphical User Interfaces (GUIs). Windows are placed within the DTV screen,
and caption text is placed within windows. Windows and text have a variety of color, size and other
attributes.
CEA-708-D describes the above issues in a reverse-hierarchical (i.e., low-to-high level) fashion. It follows
an “Open Systems Interconnect (OSI) Reference Model”-type protocol stack for layered protocols.
DTVCC consists of 5 protocol layers: the Transport Layer, the Packet Layer, the Service Layer, the
Coding Layer, and the Interpretation Layer. The discussion of the first 2 layers is a detailed presentation
of data organization issues. The discussion of the last 2 layers provides a more informative presentation
of the unique aspects of closed captioning. Some readers may wish to start with these last 2 layers first,
beginning in Section 7.