Equipping the Modern Audio-Video
Audio-Video Forensics Expert
Forensic Audio, Video, and Image Analysis
Federal Bureau of Investigation
Video Forensics Expert
and Costs.......Basic Playback,
Duplication, and Repair.......Audio Enhancement.......Voice
Identification.......Video Image Duplication and
Enhancement.......Signal Analysis and Authenticity.......Digital
Data Analysis and Retrieval.......Laboratory
or increasing the capability of a modern forensic audio-video laboratory draws
on diverse disciplines including physics, electrical and electronic engineering,
computer science, analog and digital theory, acoustics, digital signal analysis,
digital imaging, and other related fields. These disciplines require a wide variety
of electronic equipment to enable a forensic examiner to conduct reliable examinations.
This paper is directed toward established laboratories that have limited audio-video
capability but may be required to increase audio-video forensic support.
following factors should be considered when establishing or increasing the capability
of a modern audio/video forensic laboratory:
a lengthy apprenticeship or equivalent experience of personnel in certain audio
and video analyses fields
specialized training in such areas as digital signal analysis, recording theory,
sound measurement, and video imaging
support and guidance from other established laboratories
- Equipping the laboratory
to play back and record in numerous analog and digital audio and video formats,
and then providing the capability to improve voice intelligibility, compare voices,
identify non-voice signals, authenticate recordings, enhance video images, and
conduct other related analyses
professional audio, video, enhancement, signal analysis, imaging, and other related
physical space for the laboratory
paper will concentrate on the last three items in the list by discussing eight
audio-video examinations, the identification and costs of equipment needed to
examine audio-video evidence, and laboratory-space considerations. Examination
procedures and personnel matters will not be addressed.
audio-video forensic laboratories are capable of analyzing analog or digital audio-video
recordings to support criminal investigations, governmental intelligence, civil
litigation, personnel and administrative matters, and other related issues. Some
laboratories conduct a wide range of examinations on audio and video recordings,
whereas others provide only duplication and a few other commonly requested analyses.
eight most commonly conducted audio-video examinations are listed below, with
the level of complexity from lowest to highest:
and duplication: The capability to play back recordings and provide high-quality,
readily useable duplicates on standard formats. A fully equipped laboratory can
work with most audio-video formats as well as specialized recordings (e.g., miniature
law enforcement formats, 911 logging recordings, time-lapse video).
- Repair: The capability
to repair torn or stretched tapes in standard audio-video analog and digital formats
- Audio enhancement:
The capability to improve the voice intelligibility of recordings and to prepare
enhanced copies of audio recordings and the audio information on video formats
- Voice identification:
The capability to compare unknown recorded voices to known voice exemplars by
identifying similar and dissimilar characteristics
- Video image duplication
and enhancement: The capability to duplicate or improve video recordings
- Signal analysis:
The capability to quantify, identify, and compare non-voice signals to determine
origin and characteristics (i.e., telephone signaling, gunshot sounds)
- Authenticity: The
capability to resolve the authenticity of audio-video recordings by determining
originality, continuity, and integrity
data analysis and retrieval: The capability to extract and analyze digital
audio-video recordings on commercial and proprietary recording devices
equipment requirements for the eight audio-video forensic examinations are listed
below. The phase numbers represent the most basic through the more sophisticated
equipment requirements. The equipment costs reflect the prices at the time of
publication and do not include personnel, training, and laboratory space.
I: Basic Playback, Duplication, and Repair
To allow high-quality playback and standard-format duplication of audio and
video recordings, a laboratory should have the following equipment (Koenig 1987;
Luther and Inglis 1999; Pohlmann 2000; Watkinson 2001; Watkinson 2000):
- Standard professional audiocassette
decks with heavy-duty transports, a +10 percent or greater speed adjustment around
each standard speed, a three-head design, Dolby-noise-reduction systems (B, C,
and S), and real-time counters. Useful additions include set speeds of 15/16,
17/8, and 3¾ inches per second (ips), a continuous playback speed adjustment
from at least 15/32 to 3¾ ips, and a headphone jack with its own volume
audio microcassette playback decks optimized for forensic applications, including
continuous playback speeds of 0.7 to 2.4 centimeters per second, accurate metering,
headphone volume controls, and professional-quality transport and electronic systems.
Laboratory personnel normally do not make audio microcassette copies due to their
inherently lower quality.
audio playback systems including open-reel, minicassette, miniature on-the-body,
NT digital, and other analog and digital devices with playback speed, track configuration,
and other features that match the type of recordings expected to be received in
the laboratory. Playback of selected recordings may require specialized devices
and/or software available only to law enforcement agencies.
- Video recorders/playback
units with formats ranging from analog U-Matic to the latest digital configurations,
including VHS, VHS-C, SVHS, 8mm, Hi8, time-lapse, 6mm digital, and 8mm digital.
High-quality or professional decks ensure the best possible playback and/or recording.
In video playback, emphasis must be placed on the need for each head (field) to
precisely follow its respective track. This may require procuring a number of
video playback units of similar type in order to optimize the output results.
A professional analog or digital format or a semiprofessional format (i.e., SVHS)
unit should be available when interim copies are needed to minimize inherent losses
in video duplication.
professional digital time-base corrector to remedy timing errors for the variety
of video formats expected in the laboratory and to capture and hold separate fields
and frames in its memory
video monitor to accurately display the full bandwidth and resolution of the highest
quality format that will be played back in the laboratory. A hand-held screen
degausser should also be available to ensure long-term color purity.
- Equipment racks of the
correct width for the equipment, usually 19 inches. They should include professional-quality
accessories including patch bays, spacer panels, and wire runs.
- Tables for workspace and
equipment that are sturdy and at least 30 by 60 inches in size to allow ample
work and equipment space. Avoid folding tables, except for temporary use.
- High-quality wiring, connectors,
and adapters to manage all required equipment connections
- Tape repair supplies including
appropriate splicing blocks, edit-tabs, razor blades, small knives to cut open
housings, nonmagnetic-leader tape, tape-cleaning solvent, and miniature, non-magnetized
solution with particle sizes in the 0.2 to 1.5 micron range. Smaller particle
sizes cannot consistently be removed from analog tapes.
- A ten-power loupe to check
for a tape's track configuration and azimuth alignment, after application of the
heavy-duty, professional-quality headphones that completely seal around the ears.
Cheaper consumer headphones can be sonically equal to the professional models,
but they usually do not hold up to the heavy-duty use in the laboratory.
- Blank high-quality audio
and video recordable media for preparing copies
test recordings for calibration of playback and recording equipment
A basic audio system with professional or high-end consumer equipment would cost
between $8,000 and $11,000 and would allow playback, duplication, and repair of
audiocassettes and microcassettes.
basic video system would add between $9,000 and $12,000 to the basic audio system
of Phase I and would allow playback, duplication, and repair of VHS, SVHS, VHS-C,
8 mm, Hi8, and time-lapse formats. A complete Phase I audio-video setup would
cost between $17,000 and $23,000.
II: Audio Enhancement
To allow meaningful improvements in the voice intelligibility of recorded audio
information, an audio-video laboratory should have the following equipment (Bellanger
2001; Davis 2002; Koenig 1988):
digital-adaptive enhancement processor that allows the operator to access and
implement a number of different filter algorithms, including one- and two-channel
adaptive, bandpass, comb, notch, spectral inverse, parametric, and graphic equalizer.
The digital-adaptive enhancement processor should also permit the use of compression
and limiting functions. The best configurations are hardware implementations with
software control, because purely software programs may not implement complex filters
in real-time, and the audio signals are susceptible to corruption when the computer
is doing other operations simultaneously. All of these filtering systems should
contain high-quality 16-bit, or greater, analog-to-digital and digital-to-analog
converters, at least two input channels, and a range of sampling rates to at least
separate compressor/limiter with at least two channels, adjustable and automatic
compression ratios, attack-release times, and gain reduction of at least 40 dB.
- A spectrum analyzer that
provides a detailed visual representation of the audio signal. The unit should
be a fast-Fourier transform design (not a real-time analyzer) with single-channel
capability. If the laboratory also conducts signal analysis and/or authenticity
examinations, a dual-channel or larger unit would be needed. The fast-Fourier
transform device should have 16-bit or greater resolution, frequency display ranges
adjustable from at least 0-100 Hz up to 0-20 kHz, a variable number of averages,
and at least 800 lines of resolution on any frequency range. A zoom capability
can also be a useful feature for some applications (Bracewell 2000).
- A modern computer system
that can be used for a number of applications including controlling the digital
enhancement devices and running other appropriate software.
- A computer printer that
provides hard copies of fast-Fourier transform plots, filter settings of the digital
enhancement devices, and other applicable information.
- Non-real-time software
programs for precise, non-linear time and amplitude processing of audio recordings.
Some of the functions applicable to the forensic field include pitch shifting
for general or localized correction of playback speed, amplitude adjustments,
and normalization. These software programs also allow redactions of specific portions
of a recording to be performed with more ease and accuracy.
An audio enhancement capability added to Phase I (audio and video) would cost
between $28,000 and $53,000 and would allow intelligibility enhancement of audio
recording formats and the audio track on video recordings.
III: Voice Identification
Present laboratory spectrographic and/or computer voice comparison systems do
not produce conclusive results, but meaningful findings are possible with careful
analysis of speech samples collected under forensic conditions. The minimum requirements
include the following equipment (IAI Voice Identification and Acoustic Analysis
Subcommittee 1991; Committee on Evaluation of Sound Spectrograms, National Research
Council, National Academy of Sciences 1979; Tosi 1979; Koenig 1993; Koenig 1986):
- An analog sound spectrograph
that produces excellent voice spectrograms, especially under noisy recording conditions.
It is being quickly replaced with specialized spectrographic software.
- Specialized spectrogram
software that produces digitally calculated spectrograms that have been optimized
for the speech and forensic communities. This software should be user-friendly
and allow the operator to control all the important time and frequency characteristics
of the graphic representation.
forensic voice identification algorithms that are presently being developed (Nakasone
and Beck 2001; Reynolds et al. 2000). When fully developed, this specialized,
computer-based software will allow automated and/or operator-assisted voice comparisons
between different voice samples.
software that allows two or more recorded voice samples to be selectively isolated
and combined into a new recording.
headphone-switching box that allows the rapid toggling between two input signals
containing separate voice samples for aural comparison.
Adding spectrographic and computer-based voice identification capability to Phases
I and II would cost between $12,000 and $25,000 and would allow comparisons between
unknown recorded voices and known voice exemplars.
IV: Video Image Duplication and Enhancement
To allow the most accurate duplication and enhancement of video recordings, an
audio-video forensic laboratory should have the following equipment (Blitzer and
Jacobia 2002; Bovik 2000; Russ 2002; Russ 2001):
video-capture device that accurately captures the entire resolution of a single
field/frame and continuous video for any analog format expected in the laboratory.
Inputs should include at least composite and S-Video.
- A high-speed computer system
with a fast processor, a large monitor, sufficient random access memory, an input
for digital video (i.e., FireWire), and extensive hard-drive storage to handle
the video applications of the laboratory. Some manufacturers provide computer
equipment that has already been optimized for this task.
- Color and black-and-white
computer and video printers that provide hard-copy images from video evidence.
Like other components in the system, the printers should have sufficient resolution
to reproduce the entire captured and/or enhanced fields/frames.
- Software programs for continuous
video and still video enhancement, with appropriate algorithms to sharpen, enlarge,
enhance, edit, and correct visual details
A video imaging capability added to Phases I and II would cost between $4,000
and $50,000. The wide range in cost is due to the options selected, the computer
platform, the resolution, and the specifications.
V: Signal Analysis and Authenticity
This phase includes signal analysis of audio signals and authenticity analyses
of audio and video formats. The minimum requirements include the following equipment
(Bellamy 2000; Bolt et al. 1974; Hodges 2001; Koenig 1990; Koenig et al. 1998;
Rappaport 1996; Reeve 1995; Reeve 1992):
Phase IV computer system
video monitors that include pulse-cross and under-scan modes, component and composite
inputs, and at least 600 lines of horizontal resolution. They should be able to
fully reproduce all anticipated digital formats up to the highest level resolution.
Multiple monitors may be needed to manage all of the analog and digital recordings.
- A dual-channel spectrum
analyzer of the fast-Fourier transform design, not a real-time analyzer. The required
features include zoom, two-channel comparison algorithms, and long-term averaging.
- A macro-photographic or
digital camera system that allows the production of accurate pictures of the tape
surface, ranging from 0.0125 up to 2 inches in width. If a digital camera is used,
it should have a resolution of at least 2000 by 2000 pixels.
- A high-quality laboratory
computer sound card that has a 16-bit or higher resolution, sample rates to at
least 44.1 kHz, two or more input and output channels, and low-noise and distortion
analysis software with the capability to perform waveform, narrow-band spectrum,
spectrographic, and other analyses on .wav, as well as on other computer-formatted
audio files. It should allow aural review of selected portions of the displayed
data, permit high-quality printouts of the various analyses, and have the capability
to display multiple windows on the screen with time correlation between the various
audio and video digital recorders that permit the duplication of original evidence
without any loss of visual or aural information.
A signal analysis and authenticity capability added to Phases I through IV would
cost between $10,000 and $60,000. The wide range in cost is due primarily to the
selected imaging system and software.
VI: Digital Data Analysis and Retrieval
This phase involves the transfer, analysis, and retrieval (when necessary) of
digital data contained on the various audio and video recorders used in the investigative
field. Digital forensic recordings may exist on a variety of media types including
tape (DAT, NT-2, DDS), optical (CD-R/RW, DVD-R/+R/-RW/+RW/RAM), magneto-optical
(MiniDisc), and random access (FLASH memory chips, hard drives). These recordings
may have been recorded in a standard format or on proprietary recorders that require
specialized hardware and/or software to play back and analyze the recorded data.
Most of the equipment needed for this phase was discussed in Phases I through
V; however, there are a few additional requirements include the following items:
- Software that allows the
user to view the digital data down to the bit level. The proprietary recorders
may require additional hardware and/or advanced software that often are available
only to law enforcement agencies.
removable, writable digital medium that has sufficient storage capability to archive
large data files that exist on higher-capacity formats and on solid-state body
Digital data analysis and retrieval capability added to Phases I through V would
cost between $1,000 and $5,000 depending on the archival storage media chosen.
laboratory should be configured or constructed to decrease outside noise and vibration,
dampen laboratory equipment sounds, and minimize radio-frequency interference
and magnetic fields. Equipment should be optimally arranged for sufficient appropriate
workspace including separate electrical circuits, proper lighting, and secure
and adequate storage (Berger et al. 2000; Harris 1991; Salter 1998).
- The following strategies
should be implemented to decrease outside noise and vibration:
- Eliminate as much
of the outside noise as possible by locating the laboratory away from noisy outdoor
environments like train tracks, airports, playgrounds, and busy streets.
- Increase the distance between
building noise and the laboratory by avoiding shipping docks, cafeterias, or machine
shops. Distancing the laboratory may mean moving the laboratory space to a quieter
location in the same building or to a different building. If laboratory relocation
is not possible, noise relocation may be an option. For example, request that
most of the dock shipments be loaded or unloaded as far as possible from the laboratory
or arrange with the cafeteria so that the lunch tables and the noisiest equipment
are farthest from the common wall with the laboratory. Machine shop noise can
be reduced by placing sound-absorbing materials around the equipment and installing
soundproofing material on the walls and ceiling.
- The laboratory walls can
be isolated and soundproofed. However, building new walls inside the present laboratory,
isolating the new walls from the old outside walls, and treating the space between
the two walls reduces laboratory space, increases construction costs, and creates
temporary noise and dust.
the noise generated in the laboratory by purchasing equipment and computers with
low-noise fans. Place noisy equipment in soundproof isolation compartments. Whenever
possible, avoid fluorescent lighting to prevent the audible buzz the lighting
produces and the low-level noise it can induce in certain electronic equipment.
Limit the speed of the heating and air conditioning fans, and correct noisy air
radio-frequency interference by locating laboratories away from commercial radio
and TV transmitting stations or police and other radio-transmitter towers. Some
equipment is more susceptible to outside fields than others, so place that equipment
in different locations in the laboratory or at different heights in the racks.
Placing the laboratory space below ground usually minimizes radio-frequency problems.
Interference from the building's electrical wiring can often be removed with specialized
AC power filters. Large transformers or loudspeakers with large magnets should
not be placed near any audio or video evidence. A portable gauss meter can determine
if there are problematic magnetic fields in the area. To avoid electrical spikes
and other irregularities, the laboratory should have dedicated electrical circuits.
- Equipment may be placed
on tables, but sturdy equipment racks are preferred. Racks allow higher density
with better cooling, provide better protection, facilitate wiring and rapid changes
in connections, are moveable, and look professional.
- The workspace should be
flexible enough to handle reorganization. Floor space should be planned to accommodate
current and future equipment. The aisles should allow easy access for bulky evidence,
large equipment, and equipment repair. There should be sufficient table space
to visually examine, mark, and lay out evidence. The workspace should have sufficient
lighting, preferably incandescent track lighting.
- A separate, secured area
containing a safe or safe file for evidence storage should be available with controlled
entry and an alarm. The storage area should be free of magnetic fields and be
temperature- and humidity-controlled.
J. C. Digital Telephone. 3rd ed., Wiley, New York, 2000.
M. G. Adaptive Digital Filters. 2nd ed., Dekker, New York, 2001.
E. H., Royster, L. H., Royster, J. D., Driscoll, D. P., and Layne, M. eds. Noise
Manual. 5th ed., AIHA, Fairfax, Virginia, 2000.
H. L. and Jacobia, J. Forensic Digital Imaging and Photography. Academic,
San Diego, California, 2002.
R. H., Cooper, F. S., Flanagan, J. L., McKnight, J. G., Stockham, T. G., and Weiss,
M. R. Report on a Technical Investigation Conducted for the U.S. District Court
for the District of Columbia by the Advisory Panel on White House Tapes. U.S.
Government Printing Office, Washington, DC, 1974.
A., ed. Handbook of Image and Video Processing. Academic, San Diego, California,
R. N. Fourier Transform and Its Applications. 3rd ed., McGraw-Hill, Boston,
on Evaluation of Sound Spectrograms, National Research Council, National Academy
of Sciences. On the Theory and Practice of Voice Identification. National
Academy of Sciences, Washington, DC, 1979.
G. M., ed. Noise Reduction in Speech Applications. CRC, Boca Raton, Florida,
C. M. Handbook of Acoustical Measurements and Noise Control. 3rd ed., McGraw-Hill,
New York, 1991.
P. Introduction to Video Measurement. 2nd ed., Focal Press, Oxford, England,
B. E. Selected topics in forensic voice identification, Crime Laboratory Digest
B. E. Authentication of forensic audio recordings, Journal of the Audio Engineering
Society (1990) 38(1-2):3-33.
B. E. Enhancement of forensic audio recordings, Journal of the Audio Engineering
Society (1988) 36(11):884-894.
B. E. Making effective forensic audio tape recordings, FBI Law Enforcement
Bulletin (1987) 56(5):10-18.
B. E. Spectrographic voice identification, Crime Laboratory Digest (1986)
B. E., Hoffman, S. M., Nakasone, H., and Beck, S. D. Signal convolution of recorded
free-field gunshot sounds, Journal of the Audio Engineering Society (1998)
Voice Identification and Acoustic Analysis Subcommittee. Voice comparison standards,
Journal of Forensic Identification (1991) 41(5):373-392.
A. and Inglis, A. Video Engineering. 3rd ed., McGraw-Hill, New York, 1999.
H. and Beck, S. D. Forensic automatic speaker recognition, Speaker Recognition
Workshop: 2001 A Speaker Odyssey (2001).
K. C. Principles of Digital Audio. 4th ed., McGraw-Hill, New York, 2000.
T. S. Wireless Communications: Principles and Practices. IEEE, New York,
D. Subscriber Loop Signaling and Transmission Handbook: Digital. IEEE,
New York, 1995.
W. D. Subscriber Loop Signaling and Transmission Handbook: Analog. IEEE,
New York, 1992.
D. A., Quatieri, T. F., and Dunn, R. B. Speaker verification using adapted gaussian
mixture models, Digital Signal Processing (2000) 10:19-41.
J. C. Image Processing Handbook. 4th ed., CRC, Boca Raton, Florida, 2002.
J. C. Forensic Uses of Digital Imaging. CRC, Boca Raton, Florida, 2001.
C. M. Associates. Acoustics: Architecture, Engineering, the Environment.
William Stout, San Francisco, California, 1998.
O. Voice Identification: Theory and Legal Applications. University Park,
Baltimore, Maryland, 1979.
J. Art of Digital Audio. 3rd ed., Focal, Oxford, England, 2001.
J. Art of Digital Video. 3rd ed., Focal, Oxford, England, 2000.
of the page