Recording auditory stimuli

Basic principles

In most cases you want each sound file to be acoustically similar: the same loudness, same level of reverberation, same pronunciation, and so on. Doing so reduces the opportunity for acoustic attributes to confound your stimuli.
Stimuli should exploit the dynamic range of the recording system (and the playback system). That is, they should not be too soft (requiring the volume to be turned up very high) or too loud (which in extreme situations results in "clipping"—the sound intensity can't be accurately represented in the file).
Stimuli should contain a minimum of silence. Again, the specifics depend on a particular stimulus set, but in general it is best to avoid silent periods (for example, at the beginning and end of a sound file). Avoiding these silent periods makes calculating the start time and duration of each stimulus easier, and also impacts loudness (i.e., RMS) calculations.
Listen, listen, listen. Be picky about the sound quality and pronunciation, and ask others for their input. Typically once you have a set of recorded stimuli they stick around for a while, so it pays to make sure they are of high quality.

Equipment

A typical setup includes:

A microphone. It need not be an expensive microphone, but take care, as different microphones make use of different types of processing. Inexpensive plug-and-play microphones commonly sold to enable computer calling may compress the dynamic range more than you want. We currently use an Audio Technica AT2035 (but have had decent results with other, less expensive models as well).
For many microphoness, a windscreen or pop filter. Our AT2035 has a pop filter attached to its mic stand which minimizes peaking and sound artifact caused by syllables such as /s/ and /p/.
A computer with a sound recording program (we use Audacity).

Alternately, you may use a DAT or mini-audio recorder. Make sure that any digital recorder is able to save the file to a lossless format; the popular .wav format is perfectly acceptable. (MP3 files use algorithms to reduce file size that compress the audio information, and are not ideal for experimental materials.)

Specific advice

Provide the speaker with an easy-to-read written copy of the stimuli. Frequently this is best done by printing them out (in an easy-to-read font!), but you might also have luck presenting these on a screen. Having an easy-to-read copy to read from while cut down on the errors during recording. (Just be careful to avoid shuffling papers while recording!)
Record in a quiet room. This can be surprisingly difficult advice to follow—even acoustically-isolated sound booths can have noise from HVAC or computers (e.g., the computer fan). Moving the recording computer outside the sound booth is often helpful.
Make sure your speaker has good posture, and keeps a consistent posture throughout recording. Posture effects speech production and vocal quality (not to mention distance from the microphone).
As a related point, have the person speaking (the “speaker”) stay still. A speaker's position can make a huge difference. Depending on how directional your microphone is, the angle of a speaker's mouth and the distance from the microphone will impact how much "mouth noise" is picked up, how much room reverberation, and overall loudness level. Keeping the speaker's mouth in a constant position relative to the microphone is one of the easiest ways to improve the consistency of recordings.
Record all stimuli the same day. Although not always possible, this can help with consistency. From one day to another a speaker's voice will change (as a function of energy, mood, hydration, etc.). The closer in time you can record a speaker the better.
Have at least one non-speaker present to act as producer and/or recording engineer. It is very helpful to be able to notice oddities and suggest a re-take of an example in the moment, rather than coming back to it later. A producer can also monitor things like the speech rate and intonation of a speaker.
Record at least two examples of each stimulus. Recording multiple examples in one session, and then selecting an acceptable version later on, can save a lot of time and prevent having to schedule another recording session later on.
Divide up a recording session across multiple sound files. Dividing your session into ~10 minute chunks will give your speaker regular breaks, and ensure that if you encounter any technical difficulties you won’t lose an entire session of recording.
When deciding on soundfile quality, always listen to stimuli over a good pair of headphones. The sound quality is very different over headphones and speakers (especially a computer's built-in speakers).

Thanks to Chad Rogers for help with this advice.

Jonathan Peelle

Jonathan Peelle

Recording auditory stimuli

Jonathan Peelle

Recording auditory stimuli

Basic principles

Equipment

Specific advice