Showing posts with label audio. Show all posts
Showing posts with label audio. Show all posts

Thursday, September 12, 2013

Just Because it's Sound Doesn't Mean it has to be Mixed

Mixing is like driving—everybody does it, it gets you from here to there, and it seems like it’s been part of the culture forever.

For recording or broadcast requirements with a limited channel count, a stereo or mono mix will usually fit the bill, but for live events, perhaps we can do better.

As a case in point, consider a talker at a lectern in a large meeting room. Conventional practice would dictate routing the talker’s microphone to two loudspeakers at the front of the room via the left and right masters, and then feeding the signal with appropriate delays to additional loudspeakers throughout the audience area. A mono mix with the lectern midway between the loudspeakers will allow people sitting on or near the center line of the room to localize the talker more or less correctly by creating a phantom center image, but for everyone else, the talker will be localized incorrectly toward the front-of-house loudspeaker nearest them.

In contrast to a left-right loudspeaker system, natural sound in space does not take two paths to each of our ears. Discounting early reflections, which are not perceived as discrete sound sources, direct sound naturally takes only a single path to each ear. A bird singing in a tree, a speaking voice, a car driving past—all these sounds emanate from single sources. It is the localization of these single sources amid innumerable other individually localized sounds, each taking a single path to each of our two ears, that makes up the three-dimensional sound field in which we live. All the sounds we hear naturally, a complex series of pressure waves, are essentially “mixed” in the air acoustically with their individual localization cues intact.

Our binaural hearing mechanism employs inter-aural differences in the time-of-arrival and intensity of different sounds to localize them in three-dimensional space—left-right, front-back, up-down. This is something we’ve been doing automatically since birth, and it leaves no confusion about who is speaking or singing; the eyes easily follow the ears. By presenting us with direct sound from two points in space via two paths to each ear, however, conventional L-R sound reinforcement techniques subvert these differential inter-aural localization cues.

On this basis, we could take an alternative approach in our meeting room and feed the talker’s mic signal to a single nearby loudspeaker, perhaps one built into the front of the lectern, thus permitting pinpoint localization of the source. A number of loudspeakers with fairly narrow horizontal dispersion, hung over the audience area and in line with the direct sound so that each covers a fairly small portion of the audience, will subtly reinforce the direct sound as long as each loudspeaker is individually delayed so that its output is indistinguishable from early reflections in the target seats.

Such a system can achieve up to 8 dB of gain throughout the audience without the delay loudspeakers being perceived as discrete sources of sound, thanks to the well known Haas- or precedence-effect. A talker or singer with strong vocal projection may not even need a single “anchor” loudspeaker at the front at all.

As an added benefit to achieving intelligibility at a more natural level, the audience will tend to be unaware that there is a sound system in operation, an important step in reaching the elusive system design goal of transparency—people simply hear the talker clearly and intelligibly at a more or less normal level. This approach, which has been dubbed “source-oriented reinforcement,” precludes the sound system from acting as a barrier separating the performer from the audience, because it merely replicates what happens naturally, and does not disembody the voice through the removal of localization cues.

Traditional amplitude-based panning, which, as noted above, works only for those seated in the sweet spot along the centre axis of the venue, is replaced in this approach by time-based localization, which has been shown to work for better than 90 per cent of the audience, no matter where they are seated. Free from constraints related to phasing and comb-filtering that are imposed by a requirement for mono-compatibility or potential down-mixing—and that are largely irrelevant to live sound reinforcement—operators are empowered to manipulate delays to achieve pin-point localization of each performer for virtually every seat in the house.

Source-oriented reinforcement has been used successfully by a growing number of theatre sound designers, event producers and even DJs over the past 15 years or so, and this is where a large matrix comes into its own. Happily, many of today’s live sound boards are suitably equipped, with delay and EQ on the matrix outputs.

The situation becomes more complex when there is more than one talker, a wandering preacher, or a stage full of actors, but fortunately, such cases can be readily addressed as long as correct delays are established from each source zone to each and every loudspeaker on a one-to-one basis.

This requires more than a console level matrix with just output delays, or even assigning variable input delays to individual mics, since it necessitates a true delay-matrix allowing multiple independent time-alignments between each individual source zone and the distributed speaker system.

One such delay matrix that I have used successfully is the TiMax2 Soundhub, which offers control of both level and delay at each crosspoint in matrixes ranging from 16 x 16 up to 64 x 64 to define unique image definitions anywhere on the stage or field of play.

The Soundhub is easily added to a house system via analog, AES digital, and any of the various audio networks currently available, with the matrix typically being fed by input-channel direct outputs, or by a combination of console sends and/or output groups, as is the practice of the Royal Shakespeare Company, among others.

A familiar looking software interface allows for easy programming as well as real-time level control and 8-band parametric EQ on the outputs. A PanSpace graphical object-based pan programming screen allows the operator to drag input icons around a set of image definitions superimposed onto a jpg of the stage, a novel and intuitive way of localizing performers or manually panning sound effects.


 
The TiMax PanSpace graphical object-based pan programming screen


For complex productions involving up to 24 performers, designers can add the TiMax Tracker, a radar-based performer-tracking system that interpolates softly between image definitions as performers move around the stage, thus affording a degree of automation that is otherwise unattainable.

Where very high SPLs are not required, reinforcement of live events may best be achieved not by mixing voices and other sounds together, but by distributing them throughout the house with the location cues that maintain their separateness, which is, after all, a fundamental contributor to intelligibility, as anyone familiar with the “cocktail party” effect will attest.

As veteran West End sound designer Gareth Fry says, “I’m quite sure that in the coming years, source-oriented reinforcement will be the most common way to do vocal reinforcement in drama.”

While mixing a large number of individual audio signals together into a few channels may be a very real requirement for radio, television, cinema, and other channel-restricted media such as consumer audio playback systems, this is certainly not the case for corporate events, houses of worship, theatre and similar staged entertainment.

It may sound like heresy, but just because it’s sound doesn’t mean it has to be mixed. With the proliferation of matrix consoles, adequate DSP, and sound design devices such as the TiMax2 Soundhub and TiMax Tracker available to the sound system designer, mixing is no longer the only way to work with live sound—let alone the best way for every occasion.

Saturday, April 13, 2013

The Passing of Online

The essential distinction between offline and online is that an offline process is one of construction; an online process, one of execution. In media production, online usually follows offline, as in the case of video editing, where a product that has been laboriously constructed in an offline edit suite—perhaps over the course of days or weeks—is executed by machinery following an edit decision list (EDL) in minutes or hours in an online suite.

Since the hourly rate of a well appointed online suite is typically several orders of magnitude higher than that of a small offline studio—often equipped with not much more than a desktop computer running editing software—the distinction between online and offline has long been etched into the steely heart of many a production manager.

Applying this distinction to the field of music, you might say that playing an instrument is generally an online process, and requires the talent to perform. Constructing a musical performance using MIDI step input, for example, is an offline process, and requires a different skill set.

Before Bing Crosby teamed up with Jack Mullin back in 1947 and seized on the potential for splicing tape offline to construct complete recorded performances, recording musicians had to execute a complete work flawlessly to the end while it was being recorded direct to phonograph disc—an online process. If they made a mistake, they had to go back to the beginning, scrap the disc, and start all over again.

Likewise, dialing a phone on a traditional land line is an online process. If you realize you’ve made a mistake, you have to abort—hang up—and begin again. Dialing a cell phone, on the other hand, is an offline process. You compose the number and, if you make a mistake, you go back a step and delete the wrong input—edit it out—and input the right number. When the entire telephone number has been constructed to your liking, you go online—literally, hit the green online button—and the call is executed by the service provider.

The ability to edit is what distinguishes offline from online processes.

Sound mixing for film used to be mostly an online activity. It was common practice in the early decades of film sound for an entire 10-minute reel to be mixed in a single pass, following one or more rehearsals. With the development of pick-up and record electronics for film dubbers making punching in possible, the two- or three-person re-recording team enjoyed the ability at last to go back and fix a flawed portion of a mix—usually refining their console settings listening to the sound backwards while the dubbers rewound in real time—without causing undue delay and excessive cost to the production.

Mix automation changed all that, from the introduction of console automation systems in the 1970s to today’s digital audio workstations featuring the ability to graph not just volume and mute, but just about every conceivable control parameter. Automation has allowed the offline construction of mixes to become standard operating procedure, with the mix being subsequently executed online in a single record pass or internal bounce-to-disk.

Now this has all changed again with the introduction of offline bounce in ProTools 11. This enables freezing a mix—that is, rendering the final mix up to 150 times faster than real time, according to Avid—and has made the notion of “online” something of a quaint curiosity.

Now a mix need never be onlined at all, since we are able to render into a single final file something that doesn’t ever need to be played through, prior to the playback for quality control checking and approval, after the fact.

The notion of online vs. offline, once so central to the production process and necessitating the development of the all-important EDL, is in the process of being relegated to the status of a quaint curiosity, a byway in the development of modern studio practices and procedures. It will soon be forgotten, along with such other bygone realities as the daily tape recorder alignment ritual, analog noise reduction devices, and uniformed gas station attendants.

It brings to mind the day that I finally sold my once invincible Synclavier and 16-track Direct-to-Disk recorder—to a couple of vintage synth collectors, no less. The only things I hung onto were two blank rack panels and an AC power bar. Some things, at least, are irreplaceable. 

Saturday, March 23, 2013

To Mix or Not to Mix? That is the Question

In the Monty Python film, The Meaning of Life, there is an unforgettable scene in an upscale French restaurant featuring this exchange between John Cleese’s fawning waiter and Terry Jones’ more-than-morbidly obese patron, Mr. Creosote:

“Today we have for appetizers moules mariniers, pâté de fois gras, beluga caviar, eggs benedict, tarte de poivre—that’s leek tart—frogs’ legs amandine, or oeufs de cailles—little quails’ eggs on a bed of pureéd mushrooms. It’s very delicate, very subtle.”

“I’ll have the lot,” replies Mr. Creosote.

“A wise choice, monsieur. And now, how would you like it served—all mixed up together in a bucket?”

“Yeah . . . with the eggs on top.”

While the humour in the scene is partly visual, the Pythons’ unique stamp of taking things to the brink of the ridiculous, and then vaulting over it, contrasts the list of the individual, highly refined dishes on the menu—representing the pinnacle of classic French cuisine—and the way these “very delicate, very subtle” elements are offered to the patron in a gross, vulgar, and repulsive manner, “all mixed up together in a bucket.”

Of course, no-one would willingly order a meal this way, much less be served in this fashion by a trained professional. Yet, that is much the way sound is often presented to theatre patrons: all mixed up together, with the eggs—or rather, the voices—on top. Occasionally delicate, not often subtle.

What is at issue here is the very notion of mixing, of combining disparate elements into a single channel (center cluster), two channels (L, R), a combination of these (L, C, R) or perhaps even on very rare occasions, a surround mix of four or five channels.

While mixing a large number of individual audio signals together into a few channels may be a very real requirement for the limited channel count of broadcast radio and television, as well as channel-restricted media such as consumer audio playback systems, this is certainly not the case for theatre and other staged entertainment. Until recently, however, theatrical and similar live events have largely been mixed in much the same way as broadcasts and recorded music.

This may be attributed in part to the large overlap in the designs of traditional recording, broadcast and live consoles; schools teaching audio (i.e., “recording schools”) continue to focus on the art and techniques of the mixdown; even one of the audio industry’s leading magazines proudly heralds the practice in its name, Mix. Originating in broadcast and recording sessions involving multiple microphones, and refined in multitrack recording studios producing mono or stereo masters, mixing has become entrenched in the industry and in the minds of many who dream of working in it, to the point where it’s almost as if no other way of working with sound is even remotely conceivable.

A great many shows are presented as if the audience were listening to a gargantuan stereo system, with massive line arrays hung to the left and right of the stage. Now this might not be inappropriate for a touring band well known from its recordings or for a big, dynamic rock musical where the design calls for a larger-than-life aspect.

Even so, many of the blockbuster musicals from the past quarter century benefited greatly from the creativity of such esteemed sound designers as Olivier Award winner Mick Potter, who, in the quest for more natural sound, have opted for separate vocal and orchestra mixes, striving simultaneously for clarity in the voices and power in the orchestra. Moreover, two voice mixes are sometimes derived, with one going to a duplicate set of loudspeakers in an A-B configuration pioneered in 1988 by Martin Levan for Aspects of Love, to eliminate electrical summing of mic signals and the ensuing phase problems that arise when performers are in close proximity to each other’s microphones.

For other production styles, however, an approach based on mixing may not be the most appropriate technique for conveying the nuances of theatre—including musical theatre, where sound systems have become ubiquitous—if the purpose of sound reinforcement is to allow every performer’s voice to be heard as it would unamplified in an optimum seat.

Wednesday, February 29, 2012

Leap year and drop-frame time code are conceptually the same

For those in the media production industries, February 29th is a good day to revisit drop-frame SMPTE time code, because both leap year and drop-frame time code came into being for the sole purpose of reconciling two different time bases on which we do things with mundane regularity.

Take the calendar first: our calendar simply charts the sequence of the individual days that comprise a single year. The day is based, of course, on a single rotation of the earth on its axis, whereas the year is based on a single revolution of the earth around the sun. Rotation and revolution are the two different time bases on which our calendar is constructed.

Since it takes about 365.25 days for the earth to revolve around the sun, we collect four of those quarter days and add them together into a single day—February 29—that appears on the calendar once every four years.

We do this because there's no such thing as a quarter-day: you couldn't start a New Year at 6:00 a.m. After all, a day is a day and cannot be partitioned like that. It's an integer.

It's important to see that the concept of the yearly calendar comprises 366 days—February 29 is not imaginary. But rather than adding it every four years, what we are really doing is dropping it from the calendar in every year that is not a multiple of four. If the year is not divisible by 4, then we drop February 29 from our count of days in that year.

It's exactly the same with drop-frame time code, where frames are analogous to days, and hours to years. A video frame is a whole thing, an integer, and we count 30 of them in one second. But the rate at which they proceed is a bit less than 30 per second, more like 29.97 frames per second.

This is the same sort of fractional discrepancy that exists in the annual rate of 365.25 days per year.

We deal with it the same way, by dropping 2 frames from the count at the very beginning of every minute that is not a multiple of 10. In that first second, there are only 28 frames.

So frames 00 and 01 simply do not exist at the beginning of every minute of time code that doesn't have a zero at the end of it (10, 20, 30, 40, 50, and 00 minutes being the exceptions), just as February 29 does not exist in any year that can't be divided by 4. It's as simple as that.

Why go to the bother of doing this? For the calendar, it's long been considered important that the seasons start at roughly the same time every year: if we didn't have February 29 as a corrective, then the beginning of Spring, for example, would progress steadily back through February, January, December, and so on as the years rolled by.

For producers, it's important that the time displayed by your time code reader agrees with the real-time clock on the control room wall. Without drop-frame time code, a one-hour program as measured by your time code would actually run 3 seconds and 18 frames too long, and that would wreak havoc with broadcast schedules.

Note that what we are NOT doing is cutting out frames from our program and leaving them on the cutting room floor, as some of my former students at the Toronto Film School used to believe. Those "dropped" frames are simply never there in the first place, just as February 29 will not "be there" in 2013, 2014, and 2015. The calendar works as "drop-day" code.

The takeaway from this blog entry is that if you can intuitively grasp the concept of leap year, then you've already got the essence of drop-frame time code. Conceptually, they are one and the same.

Tuesday, February 28, 2012

Seller Beware—When You're Being Shopped for a Price

I got an email last week asking how much I would charge to mix 5 songs for a band's EP. I wrote back asking whether I'd be recording the original tracks, or just mixing tracks that someone else has already recorded for the band. Both, came the reply, and how much would it cost?

I wrote back to ask about the band, number of players, what instruments, etc., so that I could price out the job using the appropriate recording facility, and asking for a couple of possible dates when the band wanted to start recording. This is an important question, because in sales, there's a strong relationship between price, availability and delivery. You can sometimes get a better deal on studio time that would otherwise remain unbooked.

At this point, the answers started to get vague. What was clear, however, was that I was being shopped for a price. In other words, the prospect (not yet a client) had no intention of coming to me for the job, but was only trying to get a handle on the price of a job that most likely he was bidding on himself.

This happens to everyone from time to time, and is one of the reasons why it's not a great idea just to shoot out a price in response to an inquiry. Every job is different in some way, and a big part of the sales process is asking questions to qualify the buyer.

Asking questions not only keeps valuable business intelligence—your pricing policies— out of the hands of your competition, it also saves you from wasting time with tire kickers who would otherwise take up a lot of your time, but never end up buying anything.

The 80-20 rule seems to apply here: 80% of your business comes from 20% of your prospects. This is further refined so that in turn, 80% of your income comes from 20% of them. In other words, 4% (20% of 20%) of your potential clients are responsible for about two-thirds (80% of 80%) of your business.

Asking questions is your best line of defense here, and a genuine prospect will appreciate that you're drilling down in order to provide the best possible service.

Tuesday, October 11, 2011

Good Sound at the new Helzberg Hall in Kansas City

The $413 million Kauffman Center for the Performing Arts in Kansas City, MO, opened to delighted audiences on September 16 with performances by Placido Domingo, the Canadian Brass, and the Kansas City Symphony, among others. Designed by architect Moishe Safdie, the Kauffman Center houses the 1,600 seat Helzberg Hall, a terraced concert hall-in-the-round that is home to the Kansas City Symphony, and the 1,800-seat Muriel Kauffman Theatre that will serve as the performance home of the Kansas City Ballet and the Lyric Opera of Kansas City.

The two venues have been described as the yin and yang of the Kauffman experience—the exuberant Muriel Kauffman Theatre with its proscenium and illuminated acrylic balcony fronts ringing the hall stands in marked contrast to the sleek and ethereal oval-shaped Helzberg Hall, that some visitors have likened to the interior of a wooden ship, with its warm, muted wood tones that recall the Walt Disney Concert Hall in Los Angeles.

Noting this resemblance in both form and material, critic Steve Paul wrote in The Kansas City Star, “One important connection between these two concert halls was the work of Yasuhisa Toyota of Nagata Acoustics, whose choice of shapes, wood and physical components was paramount in creating the aural experience.”

Toronto consultants Engineering Harmonics worked with Nagata Acoustics in both Kansas City and Los Angeles, designing performance sound systems to integrate seamlessly with the natural acoustics. Judging from the critical acclaim that followed last month’s inaugural performances in Helzberg Hall, the result is a resounding success. The amplified sound is “ambient and natural-sounding,” wrote David Mermelstein in Musical America.

Paul added, “Insiders will argue whether Helzberg exceeds even Disney, a slightly larger hall, though time—plus word of mouth in the music community—will tell.”

“This was our second foray into the design of a sound system in a terraced hall with Nagata Acoustics,” noted Engineering Harmonics president Philip Giddings. “In Kansas City, we further developed and refined our approach to this type of venue, and we are more than encouraged by the response of performers, audiences and critics alike,” he said.

Monday, July 11, 2011

If your glass is more beautiful than the wine, change the wine

So says noted wine writer Tony Aspler.

Owners of smaller home and project studios who are tempted to hire top-notch professional recording engineers to help ramp up their business risk seeing clients follow these engineers to better studios.

I've seen this happen time and time again. Home and project studio owners need to understand that this is almost always inevitable when their reach starts to exceed their grasp and they want to compete with the big boys.

It may be prudent for smaller studio owners to consider a significant upgrade of their rooms and equipment before enlisting the services of established outside recording engineers.

It won't be the engineers' fault if clients seek to follow them to greener pastures.

Tuesday, October 26, 2010

How much headroom do you need when recording? Part 1

Some years back, I recorded a Stravinsky symphonic work to analog tape running at 15 ips half-track, dbx type 1. When I transferred the recording to digital for archiving, the big orchestral bass drum gave me problems. Had I been recording to digital on location, it would have been a mess: I would have needed to set 0 VU at -25 dBFS to keep the digital meter out of the red. That's how much energy the bass drum was putting out. 

Standards organizations specify how much headroom should be available above operating level (0 VU) in the digital domain. In my experience, the European EBU standard of 0 VU = -18 dBFS does not afford nearly enough headroom, and neither does the North American SMPTE standard of -20 dBFS. Granted, these were "reasonable" compromises in the days of 16-bit technology, when 93dB dynamic range was about all you could expect to get in the real world (as opposed to the theoretical 96 dB, calculated at 6 dB per bit).

Analog headroom of 24 dB—which many manufacturers of professional grade equipment achieve with maximum output levels of +28 dBu (ref 0 VU = +4 dBu)—should be considered the minimum standard during production. Even then, the Stravinsky would have been into overload by about 1 dB, so you might occasionally require even more headroom. 


In our current 24-bit world, I would say that it's not unreasonable to demand 28 dB headroom when recording wide dynamic range program material, such as symphonic works. It still gives you a working signal-to-noise ratio of better than 100 dB, and 28 dB of headroom includes a small comfort margin so you can enjoy the program without stressing over the levels during recording, knowing that you will most likely never go into the red.

Incidentally, I see a lot of "analog channels" being marketed as quality front end processing for recording into ProTools and other DAWs. Many of these boxes include a compressor after the mic preamp, probably because few recordists stop to consider how much analog headroom is really needed in a given situation. Instead of backing the level off to allow for enough headroom without compression, they tend to run the ProTools meters high up into the yellow, recording with compression on individual tracks at 24-bit resolution. It's as if a little bit of green flickering at the low end of the meter must be avoided at all costs. This is foolishness.


24-bit technology allows you to record at a moderate level with 28 dB of headroom and still accumulate no perceptible noise in the recording. Save the compression for mixing and mastering, when it becomes a creative tool rather than a protective device. And then, whatever else you do, don't normalize! Oversampling digital-to-analog converters, which are pretty much the norm these days, routinely create signal peaks greater than 0 dBFS between samples that measure 0 dBFS on disc. But that's another subject for another time.

Wednesday, September 22, 2010

Microphone cable wiring 101: connecting the ground lug—or not

Trying to repair a broken microphone cable the other day, I noticed that pin 1 of the XLR connector was connected to the ground lug with a small jumper wire, thereby bonding the cable shield to the connector shell. Here's what it looked like:


The cable was a cheapie from Active Surplus, the store on Queen St. West in Toronto with the stuffed gorilla by the door. From the quality of the components and the crummy soldering job, it's a good illustration of you-get-what-you-pay-for, so please consider this blog entry my penance for buying it. I was caught in a weak moment.

By way of contrast, here are quality connectors from Switchcraft (top) and Neutrik, with the ground lugs identified by arrows:


Should the ground lug be connected to pin 1, as in the top illustration, or not? I've read opinions pro and con over the years, so I decided to ask an acknowledged expert in the field, Neil Muncy.

Before I get to his answer, you should know that Neil, a Fellow and Life Member of the Audio Engineering Society, is the author of the ground-breaking 1994 AES paper, "Noise Susceptibility in Analog and Digital Signal Processing Systems," in which he explored the relationship between the physical construction of shielded twisted-pair cable and induced noise in a signal circuit due to cable shield current. This paper was published, along with others by authors including Philip Giddings of Toronto's Engineering Harmonics, in the June 1995 issue of the Journal of the Audio Engineering Society, which has become the most widely accessed issue of the Journal in history.

When he wrote his paper, most commercially available audio gear had pin-1 problems. It was, indeed, difficult to find equipment without it—even the most highly revered consoles had serious pin-1 problems. Since then, a number of leading manufacturers have redesigned their products to correct their mistake, but unfortunately, many have not yet done so.

Neil Muncy is also a member of the task group that developed the standard AES48-2005, "AES Standard on Interconnections—Grounding and EMC practices—Shields of Connectors in Audio Equipment Containing Active Circuitry," the published standard that deals with the pin-1 problem.

Based on all this, I figured Neil should know how to wire up a microphone cable. In close to 30 years, he hasn't failed me yet. Here's his answer to the question, Under what circumstances do you solder the ground lug (aka pin 4) to pin 1?

"The short, long, and infinitely long answers are NEVER, NEVER, & NEVER. To do so would introduce ground loops which could totally compromise an otherwise working Isolated Ground (I.G.) installation, and raise Hell with any front-end equipment that is plagued with Pin-1 Problems. Terminal #4 was introduced by Switchcraft back in the late '50's/'60's to address an application in very high impedance medical interfaces. It has no use whatsoever as far as portable A/V cables are concerned. They are simply extension cords."

There you have it. But if the lug shouldn't be used in general audio applications, why is it still there? Wouldn't it be advantageous for manufacturers such as Switchcraft and Neutrik to produce a line of connectors without the ground lug for normal stage and studio applications that have nothing to do with medical interfaces? I'd like to hear your comments.