Real-Time Audio
Hi-Fi, it Ain't
by David Lawrence
The hills are alive with the sound of packets, but it's certainly not music to an audiophile's ears.
In taking a look at the preeminent audio streaming technologies, the two clear leaders are Progressive Networks' RealAudio and Xing Technology's Streamworks. To write this article, I thought I had to build a wall between my other career and the two candidates. The computer-oriented radio talk show that I host is content on the
RealAudio home page. I do the encoding of the show weekly, and I understand both the business model and the technical aspects of the server, client, and encoder. I know RealAudio inside and out. So even though the audio quality is poor, that put RealAudio at an advantage. I feared that upon hearing Streamworks, that would change.
It turns out I had nothing to fear. They both deliver truly awful audio.
Don't get me wrong. They both work, sort of, and both do what they claim to do, which is deliver real-time audio, and audio on demand, via the Net. They both strike fear into the hearts of ISPs who have limited bandwidth, and hence see their ports being blown out the first time some really cool audio thing happens on one of their servers.
Some similarities: Both RealAudio and Streamworks take existing audio files and compress them down to a mere shadow of their former selves, both in size and in fidelity. They both give away their player, or client software, for free, and they both charge similar prices for their server solutions. They're both cross-platform, they're easy to integrate into HTML documents, and the encoders are pretty boring to watch. Both are running after radio stations and record companies, preaching the gospel of Internet as a replacement for transmitters, and both are being accepted as state-of-the-art technology. Both are available now. The fact that neither sounds very good is apparently irrelevant.
The differences are just as plentiful. Progressive Networks is working from the ground up to improve RealAudio's 8-bit sound; Xing merely "dumbed down" the decent-sounding 24-bit audio and video stream they created for their ISDN-based clients, such as NBC and CNN. RealAudio's just coming to the table with real-time, as-it's-happening-you're-hearing-it capabilities; Xing has been doing that from the start. RealAudio requires that you record your sound and save the file as a common file format (.au, .aiff, etc.), then convert it using their studio. Xing's $2,500 black box does it on the fly; plug your audio source in one end and your server in the other, and you're popping out sound. RealAudio is in bed with ABC and NPR; Xing has alliances with CBS and NBC. RealAudio stops at a proprietary audio encoding and compression scheme. Xing uses the MPEG standard to encode low-, mid-, and high-fidelity sound files, and is scalable from plain old mono audio all the way up through full-motion video with CD-quality stereo audio.
So what does it sound like? Take a deep breath, and toss your high expectations off to the side for a moment. Both RealAudio and Streamworks, in their least obtrusive and lowest-fidelity incarnation, sound a bit like this: an old AM transistor radio, slightly off-tuned, but tuned to a local station.
How does it get served up? Let's take RealAudio as the example (it's very similar for Streamworks; we'll note any differences below). Recording your source into your PC using a standard audio tool should net you a 22 KHz, 8- or 16-bit .aiff or .au format sound file. Using the RealAudio Studio Encoder, you add text information (which is displayed by the player as the sound is streamed), and you let it digest the sound file. Speed depends on your chip, but on my PowerMac 7100/66, I'm able to achieve speeds of 0.8 the length of the sound itself. That is, if I have an hour's worth of sound in a 16/22 .aiff (60 MB or so) file called foo.aif, it will take a little more than 50 minutes to complete if it's the only application running. You'll be left with a file, foo.ram, that's quite small--only 3 MB per hour. Once you've created this RealAudio .ram file, you're not quite finished. You then need to create a "metafile" or pointer file to use in your anchor tag reference to the .ram file's location on the server.
This metafile invokes a new net service prefix on its URL, pnm: (Progressive Networks Metafile), which points to the actual .ram file. This is necessary because your browser would treat the foo.ram file as a whole entity when using the ftp: or http: protocol and simply transfer it to your hard drive. The pnm: file acts as the bridge between the RealAudio server and the viewer's browser. When properly configured, your browser will allow packets of the sound to be played at the same time as it's being sent from the server in a stream of data.
Of course, those who can't afford the RealAudio Server system are still going about the business of encoding audio and making the .ram files; one can still download the .ram file and then use the RealAudio player to play it locally.
To play the sound with the browser, the stream of data starts tumbling. The packets then get buffered, much like a MiniDisk player, for 10 seconds, and then the glorious 8-bit, 8 MHz sound starts limping out of your speakers. There are major glitches: transposed audio content, weird telemetry-like undercurrents that evoke images of Marconi and DeForest, fast-forward and rewind controls that, for the most part, work when they feel like it.
Xing Technology's Streamworks follows a similar motif. You can actually go hunting for servers with the client's browsing features; once connected, the Net connection is stable at 8-bit, flaky at 24-bit. I had the opportunity to hear 24-bit Streamworks across a LAN, using the Beatles' "My Michelle" as a source. Not bad, but certainly not CD-quality, and certainly not maintainable across a dial-up connection.
All the negatives aside, once you click on a real-time audio link, and the sound starts coming, it's actually pretty heady to hear the sound of far-off content coming from your PC. With time, increased bandwidth, and a more stable information infrastructure, this would be neat. But not now.
The Impact on the 'Net
Long-term-oriented ISPs, even those with direct pipes to the backbone, are not exactly pleased with this development.
"Suppose one of our server customers puts up a stream of the local popular radio station, and everyone in a steel building that can't get the signal on their radio starts tuning in on their computers?" says Doug Humphries, President of Digital Express Group in Maryland. "We'd have breakage almost immediately." He cites the math involved: An 8-bit stream, even on a pipe that's 45 Mbits wide, means a few hundred listeners could cripple throughput.
Xing Technology's president, Howard Gordon, minimizes this issue. "We configure our server based on a maximum number of users, and turn people away after a certain number of simultaneous streams are running." RealAudio's servers behave in the same way, limiting the number of simultaneous users who can enjoy the fun.
The true winners in this technology will be, in the near term, the commercial online services. America Online, CompuServe, and Prodigy all have tens of thousands of ports that can be used simultaneously by dial-up users in a stable, reinforced, and mutually exclusive environment to receive streams of data. They are the best candidates to create and manage this type of technology effectively. America Online's Center Stage, the largest of the virtual Madison Square Gardens on the commercial service, can accommodate 50,000 simultaneous connections for what could easily be a real-time rock concert or Billy Graham revival meeting.
Conversely, even an ISP with hundreds of modems and dedicated servers can only hope to attract, at most, a few hundred listeners in each market (assuming they want to shut off all services from the rest of their dial-up customers). This is exciting to no one, and is in fact dangerous to their customer loyalty. No one wants to be told by tech support that they can't get to their e-mail tonight, because The Artist Formerly Known as Prince has decided to play a few notes to his Net friends.
What of the advertisers and the networks? The lack of quality and critical mass are two big reasons why advertisers haven't been rushing headlong into this sector of a technology that usually ensures sponsorship. The networks, on the other hand, are excited--not from a profit-making basis, but from a cost-saving basis. ABC has publicly stated that it will eventually abandon the SEDAT satellite data format that has worked so well in favor of MPEG, the technology on which Streamworks is based. The engineering staff has the ears of the bean counters when they claim that they can deliver high quality not across expensive satellites, but across the public information lines at a tiny fraction of the cost.
Who's the Winner?
I wish I could tie this up in a knot and hand it off to you with a clear direction. RealAudio has a head start in the marketing of its servers, but Streamworks is hitting the radio market hard with companies like Virginia-based Intervox, which is responsible for integrating all of EZ Communications' 21 radio stations into the Web.
The sound is horrible on both platforms, but you can expect that to change. Throw into the mix other technologies like TrueSpeech, Microsoft's Blackbird, and Netscape's threat to add .au streaming to a future incarnation of Navigator, and the waters are muddied even more. If your client is hell-bent to get audio up on the Net, take your pick.
David Lawrence can be reached at http://www.davids.com/david or via e-mail.
Reprinted from Web Developer® magazine, Vol. 1 No.1 Winter 1996 (c) 1996 internet.com Corporation. All rights reserved.
Web Developer® Site Feedback
Web Developer®
Copyright © 2000 internet.com Corporation. All rights reserved.