The Need for a DAC
One of the things we have to remember is that the RPi does not have the hardware needed for good quality music rendering. So if you connect an amplifier or powered speakers directly to the RPi, the music you hear is going to be of low quality. This is what audiophiles call ‘flat’ music.
You are welcome to try it. But, believe me, the sound is quite poor in quality. The quality of sound in music is a very large and controversial subject. Frequency response, compression, sound stage, the separation between vocals and instruments, and a whole lot of other factors come into effect. For ages, there has been a war between audiophiles and the rest of the world as to what constitutes good and bad music. Leaving aside all that let us understand what we are talking about here.
All electronic circuits that work on sound have a process by which sound is converted from analogue to digital (ADC) and back to analogue (DAC) for play back. Even your basic telephone has one. If you take human voice, the frequency range is 85 to 180 Hz for an adult male and 165 to 255 Hz for an adult female. Thus your smartphone, for example, will be able to process audio signals within this range. All data outside this range can be considered as noise and simply chopped off.
If you take music, the minimum range for good music is 20 to 20,000 Hz. Strings, woodwinds, brass and percussion can generate frequencies that go to 40 kHz or above. A muted trumpet can go to 80 kHz. Violin and oboe go above 40 kHz while a cymbal can thump you at around 100 kHz. On the lower side, a symphonic double bass can go as low as 32 Hz. Obviously the audio signal processor of a typical smartphone cannot handle this frequency range. What you will hear when you play music with a signal processor of this kind is just frequency between roughly 70-400 Hz, with the rest cut off.
A good music player must be able to handle 10-20,000 Hz as well as look at frequencies outside this range with caution. It cannot simply cut off outside signals, but be able to understand the signal, identify actual noise, and filter just the noise out.
A Digital to Analog Converter (DAC) is a circuit that converts the digital audio signal into an analogue wave. Let me explain DAC in as simple a way as possible.
When you sing, your song is recorded on a tape as a wave. The wave has two components – frequency and amplitude. If represented physically, this will result in a series of pressure points. When the music is denser or faster, the pressure points will be closer to each other. When the music is sparse, the pressure points will be spread out (see image on the right).
Unfortunately, this way of representation does not make much sense. So what is done is to convert this into a wave pattern that can have a mathematical representation.
In this, the pressure points are represented as amplitude and frequency for a fixed time scale, mostly a second. These are represented on the Y-Axis across a time scale that is used as the X-Axis.
Analogue Storage and Playback
In the olden days, music was recorded and played back from tapes and records also called vinyl. During recording, the analogue wave was directly transferred to the tape as a series of varying magnetic intensity. High-pressure points will have a higher magnetic intensity on the tape. Similarly, in vinyl, the waves were directly transferred as grooves on the record.
During playback, the magnetic intensity and grooves are read and converted to sound waves that you can hear.
Simple, was it not? Now we come to the difficult part.
How do you store the sound wave as digital data? What you do is, again, theoretically simple. You take a particular point on the curve and store that point’s information as a combination of amplitude and frequency data. Now comes the catch. If you take a standard wave across the axes, it has an infinite number of discrete points. Obviously, you cannot store all the points as the data size will become humongous. So what is done during the conversion to digital format is to sample the data. You club two parts of the curve separated by a time factor – let us say one second. Within this time frame, you record a finite number of points on the curve that you deem represents the curve well.
But, how many points do you take for a time frame? In 1928, Harry Nyquist said that for lossless digitization, you need to have a sampling rate that is at least twice the maximum frequency of the signal for each time frame. For example, if you have a maximum frequency of 10,000 Hz within a time frame, Nyquist argued that 20,000 samples are needed and sufficient. Given that the human ear can hear from 20 Hz to 20,000 Hz, he said a maximum of 40,000 samples per second is more than enough to represent the analogue signal properly.
Widely accepted, the standard sampling rate for audio is today 44,100 Hz at a bit depth of 16 bits, usually represented as 16/44.1. Don’t worry. I will explain these terms further down.
Why 44,100 Hz and not 40,000? That is because, Philips, the inventor of the CD decided to store all music at a max of 22,050 Hz on the medium. Implementing the Nyquist theorem, Philips built ADCs to have a sampling frequency of twice that, or, 44,100 Hz. And that became the industry standard.
The conversion from analogue to digital is done by a device called the Analog To Digital Converter (ADC). ADCs implement the Nyquist theorem at the minimum. Some ADCs may use higher sampling rates as well as higher bit rates. Today, you have ADC/DAC combinations that can handle 32 bits at 384 kHz!
When played, an inverse circuit of the ADC, the Digital to Analog Converter (DAC), reads the digital data and creates an analogue curve. Based on the sampling frequency, the DAC approximates and fills in missing data points to create the smoothest possible analogue curve. In most cases, the music created by such a process is nearly identical to the original sound.
Bit Depth and Word Length
We discussed the sampling rate above. The number of bits you use to store each data point is call bit depth or word length. The more the number of bits per data point, the more the information you have on the data point. In other words, word length or bit depth defines the precision of the sampling. Higher bit depths allow you to record and reproduce subtle movement in the waveform. Each sample recorded with a 16-bit resolution will have one of 65,536 unique values. At 24-bit resolution, you get 16,777,216 possible values. The higher the bit rate, the more the data you are storing for each data point. The higher the frequency, the more often you are sampling the sound wave. Thus a 24/192 will deliver better sound than a 16/44.1.
DACs are available as HATs for RPi. These are small boards that you mount on the GPIO of the RPi. DACs with various features are available from companies such as HiFiBerry, IQAudio, Allo, etc. Prices vary from around US$25 to US$400. Decoding resolutions range from 16 bits/44.1 kHz to as much as 32 bits/384 kHz!!
Till now we have been talking about using the RPi as a server for music. What about other digital media including video?
There are a number of options available to convert the RPi into a media server. Some of the popular servers available are Plex, Emby, Open Media Vault, etc. What do these do?
Essentially, a media server acts as central storage for all your media files – audio, video, and photographs. The concept of media storage started with what is called Network Attached Storage, or, NAS.
A NAS consists of multiple hard drives stacked inside a cabinet with a small electronics that is part of the setup. In a network, files inside these drives can be read, processed, and written by any computer system that is part of the network. In addition, access to NAS has been gradually offered in non-computer systems such as media players, amplifiers, and AV receivers. For example, you can have all your movie files stored in a NAS installed in your home theatre room. At the same time, with a functional network, you can rest in your bed at night and watch the movies on your iPad or tablet. You can watch stored TV shows or music shows in your kitchen.
NASes have limited function. They just store files and make them available to you. The decoding, processing, or execution of the files including the playing of video or audio and display of photographs has to be done by the client hardware.
Using a computer such as an RPi expands the features of centralized storage. You can connect multiple external hard drives to the RPi and use available software to make the RPi act not only as a NAS but also as a (media) server. Let us see what are the additional features that a media server on the RPi can provide.
- The SBC’s CPU brings in a lot of intelligence to the activity.
- Using appropriate software, media files can be ‘tagged’ and displayed beautifully on the client application. For example, instead of just displaying file names, videos, audio and photographs can be shown as small thumbnails. For video and audio, the SBC can talk to Internet-based repositories and download the ‘tags’ that identify the file. Tags are a universally accepted system that includes a small image of the file as well as a small writeup. In the case of a movie video file, for example, the tag will contain the poster of the movie and writeup that includes production/release date, names of artists, and a short summary of the theme. Audio files also have similar information that includes CD cover image, name of artists, publisher information, etc. In the case of photographs, the SBC will use local routines to create a thumbnail.
- SBCs have enough intelligence and capability to render the media or convert it to a format that the client can understand. For example, if you have an audio file that is not compatible with the client software, the SBC can convert the file to the format the client can understand.
- With a connection to the Internet and additional hardware and software, SBC can stream videos and audio from on-line streaming services such as Netflix, Amazon, Tidal, Hungama, etc.
- The SBC can also be programmed to log into online services and download videos and audio.
- SBC can log into Internet radio stations and play the music for you on any client.
All media servers including those using SBCs have to have two communication standards installed. One is DLNA, and the other is UPnP.
The Digital Living Network Alliance (DLNA) is an organization that defines and set the standards for communication and sharing media files across a network. DLNA certified devices today include PCs, iOS devices, TVs, DVD/Blu-ray players, home theatre receivers, media streamers, media players, etc. Devices that have the DLNA certificate can access, download, upload, and render video, music, and image files.
The Universal Plug and Play (UPnP) is a generic solution for devices to share files between a server and a playback device.
In 2000, Sean Adams set up a company in the US to design and manufacture networked music players. At first, they wrote an open-source software called SlimServer that acted as music sever. They then offered Squeezebox, a small device with a screen on which you could see all the music available on the server. Squeezebox was very simple in terms of connectivity. It had just two RCA jacks at the back that you use to connect to an amplifier. Squeezebox gained popularity as an ‘audiophile’ player for a very low price. Its simple GUI and touch screen allowed you to choose the album/song of your choice and play it.
In 2007, Slim Devices was bought by Logitech and, for some strange reason, shut down. The SlimServer is still available as an open-source, but Squeezebox is not available any more.
A number of DIY-ers have developed the methodology to use an RPi as an alternative for Squeezebox. The RPi is mounted on a cabinet with a DAC and a 7” touchscreen. With the appropriate software installed, this tiny device becomes an advanced audiophile-grade player for streaming music.
SBCs are here to stay. It is up to us to see how they can be used. They have enough grunt today to do a number of tasks including real-time and time-critical functions. Multiple SBCs have proven to be the back of serious parallel processing. There is enough push in the market for companies to design and release specific hardware and software that push these SBCs in a particular area of functionality.
At Sosaley, we have been working for the last two years in pushing SBC into mainstream industries. Developing software that enhances the reliability of the SBC, we have added code for video management, communication, diagnostics, and hands-free operation. 100s of SBCs with our code are now generating revenue for our clients in the US, Mexico, Sweden, the Philipines, Malaysia, and other parts of the world. We are constantly pushing ourselves to see how our clients can extract more work from the SBCs.