Tags: linux, pipewire
by Sanchayan Maity
Editor’s note: this work was completed in late 2022 but this post was unfortunately delayed.
Modern day audio hardware these days comes integrated with Digital Signal Processors integrated in SoCs and audio codecs. Processing compressed or encoded data in such DSPs results in power savings in comparison to carrying out such processing on the CPU.
+---------+ +---------+ +---------+
| CPU | ---> | DSP | ---> | Codec |
| | <--- | | <--- | |
+---------+ +---------+ +---------+
This post takes a look at how all this works.
Audio processing
A traditional audio pipeline, might look like below. An application reads encoded audio and then might leverage a media framework like GStreamer or library like ffmpeg to decode the encoded audio to PCM. The decoded audio stream is then handed off to an audio server like PulseAudio or PipeWire which eventually hands it off to ALSA.
+----------------+
| Application |
+----------------+
| mp3
+----------------+
| GStreamer |
+----------------+
| pcm
+----------------+
| PipeWire |
+----------------+
| pcm
+----------------+
| ALSA |
+----------------+
With ALSA Compressed offload, the same audio pipeline would look like this. The encoded audio stream would be passed through to ALSA. ALSA would then, via it’s compressed offload API, send the encoded data to the DSP. DSP does the decode and render.
+----------------+
| Application |
+----------------+
| mp3
+----------------+
| GStreamer |
+----------------+
| mp3
+----------------+
| PipeWire |
+----------------+
| mp3
+----------------+
| ALSA |
+----------------+
Since the processing of the compressed data is handed to a specialised hardware namely the DSP, this results in a dramatic reduction of power consumption compared to CPU based processing.
Challenges
ALSA Compressed Offload API which is a different API compared to the ALSA PCM interface, provides the control and data streaming interface for audio DSPs. This API is provided by the tinycompress library.
With PCM there is the notion of
bytes ~ time
. For example, 1920 bytes, S16LE, 2 channels, 48 KHz would correspond to 10 ms. This breaks down for compressed streams. It’s impossible to estimate reliably the duration of audio buffers when handling most compressed data.While sampling rate, number of channels and bits per sample are enough to completely specify PCM, various parameters may have to be specified to enable the DSP to deal with multiple compressed formats.
For some codecs, additional firmware has to be loaded by the DSP. This has to be handled outside the context of audio server.
Requirements
Expose all possible compressed formats.
Allow a client to negotiate the format.
Stream encoded audio frames and not PCM.
PipeWire
PipeWire has become the default sound server on Linux, handling multimedia routing and audio pipeline processing. It offers capture and playback for both audio and video with minimal latency and support for PulseAudio, JACK, ALSA, and GStreamer-based applications.
SPA
PipeWire is built on top of SPA (Simple Plugin API), a header only API for building plugins. SPA provides a set of low-level primitives.
SPA plugins are shared libraries (.so files) that can be loaded at runtime.
Each library provides one or more factories
, each of which may implement
several interfaces
.
The most interesting interface is the node
.
A node consumes or produces buffers through ports.
In addition to ports and other well defined interface methods, a node can have events and callbacks.
Ports are also first class objects within the node.
There are a set of port related interface methods on the node.
There may be statically allocated ports in instance initialization.
There can be dynamic ports managed with
add_port
andremove_port
methods.Ports have
params
which can be queried using theport_enum_params
method to determine the list of formatsEnumFormat
, the currently configured formatFormat
, buffer configuration, latency information,I/O areas
for data structures shared by port, and other such information.Some params such as the selected format can be set using the
port_set_format
method.
Implementing compressed sink SPA node
This section covers some primary implementation details of a PipeWire SPA node which can accept an encoded audio stream and then write it out using ALSA compressed offload API.
static const struct spa_node_methods impl_node = {
,
SPA_VERSION_NODE_METHODS.add_listener = impl_node_add_listener,
.set_callbacks = impl_node_set_callbacks,
.enum_params = impl_node_enum_params,
.set_io = impl_node_set_io,
.send_command = impl_node_send_command,
.add_port = impl_node_add_port,
.remove_port = impl_node_remove_port,
.port_enum_params = impl_node_port_enum_params,
.port_set_param = impl_node_port_set_param,
.port_use_buffers = impl_node_port_use_buffers,
.port_set_io = impl_node_port_set_io,
.process = impl_node_process,
};
Some key node methods defining the actual implementation are as follows.
port_enum_params
params
for ports are queried using this method. This is akin to finding out
the capabilities of a port on the node.
For the compressed sink SPA node, the following are present.
EnumFormat
This builds up a list of the encoded formats that’s handled by the node to return as a result.
Format
Returns the currently set format on the port.
Buffers
Provides information on size, minimum, and maximum number of buffers to be used when streaming data to this node.
IO
The node exchanges information via
IO
areas. There are various type ofIO
areas like buffers, clock, position. Compressed sink SPA node only advertisesbuffer
areas at the moment.
The results are returned in an SPA POD.
port_use_buffers
Tells the port to use the given buffers via the IO
area.
port_set_param
The various params
on the port are set via this method.
Format
param
request sets the actual encoded format that’s going to be streamed
to this SPA node by a pipewire client like pw-cat
or application for sending to
the DSP.
process
Buffers containing the encoded media are handled here. The media stream is
written to the IO buffer area which were provided in use_buffers
. The encoded
media stream is written to the DSP by calling compress_write
.
add_port
and remove_port
Since dynamic ports aren’t supported, these methods return a ENOTSUP
.
pw-cat
pw-cat
was modified to support negotiation of encoded formats and passing the
encoded stream as is when linked to the compressed sink node.
Deploying on hardware
Based on discussions with upstream compress offload maintainers, we chose a Dragonboard 845c with the Qualcomm SDM845 SoC as our test platform.
For deploying Linux on Embedded devices, the tool of choice is Yocto. Yocto is a build automation framework and cross-compile environment used to create custom Linux distributions/board support packages for embedded devices.
Primary dependencies are
tinycompress
ffmpeg
PipeWire
WirePlumber
The tinycompress
library is what provides the compressed offload API. It makes
ioctl()
calls to the underlying kernel driver/sound subsystem.
ffmpeg
is a dependency for the example fcplay
utility provided by
tinycompress
. It’s also used in pw-cat
to read basic metadata of the encoded
media. This is then used to determine and negotiate the format with the compressed
sink node.
PipeWire
is where the compressed sink node would reside and WirePlumber
acting
as the session manager for PipeWire
.
Going into how Yocto works is beyond the scope of what can be covered in a blog post. Basic Yocto project concepts can be found here.
In Yocto speak, a custom meta layer was written.
Yocto makes it quite easy to build autoconf
based projects. A new tinycompress
bitbake recipe
was written to build the latest sources from upstream and also include the
fcplay
and cplay
utilities for initial testing.
The existing PipeWire and WirePlumber recipes were modified to point to custom git sources with minor changes to default settings included as part of the build.
Updates since the original work
Since we completed the original patches, a number of changes have happened thanks to the community (primarily Carlos Giani). These include:
A device plugin for autodetecting compress nodes on the system
Replacing
tinycompress
with an internal library to make all the requisiteioctl()
sCompressed format detection (which was previously waiting on an upstream API addition we implemented in
tinycompress
Future work
Make compressed sink node provide clocking information. While the API provides a method to retrieve the timestamp information, the relevant timestamp fields seem to be not populated by the
q6asm-dai
driver.Validate other encoded formats. So far only MP3 and FLAC have been validated.
May be the wider community can help test this on other hardware.
Add capability to GStreamer plugin to work with compressed sink node. This would also help in validating pause and resume.