New music!

I flew every weekend of April.

The camera of my iPhone 4s suddenly came to life after almost a year of malfunctioning and I could film this short bit. Decided to make a pause and put some improvised music.

# 8Day Records (Montreal) Agustin Delbosco & PMDX – Adopt A Ninja (Original Mix)

One of my latest tracks on 8day records (Montreal)

# DAFx-16 and 140th AES Convention

New links to Engineering have been added with references to the ever-delightful DAFx convention and the AES 140th convention in Paris.

Also, new music! and some more.

# Implementing Barberpole Phasing in gen~ (Max4Live device)

I decided to make some space during the weekend to get acquainted with **Max/MSP’s** **gen~** codebox environment. This is something I’ve been meaning to do for some months now, as I was promised an idyllic world of sample-accurate processing and and a “procedural” or at least object-oriented programming environment (as opposed to visual) straight into my DAW of choice (that’s Ableton Live).

I really like the visual programming aspect of the Max/MSP but, to be honest, having learned most of my DSP knowledge in traditional ways and seeing all these boxes and cables inside patches made me cringe every once in a while. Especially knowing that many times the same problem could be solved in a couple lines of good old code. Of course, it was a small price to pay for the joy of experimenting with audio in real time and focusing on the important aspects instead of endless debugging.

Learning by doing is usually the best approach for getting used to a new environment, so I set out to search for a relatively simple -but interesting- problem to implement. On the other hand, it wouldn’t hurt if it ended up being part of my set of audio effects on my current production/performance Live set as well. This paper [1] turned out to be both interesting, current, and not overly complicated.

### Overall structure

The barberpole filter is based on a more or less well-studied effect known as the Shepard-Risset glissando. The goal of the authors was to come up with a flanging/phasing effect that generates the same “infinite sweeping” illusion on an input signal by placing adequately gain-scaled cascaded notch filters on the frequencies where the Shepard tone harmonics are placed.

These conveniently placed notch filters have their cut-off frequency one octave apart from one another, so they have a constant separation in the logarithmic frequency scale. Moreover, their cut-off gains respond to an inverted raised cosine envelope across the frequency spectrum to preserve the equivalence to the Shepard tone spectral structure. In order to generate the glissando effect, each notch filter is cyclically swept from its original cut-off frequency to twice its value. If the number of cascaded filters is sufficiently large to cover the majority of the audible spectrum and the transitions in the cyclical sweeping smooth enough, this “infinite sweeping” sensation can be perceived.

I had no reference of implementation other than the formulas on the paper and almost no knowledge of the gen~ syntax, so I decided to take a divide-and-conquer approach with the following steps:

- Implement a notch filter with the required parametrization (namely cut-off, bandwidth and gain)
- Cascade instances of this filter with its different parameters in the most painless way possible
- Implement the sweeping of the filter parameters
- Solve possible issues stemming from the implementation, such as the continuity of the audio signals when the filters roll back to their original frequency after a sweep cycle (as mentioned in the paper).

These steps should also be accomplished under one additional rule:** as much as possible must be implemented in only one instance of the gen~ codebox.** The cute Max boxes and patchers should only take care of controlling parameters available to the user. So no cheating by feeding an external line~ object to the gen~ object for the sweeping, or concatenating series of filter objects, for example.

The reason for this narrow-mindedness is just because I want my code to be as close as possible to a C/C++ version, in case I really like the result and decide to port it to a more efficient/multi-platform version.

### Implementing the notch filter

There are many variants of a notch filter. Nevertheless, if we want to achieve the raised cosine spectral envelope required for the auditory illusion, filter gain and bandwidths have to be parametrizable across the spectrum. This is the reason why the parametric EQ filter described here is used.

It’s always safe practice to do some plotting first in order to check everything out.

Matlab/Octave code

%octave script %https://pmdelgado.wordpress.com clear all; close all; Fs = 48000; Q = 10; fo = 20; Lmin=-3; Lmax=-40; M=10; k=1; K=1 A=zeros(M,3); B=zeros(M,3); fc=zeros(M,1); Lc=zeros(M,1); H = 1; for m = 1:M fc(m,1) = fo*power(2,m-1)*power(2,(k-1)); theta=2*pi*((m-1)*K+(k-1))/(M*K); Lc(m,1) = Lmin + ((Lmax-Lmin)*(1-cos(theta)))/2; G = power(10,Lc(m,1)/20); Gb_square = (1 + power(G,2))/2; dw=(2*pi*fc(m,1))/(Q*Fs); beta=sqrt((Gb_square - 1)/(power(G,2) - Gb_square))*tan(dw/2); a0=(1+G*beta)/(1+beta); a1=-2*(cos(dw*Q)/(1+beta)); a2=(1-G*beta)/(1+beta); b1=a1; b2=(1-beta)/(1+beta); A(m,:) = [a0 a1 a2]; %x(n-i) coefficients B(m,:) = [1 b1 b2]; end %plot overall transfer function for m = 1:M N = 1024; upp = pi; w = linspace(0, upp, N+1); w(end) = []; ze = exp(-1j*w); H_prev = H; H = polyval(A(m,:), ze)./polyval(B(m,:), ze); % Evaluate transfer function and take the amplitude H = H_prev .* H; %cascade filters Ha = abs(H); Hdb = 20*log10(Ha/1); wn = w/upp*(Fs/2); xlim = ([0 1]); end figure plot(wn, Hdb); grid on;

Note how the cutoff gain parameters follow an inverse raised cosine spectral envelope (in dB).

Here is a snapshot of the notch filter cascade frequency response for a white noise input in Max/MSP:

#### Implementing IIR filters in gen~ codebox

gen~ code works at sample level, which means that everything written applies to the current audio sample (or defined buffer) being processed. Feedback loops making use of past samples of an algorithm (input or output) need to be accessed in a special way. According to the documentation, the **history** or **delay** operator is the one meant for constructing such feedback loops.

In my case, implementing a cascade of filters that make use of past input and output samples (in these cases up to two previous samples) would mean declaring multiple of these history objects. I found it easier to write every state of the variables used in the filtering process into the so called Data structures, which can be declared as arrays and therefore accommodate many instances of one filter in order to cascade them. In other words, for a causal IIR filter:

I am using an array of M elements (for M cascaded filters) for each .

Data x_n1(10); Data x_n2(10); Data y_n1(10); Data y_n2(10);

And of course for each input and output of filters. Samples can be recalled, calculated and stored through (index = 1…M is the filter number):

//recurrence equation (actual filtering) out_filter_prev = a0 * in_filter.peek(index) - a1 * x_n1.peek(index) + a2 * x_n2.peek(index) + b1 * y_n1.peek(index) - b2 * y_n2.peek(index); //write output out_filter.poke(out_filter_prev,index); //feedback part //respect time order memory2 = x_n1.peek(index); // x(n-2) memory4 = y_n1.peek(index); // y(n-2) memory1 = in_filter.peek(index); // x(n-1) memory3 = out_filter_prev; // y(n-1) // store poke(y_n2, memory4,index); //y(n-2) poke(y_n1, memory3,m); //y(n-1) poke(x_n2, memory2,index); //x(n-2) poke(x_n1, memory1,index); //x(n-1)

Cascading can be done via a for loop that is executed at sample rate (m and index are used interchangeably because I am lazy).

// cascade filters for (m=1; m = M; m += 1) { out_filter_temp = out_filter.peek(m-1); in_filter.poke(out_filter_temp,m); }

### Sweeping filter frequencies

Filter frequencies are swept cyclically at a -possibly- user-defined rate. This can actually be implemented as low frequency ramp generator (at around a 0.1 Hz ballpark).

If each filter frequency is defined as

fc = fo*pow(2,m-1)*pow(2,(k-1))

where k is the time index and m is the filter-number index as before, then a ramp generator at a rate LFOrate can be defined as:

turn_around = 1; LFOinc = LFOrate/Fs; // calc LFO rate LFOpointer += LFOinc; if (LFOpointer < turn_around) { LFOpointer = 0; } LFOpointer = mod(LFOpointer,turn_around); // wrap k= LFOpointer;

The LFO wave resets when the pointer reaches the turn_around value. The filter gains are also dependent of k and m in a similar way in order to form the raised cosine spectral envelope.

### Time varying filters and complexity

Time-varying filters mean that the set of filter coefficients need to be updated in time. Depending on filter architecture and order, if the filters are updated at sample rate Fs, it can increase the complexity greatly. In our case the filter structure is already more or less complicated given that multiple divisions and trigonometric functions have to be calculated for the coefficients. I measured the first implementation (M=10) to be around 50% of the CPU usage on an Intel i5 architecture. One alternative would be to specifically optimize the divisions and trigonometric functions involved (there are, after all, fast versions of the trigonometric functions in Max). The other alternative would be to use pre-calculated tables, given that we only need to know the filter coefficients for one cycle of each filter. **The M4L device and gen~ code does not contemplate these alternatives**.

### Finer Details, avoiding clicks in audio

Experience taught me that the difference between a good audio plugin/processor/algorithm and a bad one relies on the details. Really. Anyone can implement more or less seamlessly a couple equations and stick them into some kind of framework for audio processing, whether it’s VST, RTAS, CSound, Max/Pure Data externals or similar. Auditory perception is very unforgiving to small glitches in an imperfectly implemented processing routine, and that is what makes audio processing a trade only truly mastered by few. Any inconsistency coming from not fully understanding interaction between asynchronous user or computer control signals and audio stream derives into quite unpleasing effects when listening to the output.

In our case, and as explained in the reference paper, audio glitches can occur at the point in time when the frequency sweeping cycle resets. Since each notch filter sweeps from a given frequency to twice its value (one octave higher), the leap from 2*f to f causes clicks in audio if it’s in the range of the audible frequencies. In this case, the clicking can give a perceptual cue to the ear that the cycle is resetting and therefore break the “infinite sweeping” illusion, rendering the effect useless.

There have been many methods developed for avoiding clicking in time varying filters with different trade-offs and complexity levels. The method mentioned in the paper seems to be a really efficient one, given that the state-space representation of the filter is known.

I tested my version with white noise (noisy signals like snare sounds work best with this effect) and if the filter notches extend sufficiently near the Nyquist frequency, most of the clicking is forgiven. Nevertheless, in order to make a quality effect, some kind of transient suppression described on the links should be implemented. Again, due to lack of time (this was only meant to be one weekend) I was not able to do it properly.

Since I ran out of time for this post, the efficient, click-less implementation “is left as an exercise to re reader”.

Max Device:

http://www.maxforlive.com/library/device/3606

Sound example:

#### References

[1] F. Esqueda, V. Välimäki and J. Parker, “Barberpole phasing and flanging illusions”, in *Proc. 18th Int. Conf. Digital Audio Effects (DAFx-15)*, Trondheim, Norway, Nov. 30 – Dec. 3, 2015, pp. 87-94.

# First steps on Pure Data’s approach to real time audio processing. Compiling externals. Analysing lop~.

### Follow @pm_delgado

### Motivation

There is somehow a well defined workflow when it comes to developing a signal processing algorithm. It mostly consists of taking the ideas developed in academia, analysing its algorithmic behaviour as much as possible in a high level environment such as Matlab/Octave and then porting it to lower-level, implementation efficient languages such as C/C++. This implementation stage is a whole world by itself and it can involve writing **relatively stra****ightforward floating point code** for powerful computer architectures to **painstakingly detailed fixed-point code** for embedded devices.

Depending on which branch or field, the refinement stage to the final solution requires going through decision making mechanisms and trade-off considerations which can be of varying difficulty. Sometimes the processing device just has to adhere to a certain minimum output Signa-to-Noise ratio in order to be considered fully functional. Other times -as is the case on audio- regular **“predictable” characteristics such as SNR** or Total Harmonic Distortion (THD) are **not as important** as the alleged perceptual figures of merit **modelled after the human auditory system**.

This somewhat more complicated scenario requires of course a better feedback structure among all the design stages. The so-called **“parameter tuning”** in audio (or multimedia) algorithms relies a lot on the way it is perceived (heard). Moreover, if these multimedia algorithms are to be used in the context of a signal chain with other signal processors, **interaction of these parameter variations with the other modules has to be taken into account.** One of the conclusions that can be drawn of all this is, particularly useful for those of us who have formal training:

* You can go everywhere you want with the math, but it the end it has to sound good.*

This has been said and reformulated a million times already, and it’s both the blessing and the curse of our trade. This is why wonderful mathematical/engineering ideas that do not consider the perceptual aspect have unfortunately failed to persist and other more “intuitive” concepts that present a dubious formalism have prevailed. The latter even to the point of influencing the whole industry and almost bending traditional DSP procedures otherwise thought as “untouchable”.

### Why Pure Data?

Although self-described as a visual programming language and the fact that it became a “de facto” tool for sound artists and computer music, Pure Data fulfils in my opinion most of the requirements for a good audio/multimedia algorithm development chain.

- The fact that it is
**open source**gives a great starting point for understanding the way that an audio processing algorithm is more or less**efficiently implemented****in a real-time environment**. Many books dig deep into the math and the concept but its “real life” implementation is somehow relegated to a few remarks. It also gives the possibility of further optimization of core functions,**by accessing the code you can write a more efficient implementation**of FFT modules or correlation functions, to give an example. - The easy module development in form of an external gives the opportunity to test C code which is “almost there” in terms of
**performance**, while**maintaining a modular visual environment**for quickly understanding the interaction with other processes, or even with the user interaction. - In combination with a debugger,
**the comparison to or porting from**a**Matlab**/Octave implementation fresh out of the “professor’s class notes” or AES paper should be**straightforward**.

So let’s say you want to try this 5 degrees of freedom parametric EQ and you figured it all out in Matlab. **Of course you can also hear what you do by off-line processing sound sources.** Sometimes though, you are missing out:

- how does it sound when this EQ is cascaded with a multi-band compressor?
- what happens when the user sweeps the cut-off frequency sliders or other parameters from the transfer equation? does the audio flow as smoothly?
- does it make sense to make this or that parameter available to the user?
- have you tried the performance of the filter under a variety of signals like impulsive, tonal, noise-like? how does it handle clipping?…

This is what I mean when I say that PD can be this robust “**missing link**” between theory and implementation in the audio domain.

### Writing an external

Everything necessary in order to write a Pure Data external -that is, everything peripheral to the signal processing part – can be found here: HOWTO write an External for Pure Data. The code below is the signal external example with meaningful comments from the tutorial text. I found this to be the quickest way to assimilate the contents.

/**************************************************/ /*Absolute basics for signal external */ /*Source: http://pdstatic.iem.at/externals-HOWTO/ */ /**************************************************/ #include "m_pd.h" static t_class *pan_tilde_class; /* pointer to class */ typedef struct _pan_tilde { /* data space */ t_object x_obj; t_sample f_pan; /* input variable, mixing factor */ t_sample f; /* dummy variable to convert float to signal */ t_inlet *x_in2; /* audio in-outlets, must be freed by destructor */ t_inlet *x_in3; /* note: they are not used in actual DSP/perform */ t_outlet*x_out; } t_pan_tilde; t_int *pan_tilde_perform(t_int *w) /* performance method, called by dsp */ { /* performance method only takes and returns pointer to addresses (t_int*) w */ /* allocate and cast all passed pointers to data */ t_pan_tilde *x = (t_pan_tilde *)(w[1]); t_sample *in1 = (t_sample *)(w[2]); t_sample *in2 = (t_sample *)(w[3]); t_sample *out = (t_sample *)(w[4]); int n = (int)(w[5]); /* 5 = second argument of dsp_add */ /***************** actual algorithm ******************/ t_sample f_pan = (x->f_pan<0)?0.0:(x->f_pan>1)?1.0:x->f_pan; /* exception handling */ while (n--) *out++ = (*in1++)*(1-f_pan)+(*in2++)*f_pan; /* recurrence equation */ /***************** end actual algorithm **************/ return (w+6); /* points to adress after all pointers in w */ } void pan_tilde_dsp(t_pan_tilde *x, t_signal **sp) /* dsp method, (data_space, pointer to signal) */ { /* signal array **sp ordered clockwise from graphic */ dsp_add(pan_tilde_perform, /* add perform routine to DSP tree */ 5, /* number of pointers used w[n]*/ x, /* pointer to data space */ /* pan_tilde w[1] */ sp[0]->s_vec, /* in1 w[2] */ sp[1]->s_vec, /* in2 w[3] */ sp[2]->s_vec, /* out w[4] */ sp[0]->s_n); /* s_n is length of samples of s_vec w[5] */ } void pan_tilde_free(t_pan_tilde *x) /* destructor */ { inlet_free(x->x_in2); /* all other inouts except 1 must be freed*/ inlet_free(x->x_in3); outlet_free(x->x_out); } void *pan_tilde_new(t_floatarg f) /* constructor */ { t_pan_tilde *x = (t_pan_tilde *)pd_new(pan_tilde_class); x->f_pan = f; x->x_in2=inlet_new(&x->x_obj, &x->x_obj.ob_pd, &s_signal, &s_signal); /* further inlets */ x->x_in3=floatinlet_new (&x->x_obj, &x->f_pan); x->x_out=outlet_new(&x->x_obj, &s_signal); /* outlet, s_signal is selector from lookup table */ return (void *)x; } void pan_tilde_setup(void) { /* passing class info to PD, entry point*/ pan_tilde_class = class_new(gensym("pan~"), (t_newmethod)pan_tilde_new, /* constructor */ (t_method)pan_tilde_free, /* destructor */ sizeof(t_pan_tilde), /* size of data space to be allocated */ CLASS_DEFAULT, /* graphic */ A_DEFFLOAT, /* list of arguments */ 0); /* end of list of arguments */ class_addmethod(pan_tilde_class, /* add methods to class, PD recognizes dsp method, is signal */ (t_method)pan_tilde_dsp, gensym("dsp"), 0); CLASS_MAINSIGNALIN(pan_tilde_class, t_pan_tilde, f); /* declare main signal in, all others through constructor */ }

#### Compiling a PD external.

There are many sources for finding help regarding compiling pure data sources for Linux, a handful for Mac OSX, and some of them for Windows systems. The Linux compilation process is, from my experience, the more direct one: just by reading the README file -and maybe playing a little bit with the makefile- included in the externals folder in the documentary from PD vanilla will do.

For certain reasons I currently only have a Linux **virtualization** within a Windows machine –** a no-no for everything real-time audio related** – so I decided to give the dreaded PD Windows environment a try.

**Compiling PD externals in Windows with Visual Studio.**

Much to my delight, this tutorial has everything needed for accomplishing the task out of the box.

Here is my **CMakeLists.txt** in case it is needed:

cmake_minimum_required(VERSION 3.3) project(testfilter_tilde) add_definitions(-DMSW) set(SRC testfilter~.c) include_directories("C:/Users/pablo/pd-win/pd/src") link_directories("C:/Users/pablo/pd-win/pd/bin") add_library(testfilter_tilde SHARED ${SRC} testfilter_tilde.def) target_link_libraries (testfilter_tilde pd)

And my** testfilter.def** file:

LIBRARY testfilter_tilde EXPORTS testfilter_tilde_setup

### Analysing the code structure of a one-pole low-pass filter object.

A one-pole low-pass filter is a very well understood subject both in the analogue and digital world. As such, the wonderful Wikipedia article on the subject has just the right amount of information. **Starting with** an implementation of **something very familiar** is the key to better understanding the mechanics of a **new development environment**. In this case, the code is a copy of the basic lop~ object:

#include "m_pd.h" /*Filter canvas for testing different types and architectures*/ /* ------------- Data space ----------------------- */ typedef struct testfilterctl { t_sample c_x; t_sample c_coef; } t_testfilterctl; typedef struct testfilter_tilde { t_object x_obj; t_float x_sr; t_float x_hz; t_testfilterctl x_cspace; t_testfilterctl *x_ctl; t_float x_f; } t_testfilter_tilde; t_class *testfilter_tilde_class; static void testfilter_tilde_ft1(t_testfilter_tilde *x, t_floatarg f); static void *testfilter_tilde_new(t_floatarg f) { t_testfilter_tilde *x = (t_testfilter_tilde *)pd_new(testfilter_tilde_class); inlet_new(&x->x_obj, &x->x_obj.ob_pd, gensym("float"), gensym("ft1")); outlet_new(&x->x_obj, &s_signal); x->x_sr = 44100; x->x_ctl = &x->x_cspace; x->x_cspace.c_x = 0; testfilter_tilde_ft1(x, f); x->x_f = 0; return (x); } static void testfilter_tilde_ft1(t_testfilter_tilde *x, t_floatarg f) { if (f < 0) f = 0; x->x_hz = f; x->x_ctl->c_coef = f * (2 * 3.14159) / x->x_sr; if (x->x_ctl->c_coef > 1) x->x_ctl->c_coef = 1; else if (x->x_ctl->c_coef < 0) x->x_ctl->c_coef = 0; } static void testfilter_tilde_clear(t_testfilter_tilde *x, t_floatarg q) { x->x_cspace.c_x = 0; } static t_int *testfilter_tilde_perform(t_int *w) { t_sample *in = (t_sample *)(w[1]); t_sample *out = (t_sample *)(w[2]); t_testfilterctl *c = (t_testfilterctl *)(w[3]); int n = (t_int)(w[4]); int i; t_sample last = c->c_x; t_sample coef = c->c_coef; t_sample feedback = 1 - coef; for (i = 0; i < n; i++) last = *out++ = coef * *in++ + feedback * last; if (PD_BIGORSMALL(last)) last = 0; c->c_x = last; return (w+5); } static void testfilter_tilde_dsp(t_testfilter_tilde *x, t_signal **sp) { x->x_sr = sp[0]->s_sr; testfilter_tilde_ft1(x, x->x_hz); dsp_add(testfilter_tilde_perform, 4, sp[0]->s_vec, sp[1]->s_vec, x->x_ctl, sp[0]->s_n); } void testfilter_tilde_setup(void) { testfilter_tilde_class = class_new(gensym("testfilter~"), (t_newmethod)testfilter_tilde_new, 0, sizeof(t_testfilter_tilde), 0, A_DEFFLOAT, 0); CLASS_MAINSIGNALIN(testfilter_tilde_class, t_testfilter_tilde, x_f); class_addmethod(testfilter_tilde_class, (t_method)testfilter_tilde_dsp, gensym("dsp"), A_CANT, 0); class_addmethod(testfilter_tilde_class, (t_method)testfilter_tilde_ft1, gensym("ft1"), A_FLOAT, 0); class_addmethod(testfilter_tilde_class, (t_method)testfilter_tilde_clear, gensym("clear"), 0); }

Most of the things present on this simple implementation of the lop~ object (here renamed as testfilter~) are evident if one takes into account the previously explained aspects of creating an external.

The two things that stand out are, for once, the actual recurrence equation:

t_sample last = c->c_x; t_sample coef = c->c_coef; t_sample feedback = 1 - coef; for (i = 0; i < n; i++) last = *out++ = coef * *in++ + feedback * last;

Which, taking a look at the last line, is basically of the form:

Here is a little bit more of insight on the subject. The second thing that stands out is the way the parameter for the cut-off frequency is passed on to the filter.

static void testfilter_tilde_ft1(t_testfilter_tilde *x, t_floatarg f) { if (f < 0) f = 0; x->x_hz = f; x->x_ctl->c_coef = f * (2 * 3.14159) / x->x_sr; if (x->x_ctl->c_coef > 1) x->x_ctl->c_coef = 1; else if (x->x_ctl->c_coef < 0) x->x_ctl->c_coef = 0; }

this method only does the conversion from the float argument to the actual frequency value parameter to be implemented on the filter. See how the parameter is written to a memory position which is later on read by the perform routine for each audio processing block. I am guessing that all is managed by an internal synchronization mechanism of the PD core. Further reference for passing on parameters in real time audio (and implementing different parameter curves) can be found in the Steinberg VST SDK documentation.

### PD Filter Patch

The testbench patch for externals is pretty simple. In order to get the frequency response curve for the filter a “generator” abstraction that outputs a wideband signal is directed to the input of the object. This wide band signal is in fact N/2 parallel oscillators (N being the size in samples of the operating audio buffer) with frequencies that are multiples of Fs/N. I might program a quick external to generalize it on a later stage.

Update:

The abstraction filter-graph1 and filter-graph2 from the Pure Data examples (/doc/3.audio.examples/H13.butterworth.pd) has a very nice approach to drawing filter curves by sinusoidal sweeps, in a similar manner an old superheterodyne spectrum analyzer would do. Use this way more elegant solution instead of the parallel oscillators used in the patch.

Further Reading

# Research: Acoustic source localization using wireless sensor networks

**Place of Development: University of Buenos Aires/**

**École Nationale Supérieure d’Ingénieurs de Caen****Author: Pablo Manuel Delgado**

**Supervisors: C. Galarza and M. Frikel**

This project was carried out as a part of my Masters Thesis in conjunction with the University of Buenos Aires and the École Nationale Supérieure d’Ingénieurs de Caen. Below is a very concise and simplified explanation of the main topics, and by no means complete.

**Abstract**: This work investigates most of the aspects related to locating and tracking moving targets via acoustic sensing using wireless sensor networks (WSNs). The majority of these localization methods has historically been based on position estimation via measuring time differences of arrival (TDOA) of the sound wavefront to the sensing points. Considering efficiency of the different localization algorithms on systems with limited storage, transmission and processing capabilities, a newer type of algorithms based on sound energy measurements (Energy Based – EB) and a corresponding propagation-attenuation model have taken prominence in later years. A comparison of the most relevant algorithms with an emphasis on energy based methods is carried out. In addition, an adaptation of an algorithm originally used in TDOA methods is proposed in a collaborative framework (the network is divided into independent processing clusters with local hierarchy and no centralized calculation) for the energy measurement model. Due to new considerations in approaching the disturbances affecting the energy model, this adaptive algorithm presents better performance in terms of communication cost and localization error compared to previously used energy based (EB) methods when the number of sensors in the network increases. Lastly, some considerations on implementation are taken into account under a specific hardware WSN solution, available in the consumer market.

### Context

The task of locating a sound source in space (speaker, vehicle, event..) using acoustic measurements (usually microphones of all kinds) of the emitted signal propagated over a certain medium has been of interest for a long time. This localization can be useful for a variety of applications ranging from enhancing speech processing algorithms in noisy, enclosed environments (Cocktail Party Problem) or passive vehicle or target localization/tracking in the open field. For the latter type of tasks, a specially suited technology comprising of relatively cheap, lightweight, battery-powered microcontroller nodes with sensing and transmitting capabilities is usually employed.

These nodes comprise what is known as a Wireless Sensor Network (WSN, see figure below). These WSNs usually show better performance than the classical localization systems in terms of localization accuracy vs equipment cost, since the nodes are usually cheaper than the costly precision acoustic sensors usually dedicated to these efforts, and can conform an arbitrary topology (sensor distribution) for enhanced resolution in the areas of interest.

### Constraints

The signal processing theory associated with wireless sensor networks often differs in some cases from the classical approach in the sense that the signal acquisition is multiple (bears some similarity to array processing) and also that a special care must be put in optimizing the algorithms for efficient data fusion and calculation in resource constrained, battery powered computers. In general, the following subjects are usually taken into account:

- Transmission cost: the amount of data transmitted must be kept to a minimum within the performance specifications in order to extend the sensors’ (nodes) battery life.
- Computational complexity: the algorithm blocks, methods or functions implemented must usually run in resource constrained computers (micro controllers) which do not have much processing power.
- Underlying protocols: some of the localization methods require tight synchronization among the sensors’ time base, especially the ones based on signal phase or time of arrival, these synchronization efforts over a wireless network can be major resource consuming tasks.

## Development

**Model **

Given N intervening sensors and K emitting sound sources, the received signal, a random variable (usually Wide Sense Stationary) on each sensor will be given by the following expression:

where:

- is the i-th sensor index
- is the attenuation factor of the medium (estimated in a calibration phase)
- is the i-th sensor gain, supposed unitary without loss of generality
- is the signal emitted by the source k
- is the Cartesian position vector of sensor i
- position vector of the k-th source
- time delay of the wavefront between the sound source k and the sensor i
- is the associated noise corresponding to each measurement in the sensors, supposed AWGN with zero mean and variance , and identically, independently distributed (i.i.d.).

In simple terms, according to this propagation model, the signal measured on each sensor will be the sum of the delayed, attenuated versions of the signals emitted by each of the K sound sources. We also assume that the noise power that perturbs the measurement on each sensor is of equal magnitude across the network. Among other considerations regarding the model are:

- The signal is emitted omnidirectionally, the localization process will take place in a 2D scenario,, without loss of generality for the 3D, spatial case.
- The energy of the signal is approximately constant during the measurement interval .
- As evidenced by the preceding equation, the system is such in that linear superposition of the signals takes place.
- The influence of reflections, multipath influence and obstacles are negligible for the measurements.

### Time Difference of Arrival (TDOA) based localization

The TDOA based localization is based on the estimation of the time delay of arrival of the wavefront to the different sensing points that participate. Given two different sensors i and j, one can establish the following equation for estimating this quantity:

where is the cross correlation function between the signals received at two different sensors i and j. Having estimated this delay, many geometrical considerations can be used in order to estimate the position of one acoustic source with respect to known positions of the sensors. In the 2D case, with at least 3 sensors and knowing the propagation velocity of the sound on that given medium, one possibility is to estimate the sound source position as stated by the scheme shown below:

this resulting system of equations can be linearized [3] as nonlinear systems can be difficult to solve in computationally efficient ways. Nevertheless, the determination of the TDOA can be a very resource consuming task due to the tight synchronization on the time base of each sensor. This synchronization will be a determinant factor on the accuracy of the location estimation. In addition, **the transmission of the full set of samples corresponding to the variation of the signal through time during the measurement time window can lead to a considerable data overhead, resulting in a costly transmission. The energy method described below only needs to transmit one single energy value per sensor, greatly reducing transmission burden over the network. **

### Energy Measurement based localization

Considering the previously introduced signal model, the acoustic energy of the signals at each sensor on a time window can be calculated as:

where the TDOA was considered much shorter than the time window for the measurement and therefore neglected. The calculation is carried out at a sampling frequency for L samples. The multipath effects and other reflections and perturbations described on the model assumptions have usually shorter time constants and are consequently smoothed out by the sample sum.

### Optimization problem

With similar geometric considerations using the proposed model and the measurements, many different systems of equations can be derived for estimating the source position. The deviations from the model, modeled as noise in the previous equation, turn the associated localization problem in an optimization problem, for which a Maximum Likelihood (ML) estimator can be found [4].[5].

The squared sum of random variables from the energy calculation corresponds to a new random variable which has a distribution. Nevertheless, the ML estimator derived from optimizing a feasibility problem under these circumstances can lead to complicated calculations. One way to tackle this problem is to assume that, given the sufficient amount of samples L, can be approximated to respond to a normal distribution (as granted by the Central Limit Theorem). This normal distribution can be completely described by its moments of first and second order, derived from the energy equation for [4]:

The complexity of these expressions comes from the fact that the cross products between the random variable representing the signal and the one representing noise were not neglected, as is common practice with the independence hypothesis [5]. This simplification is valid when the number of samples on the energy calculation approaches to infinity, i.e. only on the limit. Nevertheless, for finite samples ignoring these cross product terms can sometimes lead to great errors on estimating the true values of the measurements. As it will be seen below, keeping these terms gives way to a better model for the estimation process.

This set of equations put in evidence that the incertitude on energy calculations not only will depend on the measurement noise power, but also on the distance from the source to the measuring nodes. This means that the error affecting the optimization problem is not identically distributed along the equations given by each measurement, even if the measurement noise is equal for all the sensors. There is also a cross correlation (albeit small) between measurements from different nodes.

With the previous considerations, the optimization problem can be associated to a cost function:

where estimates the source position, its energy and a parameter depending on its autocorrelation function. is the error covariance matrix of the problem.

As it can be seen from the above figure the cost function is nonconvex. This nonconvexity can lead to complex computational optimization methods for finding the true maximum, and therefore the source location. In addition, even if very precise when properly initialized, these complex iterative methods can converge to local extrema which do not reflect the true source location. These are some of the reasons why the maximum likelihood methods can be prohibitive for implementation on distributed algorithms for WSNs, given the fact that more often than not, the sensor nodes are implemented with very basic hardware and limited computational power.

### Linearization: Weighted Least Squares

The optimization problem can be linearized in various ways. Thus, the nonconvex ML problem can be solved through a (linear) least squares approach. By rewriting the energy calculation expression as:

where represents the LS equation error, normally distributed with the moments presented on the previous section. We have then, rearranging and expanding:

The LS error in this equation system is not identically distributed along the set of equations. As it can be seen, its magnitude not only depends on noise power, but also on the relative distance from the source to the i-th intervening measuring node. This automatically leads to a weighted scheme in the LS problem. The corresponding solution will be then, as usual, of the form:

where is the normalized weighted matrix for the problem, and can be estimated locally only with the energy measurements and the estimated noise power [4]. This matrix can also be considered diagonal if we neglect the cross-correlation terms, which is usually an acceptable simplification. Given these conditions, the weighted least squares problem can be solved iteratively [6], [1].

### Adaptive algorithms in Wireless Sensor Networks

There are many methods for solving linear systems in a recursive way. In particular, the regular expressions for various versions of adaptive filters can be interpreted as iterative algorithms for accomplishing such a task, where each new measurement corresponds to a new equation or row in the system.

In the context of Wireless Sensor Networks, a given estimator of the final solution is propagated through the network and updated with the local measurements of each node. In addition, this propagation scheme can respond to a certain hierarchy on the nodes for granting (or accelerating) convergence and optimizing transmission power.

## Results

Performances of different Energy Based localization methods and one TDOA linearized method [3], are compared in terms of the RMS error over Monte Carlo simulations.

### Experiment Description

The sensor positions will be randomly placed on a 20×20 squared meter surface, conforming an arbitrary topology. The source will perform a random trajectory on this area as well. The network size is varied from 6 to 10 nodes. The emitted acoustic signal is white noise generated via a pre-established power spectrum density with 1kHz bandwidth. A sample frequency of 5 kHz is used for the measuring nodes over a period of one second (5000 samples) with a Signal-to-Noise Ratio of 20 dB.

#### Root Mean Square Error

The RMS error in meters is calculated according to the following formula for the Monte Carlo simulations:

where is the amount of different random network topologies (sensor placements) used, the number of simulations carried out for each network topology. is the location estimate given by each of the tried methods over each simulation and the true source position.

### Centralized Methods

The centralized methods consist on receiving all the raw measurements from each sensor node to a fusion center and solving the system of equations in a batch procedure (non iterative way). This might be not possible in situations where computational power is restricted and transmission time must be saved in the sensors in order to save energy. However, they provide the performance limit for distributed (iterative) algorithms that propagate the current location estimate across the network.

The centralized acoustic source localization methods compared here are:

- A linear nonweighted Least Squares optimization problem (LS_OS), similar to that implemented in [4].
- A linear weighted direct method (LS_WD) [4] following the extended propagation model described above.
- A correction method based on the direct method for further improving the estimation in some cases (LS_WDC)
- A linearization of the TDOA localization problem as described in [3]

This comparison will be the basis for determining the theoretical performance limit of some distributed versions [1] of these methods, more suitable for implementation on low cost WSN’s with no fusion center.

#### Root mean squared localization error (RMSE) with varying SNR

The parameters that present the most influence on the energy based localization methods are the signal-to-noise ratio and the number of samples used for the energy measurement. It can be seen clearly that under the 10 dB mark the LS-WDC (corrected, weighted method) deteriorates even more than the nonweighted schemes, so it can be seen that bad energy readings due to noise directly affect the estimation of weights for the WLS problem. Nonetheless, the weighted schemes remain vastly superior for higher SNR in all cases. The TDOA estimation (green) is expectedly better than all the EB methods for SNR > 15 dB, given that its an intrinsically more accurate, time-based method, but requires at the same time more resources in terms of synchronization and data transmission, as discussed previously. Under the 10 dB mark the TDOA method fails considerably due to the fact that the maximum value of the cross correlation function can no longer correspond to the true time delay between the two signals received at different sensors, therefore resulting in a bogus estimation of the true position.

#### Root mean squared localization error (RMSE) with varying network size

The improvement of the direct, weighted methods over nonweighted schemes is noticeable with the growing number of sensors in the network. Given the geometrical considerations for the nonweighted localization method, based on energy reading ratios over pairs of sensors [4], [5], the variance and covariance of the measurements can be considerably high when two different energy ratios from different sensors are similar in magnitude, therefore degrading the performance. On the other hand, the weighted schemes -as deducted on the previous section- do not rely on these energy ratios between a pair of sensors, and the direct weighted method is not as heavily affected by this phenomenon. The chance of having two different sensors with similar energy readings increases with the number of sensors, so the performance of non weighted, ratio-based schemes reach a limit very quickly. This is not the case with the weighted direct methods.

#### Root mean squared localization error (RMSE) with varying sample amount

The amount of samples greatly influences the performance of the weighted schemes as seen above. The weighted schemes outperform the nonweighted methods with the sufficient amount of samples used for the weight estimation. However, an insufficient amount of samples can considerably reduce performance. This is evidenced by the difference between the theoretical weighted method curve (red) and the estimated weight method (cyan) that significantly differ for fewer samples. The case where fewer than 10 samples are used for the energy calculation is nonetheless unlikely in any application. On the other hand the TDOA has a threshold in accuracy corresponding to the case where the amount of samples used (at a given sampling rate) for the cross correlation function is not enough for reflecting the propagation time of the sound wave from the source to each of the sensors.

### Distributed (adaptive) Methods

As it was mentioned on the previous paragraph, the performance analysis of the centralized methods provide a limit to which the distributed methods converge. The importance of implementing localization algorithms in an adaptive form comes from the fact that the hardware used might not be powerful enough to carry out all the computations needed at once. Among the prohibitive mathematical operations are the large matrix inversions required for solving the equation systems coming from the measurements. These iterative methods are often suitable for being implemented in a collaborative, distributed manner on a WSN, where each node only performs a relatively simple mathematical operation based on previous data calculated elsewhere on the network and its own measurements. The resulting updated estimator is then transmitted to other nodes and the process repeats itself until a certain threshold (usually set upon convergence criteria) is reached.

There are several adaptive, sequential methods for solving localization problems in a distributed manner. These methods usually focus on optimizing design parameters related to data communication overhead among the network, mathematical operations involved on local node hardware, precision and speed (number of iterations required) of convergence. One of the most popular adaptive methods successfully implemented on WSN’s is the method of Projection Onto Convex Sets or POCS [7]. This method proved to be one of the most efficient in terms of achieving good precision of the location estimates given the design constraints recently mentioned, in particular requiring low communication overhead for transmission among nodes and a relatively low computational complexity for estimator updates on each node. There are, nevertheless, a few problems associated with the localization process based on POCS.

- Speed of convergence can easily deteriorate if the relaxation parameters of the algorithm are not properly initialized, resulting in unnecessary transmissions across the network for reaching a solution within the restrictions set by the convergence criterion.
- The localization method on which the POCS is implemented is nonlinear.
- The statistic or attribute that provides the basis for the localization method over which the POCS is applied is the computation of an energy ratio between two different energy measurements in two different sensors, if the source emission power is unknown [5],[7]. Therefore, for a network or cluster (subgroup of nodes participating in the localization process) comprised of N nodes, the total amount of energy pairs (ratios) would be then N(N-1)/2. This means that, in order to the algorithm to be properly initialized and the measurements be available on each one of the nodes (for a subsequent estimator update), the same (minimum) amount of initial transmissions need to take place, thus enlarging the communication overhead greatly. The number of transmissions can be prohibitively large when the number of sensors N increases. Accordingly, a localization method which utilizes direct energy measurements (as opposed to energy ratios) is preferred.
- As explained before, for the case in which an energy ratio approach is used for source localization, a great variance in the estimation can be present if two energy readings are similar in different sensors (i.e. energy ratio close to 1), thus greatly deteriorating the estimation and/or convergence. The chance of having similar energy readings on two difference sensors increases with the number of sensors in the network, and the convergence of the POCS algorithm gets worse.
- As implemented usually, the POCS algorithm for acoustic source passive localization based on energy readings does not incorporate the extended model (i.e. does not consider weights on the associated measurements for the optimization problem), therefore degrading the precision of the localization.
- In order for the POCS method to converge, the acoustic source must be inside the convex hull of the sensor node positions.

On the other hand, an iterative method for solving the linear WLS system of equations described in the previous section can be constructed [1]. This method is direct, in the sense that it only requires local measurements for the basic statistic used for the estimation (it is not necessary to transmit the values across the network for producing the energy ratios), thus the convergence and precision will not be affected by similar energy measurements on two different sensors, as is the case in the POCS method. The method is a well known iterative algorithm for solving weighted, least squares problems. For each iteration of the algorithm, an estimation of the location is issued along with a corresponding variance of the estimator. These two values are subsequently updated across the network with new measurements until a convergence criterion is met. The weights are, as described, estimated with the measurements and can heavily influence speed of convergence if they are not accurate enough on a low SNR environment. The total amount of iterations possible is, in this case, equal to the number of sensors present in the network which participate in the localization process. This comes from the fact that, in the direct method, each measurement contributes with one equation per node, totaling N equations, as opposed to the energy ratio localization approach in which a total of N(N-1)/2 equations -and accordingly the same number of iterations- can be used.

#### RMS error vs signal-to-noise ratio for adaptive algorithms

For the RMSE vs SNR scenario, a network of 10 sensors randomly distributed over a 20×20 sq. meter surface along with a random deployment of the sound source over Monte Carlo simulations is carried out. The number of samples for calculating the energy is 5000.

The inclusion of weights in the LS problem due to an extended signal model is an advantage for high SNR, as is expected since the background noise terms are no longer dominant in the energy measurement variance, giving room to the terms that depend on the source-sensor distance that are compensated by the weights. It is also pointed out that a bad SNR leads to erroneous estimation of the weights, which can lead to a poorer performance than in the unweighted method.

#### RMS error vs number of sensors for adaptive algorithms

Considering the number of sensors in the network, the POCS method, the proposed adaptive WLS method and a centralized WLS centralized batch method (for control) are analyzed in terms of RMS error, in meters, in a 20×20 sq. meter surface with a signal-to-noise ratio of 30 dB and 5000 samples used for each energy calculation.

It can be clearly seen that the POCS method reaches a RMSE limit for the given experiment parameters once the network has more than 11 participating sensors. This is due to the previously explained phenomenon involving a higher chance of high variances due to similar energy readings from different sensors, and thus inducing energy ratios near 1. This is not the case for the WLS direct method adaptive method (cyan), where equal energy readings do not interfere the accuracy due to the fact that each measurement updates the estimator independently as opposed to using energy ratios. The centralized method provides the limit of performance for the WLS method, and is well below the adaptive estimations.

### Convergence vs estimator path

One of the determinant factors to economize when dealing with collaborative algorithms in Wireless Sensor Networks is total transmission time and data overhead. The amount of data and the distance over which it is transmitted is a key factor in power consumption. Given that these networks are usually composed of autonomous, battery-powered nodes, it is of great interest to design algorithms that can extend the autonomy to a maximum. In order to maximize battery power the data (i.e. the estimation’s current state) need to be transmitted through an optimum path. Since usually the transmission cost is proportional to the euclidean distance over which the data is transmitted, this optimization can be carried out using techniques for solving what is called a Travelling Salesman Problem, which consists in finding the shortest possible route for visiting each member, only once, of a given set of points in space. This way, the estimator is propagated (and updated) across the network in the most efficient way in terms of distance, and accordingly transmission cost.

On the other hand, the optimal propagation path for the estimator in terms of transmission cost might not be optimal for convergence speed or smoothness. Given the structure of the signal model on which the adaptive WLS algorithm is based [1],[4], the process will converge smoothly only if the estimator is propagated in such a way that the LS error on each measurement is decreasing in magnitude [1]. In other words, given that the LS error depends on the distance from the source, the estimator needs to be propagated from the closest measurement (i.e. highest energy reading) to the weaker ones, in which the variance is larger. This way, the adaptive gain of the algorithm can properly predict the noise present in the next measurement, thus making the algorithm converge smoothly.

RMSE vs number of iterations are shown below for the estimator propagation path above. For N=30 active sensor nodes in the network, the POCS method has a maximum number of N(N-1)/2=435 possible iterations while the direct adaptive WLS algorithm has only 35. The convergence rate seems to be smoother and faster on the WLS method. However, the sub-optimal propagation route chosen makes this unfeasible and/or inefficient for implementation on a battery powered wireless sensor network.

The cyan and purple lines correspond to each theoretical limit -calculated in a one-shot, centralized way- for each adaptive algorithm (i.e. the global solution). The theoretical, nonlinear solution for POCS is solved using a local maximum search routine based on the Nelder Mead method and is usually more accurate than the linear methods, albeit requiring more computational power.

If an optimization process is carried out for choosing the propagation route beforehand, the transmission energy for the localization process will be minimized, but the convergence (especially on the WLS method) can fail or be heavily perturbed. In this case, the simulation system tries to solve the Traveling Salesman Problem using a convex hull insertion algorithm.

This is because, as explained before, the adaptation gain will not predict correctly the noise levels of the next sample. The variance magnitudes are not defined by a smooth variation between measurements, as is the case with a decreasing energy reading propagation path. The result will be minimum transmission cost but harsh convergence and a tendency to huge instant deviations from the true location on the estimator (figure below).

### Improving convergence by estimating weights

In summary so far, there are two cases possible concerning the balance between an efficient transmission of the data and the smoothness of convergence of the distributed WLS algorithm. In the first case, the estimator es propagated in order to approach the nodes with an ascending variance in measurements, and the estimator converges smoothly. On the other hand, the data of the estimation is propagated to optimize energy (transmission) consumption. Through this route, the variances over the measurements can be dissimilar, thus the adaptation gain may not act as a proper buffer for these sudden changes in variance (or sample noise) and the algorithm can not properly adapt or predict when being updated through the sensors. The problem with the second case can nevertheless be relieved by properly estimating the associated weights.

As it can bee seen from the associated LS error equations, the normalized theoretical weight matrix to counteract the uneven distribution of the variance in measurements will contain elements which are mostly dependent on the distance between source and sensors. Given that the actual distance between source and sensor is not available in the localization process, these weights need to be estimated from measurements within each node [4]. **The extent to which these weights are accurately estimated determines the quality of the final location estimation, but also the adaptive ability of the algorithm facing abrupt changes in variance, as is the case when using an optimized route**.

As it was shown on the centralized case for the influence of the number of samples involved in weight estimation on location accuracy, the same procedure can be used to show the performance limit of the WLS algorithm using theoretical weights, based on the distance between source and sensor nodes.

## References

[1] Delgado, P. “Localización acústica de fuentes móviles mediante redes inalámbricas de sensores”, Tesis de Grado, Universidad de Buenos Aires, Dto. Electrónica, Mar. 2014

[2] Delgado, P. “Méthodes de Géolocalisation: Théorie et implémentation avec réseaux de capteurs sans fil”, Rapport du projet ingénieur, École Nationale Supérieure d’Ingénieurs de Caen, Feb. 2013

[3] Yiteng Huang; Benesty, J.; Elko, G.W.; Mersereati, R.M., “Real-time passive source localization: a practical linear-correction least-squares approach,” *Speech and Audio Processing, IEEE Transactions on* , vol.9, no.8, pp.943,956, Nov 2001

[4] Meesookho, C.; Mitra, U.; Narayanan, S., “On Energy-Based Acoustic Source Localization for Sensor Networks,” *Signal Processing, IEEE Transactions on* , vol.56, no.1, pp.365,377, Jan. 2008

doi: 10.1109/TSP.2007.900757

[5] Xiaohong Sheng; Yu-Hen Hu, “Maximum likelihood multiple-source localization using acoustic energy measurements with wireless sensor networks,” *Signal Processing, IEEE Transactions on* , vol.53, no.1, pp.44,53, Jan. 2005

[6] Kay, S. “Fundamentals of Statistical Signal Processing”, *Prentice-Hall PTR, 1998. *

[7] Blatt, D.; Hero, A.O., “Energy-based sensor network source localization via projection onto convex sets,” *Signal Processing, IEEE Transactions on* , vol.54, no.9, pp.3614,3619, Sept. 2006

# Conspiracy Against Truth: European tour

Conspiracy Against Truth embarked on its first European tour between the months of September and October of 2013. The tour consisted of more than 15 dates across Germany, France, Belgium and the Netherlands. The tour was sponsored by the Argentinean Ministry of Foreign Affairs and Worship as a cultural expansion initiative.