|
This page presents Nic_2, a Lego Mindstorms robot able to localize a sound source in space developped during a cooperation with Claude Baumann. It uses quasi simultaneous sound records with two microphones in combination with head movements to gain information about the direction of a unique sound source location. The robot uses only one RCX and an additional multiplexer and amplifier board for the rotation sensors and the microphone signals respectively. It samples sound with 36kHz on both channels of the amplifier during 8.33 ms (2x300 measurements) and then applies a correlation method to deduce the time difference of arrival (TDOA) of the two signals. It determines the so-called interaural time-difference. Using novel theorems (Binaural model for artificial sound localization based on interaural time delays and movements of the interaural axis, Kneip/Baumann), the robot is able to use only two measurements before and after a rotation of the interaural axis to deduce the exact direction of sound. Watch the video to get more details of this fascinating robot. Considering the fact that Lego materials (mechanics and processing unit) do not represent most professional and efficient equipement, one has to confess that the power of determinating the direction of incoming sound with such high precision certainly lies in the deterministic theorems developed in the scope of this project and presented in the previsouly mentioned JASA publication. The following video shoes the robot in action. Note that the sound source is situated on the left of the camera: The invention of Lego robots able to localize the direction of sound has already a longer history at the Convict Episcopal of Luxembourg. A detailed listing of the previous work may be found here. However, none of these robots was able to deterministically localize the spatial direction of incoming sound of a continuous sound source. Nic_2 is the first, completely autonomous Lego robot able to solve this problem. The page also presents Nic_3, an analogon to Nic_2 whose mechanics have been built up using the new Lego NXT toolkit.
Nic_2 is equipped with all three rotational degrees of freedom. The robot is able to rotate the interaural axis around the indicated x, y and z axis. Moreover, Nic_2 head-rotations conserve the origin point, which means the center between both microphones. The x- and the y-axis are driven by couples of shafts that are twisted against each other, so that the gears are submitted to torsion eliminating dead zones, if the directions of rotation are flipped. These careful measures reduce the error to less than 3° for any degree of freedom. But they also increase friction so that the mechanics have been geared down sufficiently. The LEGO rotation sensors are directly coupled to the motors. Nic_2 is equipped with two Electret microphones that point into the y-direction and that are separated at a distance of 16cm. With its omnidirectional characteristics, Nic_2 doesn not automatically solve the back-front ambiguity. Supposing that the signal's frequency is not too high, sound waves arriving from the rear will not be perceived weaker. A general threshold for the signal strength has been fixed by software.
The RCX offers only 3 input ports and other 3 output ports to the user. Because it was the declared goal to realize a low-cost
device with one microconroller-system only, some analog electronics had to be added in order to assure the amplifying of both audio
signals -wired to input ports 2 and 3- and the input multiplexing of the 3 rotation sensors that have thus been connected together
to the remaining input port 1. (Using a microcontroller system that offers more input ports to the user makes multiplexing obsolete.) The three
motors are directly controlled by the RCX. The multiplexer with amplifier may be regarded as a combination of the following two projects:
Nic_2’s functionality is described with: Turn towards the sound direction! As mentioned above, Nic_2 uses microphones with omnidirectional characteristics, which does not automatically solve the back-front ambiguity. Therefore, sound source locations are reduced to the front region. It allows to solve the problem applying only theorems A and C. What it essentially does is using a measurement before and after a rotation about the y-axis (with determination of interaural time delays) and then using Theorem C of the JASA article to mathematically determine the exact direction of the sound source, which means azimuth and elevation. The used correlation method to determine the phase shift between the amplified signals is a variant of the SAD algorithm (sum of absolute differences method). The step by step program flow looks like the following: -Robot initialized with x-axis in horizontal position If the robot is finally asked to turn its head into the direction of sound, this means that it must rotate the head around the z-axis and the x-axis, with only 2 degrees of freedom being activated. The third degree of freedom has been used to effectuate the y-rotation and determine the elevation without ambiguity. The resulting head position fulfils the condition that the y-axis points towards the sound source. The software has been realized with Ultimate Robolab, a great programming environment for graphical creation of powerful RCX firmware. This environment also provided the necessary mathematical library to evaluate the trigonometry of the theorems.
Tests have been made in an ordinary reverberant room of 50m3 volume. The approximate distance to the sound source was 1m. The sound is emitted by a small music playing mono-radio, representing an immobile audio source. Depending on its relative position of the sound source to the robot head referential, the sound waves arrive at the microphones with different delays. The final results of the complete measurements for several sound source locations are shown below. As we can see, errors stay below 10 degrees, which is a very good performance regarding the hardware conditions of the system.
-The robot presented here does not solve the sound localization problem without any ambiguity in case of microphones with no directional sensitivity. The
application of only Theorem C of the JASA article (rotation about the y-axis) leaves the "front-back"-ambiguity unsolved. A combination with a rotation about
the z-axis could easily be used to solve the remaining ambiguity. Some other scenarios may be imagined with "up-down"-ambiguity solved (no sound source below
ground level possible). A single rotation about the z-axis would then be sufficient. Some of these ideas as well as the use of translatory movements for a
determination of the distance to the sound source are presented in the
JASA paper as well, so in case of interests in sound localization issues, it is certainly worth having a look at it.
|