The use of timbre-sensistive visualization of sound corpora in creative practice

gutmandres
Dec 7, 2024
2 min read

Updated: Jul 3

For some time, I've been working with the FluidCorpusManipulation package (FLUCOMA) in my compositional practice. This library performs advanced timbre-analysis on a defined sound corpus and creates a 2D dimensional visual representation of the collection in which each point on the plot represents an analyzed sound segment. The spatial arrangement of the points is the result of a computed analysis in which the distances represent the computed similarity or difference in terms of the descriptors used in the analysis. In this plot, the analysis was performed with the first thirteen MFCCs. The Mel-Frequency Caepsral Coefficients are used in MIR for timbre representation with a strong correlation to human perception. For each sound segment, the analysis averages the MFCC's values given a preset window size and stores a value for each coefficient. The multidimensional data is further reduced using multi-dimensional scaling algorithm (FluidMDS). The distance between points is the normalized computed distance of the MFCC analysis. This is a very effective way of sorting sounds in terms of timbral similarities and differences regardless of pitch. This method follows a similar procedure used in timbre-perception research experiments using dissimilarity rating tasks. In those cases, the distances are the average perceptual distances between the sounds.

In the context of sound generatiion or live-electronic performance, this tool affords timbre-sensitive concatenative synthesis. However, the tool could be very a powerfull assitant in the compositional process.

The two pannels above show a 3-dimensional representation of the saxophone multiphonic collection created by the Quasar saxophone quartet folded into two 2D-plots (with X Y represented on the left pannel and Z Y on the right pannel). The colors of the points represent the instruments (red: soprano saxophone, green: alto, blue: tenor: black: baritone), and the hue for each color represents the relative loudness. We can see from the image that the instruments are clearly spatially spread throughout the normalized timbral-distance space and somewhat differentiated meaning that the timbral characteristics per instrument are clearly identifiable by the computational analysis. The left pannel presents a clear separation by instrument mostly for the baritone saxophone (on the top) and the soprano saxophone (on the bottom), and some of the multiphonics of the tenor saxophone (on the left side), however, there are also overlapping areas. This means that the timbral characteristics across instruments is similar.

As for the acoustic correlates per dimension, it is sometimes difficult to pin-point exactly what timbral traits are being represented. By listening to the extremes, one can intuitively identify what timbral traits are salient in each region.

My composition for saxophone quartet Ebb and Flow (2024) was composed using this timbral representation to make the selection of multiphonics according to similar or complementary timbral characteristics. Read the full article about the composition process of Ebb and Flow here.

Comments