Description
This application is a spectrum analyzer that displays the frequency spectrum of an audio stream in real-time, with optional effects on the audio stream.
Overview
This application is a spectrum analyzer that displays the frequency spectrum of an audio input in real-time, with optional hard clip distortion and frequency shift effects applied to the audio. It provides recording functionality and will also show the peak frequency with its equivalent note name.
Idea
Initially, I had the idea of creating a distortion effect, and I wanted to add a visualization of the effect it had on the audio. I decided to create a spectrum analyzer that would allow the user to see the frequency reflect the changes that distortion adds to the mix. I then decided to add the frequency shift to take advantage of the spectrum view and show those changes as well.
Features
- Real-time frequency spectrum analyzer
- Solid and line spectrum view
- Microphone on/off toggle
- Output mute/unmute toggle
- Spectrum freezing
- Frequency shifting
- Hard clip distortion effect
- Gain control
- Keybind menu
- Dynamic UI resizing/scaling
- Recording to .wav file (will create `out/` folder if it doesn't exist)
Keybinds & Controls
Click and drag a knob to adjust its value or scroll with mouse wheel over a knob to adjust its value. Click on a button to toggle its state.
- ESC
to quit
- V
to toggle between lines and solid spectrum view
- N
to toggle microphone on/off
- M
to toggle mute/unmute
- F
to freeze the spectrum
- R
to record audio to .wav file
- SHIFT + LEFT CLICK
to reset a knob to its default value
- CTRL + LEFT CLICK
to allow finer control of a knob
- RIGHT CLICK
on FREEZE to toggle freeze mode (other parameters can be adjusted while frozen)
Requirements
Note: The executable version does not require the Python dependencies, only mic/DI.
Usage
Clone repository and run `py spectrumtool.py` **OR** download the `SpectrumTool.exe` executable from the Releases section.
The executable is a standalone application and does not require the assets/
folder or the out/
folder to be present in the same directory.
Technical Foundations & Implementation
This section will explain the technology and concepts that are used and how they are utilized in the project.
Audio I/O
The PyAudio library is utilized to handle the audio input and output streams. It does this by opening an input stream using the default audio device, and parameters such as rate
(sample rate) and frames_per_buffer
(buffer size), then we read the contents of the buffer to an np.array
. The format is a 16-bit signed integer, giving the signal a range from (-32768, 32767). This is the audio used for the signal processing effects down the chain, explained in the following sections.
Frequency Spectrum and FFT
The frequency spectrum is created by performing a fast fourier transform on the audio after the effects chain using the np.rfft
function. As we learned in this course, this will convert a time/amplitude representation of audio into a frequency/amplitude representation.
The points that this function create are represented linearly, which would limit the visibility of most of the spectrum range to only the high end. To account for this, I created a logarithmic scale that would then be mapped to the x-axis of the display with the given size, which is saved to the points
list of tuples.
The output of this function is then adjusted to fit the width and height of the window, and is drawn onto the screen using the pygame.draw.aalines
(antialiased line) or pygame.gfxdraw.polygon
(solid) functions, depending on user selection.
The spectrum is drawn several times at decreasing amplitudes for it to stay on the screen for a while before dropping to zero. The visual representation of the amplitude is scaled according to the height of the window and prevents any values to go off-screen.
DSP
The digital signal processing effects used in my application are frequency shifting, hard clip distortion, and gain control.
Frequency Shifting
The first effect in the chain is frequency shifting. This is done using the np.roll
function on the frequency spectrum of the input data, which shifts the values in the array by an amount that is controlled by the SHIFT
dial. The default value is zero and can be shifted by +/-24Hz. Note that this shifts by Hz and not by semitones.
I chose frequency shifting over pitch shifting as it can make some very interesting sounds with vocals or different instruments.
Distortion
The distortion effect used in my application is hard clipping. This type of distortion occurs when the volume reaches a specified threshold and is cut off. This usually will occur when the amplitude exceeds the maximum amplitude range of the audio format. However, in this case it is done purposefully by clipping the amplitude using the np.clip
function to a value specified by the maximum Int16 amplitude divided by the ‘DIST’ knob amount. The range of this knob is between (1,512). This has the side effect of sounding very loud at values near 1 but very quiet at high values. To counteract this, when the ‘DIST’ knob is at the minimum value, it passes through the unaltered audio, and then the amplitude is modified relative to the distortion amount as it increases. The audio is then clipped if any values are out of range.
Gain
The GAIN
dial controls the amount to multiply the amplitude of the output stream by. The knob has a range of 0 to 1.2 and the default value is 0.8.
Note Detection
The note detection works by finding the peak frequency in the spectrum array after an FFT has been performed on the data. This will return an index which we can then convert to the equivalent frequency.
The MIDI note that is detected is then mapped to its note’s name/symbol using the note.map
file I created to easily map the two. I opted for using the MIDI value instead of the actual frequency as it made more sense not to hardcode the frequencies in case I desired to change the tuning. The note name is then shown on the screen at the peak frequency, or next to the mouse pointer when hovering over the spectrum.
Design
I chose to design the application with ‘Google Material’-ish theme because it provides a modern and consistent look. The knobs I created with the the Ableton Live native knobs as inspiration, using a thin border along the radius of a circle that is filled based on the knobs value. The font I chose is Product Sans, which is a modern font that lends itself well to designing UI elements. The ‘LINE’ display mode was actually created accidentally, as I had initially drawn the spectrum with a solid polygon and a missing value removed the fill. I then used the anti-aliased version of the `draw.lines` function to replicate that with a smoother look.
References
Dependencies
- Audio I/O Lib | PyAudio
- Math Lib | Numpy
- Graphics Lib | PyGame
- PyInstaller | PyInstaller
Assets
- *Product Sans* Font by Google | https://befonts.com/product-sans-font.html
- Application icon by Icons8 | https://icons8.com
Citations and Attributions
- Hard Clipping | https://www.hackaudio.com/digital-signal-processing/distortion-effects/hard-clipping/