Incubating a culture of innovation & creativity
Uncover the transformative potential of digital and mobile solutions for your industry
Augment your team with exceptional talent
Empowering brands and startups to drive innovation and success with unmatched expertise
How does music recognition work with the Shazam app?
It’s a typical scenario: You listen to a song at a restaurant or an event, and then that song stays with you, haunting you, forcing you to find its source and singer.
Earlier, the only option was to replicate that song or a few verses and ask friends and family to hunt for the source.
But since 1999, this changed because a magical app called Shazam was born that year.
What Exactly Is Shazam? Let’s explore with the leading app development company.
Shazam is an app that can recognize music, movies, advertising, and television shows and showcase the source and other details about that content.
It seems magical, right?
In this blog, we will decode the internal workings of Shazam and find out how it works.
But first, let’s overview Shazam and find some startling facts about this app.
Shazam was launched in 1999 by Chris Barton, Philip Inghelbrecht, Avery Wang, and Dhiraj Mukherjee, who teamed up to create a system that can recognize music and other related content and reveal everything about them.
In 2018, Apple acquired Shazam for $400 million, making it one of the most significant mobile app acquisitions at that time.
As of now, Shazam is not only available in the Apple App Store but also on Android, macOS, iOS, Wear OS, watchOS, and as a Google Chrome extension.
As of September 2022, Shazam has more than 225 million global monthly users, and it’s expanding at a rapid pace.
To recognize songs and TV/video content, Shazam has acquired 200 patents; as of now, the app has more than 12 billion tags, which can categorize music and video content based on user inputs.
In 2015, it was found that 5% of all music downloads across the world originated from Shazam, making it one of the most extensive databases for music content anywhere in the world.
One significant statistic for businesses looking for a solid platform to advertise their products: Advertisers on the Shazam app receive an average of 1 million clicks daily!
And one exciting trivia: The most searched (Shazamed) song ever is Dance Monkey, which has been searched 41 million times, and Drake is the most Shazamed artist ever with 350 million hits.
An incredible achievement for a tech-powered mobile app!
The operational model of the Shazam app is simple: The app listens to max 20 seconds of a song or video content from TV, a movie, ads, etc., and it can be a chorus, verse, or a mere intro, and then instantly recognize that content, and show the results.
An important thing to note: No matter how long that song or content is, the Shazam app will only read the first 20 seconds.
Now, once that data is fed into the Shazam app, then it will:
Now, this process of recognizing music is not that simple!
There are tons of processes and algorithms that work in tandem to reveal the exact source of the music and content.
In 2013, one of the inventors of the Shazam app: Avery Li-Chung Wang, shared the magic behind the Shazam app via research paper and, for the first time, revealed how this app works.
First, let’s understand what sound is.
As per science, sound is a vibration that propagates via mechanical waves comprising pressure and displacement, and the medium is mostly air or water in some cases.
The three main components of sound are frequency, time, and amplitude.
Amplitude is the loudness of the sound, which is the size of the vibration.
Source:
https://www.toptal.com/algorithms/shazam-it-music-processing-fingerprinting-and-recognition
Frequency, measured in Hertz (Hz), is the rate at which the vibration occurs. A human being can only listen to sound whose frequency lies between 20Hz and 20,000Hz.
Time is crucial because it shows at which interval a sound has occurred in relation to other sounds.
This is important to know because when a song is produced, it has sounds from different instruments that vary in frequency and amplitude as they move through time in relation to one another.
This is the reason that the same song having two different versions will still generate a unique fingerprint due to the complexity of frequency, amplitude, and time.
Once the Shazam app records the first few seconds of a song or any audio content (max is 20 seconds of recording), it will create a unique audio fingerprint of that song.
For that, the recorded analog sound is converted into a spectrogram, wherein the X-axis represents time, the Y-axis represents frequency, and the density of the shading represents amplitude.
For each section of an audio file, the algorithm chooses more substantial peaks, and gradually, the spectrogram is reduced to a scatter plot. A point comes when the amplitude is no longer needed.
And this is the crux of Shazam’s operations.
Two unique fingerprints are created, and then they are matched to find the exact song that is being fed into the system.
There is an advanced process called combinatorial hashing, which is deployed to create exact and unique fingerprints of the audio files and ensure that the matching is perfect.
This is how it works:
At this point, we have the unique fingerprints of both audio files. Now, the actual process of matching the songs starts.
Suppose we plot this process of the matching process onto a scatter plot, wherein the Y-axis represents the time at which hash occurs in the user’s recording. The X-axis is the time at which the hash occurs in the database’s recording. In that case, the matching hashes will form a diagonal line.
Source: https://medium.com/@treycoopermusic/how-shazam-works-d97135fb4582
At the same time, if this plotting is done on a histogram, then there will be a spike at the correct offset time.
Here’s a summary of this entire process, in brief:
And it hardly takes seconds to complete this entire process of finding the exact match.
Explore the power of Shazam’s algorithm with TechAhead. Our team of expert Mobile App Developers can help you create a personalized app using this cutting-edge technology. Let us guide you towards success.
Schedule a no-obligation, free consulting session with our team right here!
No FAQ available!
With our expertise and experience, we can help your brand be the next success story.
First Name
Last Name
Email Address
Phone Number
Message
Δ