Rails 7 adds AudioAnalyzer to ActiveStorage


In our Rails application, we come across a lot of cases where we need to deal with images, videos, and audio. With the support of ActiveStorage which was released in Rails 5.2 it became easy for developers to upload media to Rails application.

But after uploading the media, in few cases we also have to analyze the media in terms of width, height, duration, etc.

With improvements in ActiveStorage, Rails 6.2 added Image and Video Analyzer where we could extract the metadata of the attached image or video. But this support was missing for audio files.

Before

Let’s say, we have a Movie model which has one title_track song attached to it.

class Movie < ApplicationRecord
  has_one_attached :title_track
end

If we want to fetch the metadata of the movie title_track, it would just return the following:

movie = Movie.first
movie.title_track.metadata
#=>
{
    "identified" => true,
      "analyzed" => true
}

To extract audio metadata like duration, bit_rate, etc we can use tubbo/activestorage-audio gem. This gem requires FFmpeg to be installed as it uses FFprobe to extract the metadata from the audio.

This gem analyzes the audio and adds the following details to the metadata:

movie = Movie.first
movie.title_track.metadata
#=>
{
        "identified" => true,
          "duration" => 52.819592,
          "bit_rate" => 320000.0,
       "sample_rate" => 44100,
          "channels" => 2,
    "channel_layout" => "stereo",
          "analyzed" => true
}

activestorage-audio uses the following command to extract to above details:

ffprobe -show_streams -v error -print_format json file_path

After

Rails 7 adds AudioAnalyzer to ActiveStorage. The audio is now analyzed and, the duration and bit_rate of the audio is available as metadata without the need for any external gems.

This also needs FFmpeg to be installed as it uses a similar approach with the gem discussed above.

class Movie < ApplicationRecord
  has_one_attached :title_track
end
movie = Movie.first
movie.title_track.metadata
#=>
{
    "identified" => true,
      "duration" => 52.819592,
      "bit_rate" => 320000,
      "analyzed" => true
}

This implementation uses the following command:

ffprobe -print_format json -show_streams -show_format -v error file_path

Note:

This has a similar implementation of the above-mentioned gem, but returns only duration and bit_rate.

How this works?

The AudioAnalyzer uses FFprobe to analyze the audio.

It extracts the duration and bit_rate for the provided audio.

The result should be the same if you run the following command on the audio file.

$ ffprobe -print_format json -show_streams -show_format -v error sample_audio.mp3

{
    "streams": [
        {
            "index": 0,
            "codec_name": "mp3",
            "codec_long_name": "MP3 (MPEG audio layer 3)",
            "codec_type": "audio",
            "codec_tag_string": "[0][0][0][0]",
            "codec_tag": "0x0000",
            "sample_fmt": "fltp",
            "sample_rate": "44100",
            "channels": 2,
            "channel_layout": "stereo",
            "bits_per_sample": 0,
            "r_frame_rate": "0/0",
            "avg_frame_rate": "0/0",
            "time_base": "1/14112000",
            "start_pts": 353600,
            "start_time": "0.025057",
            "duration_ts": 745390080,
            "duration": "52.819592",
            "bit_rate": "320000",
            "disposition": {
                "default": 0,
                "dub": 0,
                "original": 0,
                "comment": 0,
                "lyrics": 0,
                "karaoke": 0,
                "forced": 0,
                "hearing_impaired": 0,
                "visual_impaired": 0,
                "clean_effects": 0,
                "attached_pic": 0,
                "timed_thumbnails": 0
            },
            "tags": {
                "encoder": "LAME3.99r"
            }
        }
    ],
    "format": {
        "filename": "sample_audio.mp3",
        "nb_streams": 1,
        "nb_programs": 0,
        "format_name": "mp3",
        "format_long_name": "MP2/3 (MPEG audio layer 2/3)",
        "start_time": "0.025057",
        "duration": "52.819592",
        "size": "2113939",
        "bit_rate": "320174",
        "probe_score": 51,
        "tags": {
            "genre": "Cinematic",
            "album": "YouTube Audio Library",
            "title": "Impact Moderato",
            "artist": "Kevin MacLeod"
        }
    }
}