2

I want to decode a stream of AAC frames continuously, one frame at a time.

I went through the ffmpeg examples (The correct answer doesn't need to make use of ffmpeg necessarily), and I only found examples using complete AAC files and batch algorithms. But I want to decode a continuous AAC stream. How can I do this?

UPDATE: Following the comments and Decode AAC to PCM with ffmpeg on android , I was able to decode to PCM using ffmpeg, however the output is very metallic and noisy. What am I doing wrong here when calling this method for each AAC frame:

...
/*loop that receives frame in buffer*/
 while(1){
   /*receive frame*/
   input = receive_one_buffer();

   /*decode frame*/
   decodeBuffer(input,strlen(input),Outfile);
 }

...

/*decode frame*/
void decodeBuffer(char * input, int numBytes, ofstream& Outfile) {
    /*"input" contains one AAC-LC frame*/
    //copy bytes from buffer
    uint8_t inputBytes[numBytes + FF_INPUT_BUFFER_PADDING_SIZE];
    memset(inputBytes, 0, numBytes + FF_INPUT_BUFFER_PADDING_SIZE);
    memcpy(inputBytes, input, numBytes);

    av_register_all();

    AVCodec *codec = avcodec_find_decoder(CODEC_ID_AAC);

    AVCodecContext *avCtx = avcodec_alloc_context();
    avCtx->channels = 1;
    avCtx->sample_rate = 44100;

    //the input buffer
    AVPacket avPacket;
    av_init_packet(&avPacket);

    avPacket.size = numBytes; //input buffer size
    avPacket.data = inputBytes; // the input buffer

    int outSize;
    int len;
    uint8_t *outbuf = static_cast<uint8_t *>(malloc(AVCODEC_MAX_AUDIO_FRAME_SIZE));

    while (avPacket.size > 0) {
        outSize = AVCODEC_MAX_AUDIO_FRAME_SIZE;
        len = avcodec_decode_audio3(avCtx, (short *) outbuf, &outSize,
                &avPacket);

    Outfile.write((char*)outbuf, outSize);

        avPacket.size -= len;
        avPacket.data += len;
    }

    av_free_packet(&avPacket);
    avcodec_close(avCtx);
    //av_free(avCtx);

    return;
}
Community
  • 1
  • 1
user2212461
  • 3,105
  • 8
  • 49
  • 87
  • 3
    same question basically: http://stackoverflow.com/questions/13499480/decode-aac-to-pcm-with-ffmpeg-on-android doesn't matter that it's for android - nor does your requirement about `real-time`: it still just decoding aac->pcm – stijn Jun 14 '14 at 09:39
  • Can't be done real-time, since you will need a frame of AAC data to convert to a block of PCM data. There is no way this can be done on a single sample at a time. But you can do it in "near real-time" by taking a block of AAC data and converting it to PCM. This is what nearly all audio-players and similar applications do. – Mats Petersson Jun 14 '14 at 09:59
  • @MatsPetersson I don't understand, correct me if I am wrong but isn't that just the case when encoding and decoding is possible from a single frame only? – user2212461 Jun 14 '14 at 23:08
  • Yes, but truly real-time would be a single sample at a time at the time it's needed, and that's not possible, you need a complete frame - which means your audio will be "behind" the actual receipt of the frame by some small period. – Mats Petersson Jun 15 '14 at 08:08
  • 1
    Aren't AAC frames supposed to overlap, using some window function, to ensure continuity at the boundaries? – moonshadow Jun 15 '14 at 12:31
  • 1
    IIRC you need to keep a few dozen samples from the previous frame, and crossfade with samples at the start of the current frame. The library should provide a function to do that. It's been years since I did this stuff though, I'll let someone with recent experience actually answer :) – moonshadow Jun 15 '14 at 12:34
  • 1
    @stijn For the purposes of flagging a question as a duplicate it does matter, but your link is definitely helpful. – Brad Jun 15 '14 at 15:54
  • 1
    Can you post a sample of the decoded sound, and the original? Why not let FFmpeg decode the stream? – Brad Jun 15 '14 at 15:55
  • I added an example above. There are some processing steps before and after the decoding which I cannot change so system calls writing to files is not an option – user2212461 Jun 15 '14 at 16:49
  • 1
    @MatsPetersson: "Realtime" generally means within fixed time bounds, regardless of input. If decoding each frame introduces a 30 frame latency (and not a clocktick more), and 30 frames can be processed in parallel, then the decoding is still realtime. – MSalters Jun 16 '14 at 13:39
  • @MSalters By realtime, I mean the opposite of batch – user2212461 Jun 17 '14 at 10:52
  • @user2212461: The usual opposite of "batch" is "straight through processing". – MSalters Jun 17 '14 at 12:06
  • @AndrewMedico Thanks for your comment. Since opening a file also can be named stream, I selected different naming :-P – user2212461 Jun 17 '14 at 18:09
  • Make sure that when you playback the decoded sound that you are using: 1) correct bit-depth (e.g. 16-bit) 2) right number of channels 3) correct endianess (little vs big) 4) signed or unsigned samples (e.g. need to add 32768 in case of 16-bit samples). Getting one of those things wrong might be the source of the "metallic and noisy" result – JarkkoL Jun 18 '14 at 16:19
  • @JarkkoL The player seems to be using correct data, rather the data is just different (smaller values than from batch algorithm, see my updated question) – user2212461 Jun 18 '14 at 22:57
  • I have simple sample of using ffmpeg: http://unick-soft.ru/Files/ffmpegDecoder-vs2008.zip . It decodes audio too, but you need to write buffer to file (Look function DecodeAudio). Maybe in your case "char * input" is not decoded fully and you need to add residue to the input buffer of next call of decodeBuffer. – Unick Jun 21 '14 at 10:42
  • I'm basically having the same problem. I can decode aac to pcm but the audio is noisy and robotic. What did you do finally? Thanks! – Pablo Martinez Apr 07 '16 at 19:37

1 Answers1

4

You have to keep the decoder alive between subsequent decode calls. The AAC decoder must decode the previous buffer to be correctly "primed".

Please check for details:

https://developer.apple.com/library/mac/technotes/tn2258/_index.html

The following code assumes that the "ReceiveBuffer" function returns exactly one complete AAC access unit.

(BTW: you can't use strlen on a binary buffer; you'll get the distance to the first zero and not the buffer length)

#include <iostream>
#include <fstream>

#include "libavcodec\avcodec.h"
#include "libavformat\avformat.h"
#include "libavdevice\avdevice.h"
#include "libavfilter\avfilter.h"

AVCodecContext * CreateContext()
{
    av_register_all();

    AVCodec *codec = avcodec_find_decoder(AV_CODEC_ID_AAC);

    AVCodecContext *avCtx = avcodec_alloc_context3(codec);

    return avCtx;
}

int32_t DecodeBuffer
(
    std::ostream   & output,
    uint8_t        * pInput,
    uint32_t         cbInputSize,
    AVCodecContext * pAVContext
)
{
    int32_t cbDecoded = 0;

    //the input buffer
    AVPacket avPacket;
    av_init_packet(&avPacket);

    avPacket.size = cbInputSize; //input buffer size
    avPacket.data = pInput; // the input bufferra

    AVFrame * pDecodedFrame = av_frame_alloc();

    int nGotFrame = 0;

    cbDecoded = avcodec_decode_audio4(    pAVContext,
                                          pDecodedFrame,
                                        & nGotFrame,
                                        & avPacket);

    int data_size = av_samples_get_buffer_size( NULL,
                                                pAVContext->channels,
                                                pDecodedFrame->nb_samples,
                                                pAVContext->sample_fmt,
                                                1);

    output.write((const char*)pDecodedFrame->data[0],data_size);


    av_frame_free(&pDecodedFrame);

    return cbDecoded;
}


uint8_t * ReceiveBuffer( uint32_t * cbBufferSize)
{
    // TODO implement

    return NULL;
}

int main
(
    int argc,
    char *argv[]
)
{
    int nResult = 0;

    AVCodecContext * pAVContext = CreateContext();

    std::ofstream myOutputFile("audio.pcm",std::ios::binary);

    while(1)
    {
        uint32_t cbBufferSize = 0;
        uint8_t *pCompressedAudio = ReceiveBuffer( &cbBufferSize);

        if(cbBufferSize && pCompressedAudio)
        {
            DecodeBuffer(   myOutputFile,
                            pCompressedAudio,
                            cbBufferSize,
                            pAVContext);
        }
        else
        {
            break;
        }
    }

    avcodec_close(pAVContext);
    av_free(pAVContext);

    return nResult;
}
Markus Schumann
  • 7,636
  • 1
  • 21
  • 27