IjkPlayer Series Data Reading Thread read_thread

PS: Control the impact of new technologies on you rather than being controlled by them.

This article analyzes the data reading thread read_thread of IjkPlayer, aiming to clarify its basic flow and key function calls. The main content is as follows:

Basic usage of IjkPlayer
Creation of read_thread
avformat_alloc_context
avformat_open_input
avformat_find_stream_info
avformat_seek_file
av_dump_format
av_find_best_stream
stream_component_open
Main loop of read_thread

Basic usage of IjkPlayer#

A brief review of the basic usage of IjkPlayer is as follows:

// Create IjkMediaPlayer
IjkMediaPlayer mMediaPlayer = new IjkMediaPlayer();
// Set log level
mMediaPlayer.native_setLogLevel(IjkMediaPlayer.IJK_LOG_DEBUG);
// Set options
mMediaPlayer.setOption(IjkMediaPlayer.OPT_CATEGORY_PLAYER, "mediacodec", 1);
// ...
// Set event listeners
mMediaPlayer.setOnPreparedListener(mPreparedListener);
mMediaPlayer.setOnVideoSizeChangedListener(mSizeChangedListener);
mMediaPlayer.setOnCompletionListener(mCompletionListener);
mMediaPlayer.setOnErrorListener(mErrorListener);
mMediaPlayer.setOnInfoListener(mInfoListener);
// Set surface
mMediaPlayer.setSurface(surface)
// ...
// Set URL
mMediaPlayer.setDataSource(dataSource);
// Prepare for playback
mMediaPlayer.prepareAsync();

After calling prepareAsync, the onPrepared callback is received, and start is called to begin playback:

@Override
public void onPrepared(IMediaPlayer mp) {
    // Start playback
    mMediaPlayer.start();
}

At this point, under normal circumstances, the video can play properly; here we focus only on the calling process.

Creation of read_thread#

Starting from the prepareAsync method of IjkMediaPlayer, the calling flow is as follows:

Mermaid Loading...

It can be seen that prepareAsync ultimately calls the function stream_open, which is defined as follows:

static VideoState *stream_open(FFPlayer *ffp, const char *filename, AVInputFormat *iformat){
    av_log(NULL, AV_LOG_INFO, "stream_open\n");
    assert(!ffp->is);
    // Initialize VideoState and some parameters.
    VideoState *is;
    is = av_mallocz(sizeof(VideoState));
    if (!is)
        return NULL;
    is->filename = av_strdup(filename);
    if (!is->filename)
        goto fail;
    // Here, iformat has not been assigned yet; it will be determined later through probing to find the best AVInputFormat
    is->iformat = iformat;
    is->ytop    = 0;
    is->xleft   = 0;
#if defined(__ANDROID__)
    if (ffp->soundtouch_enable) {
        is->handle = ijk_soundtouch_create();
    }
#endif

    /* start video display */
    // Initialize the decoded frame queue
    if (frame_queue_init(&is->pictq, &is->videoq, ffp->pictq_size, 1) < 0)
        goto fail;
    if (frame_queue_init(&is->subpq, &is->subtitleq, SUBPICTURE_QUEUE_SIZE, 0) < 0)
        goto fail;
    if (frame_queue_init(&is->sampq, &is->audioq, SAMPLE_QUEUE_SIZE, 1) < 0)
        goto fail;

    // Initialize the queue for undecoded data
    if (packet_queue_init(&is->videoq) < 0 ||
        packet_queue_init(&is->audioq) < 0 ||
        packet_queue_init(&is->subtitleq) < 0)
        goto fail;

    // Initialize condition variables (semaphores), including read thread, video seek, audio seek related semaphores
    if (!(is->continue_read_thread = SDL_CreateCond())) {
        av_log(NULL, AV_LOG_FATAL, "SDL_CreateCond(): %s\n", SDL_GetError());
        goto fail;
    }

    if (!(is->video_accurate_seek_cond = SDL_CreateCond())) {
        av_log(NULL, AV_LOG_FATAL, "SDL_CreateCond(): %s\n", SDL_GetError());
        ffp->enable_accurate_seek = 0;
    }

    if (!(is->audio_accurate_seek_cond = SDL_CreateCond())) {
        av_log(NULL, AV_LOG_FATAL, "SDL_CreateCond(): %s\n", SDL_GetError());
        ffp->enable_accurate_seek = 0;
    }
    // Initialize clocks
    init_clock(&is->vidclk, &is->videoq.serial);
    init_clock(&is->audclk, &is->audioq.serial);
    init_clock(&is->extclk, &is->extclk.serial);
    is->audio_clock_serial = -1;
    // Initialize volume range
    if (ffp->startup_volume < 0)
        av_log(NULL, AV_LOG_WARNING, "-volume=%d < 0, setting to 0\n", ffp->startup_volume);
    if (ffp->startup_volume > 100)
        av_log(NULL, AV_LOG_WARNING, "-volume=%d > 100, setting to 100\n", ffp->startup_volume);
    ffp->startup_volume = av_clip(ffp->startup_volume, 0, 100);
    ffp->startup_volume = av_clip(SDL_MIX_MAXVOLUME * ffp->startup_volume / 100, 0, SDL_MIX_MAXVOLUME);
    is->audio_volume = ffp->startup_volume;
    is->muted = 0;

    // Set audio-video synchronization method, default AV_SYNC_AUDIO_MASTER
    is->av_sync_type = ffp->av_sync_type;

    // Playback mutex
    is->play_mutex = SDL_CreateMutex();
    // Accurate seek mutex
    is->accurate_seek_mutex = SDL_CreateMutex();

    ffp->is = is;
    is->pause_req = !ffp->start_on_prepared;

    // Video rendering thread
    is->video_refresh_tid = SDL_CreateThreadEx(&is->_video_refresh_tid, video_refresh_thread, ffp, "ff_vout");
    if (!is->video_refresh_tid) {
        av_freep(&ffp->is);
        return NULL;
    }

    is->initialized_decoder = 0;
    // Read thread
    is->read_tid = SDL_CreateThreadEx(&is->_read_tid, read_thread, ffp, "ff_read");
    if (!is->read_tid) {
        av_log(NULL, AV_LOG_FATAL, "SDL_CreateThread(): %s\n", SDL_GetError());
        goto fail;
    }
    
    // Asynchronously initialize the decoder, related to hardware decoding, default not enabled
    if (ffp->async_init_decoder && !ffp->video_disable && ffp->video_mime_type && strlen(ffp->video_mime_type) > 0
                    && ffp->mediacodec_default_name && strlen(ffp->mediacodec_default_name) > 0) {
        // mediacodec
        if (ffp->mediacodec_all_videos || ffp->mediacodec_avc || ffp->mediacodec_hevc || ffp->mediacodec_mpeg2) {
            decoder_init(&is->viddec, NULL, &is->videoq, is->continue_read_thread);
            ffp->node_vdec = ffpipeline_init_video_decoder(ffp->pipeline, ffp);
        }
    }
    // Allow decoder initialization
    is->initialized_decoder = 1;

    return is;
fail:
    is->initialized_decoder = 1;
    is->abort_request = true;
    if (is->video_refresh_tid)
        SDL_WaitThread(is->video_refresh_tid, NULL);
    stream_close(ffp);
    return NULL;
}

It can be seen that the function stream_open mainly does the following:

Initializes VideoState and some parameters.
Initializes frame queues, including initializing the decoded video frame queue pictq, audio frame queue sampq, subtitle frame queue subpq, and the undecoded video data queue videoq, audio data queue audioq, and subtitle data queue subtitleq.
Initializes audio-video synchronization method and clock, defaulting to AV_SYNC_AUDIO_MASTER, which means the audio clock serves as the master clock.
Initializes volume range.
Creates a video rendering thread named ff_vout for video_refresh_thread.
Creates a read thread named ff_read for read_thread.

Now we begin the analysis of the data reading thread read_thread, with the key parts of the read_thread function simplified as follows:

static int read_thread(void *arg){
    // ...
    
    // 1. Create AVFormatContext, specifying default functions for opening and closing streams
    ic = avformat_alloc_context();
    if (!ic) {
        av_log(NULL, AV_LOG_FATAL, "Could not allocate context.\n");
        ret = AVERROR(ENOMEM);
        goto fail;
    }
    // ...
    
    // 2. Open the stream to get header information
    err = avformat_open_input(&ic, is->filename, is->iformat, &ffp->format_opts);
    if (err < 0) {
        print_error(is->filename, err);
        ret = -1;
        goto fail;
    }
    ffp_notify_msg1(ffp, FFP_MSG_OPEN_INPUT);
    // ...
    
    // 3. Get stream information
    if (ffp->find_stream_info) {
        err = avformat_find_stream_info(ic, opts);
    } 
    ffp_notify_msg1(ffp, FFP_MSG_FIND_STREAM_INFO);
    // ...
    
    // 4. If a start time is specified, seek to that position
    if (ffp->start_time != AV_NOPTS_VALUE) {
        int64_t timestamp;
        timestamp = ffp->start_time;
        if (ic->start_time != AV_NOPTS_VALUE)
            timestamp += ic->start_time;
        ret = avformat_seek_file(ic, -1, INT64_MIN, timestamp, INT64_MAX, 0);
    }
    // ...
    
    // 5. Print format information
    av_dump_format(ic, 0, is->filename, 0);
}

The following content is limited to the main flow of the read_thread function.

avformat_alloc_context#

The avformat_alloc_context function mainly allocates memory for AVFormatContext and initializes some parameters of ic->internal, as follows:

AVFormatContext *avformat_alloc_context(void){
    // Allocate memory for AVFormatContext
    AVFormatContext *ic;
    ic = av_malloc(sizeof(AVFormatContext));
    if (!ic) return ic;
    // Initialize default functions for opening and closing streams
    avformat_get_context_defaults(ic);
    // Allocate memory for internal
    ic->internal = av_mallocz(sizeof(*ic->internal));
    if (!ic->internal) {
        avformat_free_context(ic);
        return NULL;
    }
    ic->internal->offset = AV_NOPTS_VALUE;
    ic->internal->raw_packet_buffer_remaining_size = RAW_PACKET_BUFFER_SIZE;
    ic->internal->shortest_end = AV_NOPTS_VALUE;
    return ic;
}

Next, let's look at the avformat_get_context_defaults function as follows:

static void avformat_get_context_defaults(AVFormatContext *s){
    memset(s, 0, sizeof(AVFormatContext));
    s->av_class = &av_format_context_class;
    s->io_open  = io_open_default;
    s->io_close = io_close_default;
    av_opt_set_defaults(s);
}

Here, the default functions for opening and closing streams are specified as io_open_default and io_close_default, respectively, which we will not focus on for the subsequent flow.

avformat_open_input#

The avformat_open_input function is mainly used to open the stream to get header information, its definition simplified as follows:

int avformat_open_input(AVFormatContext **ps, const char *filename,
                        AVInputFormat *fmt, AVDictionary **options){
    // ...

    // Open the stream to probe the input format, returning the best demuxer score
    av_log(NULL, AV_LOG_FATAL, "avformat_open_input > init_input before > nb_streams:%d\n",s->nb_streams);
    if ((ret = init_input(s, filename, &tmp)) < 0)
        goto fail;
    s->probe_score = ret;

    // Protocol whitelist and blacklist checks, stream format whitelist checks, etc.
    // ...
    
    // Read the media header
    // read_header mainly does some initialization work for certain formats, such as filling its own private structure
    // Allocates stream structures based on the number of streams and initializes, pointing the file pointer to the start of the data area, etc.
    // Creates AVStream and waits to extract or write audio and video stream information in subsequent processes
    if (!(s->flags&AVFMT_FLAG_PRIV_OPT) && s->iformat->read_header)
        if ((ret = s->iformat->read_header(s)) < 0)
            goto fail;
    // ...
    
    // Process additional images attached to audio and video, such as album art
    if ((ret = avformat_queue_attached_pictures(s)) < 0)
        goto fail;
    // ...
    
    // Update AVStream decoder-related information to AVCodecContext
    update_stream_avctx(s);
    // ...
}

It can be seen that the avformat_open_input function mainly opens the stream to probe the input format, checks the protocol whitelist and blacklist, checks the stream format whitelist, reads the file header information, etc. Finally, it uses the update_stream_avctx function to update the AVStream decoder-related information to the corresponding AVCodecContext. This operation will be frequently seen in subsequent processes.

The most important tasks are to open the stream to probe the input format and read the file header information, which call the init_input and read_header functions, respectively. The read_header will complete the initialization of AVStream during the process of reading header information.

The init_input function mainly probes the stream format and returns the score for that stream format, ultimately finding the best AVInputFormat corresponding to that stream format. This structure is the demuxer registered during initialization, and each demuxer corresponds to an AVInputFormat object. Similarly, the muxer corresponds to AVOutputFormat, which we will just understand for now.

If the init_input function executes successfully, the corresponding stream format has been determined, and at this point, the read_header function can be called, which corresponds to the xxx_read_header function in the demuxer for the current stream format AVInputFormat. If it is an HLS format stream, it corresponds to hls_read_header, defined as follows:

AVInputFormat ff_hls_demuxer = {
    .name           = "hls,applehttp",
    .read_header   = hls_read_header,
    // ...
};

// hls_read_header
static int hls_read_header(AVFormatContext *s, AVDictionary **options){
    // ...
}

avformat_find_stream_info#

The avformat_find_stream_info function is mainly used to obtain stream information and is useful for probing file formats without headers. It can retrieve video width, total duration, bitrate, frame rate, pixel format, etc. Its definition is simplified as follows:

int avformat_find_stream_info(AVFormatContext *ic, AVDictionary **options){
    // ...
    
    // 1. Traverse streams
    for (i = 0; i < ic->nb_streams; i++) {
        // Initialize the parser in the stream, specifically initializing AVCodecParserContext and AVCodecParser
        st->parser = av_parser_init(st->codecpar->codec_id);
        // ...
        
        // Probe the corresponding decoder based on the decoder parameters in AVStream and return it
        codec = find_probe_decoder(ic, st, st->codecpar->codec_id);
        // ...
        
        // If the decoder parameters are incomplete, initialize AVCodecContext based on the specified AVCodec and call the decoder's init function to initialize the decoder
        if (!has_codec_parameters(st, NULL) && st->request_probe <= 0) {
            if (codec && !avctx->codec)
                if (avcodec_open2(avctx, codec, options ? &options[i] :&thread_opt) < 0)
                    av_log(ic, AV_LOG_WARNING,
                           "Failed to open codec in %s\n",__FUNCTION__);
        }
    }
    
    // 2. Infinite loop to get stream information
    for (;;) {
        // ...
        // Check for interrupt requests; if any, call the interrupt function
        if (ff_check_interrupt(&ic->interrupt_callback)) {
            break;
        }
        
        // Traverse streams, checking if decoder-related parameters still need to be processed
        for (i = 0; i < ic->nb_streams; i++) {
            int fps_analyze_framecount = 20;
            st = ic->streams[i];
            // Check the decoder parameters in the stream; if complete, break; otherwise, continue to execute for further analysis
            if (!has_codec_parameters(st, NULL))
                break;
            // ...
        }
        
        if (i == ic->nb_streams) {
            // Mark that all streams have been analyzed
            analyzed_all_streams = 1;
            // If the current AVFormatContext sets ctx_flags to AVFMTCTX_NOHEADER, it indicates that the current stream has no header information
            // At this point, it is necessary to read some packets to obtain stream information; otherwise, break out directly, and the infinite loop ends normally
            if (!(ic->ctx_flags & AVFMTCTX_NOHEADER)) {
                /* If we found the info for all the codecs, we can stop. */
                ret = count;
                av_log(ic, AV_LOG_DEBUG, "All info found\n");
                flush_codecs = 0;
                break;
            }
        }
         
        // The read data has exceeded the allowed probing data size, but all codec information has not yet been obtained
        if (read_size >= probesize) {
            break;
        }
         
        // The following handles the case where the current stream has no header information
         
        // Read a frame of compressed encoded data
        ret = read_frame_internal(ic, &pkt1);
        if (ret == AVERROR(EAGAIN)) continue;
        if (ret < 0) {
            /* EOF or error*/
            eof_reached = 1;
            break;
        }
        
        // The read data is added to the buffer, which will be read from the buffer later
        ret = add_to_pktbuf(&ic->internal->packet_buffer, pkt,
                                &ic->internal->packet_buffer_end, 0);
        // Attempt to decode some compressed encoded data                       
        try_decode_frame(ic, st, pkt,(options && i < orig_nb_streams) ? &options[i] : NULL);
        
        // ...
    }
    
    // 3. Handle reading data to the end of the stream
    if (eof_reached) {
         for (stream_index = 0; stream_index < ic->nb_streams; stream_index++) {
             if (!has_codec_parameters(st, NULL)) {
                const AVCodec *codec = find_probe_decoder(ic, st, st->codecpar->codec_id);
                if (avcodec_open2(avctx, codec, (options && stream_index < orig_nb_streams) ? &options[stream_index] : &opts) < 0)
                        av_log(ic, AV_LOG_WARNING,
         }
    }
    // 4. The decoder performs flushing operations to avoid cached data not being retrieved
    if (flush_codecs) {
        AVPacket empty_pkt = { 0 };
        int err = 0;
        av_init_packet(&empty_pkt);
        for (i = 0; i < ic->nb_streams; i++) {
            st = ic->streams[i];
            /* flush the decoders */
            if (st->info->found_decoder == 1) {
                do {
                    // 
                    err = try_decode_frame(ic, st, &empty_pkt,
                                            (options && i < orig_nb_streams)
                                            ? &options[i] : NULL);
                } while (err > 0 && !has_codec_parameters(st, NULL));
        
                if (err < 0) {
                    av_log(ic, AV_LOG_INFO,
                        "decoding for stream %d failed\n", st->index);
                }
            }
        }
    }
    
    // 5. The subsequent steps involve calculating some stream information, such as pix_fmt, aspect ratio SAR, actual frame rate, average frame rate, etc.
    // ...
    
    // 6. Update the stream's decoder parameters from the internal AVCodecContext (avctx) to the stream's codec parameters AVCodecParameters
    for (i = 0; i < ic->nb_streams; i++) {
        ret = avcodec_parameters_from_context(st->codecpar, st->internal->avctx);
        // ...
    }
    
    // ...
}

Since the avformat_find_stream_info function has a large amount of code, the above code omits most details and retains the more critical parts. Here we only look at the main flow. From the source code, it can be seen that this function frequently uses the has_codec_parameters function to check whether the decoder context parameters in the stream are reasonable. If they are not reasonable, measures are taken to ensure that the decoder context parameters in the stream are reasonable. When ret = 0; indicates that avformat_find_stream_info executes successfully, its main flow is as follows:

Traverse streams, initializing AVCodecParser and AVCodecParserContext based on some parameters in the stream, probing the decoder using the find_probe_decoder function, and initializing AVCodecContext by calling avcodec_open2 and the decoder's init function to initialize the decoder's static data.
The infinite loop mainly uses the ff_check_interrupt function for interrupt detection, traverses streams, and uses the has_codec_parameters function to check whether the decoder context parameters in the stream are reasonable and whether data is being read. If reasonable and the current stream has a header, it marks analyzed_all_streams = 1; and flush_codecs = 0;, directly breaking out of the infinite loop. If the current stream has no header information, i.e., ic->ctx_flags is set to AVFMTCTX_NOHEADER, it needs to call the read_frame_internal function to read a frame of encoded data and add it to the buffer, calling the try_decode_frame function to decode a frame of data to further fill the AVCodecContext in the stream.
eof_reached = 1 indicates that the previous infinite loop has read to the end of the stream using the read_frame_internal function. It traverses the stream again and uses the has_codec_parameters function to check whether the decoder context parameters in the stream are reasonable. If not, it repeats the steps in 2 to initialize the decoder context-related parameters.
The decoding process is a continuous process of putting data in and taking data out, corresponding to the avcodec_send_packet and avcodec_receive_frame functions. To avoid residual data in the decoder, it flushes the decoder using an empty AVPacket, executed when flush_codecs = 1, which means that the decoding operation has been executed in step 2 by calling try_decode_frame.
The subsequent steps involve calculating some stream information, such as pix_fmt, aspect ratio SAR, actual frame rate, average frame rate, etc.
Traverse the stream and call the avcodec_parameters_from_context function to fill the decoder parameters from the internal AVCodecContext into the stream's decoder parameters st->codecpar, corresponding to the structure AVCodecParameters. Thus, the main flow analysis of the avformat_find_stream_info function is complete.

avformat_seek_file#

avformat_seek_file is mainly used to perform seek operations, its definition simplified as follows:

int avformat_seek_file(AVFormatContext *s, int stream_index, int64_t min_ts,
                       int64_t ts, int64_t max_ts, int flags){
    // ...                   
    
    // Prefer using read_seek2
    if (s->iformat->read_seek2) {
        int ret;
        ff_read_frame_flush(s);
        ret = s->iformat->read_seek2(s, stream_index, min_ts, ts, max_ts, flags);
        if (ret >= 0)
            ret = avformat_queue_attached_pictures(s);
        return ret;
    }
    // ...
    
    // If read_seek2 is not supported, try using the old API seek
    if (s->iformat->read_seek || 1) {
        // ...
        int ret = av_seek_frame(s, stream_index, ts, flags | dir);
        return ret;
    }
    return -1; //unreachable                           
}

It can be seen that when executing the avformat_seek_file function, if the current demuxer (AVInputFormat) supports read_seek2, it uses the corresponding read_seek2 function; otherwise, it calls the old API's av_seek_frame function to perform the seek. The av_seek_frame function is as follows:

int av_seek_frame(AVFormatContext *s, int stream_index,int64_t timestamp, int flags){
    int ret;
    if (s->iformat->read_seek2 && !s->iformat->read_seek) {
        // ...
        return avformat_seek_file(s, stream_index, min_ts, timestamp, max_ts,
                                  flags & ~AVSEEK_FLAG_BACKWARD);
    }

    ret = seek_frame_internal(s, stream_index, timestamp, flags);

    // ...
    return ret;
}

It can be seen that if the current AVInputFormat supports read_seek2 and does not support read_seek, it uses the avformat_seek_file function, which is the read_seek2 function to perform the seek. If it supports read_seek, it prioritizes calling the internal seek function seek_frame_internal to perform the seek. The seek_frame_internal function mainly provides several ways to seek frames:

seek_frame_byte: Seek by byte.
read_seek: Seek according to the currently specified format, specifically supported by the corresponding demuxer.
ff_seek_frame_binary: Seek using binary search.
seek_frame_generic: Seek using a generic method.

This is also the logic for the seek operation. For example, the HLS format demuxer does not support read_seek2, only supports read_seek, and the ff_hls_demuxer is defined as follows:

AVInputFormat ff_hls_demuxer = {
    // ...
    .read_seek      = hls_read_seek,
};

av_dump_format#

The av_dump_format function is used to print detailed information about the stream input format based on the current AVFormatContext. The information printed when IjkPlayer plays a video normally is as follows:

IJKMEDIA: Input #0, hls,applehttp, from 'http://devimages.apple.com.edgekey.net/streaming/examples/bipbop_4x3/gear1/prog_index.m3u8':
IJKMEDIA:   Duration:
IJKMEDIA: 00:30:00.00
IJKMEDIA: , start:
IJKMEDIA: 19.888800
IJKMEDIA: , bitrate:
IJKMEDIA: 0 kb/s
IJKMEDIA:
IJKMEDIA:   Program 0
IJKMEDIA:     Metadata:
IJKMEDIA:       variant_bitrate :
IJKMEDIA: 0
IJKMEDIA:
IJKMEDIA:     Stream #0:0
IJKMEDIA: , 23, 1/90000
IJKMEDIA: : Video: h264, 1 reference frame ([27][0][0][0] / 0x001B), yuv420p(tv, smpte170m/smpte170m/bt709, topleft), 400x300 (400x304), 0/1
IJKMEDIA: ,
IJKMEDIA: 29.92 tbr,
IJKMEDIA: 90k tbn,
IJKMEDIA: 180k tbc
IJKMEDIA:
IJKMEDIA:     Metadata:
IJKMEDIA:       variant_bitrate :
IJKMEDIA: FFP_MSG_FIND_STREAM_INFO:
IJKMEDIA: 0
IJKMEDIA:
IJKMEDIA:     Stream #0:1
IJKMEDIA: , 9, 1/90000
IJKMEDIA: : Audio: aac ([15][0][0][0] / 0x000F), 22050 Hz, stereo, fltp
IJKMEDIA:
IJKMEDIA:     Metadata:
IJKMEDIA:       variant_bitrate :
IJKMEDIA: 0

av_find_best_stream#

The av_find_best_stream function is mainly used to select the most suitable audio and video streams, its definition simplified as follows:

int av_find_best_stream(AVFormatContext *ic, enum AVMediaType type,
                        int wanted_stream_nb, int related_stream,
                        AVCodec **decoder_ret, int flags){
    // ...
    
    // Traverse to select the appropriate audio and video stream
    for (i = 0; i < nb_streams; i++) {
        int real_stream_index = program ? program[i] : i;
        AVStream *st          = ic->streams[real_stream_index];
        AVCodecParameters *par = st->codecpar;
        if (par->codec_type != type)
            continue;
        if (wanted_stream_nb >= 0 && real_stream_index != wanted_stream_nb)
            continue;
        if (type == AVMEDIA_TYPE_AUDIO && !(par->channels && par->sample_rate))
            continue;
        if (decoder_ret) {
            decoder = find_decoder(ic, st, par->codec_id);
            if (!decoder) {
                if (ret < 0)
                    ret = AVERROR_DECODER_NOT_FOUND;
                continue;
            }
        }
        disposition = !(st->disposition & (AV_DISPOSITION_HEARING_IMPAIRED | AV_DISPOSITION_VISUAL_IMPAIRED));
        count = st->codec_info_nb_frames;
        bitrate = par->bit_rate;
        multiframe = FFMIN(5, count);
        if ((best_disposition >  disposition) ||
            (best_disposition == disposition && best_multiframe >  multiframe) ||
            (best_disposition == disposition && best_multiframe == multiframe && best_bitrate >  bitrate) ||
            (best_disposition == disposition && best_multiframe == multiframe && best_bitrate == bitrate && best_count >= count))
            continue;
        best_disposition = disposition;
        best_count   = count;
        best_bitrate = bitrate;
        best_multiframe = multiframe;
        ret          = real_stream_index;
        best_decoder = decoder;
        // ...
    }
    // ...
    return ret;
}

It can be seen that the av_find_best_stream function mainly selects based on three dimensions, in the order of comparison: disposition, multiframe, and bitrate. When disposition is the same, it selects the one with more decoded frames, corresponding to multiframe, and finally selects the one with a higher bitrate, corresponding to bitrate.

The disposition corresponds to the disposition member of AVStream, with specific values being AV_DISPOSITION_ identifiers. For example, the above AV_DISPOSITION_HEARING_IMPAIRED indicates that this stream is aimed at people with hearing impairments, which we will just understand for now.

In the read_thread function, av_find_best_stream finds the best audio, video, and subtitle streams, and the next step is decoding and playback.

stream_component_open#

The stream_component_open function mainly creates the audio rendering thread, audio, video, and subtitle decoding threads, and initializes VideoState. Its definition is simplified as follows:

static int stream_component_open(FFPlayer *ffp, int stream_index){
    // ...
    // 1. Initialize AVCodecContext
    avctx = avcodec_alloc_context3(NULL);
    if (!avctx)
        return AVERROR(ENOMEM);

    // 2. Update the current AVCodecContext parameters using the stream's decoder parameters
    ret = avcodec_parameters_to_context(avctx, ic->streams[stream_index]->codecpar);
    if (ret < 0)
        goto fail;
    av_codec_set_pkt_timebase(avctx, ic->streams[stream_index]->time_base);

    // 3. Find the decoder based on the decoder ID
    codec = avcodec_find_decoder(avctx->codec_id);

    // ...

    // 4. If a decoder name has been specified, look up the decoder again using the decoder's name
    if (forced_codec_name)
        codec = avcodec_find_decoder_by_name(forced_codec_name);
    if (!codec) {
        if (forced_codec_name) av_log(NULL, AV_LOG_WARNING,
                                      "No codec could be found with name '%s'\n", forced_codec_name);
        else                   av_log(NULL, AV_LOG_WARNING,
                                      "No codec could be found with id %d\n", avctx->codec_id);
        ret = AVERROR(EINVAL);
        goto fail;
    }

    // ...
    
    // 5. Create the audio rendering thread, initialize audio, video, and subtitle decoders, and start decoding
    switch (avctx->codec_type) {
    case AVMEDIA_TYPE_AUDIO:
        // ...
        // Open audio output, create audio output thread ff_aout_android, corresponding audio thread function aout_thread,
        // ultimately calling the write method of AudioTrack to write audio data
        // ...
        // Initialize audio decoder
        decoder_init(&is->auddec, avctx, &is->audioq, is->continue_read_thread);
        if ((is->ic->iformat->flags & (AVFMT_NOBINSEARCH | AVFMT_NOGENSEARCH | AVFMT_NO_BYTE_SEEK)) && !is->ic->iformat->read_seek) {
            is->auddec.start_pts = is->audio_st->start_time;
            is->auddec.start_pts_tb = is->audio_st->time_base;
        }
        // Start audio decoding; here creates audio decoding thread ff_audio_dec, corresponding audio decoding thread function is audio_thread
        if ((ret = decoder_start(&is->auddec, audio_thread, ffp, "ff_audio_dec")) < 0)
            goto out;
        SDL_AoutPauseAudio(ffp->aout, 0);
        break;
    case AVMEDIA_TYPE_VIDEO:
        is->video_stream = stream_index;
        is->video_st = ic->streams[stream_index];
        // Asynchronously initialize the decoder, related to MediaCodec
        if (ffp->async_init_decoder) {
            // ...
        } else {
            // Initialize video decoder
            decoder_init(&is->viddec, avctx, &is->videoq, is->continue_read_thread);
            ffp->node_vdec = ffpipeline_open_video_decoder(ffp->pipeline, ffp);
            if (!ffp->node_vdec)
                goto fail;
        }
            // Start video decoding; here creates audio decoding thread ff_video_dec, corresponding audio decoding thread function is video_thread
        if ((ret = decoder_start(&is->viddec, video_thread, ffp, "ff_video_dec")) < 0)
            goto out;

        // ...

        break;
    case AVMEDIA_TYPE_SUBTITLE:
        // ...
        // Initialize subtitle decoder
        decoder_init(&is->subdec, avctx, &is->subtitleq, is->continue_read_thread);
        // Start video decoding; here creates audio decoding thread ff_subtitle_dec, corresponding audio decoding thread function is subtitle_thread
        if ((ret = decoder_start(&is->subdec, subtitle_thread, ffp, "ff_subtitle_dec")) < 0)
            goto out;
        break;
    default:
        break;
    }
    goto out;

fail:
    avcodec_free_context(&avctx);
out:
    av_dict_free(&opts);

    return ret;
}

It can be seen that the stream_component_open function has already created the corresponding decoding threads. The comments in the above code are quite detailed, so we will not elaborate further. In the corresponding read_thread, this function fills in some data of IjkMediaMeta, at this point ffp->prepared = true; and sends a playback preparation complete event message FFP_MSG_PREPARED to the application layer, ultimately callback to OnPreparedListener.

Main loop of read_thread#

The main loop here refers to the main loop of data reading in read_thread, with the key flow as follows:

for (;;) {
    // 1. If the stream is closed or the application layer releases, is->abort_request is set to 1
    if (is->abort_request)
        break;
    // ...
    // 2. Handle seek operations
    if (is->seek_req) {
        // ...
        is->seek_req = 0;
        ffp_notify_msg3(ffp, FFP_MSG_SEEK_COMPLETE, (int)fftime_to_milliseconds(seek_target), ret);
        ffp_toggle_buffering(ffp, 1);
    }
    // 3. Handle attached_pic
    // If a stream contains AV_DISPOSITION_ATTACHED_PIC, it indicates that this stream is a Video Stream in files like *.mp3
    // This stream only has one AVPacket which is attached_pic
    if (is->queue_attachments_req) {
        if (is->video_st && (is->video_st->disposition & AV_DISPOSITION_ATTACHED_PIC)) {
            AVPacket copy = { 0 };
            if ((ret = av_packet_ref(&copy, &is->video_st->attached_pic)) < 0)
                goto fail;
            packet_queue_put(&is->videoq, &copy);
            packet_queue_put_nullpacket(&is->videoq, is->video_stream);
        }
        is->queue_attachments_req = 0;
    }
    // 4. If the queue is full, temporarily cannot read more data
    // If it is a network stream, ffp->infinite_buffer is set to 1
    /* if the queue are full, no need to read more */
    if (ffp->infinite_buffer<1 && !is->seek_req &&
        // ...
        SDL_LockMutex(wait_mutex);
        // Wait 10ms to give the decoding thread time to consume
        SDL_CondWaitTimeout(is->continue_read_thread, wait_mutex, 10);
        SDL_UnlockMutex(wait_mutex);
        continue;
    }
    // 5. Check if the stream has finished playing
    if ((!is->paused || completed) &&
        (!is->audio_st || (is->auddec.finished == is->audioq.serial && frame_queue_nb_remaining(&is->sampq) == 0)) &&
        (!is->video_st || (is->viddec.finished == is->videoq.serial && frame_queue_nb_remaining(&is->pictq) == 0))) {
        // Check if loop playback is set
        if (ffp->loop != 1 && (!ffp->loop || --ffp->loop)) {
            stream_seek(is, ffp->start_time != AV_NOPTS_VALUE ? ffp->start_time : 0, 0, 0);
        } else if (ffp->autoexit) {// Check if auto exit is set
            ret = AVERROR_EOF;
            goto fail;
        } else {
            // ...
            
            // Playback error...
            ffp_notify_msg1(ffp, FFP_MSG_ERROR);
            
            // Playback completed...
            ffp_notify_msg1(ffp, FFP_MSG_COMPLETED);
        }
    }
    pkt->flags = 0;
    // 6. Read data packets
    ret = av_read_frame(ic, pkt);
    // 7. Check data reading status
    if (ret < 0) {
        // ...
        
        // Handle reading to the end...
        if (pb_eof) {
            if (is->video_stream >= 0)
                packet_queue_put_nullpacket(&is->videoq, is->video_stream);
            if (is->audio_stream >= 0)
                packet_queue_put_nullpacket(&is->audioq, is->audio_stream);
            if (is->subtitle_stream >= 0)
                packet_queue_put_nullpacket(&is->subtitleq, is->subtitle_stream);
            is->eof = 1;
        }
        
        // Data reading handling...
        if (pb_error) {
            if (is->video_stream >= 0)
                packet_queue_put_nullpacket(&is->videoq, is->video_stream);
            if (is->audio_stream >= 0)
                packet_queue_put_nullpacket(&is->audioq, is->audio_stream);
            if (is->subtitle_stream >= 0)
                packet_queue_put_nullpacket(&is->subtitleq, is->subtitle_stream);
            is->eof = 1;
            ffp->error = pb_error;
            av_log(ffp, AV_LOG_ERROR, "av_read_frame error: %s\n", ffp_get_error_string(ffp->error));
            // break;
        } else {
            ffp->error = 0;
        }
        if (is->eof) {
            ffp_toggle_buffering(ffp, 0);
            SDL_Delay(100);
        }
        SDL_LockMutex(wait_mutex);
        SDL_CondWaitTimeout(is->continue_read_thread, wait_mutex, 10);
        SDL_UnlockMutex(wait_mutex);
        ffp_statistic_l(ffp);
        continue;
    } else {
        is->eof = 0;
    }
    // ...
    // 8. Fill the undecoded frame queue
    if (pkt->stream_index == is->audio_stream && pkt_in_play_range) {
        packet_queue_put(&is->audioq, pkt);
    } else if (pkt->stream_index == is->video_stream && pkt_in_play_range
               && !(is->video_st && (is->video_st->disposition & AV_DISPOSITION_ATTACHED_PIC))) {
        packet_queue_put(&is->videoq, pkt);
    } else if (pkt->stream_index == is->subtitle_stream && pkt_in_play_range) {
        packet_queue_put(&is->subtitleq, pkt);
    } else {
        av_packet_unref(pkt);
    }
    // ...
}

It can be seen that the infinite loop in read_thread is mainly used for data reading, and each time a frame of compressed encoded data is added to the undecoded frame queue, specifically as follows:

If the stream fails to open, the stream_close function will be called, or if the application layer calls the release function to release the player, it directly breaks.
Handles seek operations during playback.
Handles attached_pic in the stream. If a stream contains AV_DISPOSITION_ATTACHED_PIC, it indicates that this stream is a Video Stream in files like *.mp3, and this stream only has one AVPacket which is attached_pic.
Handles the case where the queue is full, which occurs when the size of the undecoded queue, i.e., the sum of the sizes of the undecoded audio, video, and subtitle corresponding queues exceeds 15M, or when using stream_has_enough_packets to determine whether the audio, video, and subtitle streams have enough packets to decode. If so, it means that the demuxer's cache is full, delaying for 10ms to allow the decoder to consume data.
Checks whether the stream has finished playing, whether loop playback is set, whether to auto exit, and handles playback errors.
av_read_frame can be said to be the key function of the read_thread thread, which is responsible for demuxing. Each time it reads a frame of compressed encoded data, it adds it to the undecoded frame queue for the corresponding decoding thread to use.
Checks the data reading status, mainly handling reading to the end of the stream and reading errors.
If the data reading is successful, it adds it to the corresponding undecoded frame queue.

Thus, the data reading thread read_thread of IjkPlayer has been basically sorted out. Most of it is still related to FFmpeg.