Libavcodec:解码 H.264 流时如何判断访问单元结束
我通过 RTP 接收 H.264 视频并使用 libavcodec 对其进行解码。我将 RTP 数据包中的 NAL 单元解包,然后将其提供给 avcodec(包括重新组装分段单元)。
我试图展示有效的解码帧速率。我曾经记录成功解码视频通话后的时间,其中 *got_picture_ptr 不为零。到目前为止,这是有效的,因为我只得到了每帧一个切片的视频。但现在我收到的视频中,I 帧和 P 帧各包含 2 个 NAL 单元,类型分别为 5 和 1。现在,当我输入帧的任一切片时,decode_video 返回它得到了一张图片,并且 pAVFrame->coded_picture_number 从每个切片中增加。
如何可靠地找到视频帧/图片/访问单元的开头或结尾?
我已从流中转储了一些 NAL 单元,并通过 h264bitstream 中的 h264_analyze 运行它们。
4 个 NAL 单元上 h264_analyze 的输出
!! Found NAL at offset 695262 (0xA9BDE), size 25 (0x0019) ==================== NAL ==================== forbidden_zero_bit : 0 nal_ref_idc : 1 nal_unit_type : 7 ( Sequence parameter set ) ======= SPS ======= profile_idc : 66 constraint_set0_flag : 1 constraint_set1_flag : 1 constraint_set2_flag : 1 constraint_set3_flag : 0 reserved_zero_4bits : 0 level_idc : 32 seq_parameter_set_id : 0 chroma_format_idc : 0 residual_colour_transform_flag : 0 bit_depth_luma_minus8 : 0 bit_depth_chroma_minus8 : 0 qpprime_y_zero_transform_bypass_flag : 0 seq_scaling_matrix_present_flag : 0 log2_max_frame_num_minus4 : 12 pic_order_cnt_type : 2 log2_max_pic_order_cnt_lsb_minus4 : 0 delta_pic_order_always_zero_flag : 0 offset_for_non_ref_pic : 0 offset_for_top_to_bottom_field : 0 num_ref_frames_in_pic_order_cnt_cycle : 0 num_ref_frames : 1 gaps_in_frame_num_value_allowed_flag : 0 pic_width_in_mbs_minus1 : 79 pic_height_in_map_units_minus1 : 44 frame_mbs_only_flag : 1 mb_adaptive_frame_field_flag : 0 direct_8x8_inference_flag : 1 frame_cropping_flag : 0 frame_crop_left_offset : 0 frame_crop_right_offset : 0 frame_crop_top_offset : 0 frame_crop_bottom_offset : 0 vui_parameters_present_flag : 1 === VUI === aspect_ratio_info_present_flag : 1 aspect_ratio_idc : 1 sar_width : 0 sar_height : 0 overscan_info_present_flag : 0 overscan_appropriate_flag : 0 video_signal_type_present_flag : 1 video_format : 5 video_full_range_flag : 1 colour_description_present_flag : 0 colour_primaries : 0 transfer_characteristics : 0 matrix_coefficients : 0 chroma_loc_info_present_flag : 0 chroma_sample_loc_type_top_field : 0 chroma_sample_loc_type_bottom_field : 0 timing_info_present_flag : 1 num_units_in_tick : 1 time_scale : 25 fixed_frame_rate_flag : 0 nal_hrd_parameters_present_flag : 0 vcl_hrd_parameters_present_flag : 0 low_delay_hrd_flag : 0 pic_struct_present_flag : 0 bitstream_restriction_flag : 1 motion_vectors_over_pic_boundaries_flag : 1 max_bytes_per_pic_denom : 0 max_bits_per_mb_denom : 0 log2_max_mv_length_horizontal : 6 log2_max_mv_length_vertical : 6 num_reorder_frames : 0 max_dec_frame_buffering : 1 === HRD === cpb_cnt_minus1 : 0 bit_rate_scale : 0 cpb_size_scale : 0 initial_cpb_removal_delay_length_minus1 : 0 cpb_removal_delay_length_minus1 : 0 dpb_output_delay_length_minus1 : 0 time_offset_length : 0 !! Found NAL at offset 695290 (0xA9BFA), size 4 (0x0004) ==================== NAL ==================== forbidden_zero_bit : 0 nal_ref_idc : 1 nal_unit_type : 8 ( Picture parameter set ) ======= PPS ======= pic_parameter_set_id : 0 seq_parameter_set_id : 0 entropy_coding_mode_flag : 0 pic_order_present_flag : 0 num_slice_groups_minus1 : 0 slice_group_map_type : 0 num_ref_idx_l0_active_minus1 : 0 num_ref_idx_l1_active_minus1 : 0 weighted_pred_flag : 0 weighted_bipred_idc : 0 pic_init_qp_minus26 : 3 pic_init_qs_minus26 : 0 chroma_qp_index_offset : 0 deblocking_filter_control_present_flag : 1 constrained_intra_pred_flag : 0 redundant_pic_cnt_present_flag : 0 transform_8x8_mode_flag : 1 pic_scaling_matrix_present_flag : 0 second_chroma_qp_index_offset : 1 !! Found NAL at offset 695297 (0xA9C01), size 50725 (0xC625) ==================== NAL ==================== forbidden_zero_bit : 0 nal_ref_idc : 1 nal_unit_type : 5 ( Coded slice of an IDR picture ) ======= Slice Header ======= first_mb_in_slice : 0 slice_type : 2 ( I slice ) pic_parameter_set_id : 0 frame_num : 0 field_pic_flag : 0 bottom_field_flag : 0 idr_pic_id : 0 pic_order_cnt_lsb : 0 delta_pic_order_cnt_bottom : 0 redundant_pic_cnt : 0 direct_spatial_mv_pred_flag : 0 num_ref_idx_active_override_flag : 0 num_ref_idx_l0_active_minus1 : 0 num_ref_idx_l1_active_minus1 : 0 cabac_init_idc : 0 slice_qp_delta : 5 sp_for_switch_flag : 0 slice_qs_delta : 0 disable_deblocking_filter_idc : 0 slice_alpha_c0_offset_div2 : 0 slice_beta_offset_div2 : 0 slice_group_change_cycle : 0 === Prediction Weight Table === luma_log2_weight_denom : 0 chroma_log2_weight_denom : 0 luma_weight_l0_flag : 0 chroma_weight_l0_flag : 0 luma_weight_l1_flag : 0 chroma_weight_l1_flag : 0 === Ref Pic List Reordering === ref_pic_list_reordering_flag_l0 : 0 ref_pic_list_reordering_flag_l1 : 0 === Decoded Ref Pic Marking === no_output_of_prior_pics_flag : 0 long_term_reference_flag : 0 adaptive_ref_pic_marking_mode_flag : 0 !! Found NAL at offset 746025 (0xB6229), size 38612 (0x96D4) ==================== NAL ==================== forbidden_zero_bit : 0 nal_ref_idc : 1 nal_unit_type : 5 ( Coded slice of an IDR picture ) ======= Slice Header ======= first_mb_in_slice : 1840 slice_type : 2 ( I slice ) pic_parameter_set_id : 0 frame_num : 0 field_pic_flag : 0 bottom_field_flag : 0 idr_pic_id : 0 pic_order_cnt_lsb : 0 delta_pic_order_cnt_bottom : 0 redundant_pic_cnt : 0 direct_spatial_mv_pred_flag : 0 num_ref_idx_active_override_flag : 0 num_ref_idx_l0_active_minus1 : 0 num_ref_idx_l1_active_minus1 : 0 cabac_init_idc : 0 slice_qp_delta : 5 sp_for_switch_flag : 0 slice_qs_delta : 0 disable_deblocking_filter_idc : 0 slice_alpha_c0_offset_div2 : 0 slice_beta_offset_div2 : 0 slice_group_change_cycle : 0 === Prediction Weight Table === luma_log2_weight_denom : 0 chroma_log2_weight_denom : 0 luma_weight_l0_flag : 0 chroma_weight_l0_flag : 0 luma_weight_l1_flag : 0 chroma_weight_l1_flag : 0 === Ref Pic List Reordering === ref_pic_list_reordering_flag_l0 : 0 ref_pic_list_reordering_flag_l1 : 0 === Decoded Ref Pic Marking === no_output_of_prior_pics_flag : 0 long_term_reference_flag : 0 adaptive_ref_pic_marking_mode_flag : 0
两个 I 切片均显示 frame_num = 0。接下来的 2 个切片(未显示)的frame_num = 1。
I'm receiving H.264 video over RTP and decoding it with libavcodec. I'm unpackaging the NAL units from the RTP packets before feeding them to avcodec (including reassembling fragmentation units).
I'm trying to show effective decoding frame rate. I used to log the time after a successful decode video call where *got_picture_ptr is non-zero. So far this worked since I only ever got video where there was one slice per frame. But now I receive video where both I and P frames consist of 2 NAL units each, of types 5 and 1 respectively. Now when I feed the either slice of a frame, decode_video return that it got a picture, and the pAVFrame->coded_picture_number is increased from every slice.
How do I go about reliably finding the beginning or end of a video frame/picture/access unit?
I've dumped out a few NAL units from the stream and run them through h264_analyze from h264bitstream.
Output from h264_analyze on 4 NAL Units
!! Found NAL at offset 695262 (0xA9BDE), size 25 (0x0019) ==================== NAL ==================== forbidden_zero_bit : 0 nal_ref_idc : 1 nal_unit_type : 7 ( Sequence parameter set ) ======= SPS ======= profile_idc : 66 constraint_set0_flag : 1 constraint_set1_flag : 1 constraint_set2_flag : 1 constraint_set3_flag : 0 reserved_zero_4bits : 0 level_idc : 32 seq_parameter_set_id : 0 chroma_format_idc : 0 residual_colour_transform_flag : 0 bit_depth_luma_minus8 : 0 bit_depth_chroma_minus8 : 0 qpprime_y_zero_transform_bypass_flag : 0 seq_scaling_matrix_present_flag : 0 log2_max_frame_num_minus4 : 12 pic_order_cnt_type : 2 log2_max_pic_order_cnt_lsb_minus4 : 0 delta_pic_order_always_zero_flag : 0 offset_for_non_ref_pic : 0 offset_for_top_to_bottom_field : 0 num_ref_frames_in_pic_order_cnt_cycle : 0 num_ref_frames : 1 gaps_in_frame_num_value_allowed_flag : 0 pic_width_in_mbs_minus1 : 79 pic_height_in_map_units_minus1 : 44 frame_mbs_only_flag : 1 mb_adaptive_frame_field_flag : 0 direct_8x8_inference_flag : 1 frame_cropping_flag : 0 frame_crop_left_offset : 0 frame_crop_right_offset : 0 frame_crop_top_offset : 0 frame_crop_bottom_offset : 0 vui_parameters_present_flag : 1 === VUI === aspect_ratio_info_present_flag : 1 aspect_ratio_idc : 1 sar_width : 0 sar_height : 0 overscan_info_present_flag : 0 overscan_appropriate_flag : 0 video_signal_type_present_flag : 1 video_format : 5 video_full_range_flag : 1 colour_description_present_flag : 0 colour_primaries : 0 transfer_characteristics : 0 matrix_coefficients : 0 chroma_loc_info_present_flag : 0 chroma_sample_loc_type_top_field : 0 chroma_sample_loc_type_bottom_field : 0 timing_info_present_flag : 1 num_units_in_tick : 1 time_scale : 25 fixed_frame_rate_flag : 0 nal_hrd_parameters_present_flag : 0 vcl_hrd_parameters_present_flag : 0 low_delay_hrd_flag : 0 pic_struct_present_flag : 0 bitstream_restriction_flag : 1 motion_vectors_over_pic_boundaries_flag : 1 max_bytes_per_pic_denom : 0 max_bits_per_mb_denom : 0 log2_max_mv_length_horizontal : 6 log2_max_mv_length_vertical : 6 num_reorder_frames : 0 max_dec_frame_buffering : 1 === HRD === cpb_cnt_minus1 : 0 bit_rate_scale : 0 cpb_size_scale : 0 initial_cpb_removal_delay_length_minus1 : 0 cpb_removal_delay_length_minus1 : 0 dpb_output_delay_length_minus1 : 0 time_offset_length : 0 !! Found NAL at offset 695290 (0xA9BFA), size 4 (0x0004) ==================== NAL ==================== forbidden_zero_bit : 0 nal_ref_idc : 1 nal_unit_type : 8 ( Picture parameter set ) ======= PPS ======= pic_parameter_set_id : 0 seq_parameter_set_id : 0 entropy_coding_mode_flag : 0 pic_order_present_flag : 0 num_slice_groups_minus1 : 0 slice_group_map_type : 0 num_ref_idx_l0_active_minus1 : 0 num_ref_idx_l1_active_minus1 : 0 weighted_pred_flag : 0 weighted_bipred_idc : 0 pic_init_qp_minus26 : 3 pic_init_qs_minus26 : 0 chroma_qp_index_offset : 0 deblocking_filter_control_present_flag : 1 constrained_intra_pred_flag : 0 redundant_pic_cnt_present_flag : 0 transform_8x8_mode_flag : 1 pic_scaling_matrix_present_flag : 0 second_chroma_qp_index_offset : 1 !! Found NAL at offset 695297 (0xA9C01), size 50725 (0xC625) ==================== NAL ==================== forbidden_zero_bit : 0 nal_ref_idc : 1 nal_unit_type : 5 ( Coded slice of an IDR picture ) ======= Slice Header ======= first_mb_in_slice : 0 slice_type : 2 ( I slice ) pic_parameter_set_id : 0 frame_num : 0 field_pic_flag : 0 bottom_field_flag : 0 idr_pic_id : 0 pic_order_cnt_lsb : 0 delta_pic_order_cnt_bottom : 0 redundant_pic_cnt : 0 direct_spatial_mv_pred_flag : 0 num_ref_idx_active_override_flag : 0 num_ref_idx_l0_active_minus1 : 0 num_ref_idx_l1_active_minus1 : 0 cabac_init_idc : 0 slice_qp_delta : 5 sp_for_switch_flag : 0 slice_qs_delta : 0 disable_deblocking_filter_idc : 0 slice_alpha_c0_offset_div2 : 0 slice_beta_offset_div2 : 0 slice_group_change_cycle : 0 === Prediction Weight Table === luma_log2_weight_denom : 0 chroma_log2_weight_denom : 0 luma_weight_l0_flag : 0 chroma_weight_l0_flag : 0 luma_weight_l1_flag : 0 chroma_weight_l1_flag : 0 === Ref Pic List Reordering === ref_pic_list_reordering_flag_l0 : 0 ref_pic_list_reordering_flag_l1 : 0 === Decoded Ref Pic Marking === no_output_of_prior_pics_flag : 0 long_term_reference_flag : 0 adaptive_ref_pic_marking_mode_flag : 0 !! Found NAL at offset 746025 (0xB6229), size 38612 (0x96D4) ==================== NAL ==================== forbidden_zero_bit : 0 nal_ref_idc : 1 nal_unit_type : 5 ( Coded slice of an IDR picture ) ======= Slice Header ======= first_mb_in_slice : 1840 slice_type : 2 ( I slice ) pic_parameter_set_id : 0 frame_num : 0 field_pic_flag : 0 bottom_field_flag : 0 idr_pic_id : 0 pic_order_cnt_lsb : 0 delta_pic_order_cnt_bottom : 0 redundant_pic_cnt : 0 direct_spatial_mv_pred_flag : 0 num_ref_idx_active_override_flag : 0 num_ref_idx_l0_active_minus1 : 0 num_ref_idx_l1_active_minus1 : 0 cabac_init_idc : 0 slice_qp_delta : 5 sp_for_switch_flag : 0 slice_qs_delta : 0 disable_deblocking_filter_idc : 0 slice_alpha_c0_offset_div2 : 0 slice_beta_offset_div2 : 0 slice_group_change_cycle : 0 === Prediction Weight Table === luma_log2_weight_denom : 0 chroma_log2_weight_denom : 0 luma_weight_l0_flag : 0 chroma_weight_l0_flag : 0 luma_weight_l1_flag : 0 chroma_weight_l1_flag : 0 === Ref Pic List Reordering === ref_pic_list_reordering_flag_l0 : 0 ref_pic_list_reordering_flag_l1 : 0 === Decoded Ref Pic Marking === no_output_of_prior_pics_flag : 0 long_term_reference_flag : 0 adaptive_ref_pic_marking_mode_flag : 0
Both I slices show the frame_num = 0. The next 2 (not shown) have frame_num = 1.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
此 H.264 流采用何种分组方式?例如,对于 FU-A/FU-B 碎片 https://www. rfc-editor.org/rfc/rfc3984#page-11 你总是可以知道 NAL 单元的结尾,因为它与标记为当前最后一个片段的片段结尾对齐纳鲁。
What kind of packetization do you have with this H.264 stream? For example, with FU-A/FU-B fragmentation https://www.rfc-editor.org/rfc/rfc3984#page-11 you always can tell end of NAL unit since it's aligned with end of fragment marked as last fragment for current NALU.