H264 Profiles: Baseline, Main, High : In Sony Vegas and Sorenson Squeeze

Questions:

  • For H264-based encoders, their configuration dialog typically offers a choice of Profiles, being Baseline, Main or High.  The default varies over varieties of encoder.  What do these mean exactly, and what guidance is there for choosing between them?
  • How do they influence things (encoding speed, quality, file-size) in practice?
  • What are their specific effects in Sony Vegas (my traditional workhorse) and Sorenson Squeeze (that I am currently experimenting with)?
    • Both of these applications offer (among their choices) CUDA-acceleration for H264 encoding.

The answer(s):

  • Profile controls the degree of sophistication in encoding and decoding.
  • It’s best to choose “High”
    • Baseline is the “cheap & nasty” variety, e.g. making no use of B-Frames.
    • Main is intermediate between Baseline and High.
    • High offers best compression, and is the typical profile for broadcast (BluRay and TV).

Experimentally, I found:

  • Within each encoding tool, viewed on its own:
    • Insignificant differences in encoding time and (perhaps to be expected) only marginal differences in file size.
      • Note: In my experiment I used MainConcept to compress HD 1920×1080 25p footage of a mid-shot of a lecturer in a static scene (himself moving undramatically in the context of static lighting and seen against a static and fairly neutral background).  Settings were for bitrates of 12Mbps average, 24Mbps maximum
  • Comparing the different tools:
    • Squeeze 8.5 took about twice as long as Sony Vegas 11 to encode to the same-specified (as far as I can deterimine) target.
  • I was unable to discern any difference in quality.  A quality measuring method would be useful here!

I have remaining uncertainties about specifying the number of reference-frames, both in general and in terms of how to do this in the various encoding applications.

Information sources: Web-Research and Experiments

Web-Research:

  • http://www.pelco.com/sites/global/en/sales-and-support/faq/faq_main.page?page=content&country=PELCO&lang=EN&id=FA31954&redirect=true
    • What is difference between baseline, main, and high profile for the Sarix IXE series network IP cameras?
      • The profile defines the subset of bit stream features in an H.264 stream, including color reproduction and additional video compression. It is
        important that the selected profile is compatible with the recording device so that a stream can be decoded and viewed.

        • Baseline:
          • A simple profile with a low compression ratio.
            • The Baseline profile supports I-frames and P-frames.
          • A baseline profile is compatible with more recorders but uses more bits to compress quality video than the other profiles.
          • Use the baseline profile in applications with limited scene changes; for example, an indoor scene with a single, unchanging primary light source and minimal motion.
        • Main:
          • An intermediate profile with a medium compression ratio.
            • The main profile supports I-frames, P-frames, and B-frames.
          • Main is the default profile setting.
          • This profile is compatible with most recorders and uses fewer bits to compress video than the baseline profile; however, it uses more bits than the high profile.
        • High:
          • A complex profile with a high compression ratio.
            • The high profile supports I-frames, P-frames, and B-frames.
          • This is the primary profile for high-definition television applications; for example this is the profile adopted for Blu-ray and HD-DVD.
  • http://ipvm.com/updates/142 (from 2009)
    • In a May Pelco seminar, Pelco advocacted of High Profile H.264. H.264 provides a variety of profiles and levels (H.264 is more like a family of specifications rather than a single specific one). Manufacturers may select from these profiles.
    • Most IP camera companies are choosing baseline profile – the lowest of the options.
    • Pelco says they have selected High.
    • It is reported that high provides greater bandwidth and storage efficiency at the expense of increased processing power.
  • http://ipvm.com/report/h264_codec_shootout (from 2012)
    • H.264 Codec Shootout
    • For the past few years, most IP camera manufacturers only supported the most basic type – baseline profile. Now, increasingly, manufacturers are adding support for more ‘advanced’ types include main and high profile.
    • Of the numerous H.264 profiles, the two most common considered for surveillance are baseline and main. Baseline is typically considered the least efficient of the H.264 profiles but also the least demanding of computing resources. By contrast, main profile is considered to be more bandwidth efficient but also more demanding.
    • Increasingly, new IP cameras are using main profile by default while the previous generation from 2-3 years ago were more likely to use baseline.
  • http://en.wikipedia.org/wiki/H.264/MPEG-4_AVC#Profiles
    • Profiles
      • The standard defines 21 sets of capabilities, which are referred to as profiles, targeting specific classes of applications.
      • Profiles for non-scalable 2D video applications include the following:
        • Constrained Baseline Profile (CBP) Primarily for low-cost applications, this profile is most typically used in videoconferencing and mobile applications.  It corresponds to the subset of features that are in common between the Baseline, Main, and High Profiles.
        • Baseline Profile (BP) Primarily for low-cost applications that require additional data loss robustness, this profile is used in some videoconferencing and mobile applications. This profile includes all features that are supported in the Constrained Baseline Profile, plus three additional features that can be used for loss robustness (or for other purposes such as low-delay multi-point video stream compositing). The importance of this profile has faded somewhat since the definition of the Constrained Baseline Profile in 2009. All Constrained Baseline Profile bitstreams are also considered to be Baseline Profile bitstreams, as these two profiles share the same profile identifier code value.
        • Main Profile (MP) This profile is used for standard-definition digital TV broadcasts that use the MPEG-4 format as defined in the DVB standard.[22] It is not, however, used for high-definition television broadcasts, as the importance of this profile faded when the High Profile was developed in 2004 for that application.
        • Extended Profile (XP) Intended as the streaming video profile, this profile has relatively high compression capability and some extra tricks for robustness to data losses and server stream switching.
        • High Profile (HiP) The primary profile for broadcast and disc storage applications, particularly for high-definition television applications (for example, this is the profile adopted by the Blu-ray Disc storage format and the DVB HDTV broadcast service)
        • {and there are several other profiles listed and described in this article}
  • http://forum.doom9.org/showthread.php?t=147426
    • You’ll end up with a FAR larger file with baseline (about 4 times)
    • Baseline profile means that you cannot use CABAC.  And CABAC gives you ~10-30% (or even more) of extra compression at the same quantizer, compared to CAVLC:
    • And you lose B-frames, which are a pretty big hit as well.
    • The number of reference frames is limited by the Level! More specifically by the maximum decoded picture buffer size (MaxDPB) defined by the individual Level.

Experiments:

  • Context:
    • Source Video:
      • HD 1920×1080 25p footage of a mid-shot of a lecturer in a static scene (himself moving undramatically in the context of static lighting and seen against a static and fairly neutral background).
    • Mac Pro (8-core, GPU = GeForce 8800 GT) running Windows 7 (64-bit) via Boot Camp.
      • GPU usefully makes a “confirmatory” fan-whine noise when the CUDA is in typical operation.
    • Target Video:
      •  HD 1920×1080 25p (same as source)
      • Bitrate = 12Mbps average, 24 Mbps maximum.
        • Filesize expected: 38 minute duration video Implies 38*60*12/8 = 57 MB.
  • Test-Runs:
    • Sony Vegas 11 (CUDA-enabled)
      • For 1-pass:
        • Baseline: 57.8 MB, Main: 48.5 MB, High: 45.8 MB
        • Duration: 46 seconds
      • For 2-pass:
        • Baseline: 66.0 MB, Main: 48.4 MB, High: 48.4 MB
          • The increase in file size presumably indicates an increase in bitrate at some point to somewhere in the margin above the specified average bitrate up to the specified maximum bitrate.
          • Duration: 90 seconds
      • Yes, that’s right, 2-pass actually gave larger file-sizes than 1-pass.
    • Squeeze 8.5
      • In all cases, 1 or 2 pass, B-pictures = 0 or 1:
        •  File size around 58 MB.
      • For 1-pass:
        • Duration: 80 sec
      • For 2-pass:
        • Duration: 2 min 20 sec

Leave a Reply

You must be logged in to post a comment.