So I examined another open-source project... (Project Stage 2)

For the second stage of the final SPO600 project, the objective is to select one open-source project of our choosing, locate the SIMD code and examine its purpose.

Professor Chris provided many open-source project options for us; I decided to go with ffmpeg.

ffmpeg is a multimedia library that is used for transcoding formats, video editing (trimming, concatenating, scaling, effects, etc.). This library has been in development since its release on December 20, 2000, with the most recent commit only being a few hours ago as shown below.

 

I decided to take a dive into their repository, searching for any sign of SIMD code.

SIMD stands for Single Instruction, Multiple Data and is a parallel processing technique that is often used for 3D Graphics and multimedia applications. This makes sense because images have two dimensions, and videos even include time as an extra 'dimension' as well. Parallel processing is crucial to operate efficiently on multimedia like that. This is what ffmpeg is all about!

In order to determine what code counts as SIMD, I did some research to discover vector related instructions for architectures AArch64 and x86_64.

AArch64 

In AArch64, Advanced SIMD can view the extension register bank as:

  • Thirty-two 128-bit registers V0-V31.

  • Thirty-two 64-bit registers D0-D31.

  • Thirty-two 32-bit registers S0-S31.

  • Thirty-two 16-bit registers H0-H31.

  • Thirty-two 8-bit registers B0-B31.

    source

To access these vector registers, it looks like there used to be an instruction called 'addv' that adds a vector. This was changed following this update to abide by NEON formatting. It is now the 'add' instruction with 'v' in its parameters

Nevertheless, I searched this and found countless instances of these vectorized instructions in the ffmpeg codebase. Interestingly, they all exist under /libavcodec.

Here is one instance:

umax compares corresponding elements in the vectors in the two source SIMD and FP registers, places the larger of each pair of unsigned integer values into a vector, and writes the vector to the destination SIMD and FP register.

uabd subtracts the elements of the vector of the second source SIMD and FP register from the corresponding elements of the first source SIMD and FP register, places the absolute values of the results into a vector, and writes the vector to the destination SIMD and FP register.

ushr is a scalar function that reads each vector element in the source SIMD and FP register, right shifts each result by an immediate value, writes the final result to a vector, and writes the vector to the destination SIMD and FP register. All the values in this instruction are unsigned integer values. The results are truncated 

cmhs compares each vector element in the first source SIMD&FP register with the corresponding vector element in the second source SIMD&FP register and if the first unsigned integer value is greater than or equal to the second unsigned integer value sets every bit of the corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero.

I'm not exactly sure what this function does, but I think it's interesting to theorize

x86_64

In x86, VBROADCASTSS, VBROADCASTSD, and VBROADCASTF128 are all instructions that copy a 32-bit, 64-bit or 128-bit memory operand to all elements of a XMM or YMM vector register.

After searching the codebase for VBROADCASTSS instances, I scrolled through several occurences of this instruction, but this one seemed ideal due to the comments explaining what's going on.

This code was located inside of /libavcodec/x86/celt_pvq_search.asm.

 

Overall

To conclude my findings, it seems like SIMD is an essential part of multimedia processing for every architecture. As we can see, the ffmpeg codebase is flooded with vectorized instructions, which I assume are necessary to simultaneously target the two dimensions of the multimedia being processed.

This lab felt really difficult but interesting; I learnt a lot from doing research about SIMD and the difference in architectures, but at the same time I didn't really understand much from reading the SIMD code.

Comments

Popular posts from this blog

The Difference of SVE2 (Project Stage 3)