1# Using Neon Instructions 2 3 4Arm Neon is an advanced Single Instruction Multiple Data (SIMD) architecture extension for Arm processors. It supports parallel processing of multiple pieces of data by using one instruction. It is widely used in fields such as multimedia encoding/decoding and 2D/3D graphics to improve execution performance. 5 6 7The Neon extension is used since ARMv7. Currently, it is set as a default in Cortex-A7, Cortex-A12, and Cortex-A15 processors, but is optional in other ARMv7 Cortex-A series processors. For details, see [Introducing NEON Development Article](https://developer.arm.com/documentation/dht0002/a/Introducing-NEON/What-is-SIMD-/ARM-SIMD-instructions?lang=en). 8 9 10The ARMv8-A processors integrate the Neon extension by default, which is supported in both AArch64 and AArch32. For details, see [Learn the architecture - Introducing Neon](https://developer.arm.com/documentation/102474/0100/Fundamentals-of-Armv8-Neon-technology). 11 12 13## Architecture Support in OpenHarmony 14 15In OpenHarmony, the Neon extension is enabled by default in the arm64-v8a ABI. It is disabled by default in the armeabi-v7a ABI, in order to support as many ARMv7-A devices as possible. 16 17In the LLVM toolchain of the OpenHarmony SDK, the armeabi-v7a ABI supports precompiled runtime libraries with many configurations. The directory structure is as follows. **native-root** is the root directory where the native package of the NDK is decompressed. 18 19``` 20{native-root}/llvm/lib/clang/current/lib/arm-linux-ohos/ 21 |-- a7_hard_neon-vfpv4 22 | |-- clang_rt.crtbegin.o 23 | |-- clang_rt.crtend.o 24 | |-- ... 25 | 26 |-- a7_soft 27 | |-- clang_rt.crtbegin.o 28 | |-- clang_rt.crtend.o 29 | |-- ... 30 | 31 |-- a7_softfp_neon-vfpv4 32 |-- clang_rt.crtbegin.o 33 |-- clang_rt.crtend.o 34 |-- ... 35``` 36 37**hard**, **soft**, and **softfp** are float-abi. If they are not specified, **softfp** is used by default. **neon-vfpv4** is the parameter type specified by **-mfpu**. The LLVM toolchain selects binary libraries that depend on different architecture configurations based on the compilation parameters. 38 39 40## How to Use 41 42The Neon extension can be used in the following ways: 43 44- Use the Auto-Vectorization feature of LLVM. The compiler generates instructions. This feature is enabled by default and can be disabled by running **-fno-vectorize**. For details, see [Auto-Vectorization in LLVM](https://llvm.org/docs/Vectorizers.html). 45 46- Use the Neon intrinsics library, which gives you direct, low-level access to Neon instructions. 47 48- Write Neon assembly instructions. 49 50For details, see [Arm Neon](https://developer.arm.com/Architectures/Neon). 51 52 53## Example 54 55The following example describes how to use Neon intrinsics in an armeabi-v7a OpenHarmony C++ project. 56 571. Include the **arm_neon.h** header file in the source code. The Neon intrinsics are closely related to the CPU architecture. Therefore, you are advised to include this header file in macros such as **cpu_features_macros**. 58 59 ```c++ 60 #include "cpu_features_macros.h" 61 void call_neon_intrinsics(short *output, const short* input, const short* kernel, int width, int kernelSize) 62 { 63 int nn, offset = -kernelSize/2; 64 for (nn = 0; nn < width; nn++) 65 { 66 int mm, sum = 0; 67 int32x4_t sum_vec = vdupq_n_s32(0); // Neon intrinsics 68 for(mm = 0; mm < kernelSize/4; mm++) 69 { 70 int16x4_t kernel_vec = vld1_s16(kernel + mm*4); 71 int16x4_t input_vec = vld1_s16(input + (nn+offset+mm*4)); 72 sum_vec = vmlal_s16(sum_vec, kernel_vec, input_vec); 73 } 74 ... 75 } 76 ... 77 } 78 ``` 79 802. Call the corresponding implementation functions based on the CPU feature. 81 ```c++ 82 void Compute(void) { 83 #if defined (CPU_FEATURES_ARCH_ARM) 84 static const ArmFeatures features = GetArmInfo().features; 85 // Determine whether the CPU features are supported based on the features field. 86 if (features.neon) { 87 // Run optimized code. 88 } else { 89 // Call normal functions written in C. 90 } 91 #endif 92 } 93 ``` 94 953. Add the corresponding options to the **CMakeLists.txt** file. 96 ```makefile 97 if (${OHOS_ARCH} STREQUAL "armeabi-v7a") 98 set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mfpu=neon -mfloat-abi=softfp") 99 endif () 100 ``` 101 102Now you can use Neon intrinsics in your project. 103 104