1# Using SmartPerf-Host to Analyze Application Performance 2 3## Overview 4 5Smartperf-Host is an intuitive performance and power optimization tool that offers in-depth data mining and fine-grained data visualization. In this tool, you can gain visibility into a multitude of metrics in terms of CPU scheduling, frequency, process and thread time slices, heap memory, frame rate, and more, in swimlanes. Better yet, you can analyze the collected data intuitively on the GUI. This tool provides five analysis templates: frame rate analysis, CPU/thread scheduling analysis, application startup analysis, task pool analysis, and animation analysis. For details about how to use the tool, see [Smartperf-Host User Guide](../../device-dev/device-test/smartperf-host.md). 6 7This document provides some performance analysis examples to describe how to use the frame rate analysis and application startup analysis templates to collect and analyze performance data and identify areas of improvement. 8 9## Deployment 10 11Before using SmartPerf-Host, deploy it on your local device. Then you can access SmartPerf-Host at **https://[*Device IP address*]:9000/application/**, as shown in the following figure. 12 13**Figure 1** Local deployment access page 14 15 16 17## Performance Analysis 18 19### FrameTimeline: Frame Rate Analysis 20 21The FrameTimeline feature allows you to record the rendering data of each frame, automatically identify frame freezing, and gain system trace information in the same period. 22 23#### Example 24 25In this example, the **Grid** component is used to implement a grid layout. Frame freezing or frame loss occurs during swiping on the application page. Let's see how the FrameTimeline feature works in this case. 26 27``` 28@Entry 29@Component 30struct Index { 31 @State children: number[] = Array.from<undefined, number>(Array(2000).fill(undefined), (_v: undefined, k) => k); 32 build() { 33 Scroll() { 34 Grid() { 35 ForEach(this.children, (item: number) => { 36 GridItem() { 37 Stack() { 38 Stack() { 39 Stack() { 40 Text(item.toString()) 41 .fontSize(32) 42 } 43 } 44 } 45 } 46 }, (item: number) => item.toString()) 47 } 48 .columnsTemplate('1fr 1fr 1fr 1fr') 49 .columnsGap(0) 50 .rowsGap(0) 51 .size({ width: "100%", height: "100%" }) 52 } 53 } 54} 55``` 56 57#### Recording Data 58 59To record data with FrameTimeline, perform the following steps: 60 611. Choose **Record template** > **Trace template** and enable **FrameTimeline**. 62 63 **Figure 2** Enabling the FrameTimeline frame 64 65  66 672. Customize the recording settings. 68 69 **Figure 3** Recording settings 70 71  72 733. Click **Record** in the upper right corner to start recording. At the same time, interact with the test device to reproduce the frame loss or frame freezing. When the recording is complete, the page automatically loads the trace data. 74 75**NOTE** 76 77- During data recording and analysis, do not exit the application or power off the device. Otherwise, the analysis may fail. 78 79- After you click **Record**, if "please kill other hdc-server!" is displayed on the top of the web page, the HDC port of the device is in use. In this case, run **hdc kill** in the CLI and reconnect to the device to try again. 80 81#### Analyzing Data 82 83A complete rendering process is as follows: The application responds to the user input, completes UI drawing, and submits the UI drawing to Render Service, which then coordinates resources such as the GPU to complete rendering, synthesis, and display. During this process, frame freezing and subsequent frame loss may occur on both the application and Render Service sides. 84 85Based on the three groups of data shown in Figure 4, Figure 5, and Figure 6, you can quickly locate where frame loss occurs and complete preliminary demarcation. 86 87**Figure 4** Total time consumed by the UI and RenderService 88 89 90 91 92**Figure 5** Time consumed by the UI 93 94 95 96 97**Figure 6** Time consumed by RenderService 98 99 100 101- **Expected Timeline** represents the expected, ideal timeline, and **Actual Timeline** the actual timeline. 102 103- There are three types of frames in the timeline: Green frames are normal frames, orange frames are janky frames, and yellow frames are where the interaction between the application and Render Service is abnormal. 104 105- In the preceding figures, the length of each frame indicates the amount of time spent on the frame. 106 107- If the actual end time of a frame on the application or Render Service side is later than the expected deadline, it is considered as a janky frame. 108 109- If there are orange frames on the application side, check whether the processing logic of the UI thread is too complex or inefficient and whether resources are preempted by other tasks. 110 111- If there are orange frames on the Render Service side, check whether the GUI layout is too complex. You can use ArkUI Inspector and [HiDumper](../performance/performance-optimization-using-hidumper.md) to analyze and locate the fault. 112 113In this example, as shown in Figure 5 and Figure 6, the frame freezing issue lies in the application side. Click a janky frame for detailed analysis. The associated frames are represented through lines, and the details of the frame are displayed under **Current Selection**, as shown in Figure 7. 114 115**Figure 7** Frame freezing in the application 116 117 118 119- **Duration** indicates the amount of time spent on the frame. 120 121- **Jank Type** indicates the janky frame type. **APP Deadline Missed** indicates that the janky frame occurs on the application side. 122 123- **FrameTimeLine flows Slice** indicates the associated frame in **FrameTimeLine**. 124 125- **Preceding flows Slice** indicates the associated frame in Render Service. 126 127In the following figure that shows the expanded application lanes, there are two lanes with the same name and PID. The first lane indicates the thread usage, and the second lane indicates the call stack in the thread. Based on the trace data of the time corresponding to the janky frame, it can be discerned that the FlushLayoutTask, which re-measures and lays out items, is time consuming. A closer look reveals that Layout[Gird] takes the longest time. Therefore, it is safe to conclude that the frame freezing can be ascribed to the gird layout processing logic being too complex or inefficient. 128 129**Figure 8** Application layout drawing trace data 130 131 132 133After locating and analyzing the grid layout code segment, we can optimize the code as follows: Remove the redundant three-layer stack container, pre-convert the source data to the string type required the layout, and add the **cachedCount** parameter to the **Grid** component to work with the **LazyForEach** syntax for pre-loading. Set **cachedCount** to the number of grid items that can be rendered on one screen. After the optimization, let's record data in the same way. As shown in Figure 9, no frame freezing or frame loss occurs during swiping. 134 135**Figure 9** FrameTimeline diagram after optimization 136 137 138 139The code after optimization is as follows: 140 141``` 142class MyDataSource implements IDataSource { // LazyForEach data source 143 private list: string[] = []; 144 145 constructor(list: string[]) { 146 this.list = list; 147 } 148 149 totalCount(): number { 150 return this.list.length; 151 } 152 153 getData(index: number): string { 154 return this.list[index]; 155 } 156 157 registerDataChangeListener(_: DataChangeListener): void { 158 } 159 160 unregisterDataChangeListener(): void { 161 } 162} 163@Entry 164@Component 165struct Index { 166 @State children: string[] = Array.from<undefined, string>(Array(2000).fill(undefined), (_v: undefined, k) => k.toString()); 167 @State data: MyDataSource = new MyDataSource(this.children) 168 build() { 169 Scroll() { 170 Grid() { 171 LazyForEach(this.data, (item: string) => { 172 GridItem() { 173 Text(item) 174 .fontSize(32) 175 } 176 }, (item: string) => item) 177 } 178 .cachedCount(80) 179 .columnsTemplate('1fr 1fr 1fr 1fr') 180 .columnsGap(0) 181 .rowsGap(0) 182 .size({ width: "100%", height: "100%" }) 183 } 184 } 185} 186``` 187 188### AppStartup: Application Startup Analysis 189 190The AppStartup feature provides the time consumption of each phase during application startup. With the provided data, you can discover which phase is slowing down your application startup and the time-consuming call stacks on the system side. 191 192#### Example 193 194This example shows how the AppStartup feature works. 195 196``` 197@Entry 198@Component 199struct Index { 200 @State private text: string = "hello world"; 201 private count: number = 0; 202 203 aboutToAppear() { 204 this.computeTask(); 205 } 206 207 build() { 208 Column({space: 10}) { 209 Text(this.text).fontSize(50) 210 } 211 .width('100%') 212 .height('100%') 213 .padding(10) 214 } 215 216 computeTask() { 217 this.count = 0; 218 while (this.count < 10000000) { 219 this.count++; 220 } 221 } 222} 223``` 224 225#### Recording Data 226 227To record data with AppStartup, perform the following steps: 228 2291. Switch to the **Flags** page and set **AppStartup** to **Enabled**. 230 231 **Figure 10** Enabling AppStartup 232 233  234 2352. Switch to the **Record template** page, click **Trace template**, and enable **AppStartup**. 236 237 **Figure 11** Enabling the AppStartup template 238 239  240 2413. On the **Record setting** tab, customize the recording settings. 242 243 **Figure 12** Recording settings 244 245  246 2474. Click **Record** in the upper right corner to start recording. At the same time, open the target application on the device. To end the recording, click **StopRecord**. Alternatively, wait until the recording is complete automatically. When the recording is complete, the page automatically loads the trace data. 248 249 **Figure 13** Ending recording 250 251  252 253#### Analyzing Data 254 255Wait until the analysis result is automatically generated. Click the filter button in the upper right corner and select **AppStartup** to view and analyze data. 256 257**Figure 14** Filtering template data 258 259 260 261Expand the lane of the corresponding application and locate the time frame in which the application is started. Select all phases of the AppStartup lane. You can view the time consumption of each phase in the lower pane. 262 263**Figure 15** Time required for each AppStartup phase (before optimization) 264 265 266 267- **ProcessTouchEvent**: input and processing of click events 268 269- **StartUIAbilityBySCB**: process information and window creation 270 271- **LoadAbility**: process startup 272 273- **Application Launching**: application loading 274 275- **UI Ability Launching**: UI ability loading 276 277- **UI Ability OnForeground**: application being switched to the foreground. 278 279- **First Frame - App Phase**: submission of the first frame for rendering – application 280 281- **First Frame - Render Phase**: submission of the first frame for rendering – Render Service 282 283As shown in the preceding figure, the **UI Ability OnForeground** phase takes the longest time, which is 323 ms. 284 285**Figure 16** Time required for the **UI Ability OnForeground** phase (before optimization) 286 287 288 289A closer look at the phase data reveals that the **aboutToAppear** lifecycle callback takes a long time, which is 268 ms, accounting for 82% of the time consumed by the entire **UI Ability OnForeground** phase. 290 291**Figure 17** Time required for **aboutToAppear** (before optimization) 292 293 294 295It is found in the code that a time-consuming calculation task is executed in the **aboutToAppear** lifecycle callback. This task slows down the cold start of the application. 296 297To speed up application startup, we can conduct asynchronous processing for **aboutToAppear**. The code after optimization is as follows: 298 299``` 300@Entry 301@Component 302struct Index { 303 @State private text: string = "hello world"; 304 private count: number = 0; 305 306 aboutToAppear() { 307 setTimeout(() => { 308 this.computeTask(); 309 }, 0) 310 } 311 312 build() { 313 Column({space: 10}) { 314 Text(this.text).fontSize(10) 315 } 316 .width('100%') 317 .height('100%') 318 .padding(10) 319 } 320 321 computeTask() { 322 this.count = 0; 323 while (this.count < 10000000) { 324 this.count++; 325 } 326 } 327} 328``` 329 330Now, let's record the data in the same way. 331 332**Figure 18** Time required for each AppStartup phase (after optimization) 333 334 335 336The focus of optimization, the **UI Ability OnForeground** phase, where the **aboutToAppear** lifecycle is located, takes 81 ms. 337 338**Figure 19** Time required for the UI Ability OnForeground phase (after optimization) 339 340 341 342A closer look at the phase data reveals that the **aboutToAppear** lifecycle callback now takes 2 ms, accounting for only 2.5% of the time consumed by the entire **UI Ability OnForeground** phase. 343 344**Figure 20** Time consumed by aboutToAppear (after optimization) 345 346 347