1# Ark Bytecode File Format
2The Ark Bytecode file is a binary product compiled in ArkTS/TS/JS. This topic describes the Ark Bytecode file format in detail, aiming to introduce each part of the bytecode and guide you to analyze and modify the bytecode.
3
4
5## Constraints
6This topic only applies to Ark Bytecode in 11.0.2.0 version. (The version number is an internal reserved field of the Ark compiler.)
7
8
9## Data Types of Bytecode File
10
11### Integer
12
13| **Name**       | **NOTE**                          |
14| -------------- | ---------------------------------- |
15| `uint8_t`      | 8-bit unsigned integer.                 |
16| `uint16_t`     | 16-bit unsigned integer in little-endian mode.  |
17| `uint32_t`     | 32-bit unsigned integer in little-endian mode.  |
18| `uleb128`      | Leb128-encoded unsigned integer.            |
19| `sleb128`      | Leb128-encoded signed integer.            |
20
21
22### String
23- Align in single byte.
24- Format
25
26| **Name**| **Format**| **NOTE**                                              |
27| -------------- | -------------- | ------------------------------------------------------------ |
28| `utf16_length`   | `uleb128`  | The value is **`len << 1 \**| **is_ascii**, where **len** indicates the size of a string encoded by UTF-16, and **is_ascii`** indicates whether the string contains only ASCII characters. The value can be **0** or **1**.|
29| `data`           | `uint8_t[]` | MUTF-8 encoded character sequence ending with **\0**. |
30
31
32### TaggedValue
33- Align in single byte.
34- Format
35
36| **Name**| **Format**| **NOTE**                               |
37| -------------- | -------------- | -------------------------------------------- |
38| `tag`          | `uint8_t`      | Indicates the tag of a data type.                          |
39| `data`         | `uint8_t[]`    | According to different tags, **data** is of different types or is empty.|
40
41
42## TypeDescriptor
43**TypeDescriptor** is the format of the [Class](#class) name. Its name **L_ClassName;** is consisted by **'L'**, **'_'**, **ClassName**, and **';'**. In this case, **ClassName** indicates the full name of the class. **'.'** in the name is replaced with **'/'**.
44
45
46## Bytecode File Layout
47The bytecode file is compiled based on the [Header](#header) structure. All structures in the file can be accessed directly or indirectly from the **Header**. The reference modes of the structure in the bytecode file include offset and index. The offset calculated from 0 is a 32-bit value, indicating the distance between the start position of the current structure and the file header in the bytecode file. An index is a 16-bit value that indicates the position of the current structure in the index region. This mechanism is described in [IndexSection](#indexsection).
48
49All multi-byte values in the bytecode file are in little-endian.
50
51
52### Header
53- Align in single byte.
54- Format
55
56| **Name**   | **Format**| **NOTE**                                              |
57| ----------------- | -------------- | ------------------------------------------------------------ |
58| `magic`             | `uint8_t[8]`     | Value of the magic number must be **'P' 'A' 'N' 'D' 'A' '\0' '\0' '\0'**.   |
59| `checksum`          | `uint32_t`       | **Adler32** checksum of the content in the bytecode file except the magic number and this check field.|
60| `version`           | `uint8_t[4]`     | [Version](#version) number of the bytecode file.|
61| `file_size`         | `uint32_t`       | Size of a bytecode file, in bytes.                            |
62| `foreign_off`       | `uint32_t`       | An offset that points to an external region. The external region contains two types of elements: [ForeignClass](#foreignclass) or [ForeignMethod](#foreignmethod). **foreign_off** points to the first element in the region.|
63| `foreign_size`      | `uint32_t`       | Size of the external region, in bytes.                              |
64| `num_classes`       | `uint32_t`       | Number of elements in the [ClassIndex](#classindex) structure, that is, the number of [Class](#class) defined in the file.|
65| `class_idx_off`     | `uint32_t`       | An offset that points to [ClassIndex](#classindex).|
66| `num_lnps`          | `uint32_t`       | Number of elements in the [LineNumberProgramIndex](#linenumberprogramindex) structure, that is, the number of [Line number program](#line-number-program) defined in the file.|
67| `lnp_idx_off`       | `uint32_t`       | An offset that points to [LineNumberProgramIndex](#linenumberprogramindex).|
68| `reserved`          | `uint32_t`       | Reserved field used internally in the Ark Bytecode file.                          |
69| `reserved`          | `uint32_t`       | Reserved field used internally in the Ark Bytecode file.                          |
70| `num_index_regions` | `uint32_t`       | Number of elements in the [IndexSection](#indexsection) structure, that is, the number of [IndexHeader](#indexheader) in the file.|
71| `index_section_off` | `uint32_t`       | An offset that points to [IndexSection](#indexsection).|
72
73
74### Version
75The bytecode version number consists of four parts in the format of **major version number.minor version number.feature version number.Build version number**.
76
77| **Name**| **Format**| **NOTE**                                            |
78| -------------- | -------------- | ---------------------------------------------------------- |
79| Major version number      | `uint8_t`        | Indicates the changes to the bytecode file format caused by the overall structure adjustment.                |
80| Minor version number      | `uint8_t`        | Indicates the changes to the bytecode file format caused by partial structure adjustment or major feature adjustment.|
81| Feature version number    | `uint8_t`        | Indicates the changes to the bytecode file format caused by small- and medium-sized features.                    |
82| Build version number    | `uint8_t`        | Indicates the changes to the bytecode file format caused by defect rectification.                    |
83
84
85### ForeignClass
86Describes the enclosing classes in the bytecode file. They are declared in other files and referenced in the current bytecode file.
87- Align in single byte.
88- Format
89
90| **Name**| **Format**| **NOTE**                                              |
91| -------------- | -------------- | ------------------------------------------------------------ |
92| `name`           | `String`         | Enclosing class name, which follows the [TypeDescriptor](#typedescriptor) syntax.|
93
94
95### ForeignMethod
96Describes external methods in bytecode file. They are declared in other files and referenced in the current bytecode file.
97- Align in single byte.
98- Format
99
100| **Name**| **Format**| **NOTE**                                              |
101| -------------- | -------------- | ------------------------------------------------------------ |
102| `class_idx`      | `uint16_t`       | An index pointing to the class to which the method belongs. It points to a position in [ClassRegionIndex](#classregionindex), whose value is an offset pointing to [Class](#class) or [ForeignClass](#foreignclass).|
103| `reserved`       | `uint16_t`       | Reserved field used internally in the Ark Bytecode file.              |
104| `name_off`       | `uint32_t`       | An offset that points to [string](#string), indicating the method name.|
105| `index_data`     | `uleb128`        | [MethodIndexData](#methodindexdata) data of the method.|
106
107**Note:**<br>
108With the offset of **ForeignMethod**, an appropriate **IndexHeader** can be found to parse the **class_idx**.
109
110
111### ClassIndex
112The **ClassIndex** structure is used to quickly locate the definition of the **Class** by name.
113- Align in four bytes.
114- Format
115
116| **Name**| **Format**| **NOTE**                                              |
117| -------------- | -------------- | ------------------------------------------------------------ |
118| `offsets`        | `uint32_t[]`     | An array. The value of each element in this array is an offset pointing to [Class](#class). Elements in an array are sorted by the class name which follows the [TypeDescriptor](#typedescriptor) syntax. The array length is specified by **num_classes** in [Header](#header).|
119
120
121### Class
122In a bytecode file, a class can represent a source code file of Ark Bytecode or a built-in [Annotation](#annotation). When it indicates a source code file, the method of the class corresponds to the function in the source code file, and class field corresponds to the internal information in the source file. When it indicates a built-in **Annotation**, the class does not contain the field or method. A class in the source code file is represented in the bytecode file as a method corresponding to its constructor.
123
124- Align in single byte.
125- Format
126
127| **Name**| **Format**| **NOTE**                                              |
128| -------------- | -------------- | ------------------------------------------------------------ |
129| `name`           | `String`         | Class name, which follows the [TypeDescriptor](#typedescriptor) syntax.|
130| `reserved`       | `uint32_t`       | Reserved field used internally in the Ark Bytecode file.                          |
131| `access_flags`   | `uleb128`        | Accessing tag of **Class**, which is a combination of [ClassAccessFlag](#classaccessflag).|
132| `num_fields`     | `uleb128`        | Number of fields of **Class**.                                         |
133| `num_methods`    | `uleb128`        | Number of methods of **Class**.                                         |
134| `class_data`     | `TaggedValue[]`  | Array with variable length. Each element in the array is of the [TaggedValue](#taggedvalue) type, and the element tag is of the [ClassTag](#classtag) type. Elements in the array are sorted in ascending order based on the tag (except the **0x00** tag).|
135| `fields`         | `Field[]`        | Array of **Class** fields. Each element in this array is of the [Field](#field) type. The array length is specified by **num_fields**.|
136| `methods`        | `Method[]`       | Array of **Class** methods. Each element in this array is of the [Method](#method) type. The array length is specified by **num_methods**.|
137
138
139### ClassAccessFlag
140
141| **Name**| **Value**| **NOTE**                                              |
142| -------------- | ------------ | ------------------------------------------------------------ |
143| `ACC_PUBLIC`     | `0x0001`       | Default attribute. [Class](#class) in the Ark Bytecode has this tag.|
144| `ACC_ANNOTATION` | `0x2000`       | Declares this class as the [Annotation](#annotation) type.|
145
146
147### ClassTag
148- Align in single byte.
149- Format
150
151| **Name**| **Value**| **Quantity**| **Format**| **NOTE**                                              |
152| -------------- | ------------ | -------------- | -------------- | ------------------------------------------------------------ |
153| `NOTHING`        | `0x00`  | `1`  | `none`    | The [TaggedValue](#taggedvalue) with this tag is the final item of the **class_data**.|
154| `SOURCE_LANG`    | `0x02`  | `0-1 ` | `uint8_t` | The **data** of [TaggedValue](#taggedvalue) with this tag is **0**, indicating that the source code language is in ArkTS, TS, or JS.|
155| `SOURCE_FILE`    | `0x07`  | `0-1`  | `uint32_t`| The **data** of [TaggedValue](#taggedvalue) with this tag is an offset that points to [string](#string), indicating the name of the source file.|
156
157**Note:**<br>
158**ClassTag** is the tag of the element ([TaggedValue](#taggedvalue)) in the **class_data**. The number in the table header refers to the number of occurrences of the element with this tag in the **class_data** of a [Class](#class).
159
160
161### Field
162Describes the fields in the bytecode file.
163
164- Align in single byte.
165- Format
166
167| **Name**| **Format**| **NOTE**                                              |
168| -------------- | -------------- | ------------------------------------------------------------ |
169| `class_idx`      | `uint16_t`       | An index pointing to the class to which the field belongs. It points to a position in [ClassRegionIndex](#classregionindex). The value of the position is of the [Type](#type) type and is an offset pointing to [Class](#class).|
170| `type_idx`       | `uint16_t`       | An index that points to the type of the field and points to a position in [ClassRegionIndex](#classregionindex). The value of the position is of the [Type](#type) type.|
171| `name_off`       | `uint32_t`       | An offset that points to [string](#string), indicating the name of the field.|
172| `reserved`       | `uleb128`        | Reserved field used internally in the Ark Bytecode file.                          |
173| `field_data`     | `TaggedValue[]`  | Array with variable length. Each element in the array is of the [TaggedValue](#taggedvalue) type, and the element tag is of the [FieldTag](#fieldtag) type. Elements in the array are sorted in ascending order based on the tag (except the **0x00** tag).|
174
175**Note:**<br>
176Based on the offset of the **Field**, the appropriate **IndexHeader** can be found to parse the **class_idx** and **type_idx**.
177
178
179### FieldTag
180
181- Align in single byte.
182- Format
183
184| **Name**| **Value**| **Quantity**| **Format**| **NOTE** |
185| -------------- | ------------ | -------------- | -------------- | ------------------------------------------------------------ |
186| `NOTHING`        | `0x00`   | `1`   | `none`     | The [TaggedValue](#taggedvalue) with this tag is the final item of the **field_data**.|
187| `INT_VALUE`      | `0x01`   | `0-1` | `sleb128`  | The **data** type of the [TaggedValue](#taggedvalue) with this tag is of **boolean**, **byte**, **char**, **short**, or **int**.|
188| `VALUE`          | `0x02`   | `0-1` | `uint32_t` | The **data** type of the [TaggedValue](#taggedvalue) with this tag is of **FLOAT** or **ID** in [Value formats](#value-formats).|
189
190**Note:**<br>
191**FieldTag** is the tag of the element ([TaggedValue](#taggedvalue)) in the **field_data**. The number in the table header refers to the number of occurrences of the element with this tag in the **field_data** of a [Field](#field).
192
193
194### Method
195Describes methods in bytecode files.
196
197- Align in single byte.
198- Format
199
200| **Name**| **Format**| **NOTE**                                              |
201| -------------- | -------------- | ------------------------------------------------------------ |
202| `class_idx`      | `uint16_t`       | An index pointing to the class to which the method belongs. It points to a position in [ClassRegionIndex](#classregionindex). The value of the position is of the [Type](#type) type and is an offset pointing to [Class](#class).|
203| `reserved`       | `uint16_t`       | Reserved field used internally in the Ark Bytecode file.                          |
204| `name_off`       | `uint32_t`       | An offset that points to [string](#string), indicating the method name.|
205| `index_data`     | `uleb128`        | [MethodIndexData](#methodindexdata) data of the method.|
206| `method_data`    | `TaggedValue[]`  | Array with variable length. Each element in the array is of the [TaggedValue](#taggedvalue) type, and the element tag is of the [MethodTag](#methodtag) type. Elements in the array are sorted in ascending order based on the tag (except the **0x00** tag).|
207
208**Note:**<br>
209With the offset of **Method**, an appropriate **IndexHeader** can be found to parse the **class_idx**.
210
211
212### MethodIndexData
213**MethodIndexData** is an unsigned 32-bit integer divided into three parts.
214
215| **Bit**| **Name**| **Format**| **NOTE**                                              |
216| ------------ | -------------- | -------------- | ------------------------------------------------------------ |
217| 0 - 15       | `header_index`   | `uint16_t`       | Points to a position in [IndexSection](#indexsection). The value of this position is [IndexHeader](#indexheader). You can use **IndexHeader** to find the offsets of the method ([Method](#method)), [string](#string), or literal array ([LiteralArray](#literalarray)) referenced by this method.|
218| 16 - 23      | `function_kind`  | `uint8_t`        | Indicates the function type ([FunctionKind](#functionkind)) of a method |
219| 24 - 31      | `reserved`       | `uint8_t`        | Reserved field used internally in the Ark Bytecode file.                          |
220
221
222#### FunctionKind
223
224| **Name**          | **Value**| **NOTE**  |
225| ------------------------ | ------------ | ---------------- |
226| `FUNCTION`                 | `0x1`          | Common function.      |
227| `NC_FUNCTION`              | `0x2`          | Common arrow function.  |
228| `GENERATOR_FUNCTION`       | `0x3`          | Generator function.    |
229| `ASYNC_FUNCTION`           | `0x4`          | Asynchronous function.      |
230| `ASYNC_GENERATOR_FUNCTION` | `0x5`          | Asynchronous generator function.|
231| `ASYNC_NC_FUNCTION`        | `0x6`          | Asynchronous arrow function.  |
232| `CONCURRENT_FUNCTION`      | `0x7`          | Concurrent function.      |
233
234
235### MethodTag
236
237| **Name**| **Value**| **Quantity**| **Format**| **NOTE**                                              |
238| -------------- | ------------ | -------------- | -------------- | ------------------------------------------------------------ |
239| `NOTHING`        | `0x00`         | `1`             | `none`           | The [TaggedValue](#taggedvalue) with this tag is the final item of the **method_data**.|
240| `CODE`           | `0x01`         | `0-1 `           | `uint32_t`       | The **data** of [TaggedValue](#taggedvalue) with this tag is an offset pointing to [Code](#code), indicating the code segment of the method.|
241| `SOURCE_LANG`    | `0x02`         | `0-1`            | `uint8_t`        | The **data** of [TaggedValue](#taggedvalue) with this tag is **0**, indicating that the source code language is in ArkTS, TS, or JS.|
242| `DEBUG_INFO`     | `0x05`         | `0-1`            | `uint32_t`       | The **data** of [TaggedValue](#taggedvalue) with this tag is an offset pointing to [DebugInfo](#debuginfo), indicating the debugging information of the method.|
243| `ANNOTATION`     | `0x06`         | `>=0`            | `uint32_t`       | The **data** of [TaggedValue](#taggedvalue) with this tag is an offset pointing to [Annotation](#annotation), indicating the annotation of the method.|
244
245**Note:**<br>
246**MethodTag** is the tag of the element ([TaggedValue](#taggedvalue)) in the **method_data**. The number in the table header refers to the number of occurrences of the element with this tag in the **method_data** of a [Method](#method).
247
248
249### Code
250
251- Align in single byte.
252- Format
253
254| **Name**| **Format**| **NOTE**                                              |
255| -------------- | -------------- | ------------------------------------------------------------ |
256| `num_vregs`      | `uleb128`        | Number of registers. Registers that store input and default parameters are not counted.        |
257| `num_args`       | `uleb128`        | Total number of input and default parameters.                                    |
258| `code_size`      | `uleb128`        | Total size of all instructions, in bytes.                            |
259| `tries_size`     | `uleb128`        | Length of the **try_blocks** array, that is, the number of [TryBlock](#tryblock).   |
260| `instructions`   | `uint8_t[]`      | Array of all instructions.                                          |
261| `try_blocks`     | `TryBlock[]`     | An array. Each element in the array is of the **TryBlock** type.|
262
263
264### TryBlock
265
266- Align in single byte.
267- Format
268
269| **Name**| **Format**| **NOTE**                                              |
270| -------------- | -------------- | ------------------------------------------------------------ |
271| `start_pc`       | `uleb128`        | Offset between the first instruction of the **TryBlock** and the start position of the **instructions** in [Code](#code).|
272| `length`         | `uleb128`        | Size of the **TryBlock** object to create, in bytes.                              |
273| `num_catches`    | `uleb128`        | Number of [CatchBlock](#catchblock) associated with **TryBlock**. The value is 1.|
274| `catch_blocks`   | `CatchBlock[]`   | Array of **CatchBlocks** associated with **TryBlock**. The array contains one **CatchBlock** that can capture all types of exceptions.|
275
276
277### CatchBlock
278
279- Align in single byte.
280- Format
281
282| **Name**| **Format**| **NOTE**                                 |
283| -------------- | -------------- | ----------------------------------------------- |
284| `type_idx`       | `uleb128`        | If the value is **0**, the **CatchBlock** captures all types of exceptions.|
285| `handler_pc`     | `uleb128`        | Program counter of the first instruction for handling the exception.         |
286| `code_size`      | `uleb128`        | Size of the **CatchBlock**, in bytes.             |
287
288
289### Annotation
290Describes an annotation structure.
291
292- Align in single byte.
293- Format
294
295| **Name**| **Format**     | **NOTE**                                              |
296| -------------- | ------------------- | ------------------------------------------------------------ |
297| `class_idx`      | `uint16_t`   | An index pointing to the class to which the **Annotation** belongs. It points to a position in [ClassRegionIndex](#classregionindex). The value of the position is of the [Type](#type) type and is an offset pointing to [Class](#class).|
298| `count`          | `uint16_t`   | Length of the **elements** array.                                        |
299| `elements`       | AnnotationElement[] | An array. Each element of the array is of the [AnnotationElement](#annotationelement) type.|
300| `element_types`  | `uint8_t[]`  | An array. Each element in the array is of the [AnnotationElementTag](#annotationelementtag) type and is used to describe an **AnnotationElement.** The position of each element in the **element_types** array is the same as that of the corresponding **AnnotationElement** in the **elements** array.|
301
302**Note:**<br>
303With the offset of **Annotation**, an appropriate **IndexHeader** can be found to parse the **class_idx**.
304
305
306### AnnotationElementTag
307
308| **Name**| **Tag**|
309| -------------- | --------- |
310| `u1`             | `'1'`   |
311| `i8`             | `'2'`   |
312| `u8`             | `'3'`   |
313| `i16`            | `'4'`   |
314| `u16`            | `'5'`   |
315| `i32`            | `'6'`   |
316| `u32`            | `'7'`   |
317| `i64`            | `'8'`   |
318| `u64`            | `'9'`   |
319| `f32`            | `'A'`   |
320| `f64`            | `'B'`   |
321| `string`         | `'C'`   |
322| `method`         | `'E'`   |
323| `annotation`     | `'G'`   |
324| `literalarray`   | `'#'`   |
325| `unknown`        | `'0'`   |
326
327
328### AnnotationElement
329
330- Align in single byte.
331- Format
332
333| **Name**| **Format**| **NOTE**                                              |
334| -------------- | -------------- | ------------------------------------------------------------ |
335| `name_off`       | `uint32_t`       | An offset that points to [string](#string), indicating the name of the annotation element.|
336| `value`          | `uint32_t`       | Value of the annotation element. If the width of the value is less than 32 bits, the value itself is stored here. Otherwise, the value stored here is an offset pointing to the [Value formats](#value-formats).|
337
338
339### Value formats
340Different value types have different value encoding formats, including INTEGER, LONG, FLOAT, DOUBLE, and ID.
341
342| **Name**| **Format**| **NOTE**                                              |
343| -------------- | -------------- | ------------------------------------------------------------ |
344| `INTEGER`        | `uint32_t`       | Signed 4-byte integer value.                                      |
345| `LONG`           | `uint64_t`       | Signed 8-byte integer value.                                      |
346| `FLOAT`          | `uint32_t`       | 4-byte mode, which is extended to the right zero. The system interprets it as an IEEE754 32-bit floating-point value.|
347| `DOUBLE`         | `uint64_t`       | 8-byte mode, which is extended to the right zero. The system interprets it as an IEEE754 64-bit floating-point value.|
348| `ID`             | `uint32_t`       | 4-byte mode, indicating the offset of a structure in a file.                  |
349
350
351### LineNumberProgramIndex
352The **LineNumberProgramIndex** structure is an array that facilitates the use of a more compact index to access the [Line number program](#line-number-program).
353
354- Align in four bytes.
355- Format
356
357| **Name**| **Format**| **NOTE**                                              |
358| -------------- | -------------- | ------------------------------------------------------------ |
359| `offsets`        | `uint32_t[]`     | An array in which the value of each element is an offset pointing to a line number program. The array length is specified by **num_lnps** in [Header](#header).|
360
361
362### DebugInfo
363The **DebugInfo** contains the mapping between the program counter of the method and the line and column numbers in the source code, as well as information about local variables. The format of the debugging information evolves from the contents of [DWARF 3.0 Standard](https://dwarfstd.org/dwarf3std.html) (see section 6.2). The execution model of the [State machine](#state-machine) interprets the [Line number program](#line-number-program) to obtain the mapping and local variable information code. To deduplicate programs with the same line number in different methods, all constants referenced in the programs are moved to the [Constant pool](#constant-pool).
364
365- Align in single byte.
366- Format
367
368| **Name**         | **Format**| **NOTE**                                              |
369| ----------------------- | -------------- | ------------------------------------------------------------ |
370| `line_start`              | `uleb128`        | Initial value of the line number register of the state machine.                                |
371| `num_parameters`          | `uleb128`        | Total number of input and default parameters.                                    |
372| `parameters`              | `uleb128[]`      | Array that stores the names of input parameters. The array length is **num_parameters**. The value of each element is the offset of the string or **0**. If the value is **0**, the corresponding parameter does not have a name.|
373| `constant_pool_size`      | `uleb128`        | Size of the constant pool, in bytes.                                |
374| `constant_pool`           | `uleb128[]`      | Array for storing constant pool data. The array length is **constant_pool_size**.        |
375| `line_number_program_idx` | `uleb128`        | An index that points to a position in [LineNumberProgramIndex](#linenumberprogramindex). The value of this position is an offset pointing to [Line number program](#line-number-program). The length of **Line number program** is variable and ends with the **END_SEQUENCE** operation code.|
376
377
378#### Constant pool
379A constant pool is a structure for storing constants in **DebugInfo**. Many methods have similar line number programs, which differ only in variable names, variable types, and file names. To deduplicate such line number programs, all constants referenced in the programs are stored in the constant pool. When interpreting the program, the state machine maintains a pointer to the constant pool. When interpreting an instruction that requires constant parameters, the state machine reads the value from the position pointed by the memory constant pool pointer and then increments the pointer.
380
381
382#### State machine
383The state machine is used to generate [DebugInfo](#debuginfo) information. It contains the following registers.
384
385| **Name**   | **Initial Value**                                            | **NOTE**                                              |
386| ----------------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
387| `address`           | 0                                                            | Program counter, an instruction that points to a method, can only monotonically increase.            |
388| `line`              | Value of the **line_start** attribute of [DebugInfo](#debuginfo)| Unsigned integer, corresponding to the line number in the source code. All lines are numbered from 1. Therefore, the register value cannot be less than **1**.|
389| `column`            | 0                                                            | Unsigned integer, corresponding to the column number in the source code.                              |
390| `file`              | Value of **SOURCE_FILE** in **class_data** (see [Class](#class)), or 0.| An offset that points to [string](#string), indicating the name of the source file. If there is no file name, that is, there is no **SOURCE_FILE** tag in [Class](#class), the register value is **0**.|
391| `source_code`       | 0                                                            | An offset that points to [string](#string), indicating the source code of the source file. If there is no source code information, the register value is **0**.|
392| `constant_pool_ptr` | Address of the first byte from the constant pool in [DebugInfo](#debuginfo).| Pointer that points to the current constant value.                                      |
393
394
395#### Line number program
396A line number program consists of instructions. Each instruction contains a single-byte operation code and optional parameters. Because of different operation codes, a parameter value may be encoded in an instruction (called an instruction parameter) or needs to be obtained from a constant pool (called a constant pool parameter).
397
398| **Operation Code** | **Value**| **Command Parameters**  | **Constant Pool Parameters**    | **Parameters**| **NOTE** |
399| ----- | ----- | ------- | ---- | ------- | ------ |
400| `END_SEQUENCE`         | `0x00`  |       |          |        | Marks the end of the line number program.   |
401| `ADVANCE_PC`           | `0x01`  |    | `uleb128 addr_diff`   | **addr_diff**: value to be added to the **address** register value.   | The value in the **address** register plus **addr_diff** points to the next address without generating a location entry.|
402| `ADVANCE_LINE`         | `0x02` |     | `sleb128 line_diff`  | **line_diff**: value to be added to the **line** register value.   | The value in the **line** register plus **line_diff** points to the next line position without generating a position entry.|
403| `START_LOCAL`          | `0x03` | `sleb128 register_num` | `uleb128 name_idx`<br>`uleb128 type_idx`   | **register_num**: register containing local variables.<br>**name_idx**: an offset pointing to [string](#string), indicating the variable name.<br>**type_idx**: an offset pointing to [string](#string), indicating the variable type.| Introduces a local variable with a name and a type in the current address. The number of the register that will contain this variable is encoded in the instruction. If the register number is -1, it indicates that the register is an accumulator. The values of **name_idx** and **type_idx** may be **0**. If so, the corresponding information does not exist.|
404| `START_LOCAL_EXTENDED` | `0x04` | `sleb128 register_num` | `uleb128 name_idx`<br>`uleb128 type_idx`<br>`uleb128 sig_idx` | **register_num**: register containing local variables.<br>**name_idx**: an offset pointing to [string](#string), indicating the variable name.<br>**type_idx**: an offset pointing to [string](#string), indicating the variable type.<br>**sig_idx**: an offset pointing to [string](#string), indicating the variable signature.| Introduces a local variable with a name, a type, and a signature in the current address. The number of the register that will contain this variable is encoded in the instruction. If the register number is -1, it indicates that the register is an accumulator. The values of **name_idx**, **type_idx**, and **sig_idx** may be **0**. If so, the corresponding information does not exist.|
405| `END_LOCAL`            | `0x05` | `sleb128 register_num` |    | **register_num**: register containing local variables. | Marks a local variable in the specified register as out of range at the current address. If the register number is -1, it indicates that the register is an accumulator.|
406| `SET_FILE`             | `0x09`  |    | `uleb128 name_idx`  | **name_idx**: an offset pointing to [string](#string), indicating the file name.| Sets the value of the file register. The value of **name_idx** may be **0**. If so, it indicates that the corresponding information does not exist.|
407| `SET_SOURCE_CODE`      | `0x0a`  |    | `uleb128 source_idx` | **source_idx**: an offset pointing to [string](#string), indicating the source code of the file.| Sets the value of the **source_code** register. The value of **source_idx** may be **0**. If so, it indicates that the corresponding information does not exist.|
408| `SET_COLUMN`           | `0x0b` |    | `uleb128 column_num`   | **column_num**: column number to be set.  | Sets the value of the **column** register and generates a location entry. |
409| Special operation code          | `0x0c..0xff`   |   |  |   | Makes the **line** and **address** registers point to the next address and generate a location entry. For details, see the following description.|
410
411
412For special operation codes whose values are between **0x0c** and **0xff** (included), the state machine moves the **line** and **address** registers by a small part and then generates a new location entry. For details, see section 6.2.5.1 "Special Opcodes" in [DWARF 3.0 Standard](https://dwarfstd.org/dwarf3std.html).
413
414| **No.**| **Operation**                                    | **NOTE**                                              |
415| ----- | -------------------------------------------------- | ------------------------------------------------------------ |
416| 1     | `adjusted_opcode = opcode - OPCODE_BASE`            | Calculates the adjusted operation code. The value of **OPCODE_BASE** is **0x0c**, which is the first special operation code.|
417| 2     | `address += adjusted_opcode / LINE_RANGE`            | Increases the value of the **address** register. The value of LINE_RANGE is 15, which is used to calculate the change of the line number information.|
418| 3     | `line += LINE_BASE + (adjusted_opcode % LINE_RANGE)` | Increase the value of the **line** register. The value of **LINE_BASE** is **-4**, which is the minimum line number increment. The maximum row number increment is **LINE_BASE + LINE_RANGE - 1**.|
419| 4     |                                                    | Generates a new location entry.                                      |
420
421**Note:**<br>
422The special operation code is calculated by using the following formula: **(line_increment - LINE_BASE) + (address_increment * LINE_RANGE) + OPCODE_BASE**.
423
424
425### IndexSection
426Generally, each structure of a bytecode file is referenced by using a 32-bit offset. When a structure references another structure, the 32-bit offset of the referenced structure needs to be recorded in the current structure. To reduce a file size, a bytecode file is divided into multiple index regions, and a structure in each index region uses a 16-bit index. The **IndexSection** structure describes a collection of index areas.
427
428- Align in four bytes.
429- Format
430
431| **Name**| **Format**| **NOTE**      |
432| -------------- | -------------- | --------- |
433| `headers`        | `IndexHeader[]`  | An array. Each element in the array is of the [IndexHeader](#indexheader) type. Elements in the array are sorted based on the start offset of the region. The array length is specified by **num_index_regions** in [Header](#header).|
434
435
436### IndexHeader
437Each **IndexHeader** structure describes an index region. Each index region has two types of indexes: indexes pointing to [Type](#type) and indexes pointing to methods, strings, or literal arrays.
438
439- Align in four bytes.
440- Format
441
442| **Name**       | **Format**| **NOTE**   |
443| -------------- | -------------- | ---------- |
444| `start_off`                             | `uint32_t`       | Start offset of the region.                                        |
445| `end_off`                               | `uint32_t`       | End offset of the region.                                        |
446| `class_region_idx_size`                 | `uint32_t`       | Number of elements in [ClassRegionIndex](#classregionindex) of the region. The maximum value is **65536**.|
447| `class_region_idx_off`                  | `uint32_t`       | An offset that points to [ClassRegionIndex](#classregionindex).|
448| `method_string_literal_region_idx_size` | `uint32_t`       | Number of elements in the [MethodStringLiteralRegionIndex](#methodstringliteralregionindex) of the region. The maximum value is **65536**.|
449| `method_string_literal_region_idx_off`  | `uint32_t`       | An offset that points to [MethodStringLiteralRegionIndex](#methodstringliteralregionindex).|
450| `reserved`                              | `uint32_t`       | Reserved field used internally in the Ark Bytecode file.                          |
451| `reserved`                              | `uint32_t`       | Reserved field used internally in the Ark Bytecode file.                          |
452| `reserved`                              | `uint32_t`       | Reserved field used internally in the Ark Bytecode file.                          |
453| `reserved`                              | `uint32_t`       | Reserved field used internally in the Ark Bytecode file.                          |
454
455
456### ClassRegionIndex
457The **ClassRegionIndex** structure is used to find the corresponding [Type](#type) through a more compact index.
458
459- Align in four bytes.
460- Format
461
462| **Name**| **Format**| **NOTE**                                              |
463| -------------- | -------------- | ------------------------------------------------------------ |
464| `types`          | `Type[]`         | An array. Each element in the array is of the [Type](#type) type. The array length is specified by **class_region_idx_size** in [IndexHeader](#indexheader).|
465
466
467### Type
468Indicates a basic type code or an offset pointing to [Class](#class). It is a 32-bit value.
469
470Basic types are encoded in the following ways.
471
472| **Type**      | **Code**       |
473| -------------- | -------------- |
474| `u1`           | `0x00`         |
475| `i8`           | `0x01`         |
476| `u8`           | `0x02`         |
477| `i16`          | `0x03`         |
478| `u16`          | `0x04`         |
479| `i32`          | `0x05`         |
480| `u32`          | `0x06`         |
481| `f32`          | `0x07`         |
482| `f64`          | `0x08`         |
483| `i64`          | `0x09`         |
484| `u64`          | `0x0a`         |
485| `any`          | `0x0c`         |
486
487
488### MethodStringLiteralRegionIndex
489The **MethodStringLiteralRegionIndex** structure is used to find the corresponding methods, strings, or literal arrays through a more compact index.
490
491- Align in four bytes.
492- Format
493
494| **Name**| **Format**| **NOTE**                                              |
495| -------------- | -------------- | ------------------------------------------------------------ |
496| `offsets`      | `uint32_t[]`   | An array. The value of each element is an offset pointing to a method, a string, or a literal array. The array length is specified by **method_string_literal_region_idx_size** in [IndexHeader](#indexheader).|
497
498
499### LiteralArray
500Describes the literal array in the bytecode file.
501
502- Align in single byte.
503- Format
504
505| **Name**| **Format**| **NOTE**                                              |
506| -------------- | -------------- | ------------------------------------------------------------ |
507| `num_literals`   | `uint32_t`       | Length of the **literals** array.                                        |
508| `literals`       | `Literal[]`      | An array. Each element of the array is of the [Literal](#literal) type.|
509
510
511### Literal
512Describes the literals in a bytecode file. There are four encoding formats based on the number of bytes of the literals.
513
514| **Name**| **Format**| **Alignment Type**| **NOTE**|
515| -------------- | -------------- | ------------------ | -------------- |
516| ByteOne        | `uint8_t`        | 1 byte           | Single-byte value.  |
517| ByteTwo        | `uint16_t`       | 2 bytes           | Double-byte value.  |
518| ByteFour       | `uint32_t`       | 4 bytes           | Four-byte value.  |
519| ByteEight      | `uint64_t`       | 8 bytes           | Eight-byte value.  |
520