diff --git a/docs/library/uctypes.rst b/docs/library/uctypes.rst index c938d74a8ef5d530560e3f6f8d6f8c5c85c985f8..dce8caecb6e09c99a39e27dc71c2904d0d13a7fb 100644 --- a/docs/library/uctypes.rst +++ b/docs/library/uctypes.rst @@ -11,19 +11,91 @@ module is to define data structure layout with about the same power as the C language allows, and then access it using familiar dot-syntax to reference sub-fields. +.. warning:: + + ``uctypes`` module allows access to arbitrary memory addresses of the + machine (including I/O and control registers). Uncareful usage of it + may lead to crashes, data loss, and even hardware malfunction. + .. seealso:: Module :mod:`ustruct` Standard Python way to access binary data structures (doesn't scale well to large and complex structures). +Usage examples:: + + import uctypes + + # Example 1: Subset of ELF file header + # https://wikipedia.org/wiki/Executable_and_Linkable_Format#File_header + ELF_HEADER = { + "EI_MAG": (0x0 | uctypes.ARRAY, 4 | uctypes.UINT8), + "EI_DATA": 0x5 | uctypes.UINT8, + "e_machine": 0x12 | uctypes.UINT16, + } + + # "f" is an ELF file opened in binary mode + buf = f.read(uctypes.sizeof(ELF_HEADER, uctypes.LITTLE_ENDIAN)) + header = uctypes.struct(uctypes.addressof(buf), ELF_HEADER, uctypes.LITTLE_ENDIAN) + assert header.EI_MAG == b"\x7fELF" + assert header.EI_DATA == 1, "Oops, wrong endianness. Could retry with uctypes.BIG_ENDIAN." + print("machine:", hex(header.e_machine)) + + + # Example 2: In-memory data structure, with pointers + COORD = { + "x": 0 | uctypes.FLOAT32, + "y": 4 | uctypes.FLOAT32, + } + + STRUCT1 = { + "data1": 0 | uctypes.UINT8, + "data2": 4 | uctypes.UINT32, + "ptr": (8 | uctypes.PTR, COORD), + } + + # Suppose you have address of a structure of type STRUCT1 in "addr" + # uctypes.NATIVE is optional (used by default) + struct1 = uctypes.struct(addr, STRUCT1, uctypes.NATIVE) + print("x:", struct1.ptr[0].x) + + + # Example 3: Access to CPU registers. Subset of STM32F4xx WWDG block + WWDG_LAYOUT = { + "WWDG_CR": (0, { + # BFUINT32 here means size of the WWDG_CR register + "WDGA": 7 << uctypes.BF_POS | 1 << uctypes.BF_LEN | uctypes.BFUINT32, + "T": 0 << uctypes.BF_POS | 7 << uctypes.BF_LEN | uctypes.BFUINT32, + }), + "WWDG_CFR": (4, { + "EWI": 9 << uctypes.BF_POS | 1 << uctypes.BF_LEN | uctypes.BFUINT32, + "WDGTB": 7 << uctypes.BF_POS | 2 << uctypes.BF_LEN | uctypes.BFUINT32, + "W": 0 << uctypes.BF_POS | 7 << uctypes.BF_LEN | uctypes.BFUINT32, + }), + } + + WWDG = uctypes.struct(0x40002c00, WWDG_LAYOUT) + + WWDG.WWDG_CFR.WDGTB = 0b10 + WWDG.WWDG_CR.WDGA = 1 + print("Current counter:", WWDG.WWDG_CR.T) + Defining structure layout ------------------------- Structure layout is defined by a "descriptor" - a Python dictionary which encodes field names as keys and other properties required to access them as -associated values. Currently, uctypes requires explicit specification of -offsets for each field. Offset are given in bytes from a structure start. +associated values:: + + { + "field1": <properties>, + "field2": <properties>, + ... + } + +Currently, ``uctypes`` requires explicit specification of offsets for each +field. Offset are given in bytes from the structure start. Following are encoding examples for various field types: @@ -31,7 +103,7 @@ Following are encoding examples for various field types: "field_name": offset | uctypes.UINT32 - in other words, value is scalar type identifier ORed with field offset + in other words, the value is a scalar type identifier ORed with a field offset (in bytes) from the start of the structure. * Recursive structures:: @@ -41,9 +113,11 @@ Following are encoding examples for various field types: "b1": 1 | uctypes.UINT8, }) - i.e. value is a 2-tuple, first element of which is offset, and second is + i.e. value is a 2-tuple, first element of which is an offset, and second is a structure descriptor dictionary (note: offsets in recursive descriptors - are relative to the structure it defines). + are relative to the structure it defines). Of course, recursive structures + can be specified not just by a literal dictionary, but by referring to a + structure descriptor dictionary (defined earlier) by name. * Arrays of primitive types:: @@ -51,42 +125,42 @@ Following are encoding examples for various field types: i.e. value is a 2-tuple, first element of which is ARRAY flag ORed with offset, and second is scalar element type ORed number of elements - in array. + in the array. * Arrays of aggregate types:: "arr2": (offset | uctypes.ARRAY, size, {"b": 0 | uctypes.UINT8}), i.e. value is a 3-tuple, first element of which is ARRAY flag ORed - with offset, second is a number of elements in array, and third is - descriptor of element type. + with offset, second is a number of elements in the array, and third is + a descriptor of element type. * Pointer to a primitive type:: "ptr": (offset | uctypes.PTR, uctypes.UINT8), i.e. value is a 2-tuple, first element of which is PTR flag ORed - with offset, and second is scalar element type. + with offset, and second is a scalar element type. * Pointer to an aggregate type:: "ptr2": (offset | uctypes.PTR, {"b": 0 | uctypes.UINT8}), i.e. value is a 2-tuple, first element of which is PTR flag ORed - with offset, second is descriptor of type pointed to. + with offset, second is a descriptor of type pointed to. * Bitfields:: "bitf0": offset | uctypes.BFUINT16 | lsbit << uctypes.BF_POS | bitsize << uctypes.BF_LEN, - i.e. value is type of scalar value containing given bitfield (typenames are - similar to scalar types, but prefixes with "BF"), ORed with offset for + i.e. value is a type of scalar value containing given bitfield (typenames are + similar to scalar types, but prefixes with ``BF``), ORed with offset for scalar value containing the bitfield, and further ORed with values for - bit offset and bit length of the bitfield within scalar value, shifted by - BF_POS and BF_LEN positions, respectively. Bitfield position is counted - from the least significant bit, and is the number of right-most bit of a - field (in other words, it's a number of bits a scalar needs to be shifted - right to extract the bitfield). + bit position and bit length of the bitfield within the scalar value, shifted by + BF_POS and BF_LEN bits, respectively. A bitfield position is counted + from the least significant bit of the scalar (having position of 0), and + is the number of right-most bit of a field (in other words, it's a number + of bits a scalar needs to be shifted right to extract the bitfield). In the example above, first a UINT16 value will be extracted at offset 0 (this detail may be important when accessing hardware registers, where @@ -126,10 +200,11 @@ Module contents Layout type for a native structure - with data endianness and alignment conforming to the ABI of the system on which MicroPython runs. -.. function:: sizeof(struct) +.. function:: sizeof(struct, layout_type=NATIVE) - Return size of data structure in bytes. Argument can be either structure - class or specific instantiated structure object (or its aggregate field). + Return size of data structure in bytes. The *struct* argument can be + either a structure class or a specific instantiated structure object + (or its aggregate field). .. function:: addressof(obj) @@ -151,6 +226,35 @@ Module contents so it can be both written too, and you will access current value at the given memory address. +.. data:: UINT8 + INT8 + UINT16 + INT16 + UINT32 + INT32 + UINT64 + INT64 + + Integer types for structure descriptors. Constants for 8, 16, 32, + and 64 bit types are provided, both signed and unsigned. + +.. data:: FLOAT32 + FLOAT64 + + Floating-point types for structure descriptors. + +.. data:: VOID + + ``VOID`` is an alias for ``UINT8``, and is provided to conviniently define + C's void pointers: ``(uctypes.PTR, uctypes.VOID)``. + +.. data:: PTR + ARRAY + + Type constants for pointers and arrays. Note that there is no explicit + constant for structures, it's implicit: an aggregate type without ``PTR`` + or ``ARRAY`` flags is a structure. + Structure descriptors and instantiating structure objects --------------------------------------------------------- @@ -163,7 +267,7 @@ following sources: system. Lookup these addresses in datasheet for a particular MCU/SoC. * As a return value from a call to some FFI (Foreign Function Interface) function. -* From uctypes.addressof(), when you want to pass arguments to an FFI +* From `uctypes.addressof()`, when you want to pass arguments to an FFI function, or alternatively, to access some data for I/O (for example, data read from a file or network socket). @@ -181,30 +285,41 @@ the standard subscript operator ``[]`` - both read and assigned to. If a field is a pointer, it can be dereferenced using ``[0]`` syntax (corresponding to C ``*`` operator, though ``[0]`` works in C too). -Subscripting a pointer with other integer values but 0 are supported too, +Subscripting a pointer with other integer values but 0 are also supported, with the same semantics as in C. -Summing up, accessing structure fields generally follows C syntax, +Summing up, accessing structure fields generally follows the C syntax, except for pointer dereference, when you need to use ``[0]`` operator instead of ``*``. Limitations ----------- -Accessing non-scalar fields leads to allocation of intermediate objects +1. Accessing non-scalar fields leads to allocation of intermediate objects to represent them. This means that special care should be taken to layout a structure which needs to be accessed when memory allocation is disabled (e.g. from an interrupt). The recommendations are: -* Avoid nested structures. For example, instead of +* Avoid accessing nested structures. For example, instead of ``mcu_registers.peripheral_a.register1``, define separate layout descriptors for each peripheral, to be accessed as - ``peripheral_a.register1``. -* Avoid other non-scalar data, like array. For example, instead of - ``peripheral_a.register[0]`` use ``peripheral_a.register0``. - -Note that these recommendations will lead to decreased readability -and conciseness of layouts, so they should be used only if the need -to access structure fields without allocation is anticipated (it's -even possible to define 2 parallel layouts - one for normal usage, -and a restricted one to use when memory allocation is prohibited). + ``peripheral_a.register1``. Or just cache a particular peripheral: + ``peripheral_a = mcu_registers.peripheral_a``. If a register + consists of multiple bitfields, you would need to cache references + to a particular register: ``reg_a = mcu_registers.peripheral_a.reg_a``. +* Avoid other non-scalar data, like arrays. For example, instead of + ``peripheral_a.register[0]`` use ``peripheral_a.register0``. Again, + an alternative is to cache intermediate values, e.g. + ``register0 = peripheral_a.register[0]``. + +2. Range of offsets supported by the ``uctypes`` module is limited. +The exact range supported is considered an implementation detail, +and the general suggestion is to split structure definitions to +cover from a few kilobytes to a few dozen of kilobytes maximum. +In most cases, this is a natural situation anyway, e.g. it doesn't make +sense to define all registers of an MCU (spread over 32-bit address +space) in one structure, but rather a peripheral block by peripheral +block. In some extreme cases, you may need to split a structure in +several parts artificially (e.g. if accessing native data structure +with multi-megabyte array in the middle, though that would be a very +synthetic case).