Skip to content
Snippets Groups Projects
  • Damien George's avatar
    0016a453
    py/parse: Compress rule pointer table to table of offsets. · 0016a453
    Damien George authored
    This is the sixth and final patch in a series of patches to the parser that
    aims to reduce code size by compressing the data corresponding to the rules
    of the grammar.
    
    Prior to this set of patches the rules were stored as rule_t structs with
    rule_id, act and arg members.  And then there was a big table of pointers
    which allowed to lookup the address of a rule_t struct given the id of that
    rule.
    
    The changes that have been made are:
    - Breaking up of the rule_t struct into individual components, with each
      component in a separate array.
    - Removal of the rule_id part of the struct because it's not needed.
    - Put all the rule arg data in a big array.
    - Change the table of pointers to rules to a table of offsets within the
      array of rule arg data.
    
    The last point is what is done in this patch here and brings about the
    biggest decreases in code size, because an array of pointers is now an
    array of bytes.
    
    Code size changes for the six patches combined is:
    
       bare-arm:  -644
    minimal x86: -1856
       unix x64: -5408
    unix nanbox: -2080
          stm32:  -720
        esp8266:  -812
         cc3200:  -712
    
    For the change in parser performance: it was measured on pyboard that these
    six patches combined gave an increase in script parse time of about 0.4%.
    This is due to the slightly more complicated way of looking up the data for
    a rule (since the 9th bit of the offset into the rule arg data table is
    calculated with an if statement).  This is an acceptable increase in parse
    time considering that parsing is only done once per script (if compiled on
    the target).
    0016a453
    History
    py/parse: Compress rule pointer table to table of offsets.
    Damien George authored
    This is the sixth and final patch in a series of patches to the parser that
    aims to reduce code size by compressing the data corresponding to the rules
    of the grammar.
    
    Prior to this set of patches the rules were stored as rule_t structs with
    rule_id, act and arg members.  And then there was a big table of pointers
    which allowed to lookup the address of a rule_t struct given the id of that
    rule.
    
    The changes that have been made are:
    - Breaking up of the rule_t struct into individual components, with each
      component in a separate array.
    - Removal of the rule_id part of the struct because it's not needed.
    - Put all the rule arg data in a big array.
    - Change the table of pointers to rules to a table of offsets within the
      array of rule arg data.
    
    The last point is what is done in this patch here and brings about the
    biggest decreases in code size, because an array of pointers is now an
    array of bytes.
    
    Code size changes for the six patches combined is:
    
       bare-arm:  -644
    minimal x86: -1856
       unix x64: -5408
    unix nanbox: -2080
          stm32:  -720
        esp8266:  -812
         cc3200:  -712
    
    For the change in parser performance: it was measured on pyboard that these
    six patches combined gave an increase in script parse time of about 0.4%.
    This is due to the slightly more complicated way of looking up the data for
    a rule (since the 9th bit of the offset into the rule arg data table is
    calculated with an if statement).  This is an acceptable increase in parse
    time considering that parsing is only done once per script (if compiled on
    the target).