Skip to content
Snippets Groups Projects
  1. Dec 29, 2017
    • Damien George's avatar
      py/mpz: In mpz_as_str_inpl, convert always-false checks to assertions. · e7842744
      Damien George authored
      There are two checks that are always false so can be converted to (negated)
      assertions to save code space and execution time.  They are:
      
      1. The check of the str parameter, which is required to be non-NULL as per
         the original comment that it has enough space in it as calculated by
         mp_int_format_size.  And for all uses of this function str is indeed
         non-NULL.
      
      2. The check of the base parameter, which is already required to be between
         2 and 16 (inclusive) via the assertion in mp_int_format_size.
      e7842744
    • Damien George's avatar
      py/mpz: Simplify handling of borrow and quo adjustment in mpn_div. · 9766fddc
      Damien George authored
      The motivation behind this patch is to remove unreachable code in mpn_div.
      This unreachable code was added some time ago in
      9a21d2e0, when a loop in mpn_div was copied
      and adjusted to work when mpz_dig_t was exactly half of the size of
      mpz_dbl_dig_t (a common case).  The loop was copied correctly but it wasn't
      noticed at the time that the final part of the calculation of num-quo*den
      could be optimised, and hence unreachable code was left for a case that
      never occurred.
      
      The observation for the optimisation is that the initial value of quo in
      mpn_div is either exact or too large (never too small), and therefore the
      subtraction of quo*den from num may subtract exactly enough or too much
      (but never too little).  Using this observation the part of the algorithm
      that handles the borrow value can be simplified, and most importantly this
      eliminates the unreachable code.
      
      The new code has been tested with DIG_SIZE=3 and DIG_SIZE=4 by dividing all
      possible combinations of non-negative integers with between 0 and 3
      (inclusive) mpz digits.
      9766fddc
    • Damien George's avatar
      py/parse: Fix macro evaluation by avoiding empty __VA_ARGS__. · c7cb1dfc
      Damien George authored
      Empty __VA_ARGS__ are not allowed in the C preprocessor so adjust the rule
      arg offset calculation to not use them.  Also, some compilers (eg MSVC)
      require an extra layer of macro expansion.
      c7cb1dfc
  2. Dec 28, 2017
    • Damien George's avatar
    • Damien George's avatar
      py/parse: Compress rule pointer table to table of offsets. · 0016a453
      Damien George authored
      This is the sixth and final patch in a series of patches to the parser that
      aims to reduce code size by compressing the data corresponding to the rules
      of the grammar.
      
      Prior to this set of patches the rules were stored as rule_t structs with
      rule_id, act and arg members.  And then there was a big table of pointers
      which allowed to lookup the address of a rule_t struct given the id of that
      rule.
      
      The changes that have been made are:
      - Breaking up of the rule_t struct into individual components, with each
        component in a separate array.
      - Removal of the rule_id part of the struct because it's not needed.
      - Put all the rule arg data in a big array.
      - Change the table of pointers to rules to a table of offsets within the
        array of rule arg data.
      
      The last point is what is done in this patch here and brings about the
      biggest decreases in code size, because an array of pointers is now an
      array of bytes.
      
      Code size changes for the six patches combined is:
      
         bare-arm:  -644
      minimal x86: -1856
         unix x64: -5408
      unix nanbox: -2080
            stm32:  -720
          esp8266:  -812
           cc3200:  -712
      
      For the change in parser performance: it was measured on pyboard that these
      six patches combined gave an increase in script parse time of about 0.4%.
      This is due to the slightly more complicated way of looking up the data for
      a rule (since the 9th bit of the offset into the rule arg data table is
      calculated with an if statement).  This is an acceptable increase in parse
      time considering that parsing is only done once per script (if compiled on
      the target).
      0016a453
    • Damien George's avatar
    • Damien George's avatar
    • Damien George's avatar
      py/parse: Pass rule_id to push_result_rule, instead of passing rule_t*. · 815a8cd1
      Damien George authored
      Reduces code size by eliminating quite a few pointer dereferences.
      815a8cd1
    • Damien George's avatar
      py/parse: Break rule data into separate act and arg arrays. · 845511af
      Damien George authored
      Instead of each rule being stored in ROM as a struct with rule_id, act and
      arg, the act and arg parts are now in separate arrays and the rule_id part
      is removed because it's not needed.  This reduces code size, by roughly one
      byte per grammar rule, around 150 bytes.
      845511af
    • Damien George's avatar
      py/parse: Split out rule name from rule struct into separate array. · 1039c5e6
      Damien George authored
      The rule name is only used for debugging, and this patch makes things a bit
      cleaner by completely separating out the rule name from the rest of the
      rule data.
      1039c5e6
    • Peter D. Gray's avatar
      stm32/spi: If MICROPY_HW_SPIn_MISO undefined, do not claim pin on init. · dfe8980a
      Peter D. Gray authored
      This permits output-only SPI use.
      dfe8980a
    • Damien George's avatar
      py/nlr: Factor out common NLR code to macro and generic funcs in nlr.c. · b25f9216
      Damien George authored
      Each NLR implementation (Thumb, x86, x64, xtensa, setjmp) duplicates a lot
      of the NLR code, specifically that dealing with pushing and popping the NLR
      pointer to maintain the linked-list of NLR buffers.  This patch factors all
      of that code out of the specific implementations into generic functions in
      nlr.c, along with a helper macro in nlr.h.  This eliminates duplicated
      code.
      b25f9216
    • Damien George's avatar
      py/nlr: Clean up selection and config of NLR implementation. · 5bf8e85f
      Damien George authored
      If MICROPY_NLR_SETJMP is not enabled and the machine is auto-detected then
      nlr.h now defines some convenience macros for the individual NLR
      implementations to use (eg MICROPY_NLR_THUMB).  This keeps nlr.h and the
      implementation in sync, and also makes the nlr_buf_t struct easier to read.
      5bf8e85f
    • Damien George's avatar
      py/nlrthumb: Fix use of naked funcs, must only contain basic asm code. · 97cc4855
      Damien George authored
      A function with a naked attribute must only contain basic inline asm
      statements and no C code.
      
      For nlr_push this means removing the "return 0" statement.  But for some
      gcc versions this induces a compiler warning so the __builtin_unreachable()
      line needs to be added.
      
      For nlr_jump, this function contains a combination of C code and inline asm
      so cannot be naked.
      97cc4855
  3. Dec 26, 2017
    • Paul Sokolovsky's avatar
      zephyr/main: Remove unused do_str() function. · 7a9a73ee
      Paul Sokolovsky authored
      The artifact of initial porting effort.
      7a9a73ee
    • Paul Sokolovsky's avatar
      Revert "py/nlr: Factor out common NLR code to generic functions." · 096e967a
      Paul Sokolovsky authored
      This reverts commit 6a3a742a.
      
      The above commit has number of faults starting from the motivation down
      to the actual implementation.
      
      1. Faulty implementation.
      
      The original code contained functions like:
      
      NORETURN void nlr_jump(void *val) {
          nlr_buf_t **top_ptr = &MP_STATE_THREAD(nlr_top);
          nlr_buf_t *top = *top_ptr;
      ...
           __asm volatile (
          "mov    %0, %%edx           \n" // %edx points to nlr_buf
          "mov    28(%%edx), %%esi    \n" // load saved %esi
          "mov    24(%%edx), %%edi    \n" // load saved %edi
          "mov    20(%%edx), %%ebx    \n" // load saved %ebx
          "mov    16(%%edx), %%esp    \n" // load saved %esp
          "mov    12(%%edx), %%ebp    \n" // load saved %ebp
          "mov    8(%%edx), %%eax     \n" // load saved %eip
          "mov    %%eax, (%%esp)      \n" // store saved %eip to stack
          "xor    %%eax, %%eax        \n" // clear return register
          "inc    %%al                \n" // increase to make 1, non-local return
           "ret                        \n" // return
          :                               // output operands
          : "r"(top)                      // input operands
          :                               // clobbered registers
           );
      }
      
      Which clearly stated that C-level variable should be a parameter of the
      assembly, whcih then moved it into correct register.
      
      Whereas now it's:
      
      NORETURN void nlr_jump_tail(nlr_buf_t *top) {
          (void)top;
      
          __asm volatile (
          "mov    28(%edx), %esi      \n" // load saved %esi
          "mov    24(%edx), %edi      \n" // load saved %edi
          "mov    20(%edx), %ebx      \n" // load saved %ebx
          "mov    16(%edx), %esp      \n" // load saved %esp
          "mov    12(%edx), %ebp      \n" // load saved %ebp
          "mov    8(%edx), %eax       \n" // load saved %eip
          "mov    %eax, (%esp)        \n" // store saved %eip to stack
          "xor    %eax, %eax          \n" // clear return register
          "inc    %al                 \n" // increase to make 1, non-local return
          "ret                        \n" // return
          );
      
          for (;;); // needed to silence compiler warning
      }
      
      Which just tries to perform operations on a completely random register (edx
      in this case). The outcome is the expected: saving the pure random luck of
      the compiler putting the right value in the random register above, there's
      a crash.
      
      2. Non-critical assessment.
      
      The original commit message says "There is a small overhead introduced
      (typically 1 machine instruction)". That machine instruction is a call
      if a compiler doesn't perform tail optimization (happens regularly), and
      it's 1 instruction only with the broken code shown above, fixing it
      requires adding more. With inefficiencies already presented in the NLR
      code, the overhead becomes "considerable" (several times more than 1%),
      not "small".
      
      The commit message also says "This eliminates duplicated code.". An
      obvious way to eliminate duplication would be to factor out common code
      to macros, not introduce overhead and breakage like above.
      
      3. Faulty motivation.
      
      All this started with a report of warnings/errors happening for a niche
      compiler. It could have been solved in one the direct ways: a) fixing it
      just for affected compiler(s); b) rewriting it in proper assembly (like
      it was before BTW); c) by not doing anything at all, MICROPY_NLR_SETJMP
      exists exactly to address minor-impact cases like thar (where a) or b) are
      not applicable). Instead, a backwards "solution" was put forward, leading
      to all the issues above.
      
      The best action thus appears to be revert and rework, not trying to work
      around what went haywire in the first place.
      096e967a
    • Paul Sokolovsky's avatar
  4. Dec 23, 2017
  5. Dec 22, 2017
  6. Dec 20, 2017
  7. Dec 19, 2017
Loading