4 comments

  • camel-cdr 17 minutes ago

    The problem should be equivalent to: https://www.reddit.com/r/simd/comments/1hmwukl/mask_calculat...

    Falvyu's and bremac's solution seems to be the best.

  • pestatije an hour ago

    wheres the code?...have a look at codereview[5], the whole site is geared for this kind of challenges

    [5] codereview.stackexchange.com

      zokrezyl an hour ago

      I do not have one "implementation" but have been trying with different approaches that all delivered under 50% of memory bandwith... I guess if anyone can purpose a solution should be from scratch... The problem is that all approaches I tried end up generating unpredictable branches that do not allow the CPU to optimally keep loading text from memory.

      zokrezyl an hour ago