Skip Navigation
What are you working on this week? (June. 16, 2024)
  • How are you doing a date/time library without platform dependencies like libc or windows-sys? Are you rolling your own bindings in order to get the local time zone? (Or perhaps you aren't doing that at all.)

  • Trying to invent a better substring search algorithm
  • Disclosure: I'm the author of the memchr crate.

    You mention the memchr crate, but you don't seem to have benchmarked it. Instead, you benchmarked the needle crate (last updated 7 years ago). Can you explain a bit more about your methodology?

    The memchr crate in particular doesn't just use Rabin-Karp. It also uses Two-Way. And SIMD (with support for x86-64, aarch64 and wasm32).

  • A comprehensive guide to the dangers of Regular Expressions in JavaScript
  • Both Perl and Python use backtracking regex engines and are thus susceptible to similar problems as discussed in the OP.

  • aho-corasick (and thus the regex crate too) now uses SIMD on aarch64 (e.g., Apple silicon) to greatly accelerate some searches
  • Cross-posting from reddit:

    The PR has more details, but here are a few ad hoc benchmarks using ripgrep on my M2 mac mini while searching a 5.5GB file.

    This one is just a case insensitive search. A case insensitive regex expands to something like (ignoring Unicode) [Ss][Hh][Ee][Rr]..., which means that it has multiple literal prefixes. In fact, you can enumerate them! As long as the set is small enough, this is something that the new SIMD acceleration on aarch64 can handle (and has done for a long time on x86-64):

    $ time rg-before-teddy-aarch64 -i -c 'Sherlock Holmes' OpenSubtitles2018.half.en
    3055
    
    real    8.208
    user    7.731
    sys     0.467
    maxmem  5600 MB
    faults  191
    
    $ time rg-after-teddy-aarch64 -i -c 'Sherlock Holmes' OpenSubtitles2018.half.en
    3055
    
    real    1.137
    user    0.695
    sys     0.430
    maxmem  5904 MB
    faults  203
    

    And of course, using multiple literals explicitly also uses this optimization:

    $ time rg-before-teddy-aarch64 -c 'Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty' OpenSubtitles2018.half.en
    3804
    
    real    9.055
    user    8.580
    sys     0.474
    maxmem  4912 MB
    faults  11
    
    $ time rg-after-teddy-aarch64 -c 'Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty' OpenSubtitles2018.half.en
    3804
    
    real    1.121
    user    0.697
    sys     0.422
    maxmem  4832 MB
    faults  11
    

    And it doesn't just work for prefixes, it also works for inner literals too:

    $ time rg-before-teddy-aarch64 -c '\w+\s+(Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty)\s+\w+' OpenSubtitles2018.half.en
    773
    
    real    9.065
    user    8.586
    sys     0.477
    maxmem  6384 MB
    faults  11
    
    $ time rg-after-teddy-aarch64 -c '\w+\s+(Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty)\s+\w+' OpenSubtitles2018.half.en
    
    773
    
    real    1.124
    user    0.702
    sys     0.421
    maxmem  6784 MB
    faults  11
    

    If you're curious about how the SIMD stuff works, you can read my description of Teddy here. I ported this algorithm out of the Hyperscan project several years ago, and it has been one of the killer ingredients for making ripgrep fast in a lot of common cases. But it only worked on x86-64. With the rise and popularity of aarch64 and Apple silicon, I was motivated to port it over. I just recently finished analogous work for the memchr crate as well.

  • aho-corasick (and thus the regex crate too) now uses SIMD on aarch64 (e.g., Apple silicon) to greatly accelerate some searches
    github.com add aarch64 SIMD implementation of Teddy by BurntSushi · Pull Request #129 · BurntSushi/aho-corasick

    Up until this point, Teddy was explicitly written using x86-64 SIMD routines. Specifically, ones from SSSE3 and AVX2. This PR shuffles Teddy's main implementation into code that is generic over a n...

    add aarch64 SIMD implementation of Teddy by BurntSushi · Pull Request #129 · BurntSushi/aho-corasick
    2
    burntsushi burntsushi @programming.dev

    I love to code.

    Posts 1
    Comments 6