Hi,
This adds combining diacritical mark ranges in Hebrew and Arabic unicode blocks (things like cantillations, vowel marks, etc.) to the list of code points which should be stripped in `unaccent`. There are a few punctuation code points interspersed between the ranges, so more contiguous blocks cannot be used.