Restoring Hebrew Diacritics Without a Dictionary


Abstract in English

We demonstrate that it is feasible to diacritize Hebrew script without any human-curated resources other than plain diacritized text. We present NAKDIMON, a two-layer character level LSTM, that performs on par with much more complicated curation-dependent systems, across a diverse array of modern Hebrew sources.

Download