Yiddish OCR
Sorry, I have disabled this facility for now.
General Notes
-
The program only works for TIFF files. It rejects PDF, JPEG, and other
files. It is a good idea to build your TIFF files with compression (LZW
works well). The program can handle multiple-page TIFF files.
-
The program does not handle English letters. If you have an image of a
bilingual text, it will try to understand each letter as a Yiddish letter.
-
Any letter that the program does not recognize will result in the
"▯" character.
-
I have trained the program on printed texts. It doesn't recognize
computer fonts very well.
-
It is best to submit a very clear, high-resolution image in simple
black/white. The background must be white, the letters black.
The lines must be very close to horizontal.
Images from the National Yiddish Book Center are often good.
-
The program is set to ignore characters that are too small (fewer
than 10 pixels wide and high).
-
Please send comments to
raphael
at
cs
dot
uky
dot
edu.
I look occasionally at the images that you download, and I use
your high-quality images to improve the recognition rate of my
program and to discover bugs that I need to fix.
-
The program works best if I set certain parameters to match the
characteristics of your TIFF file. This web version just uses standard
settings. If you really need a particular document analyzed well, it takes
about an hour to train the program to its peculiarities. Contact me.