New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node.js: Loading corrupted language trained data does not throw an error #602
Comments
Thanks for reporting. I agree that loading corrupted language data should throw an error at the Rather than throwing an exception, the Tesseract API returns "0 on success and -1 on initialization failure". We do not check for this at present: tesseract.js/src/worker-script/index.js Line 188 in dd6c40b
The |
If the traineddata cache becomes corrupted, tesseract.js will still load it without throwing an error. Then, when the recognize function is called, it results in an uncatchable fatal error.
Steps to reproduce the behavior:
This results in the following output:
Note the absence of "caught error", indicating that the error is not being caught. The "Error opening data file" output occurs on the worker.initialize() call, but it does not result in an exception being thrown at that point.
If, however, the errorHandler function is enabled, this is what happens:
The worker's errorHandler function doesn't receive an error when the initialize function is called, but it does when recognize is called. Also, interestingly, the error triggered by calling the recognize function now becomes catchable.
I would expect the worker.recognize function to throw a catchable error, regardless of whether the user has specified an errorHandler for the worker. I would also expect the worker.initialize function to either throw an error when it can't load the specified traineddata or at least send an error to the errorHandler. Neither is currently done.
The text was updated successfully, but these errors were encountered: