Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promise catch not triggering #471

Closed
rvndorado opened this issue Jul 20, 2020 · 4 comments
Closed

Promise catch not triggering #471

rvndorado opened this issue Jul 20, 2020 · 4 comments

Comments

@rvndorado
Copy link

I have encountered the issue with tesseract.js version 2.1.1, express 4.17.1, and node 12.6.1.

It seems that the catch of the promise is not triggering.

Here is my code block:

const Tesseract = require('tesseract.js');
exports.postImage = (req, res, next) => {
    
    const image = req.body.image;
    const language = req.body.language;
    let hasError = false;
    let errorMessage = '';
    let outputText = '';

    Tesseract.recognize(image, language, { })
      .then(({ data: { text } }) => {
        outputText = text;
        res.status(200).json({
          hasError: hasError,
          errorMessage: errorMessage,
          outputText: outputText,
        });
      })
      .catch((error) => {
        hasError = true;
        errorMessage = "Error Message";
        res.status(400).json({
          hasError: hasError,
          errorMessage: errorMessage,
          outputText: outputText,
        });
      });


};

Error message encountered

Error in pixReadMem: Unknown format: no pix returned
Error in pixGetSpp: pix not defined
Error in pixGetDimensions: pix not defined
Error in pixGetColormap: pix not defined
Error in pixCopy: pixs not defined
Error in pixGetDepth: pix not defined
Error in pixGetWpl: pix not defined
Error in pixGetYRes: pix not defined
Error in pixClone: pixs not defined
Please call SetImage before attempting recognition.
@TimMun
Copy link

TimMun commented Aug 11, 2020

For anyone else who stumbles across this: you need to supply an errorHandler in the options dict when you initiate Tesseract. See below PR:

#368

Something like:

import { createWorker } from 'tesseract.js';

const worker = createWorker({
  logger: m => console.log(m),
  errorHandler: e => myErrorHandler(e)
})

@foopis23
Copy link

I am attempting to ocr a batch of images, but with this kind of error handling there is no way to reuse a worker and just skip over images that throw an error. It would just be nice if the recognize function could throw the error so you can handle it on a per job bias.

@foopis23
Copy link

foopis23 commented Jun 29, 2022

I figured out how to handle errors when using one worker to go through a large batch. I will leave my last comment here because it might be helpful to someone to find this, but I realized after quite a bit of digging that if you pass an error handler to the create worker function, even if it does nothing, it will allow the error to be thrown in the promise for the recognize function. But because this error causes the work thread to stop working, you still do have to do something to fix that. My solution was just to create a new thread when the promise fails.

My solution:

async function setupWorker() {
  const worker = createWorker({
    // eslint-disable-next-line @typescript-eslint/no-empty-function
    errorHandler: () => { }
  });

  await worker.load();
  await worker.loadLanguage('eng');
  await worker.initialize('eng');

  return worker;
}

async function cleanUpWorker(worker: Worker) {
  await worker.terminate();
}

async function main() {
  const imgs = ['./img1.png', './img2.png', './img3.png']
  let worker = await setupWorker();
  for (const img of imgs) {
    for (let i = 0; i <= maxRetries; i++) {
      try {
        const result = await worker.recognize(img);

        // TODO: upload result to database

        break;
      } catch (err) {
        // reset worker
        await cleanUpWorker(worker);
        worker = await setupWorker();

        if (i == maxRetries) {
          console.log(`Failure: Max retry reached for ${img}`);
          totalErrors++;
        }
      }
    }
  }
}
main()

@Balearica
Copy link
Collaborator

I believe the core issue here is that (without workarounds) Tesseract.js did not throw an error message when you attempt to use an invalid image. This was resolved in the latest version (4.0.1). If Tesseract is unable to read an image, an error is now immediately thrown. If anybody continues to experience this issue in the latest version, please open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants