Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tesseract fails when running in Firefox incognito browser #609

Closed
ronbarbosa opened this issue Apr 17, 2022 · 3 comments
Closed

Tesseract fails when running in Firefox incognito browser #609

ronbarbosa opened this issue Apr 17, 2022 · 3 comments

Comments

@ronbarbosa
Copy link

ronbarbosa commented Apr 17, 2022

Describe the bug
When running Firefox in an incognito browser window/tab, Tesseract image scans do not function. This does not seem to impact Chromium based browsers.

To Reproduce
Steps to reproduce the behavior:

  1. In a Firefox incognito/private browser window visit https://tesseract.projectnaptha.com/
  2. Notice the demo at the bottom hangs
  3. This error is thrown in the console:
    Uncaught Error: InvalidStateError: A mutation operation was attempted on a database that did not allow mutations.
    exports createWorker.js:139
    onmessage onMessage.js:3
    exports onMessage.js:2
    exports createWorker.js:122
    demo.js:28
    createWorker.js:139:14

Expected behavior
No error should be shown and the OCR process should complete.

Screenshots
image

Desktop (please complete the following information):

  • OS: Fedora 35
  • Browser: Firefox
  • Version: 99.0.1

Smartphone (please complete the following information):

  • Device: N/A
  • OS: N/A
  • Browser N/A
  • Version N/A

Additional context
A failure occurs in my own project as well, both in online mode and offline mode, but I do not see the same error message. In my project, online mode fails during Tesseract.recognize() and neither returns nor throws an exception (the worker is created internally by Tesseract.recognize()). Offline mode fails during createWorker() and throws a "type error" exception indicating "r is null."

In my project, both online and offline methods work properly outside incognito mode. The offline version functions properly when disconnected from the internet as well (as long as the browser is not in incognito mode). The problem only seems to be present in incognito mode regardless of whether the browser has an active connection to the internet.

I have noticed that in a Firefox incognito browser window, the demo on the Tesseract.js homepage also fails.

@WintrySnowman
Copy link
Contributor

Tesseract stores the training data inside IndexedDB after it's loaded it for the first time. Firefox's private browsing mode explicitly disables IndexedDB, as per this bug report. There does not appear to be a fallback in place, but in the worst case you should be able to check for support and alert the user as to why.

@Balearica
Copy link
Collaborator

Thanks to @ronbarbosa for reporting and @WintrySnowman for explaining the core issue. I reviewed the code, and believe this can be easily resolved. At present, it looks like the language data is successfully retrieved from the remote server, however there is an error writing to cache (as modifications to IndexedDB are disabled) that causes the entire function to fail/throw an error. Therefore, it should simply be a matter of wrapping this line in a "try" block. Will implement a fix shortly.

if (['write', 'refresh', undefined].includes(cacheMethod)) {
await adapter.writeCache(`${cachePath || '.'}/${lang}.traineddata`, data);
}

Balearica pushed a commit that referenced this issue Sep 25, 2022
Balearica added a commit that referenced this issue Nov 25, 2022
See #662 for explanation of Tesseract.js Version 4 changes.  List below is auto-generated from commits. 

* Added image preprocessing functions (rotate + save images)

* Updated createWorker to be async

* Reworked createWorker to be async and throw errors per #654

* Reworked createWorker to be async and throw errors per #654

* Edited detect to return null when detection fails rather than throwing error per #526

* Updated types per #606 and #580 (#663) (#664)

* Removed unused files

* Added savePDF option to recognize per #488; cleaned up code for linter

* Updated download-pdf example for node to use new savePDF option

* Added OutputFormats option/interface for setting output

* Allowed for Tesseract parameters to be set through recognition options per #665

* Updated docs

* Edited loadLanguage to no longer overwrite cache with data from cache per #666

* Added interface for setting 'init only' options per #613

* Wrapped caching in try block per #609

* Fixed unit tests

* Updated setImage to resolve memory leak per #678

* Added debug output option per #681

* Fixed bug with saving images per #588

* Updated examples

* Updated readme and Tesseract.js-core version
@Balearica
Copy link
Collaborator

Closing as this was resolved in Version 4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants