Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NextJS project fails to build when using Tesseract.js #868

Closed
fmercille opened this issue Jan 5, 2024 · 3 comments
Closed

NextJS project fails to build when using Tesseract.js #868

fmercille opened this issue Jan 5, 2024 · 3 comments

Comments

@fmercille
Copy link

Tesseract.js version (version number for npm/GitHub release, or specific commit for repo)
5.0.4

Describe the bug
A NextJS app using tesseract.js will fail to build with error Cannot find module '/my-dev-folder/.next/server/app/worker-script/node/index.js'

To Reproduce
Steps to reproduce the behavior:

  1. Clone the following repo which I created to illustrate this bug: https://github.com/fmercille/tesseract-js-nextjs
  2. In the project folder, run yarn install to install the dependencies
  3. Run yarn build to build the NextJS app
  4. See error

Expected behavior
I would expect to be able to use tesseract.js in a NextJS application.

Device Version:

  • OS + Version: Ubuntu 22.04.3 LTS

Additional context
The project was created using npx create-next-app@latest tesseract-js-nextjs and accepting all the default values. The file that uses tesseract.js is /src/app/api/tesseract/route.ts. It includes reading a file into a buffer, creating a worker and then recognizing the buffer, but in reality, only creating the worker is enough to make the build fail.

@fmercille
Copy link
Author

Another simple way to reproduce the issue without having to clone a repo:

  1. Create a new NextJS project with npx create-next-app@latest my-project and accept all the default values
  2. cd my-project
  3. yarn install
  4. yarn build --> See that the build is successful
  5. yarn add tesseract.js
  6. Create file src/app/api/route.ts and add the following code to it:
import { createWorker } from "tesseract.js";

export async function GET(req: Request) {
  const worker = await createWorker("eng");
  return new Response();
}
  1. yarn build --> See the error

@Balearica
Copy link
Collaborator

When I ran your repo I got a file path (Cannot find module) error. This indicates that the worker code is not being found automatically, so the path needs to be set manually. This can be done using the workerPath argument. In your case, I think this would look like the following:

const worker = await createWorker("eng", 1, {workerPath: "./node_modules/tesseract.js/src/worker-script/node/index.js"});

For context, Tesseract.js "workers" get their own web worker (browser) or worker thread (Node.js), which is independent code that uses a different entry point. When Tesseract.js is used on its own, this entrypoint should be identified automatically. However, this may not hold with build systems implemented by various frameworks, as these build systems copy around files in a way that violates Tesseract.js's assumptions for where files are located.

@fmercille
Copy link
Author

That is actually very helpful information. It works now. Thanks a lot :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants