PyTesseract - recognize digits in simple image
I'm trying to use pytesseract to recognize two numbers from an image:
- I have tried
--psm 6
up to10
- I have tried
-c tessedit_char_whitelist=0123456789'
None of the above returns 49
number. Closest I got is returned 4
without 9
Do you have any tips about how to make tesseract recognize it ?
python ocr tesseract pytesseract
add a comment |
I'm trying to use pytesseract to recognize two numbers from an image:
- I have tried
--psm 6
up to10
- I have tried
-c tessedit_char_whitelist=0123456789'
None of the above returns 49
number. Closest I got is returned 4
without 9
Do you have any tips about how to make tesseract recognize it ?
python ocr tesseract pytesseract
add a comment |
I'm trying to use pytesseract to recognize two numbers from an image:
- I have tried
--psm 6
up to10
- I have tried
-c tessedit_char_whitelist=0123456789'
None of the above returns 49
number. Closest I got is returned 4
without 9
Do you have any tips about how to make tesseract recognize it ?
python ocr tesseract pytesseract
I'm trying to use pytesseract to recognize two numbers from an image:
- I have tried
--psm 6
up to10
- I have tried
-c tessedit_char_whitelist=0123456789'
None of the above returns 49
number. Closest I got is returned 4
without 9
Do you have any tips about how to make tesseract recognize it ?
python ocr tesseract pytesseract
python ocr tesseract pytesseract
edited Jan 2 at 4:27
Davide Fiocco
1,076624
1,076624
asked Jan 1 at 20:26
PovilasKPovilasK
165112
165112
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
Have you tried different --oem
? I would also try to use a --psm
higher than 10.
Yes, I have tried --oem 3 without any luck.
– PovilasK
Jan 2 at 20:09
add a comment |
Try --psm 13 --oem 3
(oem
= 1 or 2 should do also)
import pytesseract
from PIL import Image
import requests
import io
response = requests.get('https://i.stack.imgur.com/oAAXR.png')
text = pytesseract.image_to_string(Image.open(io.BytesIO(response.content)), lang='eng',
config='--psm 13 --oem 3 -c tessedit_char_whitelist=0123456789')
print(text)
yields 49
as you expect on my machine.
I get the same result by downloading the image locally and firing
tesseract oAAXR.png output --oem 3 --psm 13 -l eng
For reference my tesseract --version
gives
tesseract 4.0.0 leptonica-1.77.0 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 2.0.1) : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.1 Found AVX2 Found AVX Found SSE
.
Where is --psm 13 documented? I only see 1-10 here: TESSERACT(1) Manual Page.
– user3169
Jan 7 at 6:55
Aw, looks like their documentation may be inconsistent or refers to different versions there, check github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage for psm > 10.
– Davide Fiocco
Jan 7 at 10:14
Thanks for the answer, but your code gives me "ay" string instead of 49. tesseract versions: tesseract 4.0.0 leptonica-1.77.0 libgif 5.1.4 : libjpeg 9c : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.1 : libopenjp2 2.3.0 Found AVX2 Found AVX Found SSE I'm also using MacOS (Mojave). Maybe that has something to do with it
– PovilasK
Jan 8 at 15:10
I have edited the answer with my config, not sure what could be going wrong :(
– Davide Fiocco
Jan 8 at 21:10
Yeah I can see that only difference is: your libjpeg is 8d and my libjpeg 9c. Everything else is same.
– PovilasK
Jan 10 at 12:22
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53998699%2fpytesseract-recognize-digits-in-simple-image%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Have you tried different --oem
? I would also try to use a --psm
higher than 10.
Yes, I have tried --oem 3 without any luck.
– PovilasK
Jan 2 at 20:09
add a comment |
Have you tried different --oem
? I would also try to use a --psm
higher than 10.
Yes, I have tried --oem 3 without any luck.
– PovilasK
Jan 2 at 20:09
add a comment |
Have you tried different --oem
? I would also try to use a --psm
higher than 10.
Have you tried different --oem
? I would also try to use a --psm
higher than 10.
edited Jan 6 at 14:15
Davide Fiocco
1,076624
1,076624
answered Jan 1 at 22:37
QuarKUS7QuarKUS7
513
513
Yes, I have tried --oem 3 without any luck.
– PovilasK
Jan 2 at 20:09
add a comment |
Yes, I have tried --oem 3 without any luck.
– PovilasK
Jan 2 at 20:09
Yes, I have tried --oem 3 without any luck.
– PovilasK
Jan 2 at 20:09
Yes, I have tried --oem 3 without any luck.
– PovilasK
Jan 2 at 20:09
add a comment |
Try --psm 13 --oem 3
(oem
= 1 or 2 should do also)
import pytesseract
from PIL import Image
import requests
import io
response = requests.get('https://i.stack.imgur.com/oAAXR.png')
text = pytesseract.image_to_string(Image.open(io.BytesIO(response.content)), lang='eng',
config='--psm 13 --oem 3 -c tessedit_char_whitelist=0123456789')
print(text)
yields 49
as you expect on my machine.
I get the same result by downloading the image locally and firing
tesseract oAAXR.png output --oem 3 --psm 13 -l eng
For reference my tesseract --version
gives
tesseract 4.0.0 leptonica-1.77.0 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 2.0.1) : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.1 Found AVX2 Found AVX Found SSE
.
Where is --psm 13 documented? I only see 1-10 here: TESSERACT(1) Manual Page.
– user3169
Jan 7 at 6:55
Aw, looks like their documentation may be inconsistent or refers to different versions there, check github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage for psm > 10.
– Davide Fiocco
Jan 7 at 10:14
Thanks for the answer, but your code gives me "ay" string instead of 49. tesseract versions: tesseract 4.0.0 leptonica-1.77.0 libgif 5.1.4 : libjpeg 9c : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.1 : libopenjp2 2.3.0 Found AVX2 Found AVX Found SSE I'm also using MacOS (Mojave). Maybe that has something to do with it
– PovilasK
Jan 8 at 15:10
I have edited the answer with my config, not sure what could be going wrong :(
– Davide Fiocco
Jan 8 at 21:10
Yeah I can see that only difference is: your libjpeg is 8d and my libjpeg 9c. Everything else is same.
– PovilasK
Jan 10 at 12:22
add a comment |
Try --psm 13 --oem 3
(oem
= 1 or 2 should do also)
import pytesseract
from PIL import Image
import requests
import io
response = requests.get('https://i.stack.imgur.com/oAAXR.png')
text = pytesseract.image_to_string(Image.open(io.BytesIO(response.content)), lang='eng',
config='--psm 13 --oem 3 -c tessedit_char_whitelist=0123456789')
print(text)
yields 49
as you expect on my machine.
I get the same result by downloading the image locally and firing
tesseract oAAXR.png output --oem 3 --psm 13 -l eng
For reference my tesseract --version
gives
tesseract 4.0.0 leptonica-1.77.0 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 2.0.1) : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.1 Found AVX2 Found AVX Found SSE
.
Where is --psm 13 documented? I only see 1-10 here: TESSERACT(1) Manual Page.
– user3169
Jan 7 at 6:55
Aw, looks like their documentation may be inconsistent or refers to different versions there, check github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage for psm > 10.
– Davide Fiocco
Jan 7 at 10:14
Thanks for the answer, but your code gives me "ay" string instead of 49. tesseract versions: tesseract 4.0.0 leptonica-1.77.0 libgif 5.1.4 : libjpeg 9c : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.1 : libopenjp2 2.3.0 Found AVX2 Found AVX Found SSE I'm also using MacOS (Mojave). Maybe that has something to do with it
– PovilasK
Jan 8 at 15:10
I have edited the answer with my config, not sure what could be going wrong :(
– Davide Fiocco
Jan 8 at 21:10
Yeah I can see that only difference is: your libjpeg is 8d and my libjpeg 9c. Everything else is same.
– PovilasK
Jan 10 at 12:22
add a comment |
Try --psm 13 --oem 3
(oem
= 1 or 2 should do also)
import pytesseract
from PIL import Image
import requests
import io
response = requests.get('https://i.stack.imgur.com/oAAXR.png')
text = pytesseract.image_to_string(Image.open(io.BytesIO(response.content)), lang='eng',
config='--psm 13 --oem 3 -c tessedit_char_whitelist=0123456789')
print(text)
yields 49
as you expect on my machine.
I get the same result by downloading the image locally and firing
tesseract oAAXR.png output --oem 3 --psm 13 -l eng
For reference my tesseract --version
gives
tesseract 4.0.0 leptonica-1.77.0 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 2.0.1) : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.1 Found AVX2 Found AVX Found SSE
.
Try --psm 13 --oem 3
(oem
= 1 or 2 should do also)
import pytesseract
from PIL import Image
import requests
import io
response = requests.get('https://i.stack.imgur.com/oAAXR.png')
text = pytesseract.image_to_string(Image.open(io.BytesIO(response.content)), lang='eng',
config='--psm 13 --oem 3 -c tessedit_char_whitelist=0123456789')
print(text)
yields 49
as you expect on my machine.
I get the same result by downloading the image locally and firing
tesseract oAAXR.png output --oem 3 --psm 13 -l eng
For reference my tesseract --version
gives
tesseract 4.0.0 leptonica-1.77.0 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 2.0.1) : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.1 Found AVX2 Found AVX Found SSE
.
edited Jan 8 at 21:13
answered Jan 4 at 20:27
Davide FioccoDavide Fiocco
1,076624
1,076624
Where is --psm 13 documented? I only see 1-10 here: TESSERACT(1) Manual Page.
– user3169
Jan 7 at 6:55
Aw, looks like their documentation may be inconsistent or refers to different versions there, check github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage for psm > 10.
– Davide Fiocco
Jan 7 at 10:14
Thanks for the answer, but your code gives me "ay" string instead of 49. tesseract versions: tesseract 4.0.0 leptonica-1.77.0 libgif 5.1.4 : libjpeg 9c : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.1 : libopenjp2 2.3.0 Found AVX2 Found AVX Found SSE I'm also using MacOS (Mojave). Maybe that has something to do with it
– PovilasK
Jan 8 at 15:10
I have edited the answer with my config, not sure what could be going wrong :(
– Davide Fiocco
Jan 8 at 21:10
Yeah I can see that only difference is: your libjpeg is 8d and my libjpeg 9c. Everything else is same.
– PovilasK
Jan 10 at 12:22
add a comment |
Where is --psm 13 documented? I only see 1-10 here: TESSERACT(1) Manual Page.
– user3169
Jan 7 at 6:55
Aw, looks like their documentation may be inconsistent or refers to different versions there, check github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage for psm > 10.
– Davide Fiocco
Jan 7 at 10:14
Thanks for the answer, but your code gives me "ay" string instead of 49. tesseract versions: tesseract 4.0.0 leptonica-1.77.0 libgif 5.1.4 : libjpeg 9c : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.1 : libopenjp2 2.3.0 Found AVX2 Found AVX Found SSE I'm also using MacOS (Mojave). Maybe that has something to do with it
– PovilasK
Jan 8 at 15:10
I have edited the answer with my config, not sure what could be going wrong :(
– Davide Fiocco
Jan 8 at 21:10
Yeah I can see that only difference is: your libjpeg is 8d and my libjpeg 9c. Everything else is same.
– PovilasK
Jan 10 at 12:22
Where is --psm 13 documented? I only see 1-10 here: TESSERACT(1) Manual Page.
– user3169
Jan 7 at 6:55
Where is --psm 13 documented? I only see 1-10 here: TESSERACT(1) Manual Page.
– user3169
Jan 7 at 6:55
Aw, looks like their documentation may be inconsistent or refers to different versions there, check github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage for psm > 10.
– Davide Fiocco
Jan 7 at 10:14
Aw, looks like their documentation may be inconsistent or refers to different versions there, check github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage for psm > 10.
– Davide Fiocco
Jan 7 at 10:14
Thanks for the answer, but your code gives me "ay" string instead of 49. tesseract versions: tesseract 4.0.0 leptonica-1.77.0 libgif 5.1.4 : libjpeg 9c : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.1 : libopenjp2 2.3.0 Found AVX2 Found AVX Found SSE I'm also using MacOS (Mojave). Maybe that has something to do with it
– PovilasK
Jan 8 at 15:10
Thanks for the answer, but your code gives me "ay" string instead of 49. tesseract versions: tesseract 4.0.0 leptonica-1.77.0 libgif 5.1.4 : libjpeg 9c : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.1 : libopenjp2 2.3.0 Found AVX2 Found AVX Found SSE I'm also using MacOS (Mojave). Maybe that has something to do with it
– PovilasK
Jan 8 at 15:10
I have edited the answer with my config, not sure what could be going wrong :(
– Davide Fiocco
Jan 8 at 21:10
I have edited the answer with my config, not sure what could be going wrong :(
– Davide Fiocco
Jan 8 at 21:10
Yeah I can see that only difference is: your libjpeg is 8d and my libjpeg 9c. Everything else is same.
– PovilasK
Jan 10 at 12:22
Yeah I can see that only difference is: your libjpeg is 8d and my libjpeg 9c. Everything else is same.
– PovilasK
Jan 10 at 12:22
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53998699%2fpytesseract-recognize-digits-in-simple-image%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown