OCRExtract Variable

screen-scraping
ocr

#1

Hi, I’ve been trying to do a really simple program:

Blockquote

{
“Name”: “0 test”,
“CreationDate”: “2019-3-4”,
“Commands”: [
{
“Command”: “XClick”,
“Target”: “500, 700”,
“Value”: “”
},
{
“Command”: “OCRExtract”,
“Target”: “capta_dpi_120.png”,
“Value”: “text”
},
{
“Command”: “XClick”,
“Target”: “500, 700”,
“Value”: “”
},
{
“Command”: “XType”,
“Target”: “${text}”,
“Value”: “”
}
]
}

Blockquote

The XType command doesn’t properly type the text (it types nothing at all actually). I’m using OCRExtract because the variable doesn’t change much (0.7 is the scanner % and works fine). Here’s the photo:
capt

What am I doing wrong? Thank you!


#2

The first step is to check if OCRExtract works ok.

  • What is the value of ${text}?

If OCRExtract fails, then the next tests are:

  • If you test OCRExtract command with the FIND button, does it work then?

If this also fails, then please provide the OCRExtract input image that you are using, plus a screenshot of the website where the “value to be extracted” is.


#3

Thank you for your response!

The value of text:
I’m inserting the photo (shown below) into the OCRExtract, and the value of text is what it OCRs.

To my knowledge:

OCRExtract | (Image of text you want to extract) | (Variable where you want to place the text)

in which I put

OCRExtract | Image | text, with text as my variable.

So by doing XType | ${text}, I’m typing in what I have OCR’d. Is this how it works? How can I test whether or not if it receives the proper value?

And secondly, the find button works fine - it can find the image with about 0.7, of what I want to extract.


(Sample of Screen)

image
(What I’m OCRing)

Thank you for your help!


#4

So, I tried using echo… and apparently it doesn’t save properly onto the variable is what I’m seeing from OCRExtract, although when I click find on OCRExtract… it works fine and can locate it. I’m using Firefox if that helps.

Side note - How does OCRExtract work if there are multiple instances of the image to be OCR’d in the same image?


#5

Better use relative Extraction with OCRExtractRelative. Use an input image like this:

gettext_dpi_120_relative

It finds the green area and uses the pink area as input for OCR.


#6

I tried using it before - It had the same problem, the OCRExtractRelative wouldn’t pick up anything - even though when I click Find, it works.
I added echo (This is ${text}) and nothing appears:

  • [status]

Playing macro 0 test

  • [info]

Executing: | XClick | 500, 700 | |

  • [info]

Executing: | OCRExtract | CaptchaR_dpi_120.png | text |

  • [info]

OCR (eng) started (1.0 KB)

  • [info]

OCR result received (0.5s)

  • [info]

Executing: | XClick | 500, 700 | |

  • [info]

Executing: | XType | ${text} | |

  • [info]

Executing: | echo | This is ${text} | |

  • [echo]

This is

  • [info]

Macro completed (Runtime 6.05s)


#7

In your log file I see only OCRExtract but it should be OCRExtractRelative - otherwise the green and pink box are not detected by Kantu.


#8

I’m using OCRExtract - There’s no pink and green boxes posted up above in my picture. Please look at my second post - thank you


#9

Ok, but you should use OCRExtractRelative - just as @ulrich suggested. It makes the image finding and text extracting more reliable.


#10

I’ve used it before - still, the problem persists. Is there something wrong inherently within my code? I can switch back, but that doesn’t solve my issue of it not working here. What seems to be the problem with my code? I don’t think it’s an issue of image finding or text extracting because they seem to work fine when I use find, as well as the log says it can accurately find it; the problem is, nothing appears when I use echo, meaning that it’s not extracting anything.


#11

Can we test this on a public website? Then I can make a short screencast to show how it works.


#12

Sure, it doesn’t really matter, I just need to know how it works, if there’s a simple of example of it extracting a text, it could even be on this forum page! That would really help,


#13

Would appreciate it sometime soon, I honestly think I’m just doing something really simple wrong :frowning:


#14

If you just need a demo on any page, have a look at DemoPDFTest_with_OCR (it ships with Kantu) and shows the use of OCRExtractRelative.

If you can share a link to the website you need it working on, I can do a demo on this specific page.


#15

Do you happen to have discord (it’s a messaging platform, that I would like to extract something from, and that’s where the website is - you need an account though)? I have looked at it and I’ve tried to copy it, I just dont really see where I went wrong with my code to be honest :frowning: If you could get it - add Test00#0619, otherwise from what you can see, I can switch the code to use relative with green and pink boxes, I just don’t understand what I did wrong basically, or why it isn’t copying