查看完整版本 : 抓取顯示屏顯示中的文字

fffisher 2017-10-9 11:01 AM

抓取顯示屏顯示中的文字

各位C兄,想請教有無辦法用VBS抓取顯示屏顯示中的文字去EXCEL? 想抓取數據0既對象係期權報價機(SPTrader), 希望可以做到起碼每5秒就能更新到一次, 請問有冇OBJECT 可以做到?
本人知道 SPTrader 有API可以做到DATA FEEDING, 但要簽免責聲明咁又失去好多法律保障.

fffisher 2017-10-9 11:09 AM

或者轉個發問方法, 有冇辦法o係invisible o既請況下控制mouse 去做copy & paste? 要invisible 情況下進行喺希望唔會影響同時使用電腦.

遺失帳戶 2020-5-25 08:40 AM

我有13年9月13號,19號 同27號 (d130927c, d130919c, d130913c) 0既港交所日報表files ,請問你還需要嗎?

form5 2020-5-25 09:47 PM

[quote]原帖由 [i]fffisher[/i] 於 2017-10-9 11:09 AM 發表 [url=https://computer.discuss.com.hk/redirect.php?goto=findpost&pid=469079023&ptid=26979609][img]https://computer.discuss.com.hk/images/common/back.gif[/img][/url]
或者轉個發問方法, 有冇辦法o係invisible o既請況下控制mouse 去做copy & paste? 要invisible 情況下進行喺希望唔會影響同時使用電腦. [/quote]
yes, you can, you may use the spy++ to locate the UI element in your application  and grep the text you want.

abcd5678 2020-5-26 02:36 AM

I see similar project years ago by auto screen capture and OCR.

If your text is English, it'll nearly 100% OCR accuracy.

kormer 2020-5-26 06:31 PM

[quote]原帖由 [i]abcd5678[/i] 於 2020-5-26 02:36 AM 發表 [url=https://www.discuss.com.hk/redirect.php?goto=findpost&pid=519213162&ptid=26979609][img]https://www.discuss.com.hk/images/common/back.gif[/img][/url]
I see similar project years ago by auto screen capture and OCR.

If your text is English, it'll nearly 100% OCR accuracy. [/quote]

captcha驗證碼呢D可能低D

abcd5678 2020-5-26 06:52 PM

[quote]原帖由 [i]kormer[/i] 於 2020-5-26 06:31 PM 發表 [url=https://computer.discuss.com.hk/redirect.php?goto=findpost&pid=519246635&ptid=26979609][img]https://computer.discuss.com.hk/images/common/back.gif[/img][/url]


captcha驗證碼呢D可能低D [/quote]
Captcha on a stock display service ?

kormer 2020-5-26 07:45 PM

[quote]原帖由 [i]abcd5678[/i] 於 2020-5-26 06:52 PM 發表 [url=https://www.discuss.com.hk/redirect.php?goto=findpost&pid=519247564&ptid=26979609][img]https://www.discuss.com.hk/images/common/back.gif[/img][/url]

Captcha on a stock display service ? [/quote]

沒指明什麼呢,可能login時有captcha都說不定。

form5 2020-5-28 11:48 PM

I just did a OCR application to scrape data from another  application
:smile_47::smile_47::smile_47:

[quote]
[url]https://sendeyo.com/up/d/8f30279824[/url]
[/quote]

[[i] 本帖最後由 form5 於 2020-5-28 11:49 PM 編輯 [/i]]

YjgfkHJj 2020-5-28 11:58 PM

[quote]原帖由 [i]form5[/i] 於 2020-5-28 11:48 PM 發表 [url=https://computer.discuss.com.hk/redirect.php?goto=findpost&pid=519371402&ptid=26979609][img]https://computer.discuss.com.hk/images/common/back.gif[/img][/url]
I just did a OCR application to scrape data from another  application
:smile_47::smile_47::smile_47:

[/quote]

wow, hello Neo

form5 2020-5-29 12:18 AM

:smile_47: :smile_47: :smile_47: 可能執下可以賣到錢, tesseract library好似係 apache license , 中文認字比較差,可能吾同font 都要train 下
要加d 咩功能呢, 連去 excel ?

[[i] 本帖最後由 form5 於 2020-5-29 12:36 AM 編輯 [/i]]

abcd5678 2020-5-30 01:19 PM

[quote]原帖由 [i]form5[/i] 於 2020-5-29 12:18 AM 發表 [url=https://computer.discuss.com.hk/redirect.php?goto=findpost&pid=519372548&ptid=26979609][img]https://computer.discuss.com.hk/images/common/back.gif[/img][/url]
:smile_47: :smile_47: :smile_47: 可能執下可以賣到錢, tesseract library好似係 apache license , 中文認字比較差,可能吾同font 都要train 下
要加d 咩功能呢, 連去 excel ? ... [/quote]
The best Chinese OCR now is Google Vision. Better than Abby. tesseract need to train for best result on Chinese charcter.

abcd5678 2020-5-30 01:28 PM

[quote]原帖由 [i]form5[/i] 於 2020-5-29 12:18 AM 發表 [url=https://computer.discuss.com.hk/redirect.php?goto=findpost&pid=519372548&ptid=26979609][img]https://computer.discuss.com.hk/images/common/back.gif[/img][/url]
:smile_47: :smile_47: :smile_47: 可能執下可以賣到錢, tesseract library好似係 apache license , 中文認字比較差,可能吾同font 都要train 下
要加d 咩功能呢, 連去 excel ? ... [/quote]
A bit late, Google and Ms provide document recognition on their cloud service. No programming required, just import and train. Also, there are many OpenCV project do the similar job.

Inspect moving object has a brighter future.

kormer 2020-5-30 02:35 PM

如有應用程式介面就不用理會什麼類型文字。BTW,ocr不是什麼新tech,已有了超多年。這樣的雲服務亦未必人人喜歡用。

form5 2020-5-30 03:21 PM

[quote]原帖由 [i]abcd5678[/i] 於 2020-5-30 01:28 PM 發表 [url=https://computer.discuss.com.hk/redirect.php?goto=findpost&pid=519447245&ptid=26979609][img]https://computer.discuss.com.hk/images/common/back.gif[/img][/url]
A bit late, Google and Ms provide document recognition on their cloud service. No programming required, just import and train. Also, there are many OpenCV project do the similar job.
Inspect movi ... [/quote]
好奇問下, 假如俾你做有關application , 你想做哪類產品? 或者覺得哪類產品有景? 可否詳細講下? thanks

form5 2020-5-30 03:26 PM

[quote]原帖由 [i]kormer[/i] 於 2020-5-30 02:35 PM 發表 [url=https://computer.discuss.com.hk/redirect.php?goto=findpost&pid=519450252&ptid=26979609][img]https://computer.discuss.com.hk/images/common/back.gif[/img][/url]
如有應用程式介面就不用理會什麼類型文字。BTW,ocr不是什麼新tech,已有了超多年。這樣的雲服務亦未必人人喜歡用。 [/quote]
做二次開發雲服務都可能用得到,但係我吾係好知道 有 D 乜 product 有需求,可以賣到錢?假如係 App Store 上賣的話

[[i] 本帖最後由 form5 於 2020-5-30 03:31 PM 編輯 [/i]]

abcd5678 2020-5-30 06:50 PM

[quote]原帖由 [i]kormer[/i] 於 2020-5-30 02:35 PM 發表 [url=https://computer.discuss.com.hk/redirect.php?goto=findpost&pid=519450252&ptid=26979609][img]https://computer.discuss.com.hk/images/common/back.gif[/img][/url]
如有應用程式介面就不用理會什麼類型文字。BTW,ocr不是什麼新tech,已有了超多年。這樣的雲服務亦未必人人喜歡用。 [/quote]
That means you never develop an OCR application. Accuracy of 90% and 95% can determine the success or failure of a project. You think both Google and MS would redevelop an outdated technology as a cloud service ?

As an example, can you recognize these two character "0I" ? Is it zero, OLD.  Small "letter", big "Ice" or one. For Chinese character, it is even more difficult.

To get good result on OCR, currently many using deep learning and computer vision. Like tesseract 4 use LSTM which is a kind of computer learning.

OCR and document recognition is an ever evolving technology. DMS, ECM or IIM can sell for millions.

abcd5678 2020-5-30 07:11 PM

[quote]原帖由 [i]form5[/i] 於 2020-5-30 03:21 PM 發表 [url=https://computer.discuss.com.hk/redirect.php?goto=findpost&pid=519452476&ptid=26979609][img]https://computer.discuss.com.hk/images/common/back.gif[/img][/url]

好奇問下, 假如俾你做有關application , 你想做哪類產品? 或者覺得哪類產品有景? 可否詳細講下? thanks [/quote]
If you want to apply government fund, apply deep learning on your project is a selling point. Add some specific hardware is even better.

kormer 2020-5-30 10:24 PM

[quote]原帖由 [i]abcd5678[/i] 於 2020-5-30 06:50 PM 發表 [url=https://www.discuss.com.hk/redirect.php?goto=findpost&pid=519462017&ptid=26979609][img]https://www.discuss.com.hk/images/common/back.gif[/img][/url]

That means you never develop an OCR application. Accuracy of 90% and 95% can determine the success or failure of a project. You think both Google and MS would redevelop an outdated technology as a c ... [/quote]

你說的這些tech已有數年,想指出的是,試想想如果企業逐漸paperless,那裡來document需要做OCR呢? paperless即所有東西已數據化了,用應用程式介面就可以,達100%。最近還有H兄建議的其他方法做學習,可大量減少所需的training data,可能才是未來新方向,或比現在用的更準。

abcd5678 2020-5-31 12:56 AM

[quote]原帖由 [i]kormer[/i] 於 2020-5-30 10:24 PM 發表 [url=https://computer.discuss.com.hk/redirect.php?goto=findpost&pid=519472062&ptid=26979609][img]https://computer.discuss.com.hk/images/common/back.gif[/img][/url]


你說的這些tech已有數年,想指出的是,試想想如果企業逐漸paperless,那裡來document需要做OCR呢? paperless即所有東西已數據化了,用應用程式介面就可以,達100%。最近還有H兄建議的其他方法做學習,可大量減少所需的training data,可能才是未來新方向,或比現在用的更準。 ... [/quote]
OCR is NOT ONLY on documents. Also on photo, video, CCTV surveillance, etc. Such as license plate identification on car park.

Besides, the world has more paper than ever before due to advance of printer.

The biggest employer in Hong Kong demand for OCR is the government. Every OCR project paid for as least 2M, not including monthly paid IT staff.

kormer 2020-5-31 12:22 PM

[quote]原帖由 [i]abcd5678[/i] 於 2020-5-31 12:56 AM 發表 [url=https://www.discuss.com.hk/redirect.php?goto=findpost&pid=519478342&ptid=26979609][img]https://www.discuss.com.hk/images/common/back.gif[/img][/url]

OCR is NOT ONLY on documents. Also on photo, video, CCTV surveillance, etc. Such as license plate identification on car park.

Besides, the world has more paper than ever before due to advance of ... [/quote]

如說對應photo,video這些unstructured的數據類型,例如資訊分割等等,現在流行的技術就“NOT ONLY OCR”了。

總括來說OCR,印象中bfsi行業應該是最大戶,“支票”文件仍是有需要存在,但現時已有非文件的電子方法。

abcd5678 2020-5-31 01:36 PM

[quote]原帖由 [i]kormer[/i] 於 2020-5-31 12:22 PM 發表 [url=https://computer.discuss.com.hk/redirect.php?goto=findpost&pid=519493943&ptid=26979609][img]https://computer.discuss.com.hk/images/common/back.gif[/img][/url]


如說對應photo,video這些unstructured的數據類型,例如資訊分割等等,現在流行的技術就“NOT ONLY OCR”了。

總括來說OCR,印象中bfsi行業應該是最大戶,“支票”文件仍是有需要存在,但現時已有非文件的電子方法。 ... [/quote]
Check the Government Technology Voucher Program in the past 5 year, over 50% project is DMS. OCR is the core part of a DMS.

Why Google and MS jump on ? Because they see the market and demand. ABBY or OmniPage OCR SDK sold for USD10,000 or above.

kormer 2020-5-31 03:15 PM

[quote]原帖由 [i]abcd5678[/i] 於 2020-5-31 01:36 PM 發表 [url=https://www.discuss.com.hk/redirect.php?goto=findpost&pid=519497864&ptid=26979609][img]https://www.discuss.com.hk/images/common/back.gif[/img][/url]

Check the Government Technology Voucher Program in the past 5 year, over 50% project is DMS. OCR is the core part of a DMS.

Why Google and MS jump on ? Because they see the market and demand. AB ... [/quote]

沒研究你說的tvp數據,因文件有不同種類的,用DMS未必等於需要和應用OCR。OCR本身沒什麼特別之處,有其他配套才能發揮營運效益,多年來已有很多即用商業企業軟件特別針對這方面推出市場,選雲與否有很多考量的。

abcd5678 2020-5-31 09:30 PM

[url=https://www.google.com/search?client=firefox-b-d&sxsrf=ALeKk02gylR0XiMFpJrGZDncI0kpy1wabQ:1590931775646&q=you+can]you [b][i]can't[/i][/b] wake a person who is pretending to be asleep[/url]

鄉貢仁 2020-6-6 12:20 AM

Try referring to 14 yrs ago, there was a Hollywood movie in 2006 which would give you some hints.

[attach]11234456[/attach]

abcd5678 2020-6-6 02:31 AM

[quote]原帖由 [i]鄉貢仁[/i] 於 2020-6-6 12:20 AM 發表 [url=https://computer.discuss.com.hk/redirect.php?goto=findpost&pid=519773776&ptid=26979609][img]https://computer.discuss.com.hk/images/common/back.gif[/img][/url]
Try referring to 14 yrs ago, there was a Hollywood movie in 2006 which would give you some hints.

11234456 [/quote]
It 's good to see somebody still remember this movie.

It give me ideas about the mentioned project. We just changed the scanner to auto screen capture.

BTW, the screen scroll too fast, the handheld scanner is not working. We tried this for fun.

form5 2020-6-12 11:38 PM

Movie Firewall , Netflix有得睇:lol

howevera 2020-6-15 09:03 AM

[quote]原帖由 [i]abcd5678[/i] 於 2020-5-26 02:36 AM 發表 [url=https://computer.discuss.com.hk/redirect.php?goto=findpost&pid=519213162&ptid=26979609][img]https://computer.discuss.com.hk/images/common/back.gif[/img][/url]
I see similar project years ago by auto screen capture and OCR.

If your text is English, it'll nearly 100% OCR accuracy. [/quote]

我都係諗倒類似 logic 先做倒樓主所講既野. i.e. 寫個 program capture screen + OCR.

問題係樓主又要唔阻住 user 同時使用電腦, 咁電腦個 screen 唔保正一定係想要 capture 果個網站版面.

alee001 2020-6-15 01:28 PM

其實如果有方法讀出瀏覽器內部資料就可以即時同步更新數據...
[attach]11265677[/attach]

abcd5678 2020-6-15 06:34 PM

[quote]原帖由 [i]alee001[/i] 於 2020-6-15 01:28 PM 發表 [url=https://computer.discuss.com.hk/redirect.php?goto=findpost&pid=520227992&ptid=26979609][img]https://computer.discuss.com.hk/images/common/back.gif[/img][/url]
其實如果有方法讀出瀏覽器內部資料就可以即時同步更新數據...
11265677 [/quote]
It depends whether or not the service is though http or not. If use http, then it is simpler.

Browser is not required. The simplest is use curl in linux/unix to get the html pages, then filter the required data.

Many target services has limited access count on particular period. Use a program might be better, since you might need to inject user name, password, etc.
頁: [1] 2
查看完整版本: 抓取顯示屏顯示中的文字