![]() Added more unit tests to releasing new versions Add `use_batching` config option to allow disabling of batching for those for which this causes performance issues Add `num_threads` config option to allow manual setting of number of threads Added some log text that will appear when invalid notes are encountered during a processing run Fixed issue causing a crash in anki versions > 2.1.40 Change in the way note ID's are processed, no longer limited to 1000 cards Split out `tessdata` to its own folder, allowing easier installation of new languages Added bundled tesseract for Mac, no longer any need to install it separately Hotfix to include accidentally gitignored tesseract mac libs ![]() Fix error on some Linux environments, see, thanks to user thiswillbeyourgithub for the fix! Fix raising of KeyError when img src is not found, thanks for the fix! Drop support for Anki versions prior to 2.1.41, but it should still work. Other small fixes to support Anki 2.1.41 and beyond ![]() Improved exception display to end user if processing fails unexpectedly, adding debug info Attempting to fix error where image does not exist Updated build script for tesseract for mac Updating readme with link to language data Removing Chinese, German, French and Spanish language data to reduce filesize Note that for versions of anki prior to 2.1.41, the addon is locked to AnkiOCR version 0.5.3, due to breaking changes in the Anki API If you have examples that you think should have been processed, please raise a GitHub issue so I can look into it Images with differently sized text, and/or images with low resolution text, may not process properly. Will not work with handwritten text, this probably wont change as the library its based on is not optimised for handwritten text If you want to add new languages, you need to download the appropriate language data from here. If you wish to have the OCR data outputted to a separate 'OCR' field on the note, which will modify your note types in your deck, you can set the `text_output_location` config option to `new_field` If you want to remove the OCR data from any notes, select them and then use the "Remove OCR data from selected notes" option in the menu shown above After processing, each of the images in the note will have the ocr data embedded in the `title` html tag, viewable as a tooltip:Ĥ. On the toolbar at the top, select 'Cards', then 'AnkiOCR', and select 'Run AnkiOCR on selected notes', as shown belowģ. Use the search bar at the top, select tags, decks, etc.Ģ. Open the card browser and select the note(s) you want to process. This program is distributed in the hope that it will be useful but WITHOUT ANY WARRANTY.ġ. If you're on Linux carefully follow the instructions here If you're on Windows or Mac, teseract is bundled with the addon This is currently in beta stage, please submit a bug report on GitHub if bugs are found, or you want to raise a feature request.ĪnkiOCR depends on the Tesseract OCR library. Multilanguage support via configuration of detection languages The aim of this addon was to generate searchable text for image-heavy notes, it is not intended to produce high quality, perfectly ordered text! Note that this is only designed for computer generated text, not handwritten. Anki 2.1 addon to generate OCR text from images inside of Anki notes/cards.
0 Comments
Leave a Reply. |