#python#document_analysis#layout_analysis#ocr#parser#pdf#pdf_converter#pdf_parser#python#vlm_ocr
Dolphin is a smart AI tool that can analyze and understand complex document images, like pages with text, tables, formulas, and pictures. It works in two steps: first, it figures out the layout and reading order of the page; then, it quickly parses each element using special prompts. This makes it fast and accurate for turning document images into structured data like JSON or Markdown. You can use pre-trained models and easy code to process single pages, PDFs, or specific elements. This helps you save time and effort when extracting information from complicated documents efficiently.
https://github.com/bytedance/Dolphin
Image to Text OCR is a utility website made by Alejandro Akbal for extracting text from any image using #OCR.
This tool was made for those moments where you take a photo of some text and wish you could have it digitally.
https://github.com/AlejandroAkbal/Image-to-Text-OCR
Online: https://image-to-text-ocr.netlify.app/
OSS Document Scanner
Android Open Source app to #scan all your #documents. You either scan using your camera or by importing an image. The app will automatically detect you document within the photo and will crop the image.
Once the document is created you can detect text within the document using #OCR.
You can also share your document as a #PDF. If you want you can synchronize the app data with a webdav server (like nextloud) to never loose anything!
https://github.com/Akylas/com.akylas.documentscanner
https://apt.izzysoft.de/fdroid/index/apk/com.akylas.documentscanner
https://github.com/tesseract-ocr/tesseract
This package contains an #OCR (Optical character recognition) engine - libtesseract and a command line program - tesseract.
The lead developer is Ray Smith. The maintainer is Zdenko Podobny. For a list of contributors see AUTHORS and github's log of contributors.
#Tesseract has unicode (UTF-8) support, and can recognize more than 100 languages "out of the box". It can be trained to recognize other languages. See Tesseract Training for more information.
Tesseract supports various output formats: plain-text, hocr(html), pdf.
This project does not include a GUI application. If you need one, please see the 3rdParty wiki page.
You should note that in many cases, in order to get better OCR results, you'll need to improve the quality of the image you are giving Tesseract.
Version 3.10 of the legendary programming language is now here: https://www.python.org/downloads/release/python-3100
No rush to update, though. #Python
#Python is the main language of data science, per this analysis on 10M Jupyter Notebooks: https://blog.jetbrains.com/datalore/2020/12/17/we-downloaded-10-000-000-jupyter-notebooks-from-github-this-is-what-we-learned/
https://simpleisbetterthancomplex.com/2015/11/23/small-open-source-django-projects-to-get-started.html
Small Open-Source Django Projects to Get Started
Learning #Django and #Python can be very fun. I personally love programming with Python and for the most part, work with the Django framework. But in the beginning some stuff can be confusing, especially if you are coming from a Java or C♯ background, like me.