TGTGInsighttelegram intelligenceLIVE / telegram public index
← GitHub Trends

TGINSIGHT SIMILAR POSTS

Find similar content

Source channel @githubtrending · Post #15163 · Sep 24

#python#document_analysis#layout_analysis#ocr#parser#pdf#pdf_converter#pdf_parser#python#vlm_ocr Dolphin is a smart AI tool that can analyze and understand complex document images, like pages with text, tables, formulas, and pictures. It works in two steps: first, it figures out the layout and reading order of the page; then, it quickly parses each element using special prompts. This makes it fast and accurate for turning document images into structured data like JSON or Markdown. You can use pre-trained models and easy code to process single pages, PDFs, or specific elements. This helps you save time and effort when extracting information from complicated documents efficiently. https://github.com/bytedance/Dolphin

Results

1 similar post found

Search: #pdfanalysis

当前筛选 #pdfanalysis清除筛选
Libreware

@libreware · Post #1330 · 09/05/2024, 08:53 PM

Interactive PDF Analysis (also called IPA) allows any researcher to explore the inner details of any PDF file. PDF files may be used to carry malicious payloads that exploit vulnerabilities, and issues of PDF viewer, or may be used in phishing campaigns as social engineering artefacts. The goal of this software is to let any analyst go deep on its own the PDF file. Via IPA, you may extract important payload from PDF files, understand the relationship across objects, and infer elements that may be helpful for triage of malicious or untrusted payloads. IPA/README.md at main · seekbytes/IPA · GitHub #PDF#PDFanalysis#Malware#Security