TGTGInsighttelegram intelligenceLIVE / telegram public index
← Libreware
Libreware avatar

TGINSIGHT POST

Post #1239

@libreware

Libreware

Views4,540Post view count
PostedFeb 2702/27/2024, 04:46 AM
Post content

Post content

ArchiveBox Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more… https://archivebox.io https://github.com/ArchiveBox/ArchiveBox ArchiveBox is a powerful, self-hosted internet archiving solution to collect, save, and view websites offline. Without active preservation effort, everything on the internet eventually dissapears or degrades. Archive.org does a great job as a centralized service, but saved URLs have to be public, and they can’t save every type of content. ArchiveBox is an open source tool that lets organizations & individuals archive both public & private web content while retaining control over their data. It can be used to save copies of bookmarks, preserve evidence for legal cases, backup photos from FB/Insta/Flickr or media from YT/Soundcloud/etc., save research papers, and more… Once installed, it can be used as a CLI tool, self-hosted Web App, Python library, or one-off command. It saves snapshots of the URLs you feed it in several redundant formats. It also detects any content featured inside pages & extracts it out into a folder: 🌐HTML/Any websites➡️ original HTML+CSS+JS, singlefile HTML, screenshot PNG, PDF, WARC, title, article text, favicon, headers, … 🎥Social Media/News➡️ post content TXT, comments, title, author, images, … 🎬YouTube/SoundCloud/etc. ➡️ MP3/MP4s, subtitles, metadata, thumbnail, … 💾Github/Gitlab/etc. links ➡️ clone of GIT source code, README, images, … ✨and more, see Output Formats below…