Free Republic
Browse · Search
General/Chat
Topics · Post Article

To: Theo

This is not my site. But, again, the webmaster has not expressed any objection to saving of the pages of the site for personal use, and he HAS been involved in the thread about the shutdown. Multiple members have been saving individual pages with copy/paste, etc. But even for a small forum that’s not practical to save the whole thing or maintain any structure.

It’s ALMOST as if I heard FR was shutting down and I wanted to archive it, except that FR is MUCH, MUCH bigger. (Probably 10,000x if I had to guess.)

SFAIK it’s not a WordPress type site, but I am not sure. It IS a forum.


10 posted on 06/16/2024 5:32:01 PM PDT by Paul R. (Bin Laden wanted Obama killed so the incompetent VP, Biden, would become President!)
[ Post Reply | Private Reply | To 7 | View Replies ]


To: Paul R.; Theo

If it’s a forum, 99.9% chance is has a database for storing posts. Most any website scraper/down-loader will just grab the html that a browser gets from that and save it as html pages, plus images.

I’m on Linux/Ubuntu and I’ve used httrack for that but like someone mentioned about a different tool. You have to be careful what depth of links you grab. Might need 2-3 depending on how the site works. If the forum allows embedded youtube videos it could get big unless you figure out how to filter that out.

Like everything tech the answer is, it depends on some pesky thing like variables.

Plug the url into builtwith.com and you might get an idea what it’s ‘built with’. At any rate, you’re only going to be able to scrape the html pages that get rendered for the browser.


11 posted on 06/16/2024 5:46:13 PM PDT by Pollard (Will work for high tunnel money!)
[ Post Reply | Private Reply | To 10 | View Replies ]

To: Paul R.

Go to the URL of your site and add “/wp-admin” to the end. For example, if your site is www.example.com, you would go to www.example.com/wp-admin. If you get a login page it is a wordpress site. If it is then there are wordpress plugins that do full backups.


13 posted on 06/16/2024 5:59:21 PM PDT by bankwalker (Repeal the 19th ...)
[ Post Reply | Private Reply | To 10 | View Replies ]

To: Paul R.

Paul,

I started downloading the site, and after successfully downloading a dozen or so pages, the server blocked my access with a “Code 401 (forbidden).”

The server admin / webmaster would have to tweak the server’s security settings to allow the site to be downloaded by something like SiteSucker.


23 posted on 06/17/2024 8:56:15 AM PDT by Theo (FReeping since 1997 ... drain the swamp.)
[ Post Reply | Private Reply | To 10 | View Replies ]

Free Republic
Browse · Search
General/Chat
Topics · Post Article


FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson