Page 1
Page 1
Started By
Message

Scraping a site that isn't accessible to find old files

Posted on 12/8/23 at 8:40 am
Posted by Dam Guide
Member since Sep 2005
16268 posts
Posted on 12/8/23 at 8:40 am
Long shot trying to see if anyone has done this before. I have a site I am trying to get files off of, but the site has been down for a few years. I believe this is due to PHP upgrades and the site being built off of deprecated PHP software. I have found that pictures and PDF links still work from going to wayback machine and finding old examples of the site and all the files still seem to be accessible through direct links. The links to downloads of the files I want though go through php commands that no longer work or point to the file structure where the files are at. Trying to figure out a way to scrape the site to see if I can find all of these files and get them. Anyone with experience with this?
Posted by Fat Batman
Gotham City, NJ
Member since Oct 2019
1555 posts
Posted on 12/8/23 at 10:26 am to
i would imagine if the site is down then the server is as well. in which case your calls to the address to download those files won't work. what is the response from clicking those links or what is a link?
Posted by Dam Guide
Member since Sep 2005
16268 posts
Posted on 12/8/23 at 12:19 pm to
quote:

i would imagine if the site is down then the server is as well. in which case your calls to the address to download those files won't work. what is the response from clicking those links or what is a link?


Server is very much still up, I think this is a case where the site owner is using a shared server host that updated php on them and they don’t know how or there isn’t an updated for their repository software to work with the updated version of php. All the direct links to PDFs and pictures that were on the site all still work. There are some zip files stored somewhere in directories that aren’t public facing but going through php to download that I want to find, but the site isn’t working correctly and I can’t even grab the commands going through php as the whole site is 500ing out.
first pageprev pagePage 1 of 1Next pagelast page
refresh

Back to top
logoFollow TigerDroppings for LSU Football News
Follow us on X, Facebook and Instagram to get the latest updates on LSU Football and Recruiting.

FacebookXInstagram