- My Forums
- Tiger Rant
- LSU Recruiting
- SEC Rant
- Saints Talk
- Pelicans Talk
- More Sports Board
- Fantasy Sports
- Golf Board
- Soccer Board
- O-T Lounge
- Tech Board
- Home/Garden Board
- Outdoor Board
- Health/Fitness Board
- Movie/TV Board
- Book Board
- Music Board
- Political Talk
- Money Talk
- Fark Board
- Gaming Board
- Travel Board
- Food/Drink Board
- Ticket Exchange
- TD Help Board
Customize My Forums- View All Forums
- Show Left Links
- Topic Sort Options
- Trending Topics
- Recent Topics
- Active Topics
Started By
Message
Scraping a site that isn't accessible to find old files
Posted on 12/8/23 at 8:40 am
Posted on 12/8/23 at 8:40 am
Long shot trying to see if anyone has done this before. I have a site I am trying to get files off of, but the site has been down for a few years. I believe this is due to PHP upgrades and the site being built off of deprecated PHP software. I have found that pictures and PDF links still work from going to wayback machine and finding old examples of the site and all the files still seem to be accessible through direct links. The links to downloads of the files I want though go through php commands that no longer work or point to the file structure where the files are at. Trying to figure out a way to scrape the site to see if I can find all of these files and get them. Anyone with experience with this?
Posted on 12/8/23 at 10:26 am to Dam Guide
i would imagine if the site is down then the server is as well. in which case your calls to the address to download those files won't work. what is the response from clicking those links or what is a link?
Posted on 12/8/23 at 12:19 pm to Fat Batman
quote:
i would imagine if the site is down then the server is as well. in which case your calls to the address to download those files won't work. what is the response from clicking those links or what is a link?
Server is very much still up, I think this is a case where the site owner is using a shared server host that updated php on them and they don’t know how or there isn’t an updated for their repository software to work with the updated version of php. All the direct links to PDFs and pictures that were on the site all still work. There are some zip files stored somewhere in directories that aren’t public facing but going through php to download that I want to find, but the site isn’t working correctly and I can’t even grab the commands going through php as the whole site is 500ing out.
Popular
Back to top
