Make an old PHP website static
At iMAL we have a couple of old websites running an old version of Wordpress. Because:
- there's no point in keeping those as dynamic websites
- because these old Wordpresses don't work well on recent PHP versions
- because old Wordpress installs are insecure
it seems like a good idea to turn those into static html webpages.
The workflow needs to be improved, but here's how I'm going it now:
PHP4
The first thing is to make a local copy of the website files & DB, and to run them on a local Apache + MySQL + PHP server supporting PHP4. I haven't found an easy *nix way to do it, but on Windows you can install XAMP 1.6.8: https://sourceforge.net/projects/xampp/files/XAMPP%20Windows/1.6.8/
XAMP (and its cousins WAMP and MAMP) tend to run a DB without password, so you might need to change the configuration of your website/cms accordingly.
In the specific case of this Wordpress install, the home and/or siteurl were stored somewhere, so loading the website from the local server woud redirect me online. I fixed this with defining home and siteurl either directly in the DB (in the _options table) or through
update_option( 'siteurl', 'http://localhost/mylocalfolder' ); update_option( 'home', 'http://localhost/mylocalfolder' );
in the functions.php of the theme.
Downloading a static copy from the local server
To download the various pages of the website as HTML documents, I used wget (via Bash on Ubuntu on Windows that allows you to run a Bash environment on Windows 10). HTTrack is also sometimes recommended, but I didn't get it to work.
wget -E -r -l 10 -p -N -F --restrict-file-names=windows -nH http://localhost/myoldwordpress/?page_id=0
I explicitly pointed it to the url of the first page - that was the page displayed at http://localhost/myoldwordpress/ .. just in case.
In the case of this Wordpress, it got me a list of html files like this:
Making the urls relative
This Wordpress was using absolute urls - and therefore on my local server, all urls were written as "http://localhost/myoldworpress/*".
wget has a mode to rewrite urls in the html files it creates, but I didn't get it to work. Instead, I just made a search & replace on all these html files (using Notepad++ in my case) to make the urls relative ("blabla.html" instead of "http://localhost/myoldworpress/blabla.html")
Or the command-line way:
find /staticsitefolder -type f -print0 | xargs -0 sed -i 's/http\:\/\/localhost\/php4site\///g'
Redirecting requests to the html documents
So, links from the menus or from the outside point to addresses like http://localhost/myoldwordpress/?page_id=0.
We want to redirect these requests to page_id=0.html. We can do that with a RewriteRule
- https://www.digitalocean.com/community/tutorials/how-to-set-up-mod_rewrite
- http://httpd.apache.org/docs/current/rewrite/
- https://www.addedbytes.com/articles/for-beginners/url-rewriting-for-begi...
Here's what's working for me now - this can surely be improved
DirectoryIndex page_id=0.html Options -Indexes RewriteEngine On RewriteBase /myoldworpress/ RewriteCond %{QUERY_STRING} ^(.*)=(.*)$ [NC] RewriteRule ^(.*) %1=%2.html [NC,L]
Cleaning
If you wish, after turning your site static, you can remove the php file.
List them first with
find . -name "*.php" -type f
Check that the list is correct, and delete them with
find . -name "*.php" -type f -delete
but please USE THIS WITH CARE (and archive your whole site, including PHP files, somewhere, first!)
Info
Difficulty: ●●●●○
Last updated: May 2017