How To Export a Wordpress Blog to Text Files

The next couple of posts will be about some behind-the-scenes work I've been doing on the blog. These are mainly a way for me to record what I did for my own reference, but others may also find it interesting.

This all started because of a brute-force attack on the Walter and Ina site. Regular readers will recall that I transcribed a trove of love letters my grandparents wrote to each other during their long relationship, and that site is the result. Since the letters are all online now and the hard copies are archived at Johns Hopkins University with the rest of my grandfather's papers, I haven't been - and won't be - adding anything else to the site. Unfortunately, I still had to keep updating the Wordpress installation on it to keep it secure. That seemed silly. Worse, some hacker's script happened upon the login page for it last week, and began a dictionary attack to guess the administrative password. They didn't get in because I use strong passwords, but the constant reloading of the login page triggered a warning from my hosting provider that I was using too much CPU time on the server. Ugh.

What I really wanted to do was simply archive the site, keeping it online but doing away with Wordpress. I found a nifty tool for it: Site Sucker. It's a $4.99 Mac application that does exactly what it says. About ten minutes after discovering it, I had it chewing away on walterandina.com. The site is pretty big, but after another 20 minutes or so I had a folder on my hard drive with the whole thing stored as static HTML and image files. Uploading those to the server, moving the old Wordpress site out of the way, and making a couple of changes to the .htaccess file yielded a perfect copy of the old site, but in static HTML and CSS - the closest thing the internet has to an archival format. I'll never need to update Wordpress on that site again, the pages load an order of magnitude faster, and there's no login page to hack.

The Googling that led me to Site Sucker also put me on the trail of some other tools, and eventually I discovered that I could export my other Wordpress blogs' posts and pages into a neat pile of plain text files. That sounded great.

Obviously there are lots of ways to export a blog, including the built-in "Export" tool in Wordpress, but those are aimed at moving your content to another blogging engine. What I wanted - what I've wanted for quite some time, actually - was an easy way to access all of my blog posts outside of Wordpress. All of the articles I've written professionally are on my hard drive for me to search and reopen whenever I like, and I wanted my blog archive to live in that environment, too.

After a few false starts, here's the procedure I finally discovered.

  1. Download the Jekyll-export plugin for Wordpress, and install and activate it on your blog.

  2. Select Export to Jekyll from the Tools menu in your dashboard. Don't worry about who Jekyll is, we're just going to use his exporter.

  3. Wait. It may take quite awhile for the exporter to process your whole blog, depending on how many posts you have. Mine took almost a half-hour. At the end of the process you'll be prompted to download a file. Open it, and your posts and pages should all be in there, with file names based on the dates of the posts.

You'll notice the files are all in .md format. That's Markdown, and if you're not familiar with it you should go read the description. It's an excellent way to write. Tell your operating system to open all .md files with your favorite text editor. One of the many great things about Markdown is that it's made of plain, human-readable text.

The files all have headers at the top that are designed to work with the Jekyll static site generator. I'll talk about static site generators in another post.

Explore

Subscribe