WordPress Static Site Generator – Categories, Tags, Feeds, Sitemap

WordPress Static Site Generator This is a continuation of “WordPress Static Site Generator – The PHP Script for Pages and Posts“.

If you don’t want categories or tags, you can skip them. Both sections are almost the same.

The feeds (only two of them) are static like everything else and they’re named a little differently. They still work well whether they’re redirected to FeedBurner or not.

Categories and Tags

As I mentioned in the previous article, categories and tags use a dash instead of a trailing slash. The plugin that adds an extension only does so with pages. I have to do it manually, here, and that’s why you see “$ext” in the loops.

Here’s the category section:

$categories = get_terms(array('taxonomy' => 'category', 'orderby' => 'name', 'hide_empty' => true));
foreach ($categories as $cat) {
  echo '.'; // progress indicator continues
  $posts_array = get_posts(array('posts_per_page' => -1, 'category_name' => "$cat->slug" , 'orderby' => 'date', 'order' => 'DESC', 'post_type' => 'post', 'post_status' => 'publish'));
  $new_content  = '<article class="group page type-page status-publish hentry">' . "n";
  $new_content .= '<div class="entry themeform">' . "n";
  $new_content .= 'This is a list of all the articles filed under the "' . $cat->name . '" category, in reverse date order.
';
  $new_content .= '<ul>';
  foreach ($posts_array as $post) {
    $a = explode(' ', $post->post_date);
    $new_content .= '<li><a href="' . $real_site_url . $post->post_name . '.html">' . $post->post_title . '</a> (' . $a[0] . ')</li>';
  }
  $new_content .= '</ul>' . "n";
  $new_content .= '</div>' . "n";
  $new_content .= '</article>';
  $page = $saved; // (based on about page)
  $page = str_replace($site_url, $real_site_url, $page);
  $page = str_replace('<title>About RTCXpression</title>', '<title>Category: ' . $cat->name . ' - RTCXpression</title>', $page);
  $page = str_replace($real_site_url . 'about.html', $real_site_url . 'category-' . $cat->slug . $ext, $page);
  $page = str_replace('<h1>About RTCXpression</h1>', '<h1>Category: ' . $cat->name . '</h1>', $page);
  $page = my_str_replace('<article class', '</article>', $new_content, $page);
  $temp = explode("n", $page);
  $page = '';
  foreach ($temp as $line) {
    if (strstr($line, '<meta name="desc')) continue;
    if (strstr($line, '<meta property')) continue;
    if (strstr($line, '>About</a>')) $line = '<li id="menu-item-3160" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-3160"><a href="https://www.rtcx.net/about.html">About</a></li>';
    if (stristr($line, '<link rel="canonical"')) $line = '<meta name="robots" content="noindex,follow">' . "n" . $line;
    $page .= $line . "n";
  }
  $page = trim($page) . "n";
  $file = $pagesdir . '/category-' . $cat->slug . $ext;
  if ($pagesdir == $master) {
    my_file_put_contents($file, $page, $user, $group);
  } else {
    if (file_exists($master . '/category-' . $cat->slug . $ext)) {
      $master_page = file_get_contents($master . '/category-' . $cat->slug . $ext);
    } else {
      $master_page = '';
    }
    if ($page != $master_page) {  // compare new file (in memory) to original file
      my_file_put_contents($new . '/category-' . $cat->slug . $ext, $page, $user, $group);
    }
  }
}


Here’s the tag section:

$tags = get_terms(array('taxonomy' => 'post_tag', 'orderby' => 'name', 'hide_empty' => true));
foreach ($tags as $tag) {
  echo '.'; // progress indicator continues
  $posts_array = get_posts(array('posts_per_page' => -1, 'orderby' => 'data', 'order' => 'DESC', 'post_status' => 'publish', 'tax_query' => array(array('taxonomy' => 'post_tag', 'field' => 'slug', 'terms' => $tag))));
  $new_content  = '<article class="group page type-page status-publish hentry">' . "n";
  $new_content .= '<div class="entry themeform">' . "n";
  $new_content .= 'This is a list of all the articles filed under the "' . $tag->name . '" tag, in reverse date order.
';
  $new_content .= '<ul>';
  foreach ($posts_array as $post) {
    $a = explode(' ', $post->post_date);
    $new_content .= '<li><a href="' . $real_site_url . $post->post_name . '.html">' . $post->post_title . '</a> (' . $a[0] . ')</li>';
  }
  $new_content .= '</ul>' . "n";
  $new_content .= '</div>' . "n";
  $new_content .= '</article>';
  $page = $saved; // (based on about page)
  $page = str_replace($site_url, $real_site_url, $page);
  $page = str_replace('<title>About RTCXpression</title>', '<title>Tag: ' . $tag->name . ' - RTCXpression</title>', $page);
  $page = str_replace($real_site_url . 'about.html', $real_site_url . 'tag-' . $tag->slug . $ext, $page);
  $page = str_replace('<h1>About RTCXpression</h1>', '<h1>Tag: ' . $tag->name . '</h1>', $page);
  $page = my_str_replace('<article class', '</article>', $new_content, $page);
  $temp = explode("n", $page);
  $page = '';
  foreach ($temp as $line) {
    if (stristr($line, '<meta name="desc')) continue;
    if (stristr($line, '<meta property')) continue;
    if (stristr($line, '>About</a>')) $line = '<li id="menu-item-3160" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-3160"><a href="https://www.rtcx.net/about.html">About</a></li>';
    if (stristr($line, '<link rel="canonical"')) $line = '<meta name="robots" content="noindex,follow">' . "n" . $line;
    $page .= $line . "n";
  }
  $page = trim($page) . "n";
  $file = $pagesdir . '/tag-' . $tag->slug . $ext;
  if ($pagesdir == $master) {
    my_file_put_contents($file, $page, $user, $group);
  } else {
    if (file_exists($master . '/tag-' . $tag->slug . $ext)) {
      $master_page = file_get_contents($master . '/tag-' . $tag->slug . $ext);
    } else {
      $master_page = '';
    }
    if ($page != $master_page) {  // compare new file (in memory) to original file
      my_file_put_contents($new . '/tag-' . $tag->slug . $ext, $page, $user, $group);
    }
  }
}

Like the 404 page, I remove the original meta description, the original Open Graph tags and replace the “canonical” line with a meta robots line in each file. Category and tag pages should be followed, but not indexed.

These sections produce lists. You can click on the category at the top of the page, or one of the tags after the article to see what they look like.

Static Feeds

The RSS version is named “feed.rss”. The ATOM version is named “feed.atom”. The only URL you need to display on your site is the RSS feed.

$queue = array();
$queue[] = 'feed.atom';
$queue[] = 'feed.rss';
foreach($queue as $slug) {
  echo "."; // progress indicator continues
  $url = $site_url . 'feed';
  if ($slug == 'feed.atom')
    $url .= '/atom';
  $feed = file_get_contents($url);
  $feed = str_replace($site_url, $real_site_url, $feed);
  $feed = str_replace(trim($site_url, '/'), trim($real_site_url, '/'), $feed);
  if ($slug == 'feed.atom') {
    $feed = str_replace($real_site_url . 'feed/atom', $real_site_url . 'feed.atom', $feed);
  } else {
    $feed = str_replace($real_site_url . 'feed', $real_site_url . 'feed.rss', $feed);
  }
  $feed = str_replace('type="application/rss+xml">', 'type="application/rss+xml"></atom:link>', $feed);
  $feed = str_replace($real_site_url . 'wp-atom.php', $real_site_url, $feed);
  if (substr($feed, 0, 4) == '1.0"') $feed = '<?xml version="' . $feed;
  $file = $pagesdir . '/' . $slug;
  if ($pagesdir == $master) {
    my_file_put_contents($file, $slug, $user, $group);
  } else {
    if (file_exists($master . '/' . $slug)) {
      $master_page = file_get_contents($master . '/' . $slug);
    } else {
      $master_page = '';
    }
    if ($feed != $master_page) {  // compare new file (in memory) to original file
      my_file_put_contents($new . '/' . $slug, $feed, $user, $group);
    }
  }
}

The Sitemap

The sitemap only lists pages and posts.

echo "."; // progress indicator continues
date_default_timezone_set($timezone);
$today = date('Y-m-d');
$results = get_posts(array('numberposts' => -1, 'orderby' => 'modified', 'post_type' => array('post','page')));
$text = '<?xml version="1.0" encoding="UTF-8"?>' . "n" .
  '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">' . "n" .
  "t<url>n" .
  "tt<loc>$real_site_url</loc>n" .
  "tt<lastmod>$today</lastmod>n" .
  "tt<changefreq>monthly</changefreq>n" .
  "t</url>n";
foreach ($results as $post) {
  $slug = $post->post_name;
  if (strstr($slug, '?')) continue;
  $a = explode(' ', $post->post_modified);
  $postdate = $a[0];
  $text .= "t" . '<url>' . "n" .
    "tt<loc>$real_site_url$slug$ext</loc>n" .
    "tt<lastmod>$postdate</lastmod>n" .
    "tt<changefreq>daily</changefreq>n" .
    "t</url>n";
}
$text .= '</urlset>' . "n";
$file = $pagesdir . '/sitemap.xml';
if ($pagesdir == $master) {
  my_file_put_contents($file, $text, $user, $group);
} else {
  if (file_exists($master . '/sitemap.xml')) {
    $master_page = file_get_contents($master . '/sitemap.xml');
  } else {
    $master_page = '';
  }
  if ($text != $master_page) {  // compare new file (in memory) to original file
    my_file_put_contents($new . '/sitemap.xml', $text, $user, $group);
  }
}
echo "n"; // progress indicator ends
echo "nDone!nn";
?>

And that’s the end of the script. Even with repetition, it still weighs in at only 264 lines.

Articles in this Series

This is a list of all the articles in this series. You should read each article in the order they’re presented. You could miss something important if you skip around.

Share this: