WordPress and Hostname Problems

One of the problems with large-scale development, staging, production (DSP) deployment models is the WordPress database. WordPress takes over the network routing (URL processing) as soon as you connect to a service running the WordPress application. One of the biggest problems with the WordPress design is the hundreds of ways WordPress or themes and plugins mangle the URLs. Since the entire selling point of WordPress is the extensive plugin and theme system with mostly unvetted code is that there is NO STANDARD an URL routing. That means the URL you typed in to get to the WordPress site is often munged into whatever is stored in the underlying WordPress Database.

This is a major problem for DSP deployments. In a standard DSP deployment you often have a mirror of the production environment in staging and development environments, however for proper testing these use different hostnames. The hostname is the first part of the URL, and WordPress often grabs this from the database settings in multiple places. The wp_options table with site url and home settings is one. On multisite all the blog table entries is another. In addition many plugins and themes store fully-qualified URLS in metadata strings instead of relations URLS; which begs the question – why have a standard for storing the base host + domain + TLD and then allow all on-site references to store a full URL?

Enough about the griping, we need solutions.

The Large Data Migration Problem

The main issue with a standard WordPress DSP deployment with large data is the “home URL”. This is the main URL that WordPress things the site will be running at and, first thing, it will redirect any incoming request from a browser that brings up the WordPress app to that URL. That means if you copy your production database over to staging for testing, anyone surfing to staging.storelocatorplus.com will end up being routed back to production.storelocatorplus.com.

WP CLI Quick Fix & Problems

The quick fix is to use WP CLI and the search-replace tool to replace the domain name. When used properly you can change all references in all tables that have production.storelocatorplus.com => staging.storelocatorplus.com after you’ve copied the latest production data to staging.

The problem with this solution is that to deal with all those rogue plugins and themes writing full URLS anywhere in the database, often including huge JSON strings, is you need a full text replace which is slow as heck on a normal “cat blog” website. it is painfully slow on a large database and impossible to use in rapid turnaround production environments on a large SaaS application like Store Locator Plus®. Doing this process on a multisite install with thousands of tables, even on a moderate subset of 6GiB of data, literally takes HOURS on a 64GB / 16-core dedicated server posting against a XXLarge RDS cluster. It is useless.

The Large Data Copy Problem

The other issue with a large multisite SaaS installation is the database replication issue. The production database is updated constantly, so the 6GiB+ of data is out-of-date within minutes of replication. While that is not an issue, on a two-week heartbeat for our dev cycle that means our staging (testing) data set is far enough out-of-sync with production that it is closer to a 90% representation of live data testing. We want to do better. When we start testing our new version we want to have 99%+ coverage of real production data testing.

At the start of each testing lifecycle we need to replicate production to staging. The current method of “restore from snapshot” is efficient, but it could be better. While these snapshots are essentially real-time backups thanks to our RDS configuration with a multiz-zone cluster and multi-state backup and snapshot logging, it still takes 1-2 hours to replicate the production data to our staging RDS servers.

Then we are still left having to run the WP CLI to update the URLs.

There has to be a better way.

WordPress Multisite Blogname Problem

One of the early issues we ran into with setting the environment variables to override the database values has to do with the wp_blogs table on a WordPress multisite installation.

When WordPress multisite first connects to the database it almost immediately runs this query on the wp_blogs table:

    [func_call] => $db->query("
			SELECT  wp_blogs.blog_id
			FROM wp_blogs 
			WHERE domain = 'local.storelocatorplus.com' AND path = '/'
			
			ORDER BY wp_blogs.blog_id ASC
			LIMIT 1
		")

Notice the domain name exact match.

Turns out if no records are returned from this query the site crashes with a super generic error message like the one below:

Yup, super helpful. This is from the dead_db() function in wp-includes/functions.php which really doesn’t tell you shit. The details were from a hacked version of WP to provide a print_r of the $wp_db global to see what it was doing last when this failed.

WP 6.4.2 stack:

  • index.php (17)
  • wp-blog-header.php (13)
  • wp-load.php (50)
  • wp-config.php (133)
  • wp-settings.php (141)
  • wp-includes/ms-settings.php (77) : ms_not_installed(…)
    • ms_load_current_site_and_network( $domain, $path, is_subdomain_install() )
  • wp-includes/ms-load.php (470): dead_db()

Turns out there is a lesser-known multisite feature that can help with this, SUNRISE – as documented on the WordPress VIP sunrise.php page. They even have a page on allowing multiple domains to resolve to the same network site.

This methodology seems to work well for applying filters and allowing the domain name to be variable for typical 3-stage DSP domain names.

For our initial ECS build attempt this will be part of the image via a Dockerfile copy command, ensuring define(‘SUNRISE’,true) is part of our composer startup file configuration for WordPress.

Investigation Hostname Change Options

There are several options noted in WordPress documentation for changing the URL, but given the plugin and theme munging mileage may vary.

  • Edit wp-config.php
  • Edit functions.php
  • Relocation method (code definition)
  • Database update

So far the fatest option for running multisite on different domains without changing the entire data set is to use the sunrise.php approach and add custom hooks and filters to munge the data before processing.

Environment Variables

One thing we’ve come across in the past is using the environment variables and/or wp-config files to force the hostname. This will likely solve at least half of our domain name resolution issues as it SHOULD override the db-driven get_option() calls that force the URL rewrites based on the store hostname and site url options in the wp_options table.

WordPress Core Filters and Hooks

The other advantage we have is we have full control over the SaaS application. Very little is “out of house” code other than WordPress core. Our theme and our plugins are mostly “in-house” Store Locator Plus® code which means we can control (or revise) any code that tries to store full URLs. Our standard is to only ever use relative URLs when possible, but WordPress core has other ideas when you save posts (we store location details in post meta) so we are not always in full control of what is written in the final database updates.

Thankfully WordPress is good about providing lots of hooks and filters that let you intercept data along the way before it is written to the database. We may need to dive deeper into this realm as well.

Image by Ivelin Donchev from Pixabay

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.