The KAGG Fast Post Generator plugin is a new tool that allows you to create 1 million random posts in 2 minutes. It was prompted to be written by the need to test the product under WordPress on large databases containing hundreds of thousands and millions of posts/pages.
Existing tools don’t allow you to create the right number of random posts in a reasonable amount of time. For example, the FakerPress plugin generates about 1,000 posts per hour. The wpastronaut/wp-cli-seeder package is considerably faster, creating 1,000 posts in 35 seconds. One million posts this tool will create in about 10 hours. If you need a database containing millions and tens of millions of posts, you’re in trouble.
How is this speed achieved? The plugin uses the technique of loading .csv files quickly via the LOAD DATA INFILE command. It is the basic idea of the code. The plugin creates a temporary .csv file, writes a chunk of posts into it (default size 50,000 posts) and then loads them into the database with one MySQL command. Creating and writing 50,000 posts with PHP to a temporary .csv file takes about 1 sec, and it takes about 5 sec to load those 50,000 posts into the database with a single MySQL command. When tasked to generate 1,000,000 posts, the plugin splits the task into chunks and executes 20 ajax requests by default.
Below is the picture of the settings page and the plugin result.
We see that the fast post generator plugin created 1 million posts in 3 minutes. The speed on my test machine is slightly slower than on an entry-level Digital Ocean server. There the same procedure takes 2 minutes. Before generation, the database size is 250 MB; after generation, it is 2.9 GB.
Fast post generator plugin – optimisation
I created a database with 2 million posts and 200,000 pages for my tests. Its total size exceeded 5 GB. I should note here that with 200,000 pages in the database, WordPress starts to stall mercilessly due to errors both at the front and in the admin. The 2022 theme runs a database query on the front page, trying to load all 200,000 pages. In the admin on the all posts page, the core tries to load all pages to create a drop-down list of pages. So it’s better to use post generation to create large databases.
To achieve maximum speed, the plugin uses the WordPress constant SHORTINIT in the ajax calls to produce minimal core loading and bypass the loading of all plugins. Such an approach saves 1-2 seconds per generation step.
During development, it turned out that the plugin does not work on some hosting and servers. An error occurred when running LOAD DATA INFILE.
The point is that LOAD DATA INFILE can only read from a specific directory, which is set read-only by the system variable
secure_file_priv. Usually, it has the value of
/var/lib/mysql-files. On Windows, it’s empty, so I’ve been able to load the file from anywhere.
On Linux, understandably, there are problems. PHP can’t write to this folder, and MySQL can’t read from the PHP temporary file. The way out is to use LOAD DATA LOCAL INFILE. In this case, MySQL establishes a connection to PHP, overwrites the file in its folder and then loads it.
For this to work, you need to execute the command
SET GLOBAL local_infile = 'ON'. We can do it easily on the fly.
The second condition is that
mysqli.allow_local_infile = On must be in the
php.ini (and only there). As of PHP 7.1, this option is disabled by default.
The plugin notices that this option in the
php.ini is needed if it is disabled when activated. A developer can handle it, and housewives don’t need this plugin.
Deleting generated posts
Quick post deletion is a separate topic. To provide an easy deleting of the generated posts, the plugin writes in the
guid field of the post (which is seldom used). The initial slug of the post in the `guid` field looks as follows:
It takes 4 minutes to delete 1 million posts with a
DELETE FROM wp_posts p WHERE p.guid like 'https://generator.kagg.eu/%'; query. That’s a long time.
It turns out that there is a quicker and more elegant solution. To do the same thing via
START TRANSACTION; CREATE TABLE wp_posts_copy LIKE wp_posts; INSERT INTO wp_posts_copy SELECT * FROM wp_posts p WHERE p.guid not like 'https://generator.kagg.eu/%'; DROP TABLE wp_posts; RENAME TABLE wp_posts_copy TO wp_posts; COMMIT;
it only takes five seconds.
This performance gain during deletion is because INSERT is many times faster in MySQL than DELETE. It is easier to rewrite the posts to be saved into another table, delete the old one, and then rename the new one. After that, by the way, the database works much faster. If you delete rows, then the size of the
wp_posts table, which is almost empty, is kept at 2.9 GB. The database starts to falter. But with table tricks, it works well.