What happens on wp_insert_post()?
This question came up in the Advanced WordPress Facebook group in the form of whether we should be using direct database inserts when converting non-WP data to WP data, or use the APIs.
On one hand, using the WP API ensures that data will be sanitised and checked throughout the process. On the other it might mean that the process is going to be long running because there are a lot of database queries involved, and if the initial import is a failry large chunk of stuff, the servers might just choke.
I was curious to know what exactly happens on wp_insert_post()
.
Here's the entire code for reference: WP 3.9.2, wp-includes/post.php, line 2909, and here's the WordPress Codex page about wp_insert_post.
Assumptions
We are creating a new post. The data we have available:
array(
'post_title' => 'Some Title',
'post_content' => 'Some Content'
);
With that, let's hurl that data at wp_insert_post
. I'll count the database queries we'll encounter.
The process
Let's get the current user by calling get_current_user_id
. That calls wp_get_current_user
. That calls get_currentuserinfo()
(which populates the $current_user
global, and makes sure it's an instance of WP_User
.). It might call wp_set_current_user
, which calls a new WP_User
, which might fire off a number of database calls:
- (1)
$wpdb->get_blog_prefix
inWP_User::init
to get the capability keys - (2) then
$wpdb->get_user_meta
to get the capabilities by the key inWP_User::_init_caps
, which then callsWP_User::get_role_caps
, which callsnew WP_Roles
- (3)
$wpdb->get_blog_prefix
inWP_Roles::_init
to get the role key - (4)
get_option
with the role key, which calls a$wpdb->get_row
(unless it's in the cache) after callingwp_load_alloptions
- (5)
wp_load_alloptions
fires off a request at the database that gets all the options that are to be autoloaded, unless they're in the cache (which is an instance ofWP_Object_Cache
, a global array with stuff in them).
At this point we have our current user. wp_insert_post
is on line 3.
It populates the defaults, parses our data, sanitizes the data, and extracts stuff, all PHP. We're now on line 22 locally, or line 2932 in the file.
We're creating a new post, so $ID
is empty (because our initial array did not have a key called 'ID'
, thus PHP extract did not convert that to $ID
), therefore we're jumping to line 2947.
Line 2969: if we didn't give a title, content and excerpt, the script will stop here with an error that we're trying to insert an empty post. We don't, so carry on.
It's mostly setting defaults the next few lines, until line 2989.
- (6)
get_option('default_category')
on line 2989. It may or may not get things from the database depending on whether the thing we're looking for is already in the cache. The first time it does it it won't be.
3011 will call get_post_field
, but since we don't have an existing post yet, it's going to jump a few functions deep and return false. This is all PHP. Does a lot of sanitizing, accent removing, etc.
The next bunch of lines are all to do with datetimes, all PHP.
- (7) 3062 has another
get_option
, which may or may not be cached. - (8) 3065 has another
get_option
, which may or may not be cached.
3100 returns the $post_name
, which is already sanitized at this point (because we're creating a new post, did not specify the new post_status
, therefore the status is draft
, therefore it doesn't matter at this point).
3117 checks whether we're updating (we're not), so let's skip to 3134.
And finally, 3143 inserts the post. We're 234 lines deep into the function.
- (9)
$wpdb->insert
finally happens with all our sanitized, default data. - (10) after we've inserted the post, we're going to do an update on the record just inserted, and make sure that
post_name
is set too. Our original data array did not havepost_name
, therefore it has to be created frompost_title
. - (11) if our post type has
category
as a taxonomy, line 3161 callswp_set_post_categories
. That callsget_post_type
, which callsget_post
, which will either return the global$post
, or fire offWP_Post:get_instance
, which is a database read. - (12)
wp_set_post_categories
will also callget_post_status
, which also callsget_post
. - (13) there's also a
get_option
call, which may or may not be returned from cache. - (14)
wp_set_post_terms
is called, which callswp_set_object_terms
, which fires aSELECT
to ensure data parity. - (15) there's a
$wpdb->insert
to the term relationships table between the default category and the post - (16) there are a number of database queries relating to term counting.
_update_post_term_count
is one, and for each term it will fire off twoSELECT
queries. This is to ensure that when you are looking at the taxonomy pages, and it says Uncategorized has 14 posts, that 14 is accurate. - (17) there's an
INSERT INTO
near the end ofwp_set_object_terms
that deals with term orders, which may or may not fire. - 12-17 are repeated for tags. Since there weren't any, this is skipped. Ends up calling
wp_set_post_terms
in the end, so process is identical.
Then we get the guid
, which is empty, we're not updating, therefore:
- (18)
$wpdb->update
on line 3177 to update the guid (which is the permalink usually) - (19) after deleting cache, we're doing a
get_post
again. It may or may not use the cache. Probably not as we've just cleared it, therefore a fullSELECT
. - (20) the function
_transition_post_status
is hooked into thetransition_post_status
hook, which is fired by calling thewp_transition_post_status
function on line 3199. It's a simple$wpdb->update
on the row that we've just inserted.
At this point it fires off a bunch of other hooks, but nothing interesting happens.
If there was no cache, a simple wp_insert_post
would have 20 database queries. With cache, we can probably speculate it's actually down to about 5 depending on what had happened before we called the function.
Updating instead of inserting a new one, and supplying more data might change the number of queries as well in both directions.
The downside is that it can be quite taxing on the server. The upside is that data will make sense, and WordPress will not let you add bad, inconsistent data.
Note: get_post
will return stuff from the cache, if it's available.