Add more than 15K items in WordPress Database without overwhelming the server?












0















We have created a Crawler using GuzzleHTTP and other associated libraries hosted on AWS Servers which crawls and gives us around 5,000 products from 1 site alone and we have in total 4 sites, so the count of items comes upto around 15k+.



Now the Crawler is working fine & we are able to crawl all the site in under an hour and build JSON files.



Then we are exporting the data from those JSON files into WordPress DB with the items being a post and any additional data as post_meta and terms and taxonomies. We are doing this right now using WP Ajax hooks and filters and a loop(ofcourse).



But it is taking a hell lot of time to export and the chance of server giving a timeout is very high, which will be as normally Apache servers are not meant to take such a load.



We need to know the best way possible to do this.




  1. Do we create a DB on the AWS itself and somehow connect it to WordPress? If yes, then how will we manage the relationships between the custom posts and its meta and terms. As if we add the data on the server where the WordPress is hosted we can use WordPress functions to create posts and associate data accordingly.


  2. Do we run a CRON Job on the WordPress server's end and give more power to the server, so that the timeout issue is not there. We are on Site Ground's servers.


  3. Or this there a better way to do this?



Any help would be appreciated.



Thanks!










share|improve this question


















  • 1





    15K items shouldn't even cause your server to break a sweat. You might need to set your timeouts in php.ini higher (even if termporarily) but there's no reason you shouldn't be able to handle hundreds of thousands of records.

    – Difster
    Nov 21 '18 at 11:43











  • How are you handling your inserts? I had a long running thing that does an ETL operation (read from file read from db compare the 2 sets of 50k records and write out differences then update DB with data from file) , would take long time to run - 10 minutes or more at times. Cut it down to an average of just under 3 minutes by collecting all data and then doing a single DB connection, loop through the insert statements, and then disconnect. Prior method would connect/disconnect for each insert.

    – ivanivan
    Nov 21 '18 at 12:06











  • @Difster How much timeout value you think would be good enough? We have set_time_limit(3000); Thanks!

    – Rupak Dhiman
    Nov 22 '18 at 12:13













  • @ivanivan We are handling our inserts with WordPress AJAX Hooks and Filters and default WordPress Functions to insert post wp_insert_post. i have called AJAX request within same AJAX as to run another request only when the previous one is completed, so that we can cancel the action in middle if we wish to. Thanks!

    – Rupak Dhiman
    Nov 22 '18 at 12:15


















0















We have created a Crawler using GuzzleHTTP and other associated libraries hosted on AWS Servers which crawls and gives us around 5,000 products from 1 site alone and we have in total 4 sites, so the count of items comes upto around 15k+.



Now the Crawler is working fine & we are able to crawl all the site in under an hour and build JSON files.



Then we are exporting the data from those JSON files into WordPress DB with the items being a post and any additional data as post_meta and terms and taxonomies. We are doing this right now using WP Ajax hooks and filters and a loop(ofcourse).



But it is taking a hell lot of time to export and the chance of server giving a timeout is very high, which will be as normally Apache servers are not meant to take such a load.



We need to know the best way possible to do this.




  1. Do we create a DB on the AWS itself and somehow connect it to WordPress? If yes, then how will we manage the relationships between the custom posts and its meta and terms. As if we add the data on the server where the WordPress is hosted we can use WordPress functions to create posts and associate data accordingly.


  2. Do we run a CRON Job on the WordPress server's end and give more power to the server, so that the timeout issue is not there. We are on Site Ground's servers.


  3. Or this there a better way to do this?



Any help would be appreciated.



Thanks!










share|improve this question


















  • 1





    15K items shouldn't even cause your server to break a sweat. You might need to set your timeouts in php.ini higher (even if termporarily) but there's no reason you shouldn't be able to handle hundreds of thousands of records.

    – Difster
    Nov 21 '18 at 11:43











  • How are you handling your inserts? I had a long running thing that does an ETL operation (read from file read from db compare the 2 sets of 50k records and write out differences then update DB with data from file) , would take long time to run - 10 minutes or more at times. Cut it down to an average of just under 3 minutes by collecting all data and then doing a single DB connection, loop through the insert statements, and then disconnect. Prior method would connect/disconnect for each insert.

    – ivanivan
    Nov 21 '18 at 12:06











  • @Difster How much timeout value you think would be good enough? We have set_time_limit(3000); Thanks!

    – Rupak Dhiman
    Nov 22 '18 at 12:13













  • @ivanivan We are handling our inserts with WordPress AJAX Hooks and Filters and default WordPress Functions to insert post wp_insert_post. i have called AJAX request within same AJAX as to run another request only when the previous one is completed, so that we can cancel the action in middle if we wish to. Thanks!

    – Rupak Dhiman
    Nov 22 '18 at 12:15
















0












0








0








We have created a Crawler using GuzzleHTTP and other associated libraries hosted on AWS Servers which crawls and gives us around 5,000 products from 1 site alone and we have in total 4 sites, so the count of items comes upto around 15k+.



Now the Crawler is working fine & we are able to crawl all the site in under an hour and build JSON files.



Then we are exporting the data from those JSON files into WordPress DB with the items being a post and any additional data as post_meta and terms and taxonomies. We are doing this right now using WP Ajax hooks and filters and a loop(ofcourse).



But it is taking a hell lot of time to export and the chance of server giving a timeout is very high, which will be as normally Apache servers are not meant to take such a load.



We need to know the best way possible to do this.




  1. Do we create a DB on the AWS itself and somehow connect it to WordPress? If yes, then how will we manage the relationships between the custom posts and its meta and terms. As if we add the data on the server where the WordPress is hosted we can use WordPress functions to create posts and associate data accordingly.


  2. Do we run a CRON Job on the WordPress server's end and give more power to the server, so that the timeout issue is not there. We are on Site Ground's servers.


  3. Or this there a better way to do this?



Any help would be appreciated.



Thanks!










share|improve this question














We have created a Crawler using GuzzleHTTP and other associated libraries hosted on AWS Servers which crawls and gives us around 5,000 products from 1 site alone and we have in total 4 sites, so the count of items comes upto around 15k+.



Now the Crawler is working fine & we are able to crawl all the site in under an hour and build JSON files.



Then we are exporting the data from those JSON files into WordPress DB with the items being a post and any additional data as post_meta and terms and taxonomies. We are doing this right now using WP Ajax hooks and filters and a loop(ofcourse).



But it is taking a hell lot of time to export and the chance of server giving a timeout is very high, which will be as normally Apache servers are not meant to take such a load.



We need to know the best way possible to do this.




  1. Do we create a DB on the AWS itself and somehow connect it to WordPress? If yes, then how will we manage the relationships between the custom posts and its meta and terms. As if we add the data on the server where the WordPress is hosted we can use WordPress functions to create posts and associate data accordingly.


  2. Do we run a CRON Job on the WordPress server's end and give more power to the server, so that the timeout issue is not there. We are on Site Ground's servers.


  3. Or this there a better way to do this?



Any help would be appreciated.



Thanks!







php mysql ajax wordpress guzzlehttp






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 21 '18 at 11:33









Rupak DhimanRupak Dhiman

11




11








  • 1





    15K items shouldn't even cause your server to break a sweat. You might need to set your timeouts in php.ini higher (even if termporarily) but there's no reason you shouldn't be able to handle hundreds of thousands of records.

    – Difster
    Nov 21 '18 at 11:43











  • How are you handling your inserts? I had a long running thing that does an ETL operation (read from file read from db compare the 2 sets of 50k records and write out differences then update DB with data from file) , would take long time to run - 10 minutes or more at times. Cut it down to an average of just under 3 minutes by collecting all data and then doing a single DB connection, loop through the insert statements, and then disconnect. Prior method would connect/disconnect for each insert.

    – ivanivan
    Nov 21 '18 at 12:06











  • @Difster How much timeout value you think would be good enough? We have set_time_limit(3000); Thanks!

    – Rupak Dhiman
    Nov 22 '18 at 12:13













  • @ivanivan We are handling our inserts with WordPress AJAX Hooks and Filters and default WordPress Functions to insert post wp_insert_post. i have called AJAX request within same AJAX as to run another request only when the previous one is completed, so that we can cancel the action in middle if we wish to. Thanks!

    – Rupak Dhiman
    Nov 22 '18 at 12:15
















  • 1





    15K items shouldn't even cause your server to break a sweat. You might need to set your timeouts in php.ini higher (even if termporarily) but there's no reason you shouldn't be able to handle hundreds of thousands of records.

    – Difster
    Nov 21 '18 at 11:43











  • How are you handling your inserts? I had a long running thing that does an ETL operation (read from file read from db compare the 2 sets of 50k records and write out differences then update DB with data from file) , would take long time to run - 10 minutes or more at times. Cut it down to an average of just under 3 minutes by collecting all data and then doing a single DB connection, loop through the insert statements, and then disconnect. Prior method would connect/disconnect for each insert.

    – ivanivan
    Nov 21 '18 at 12:06











  • @Difster How much timeout value you think would be good enough? We have set_time_limit(3000); Thanks!

    – Rupak Dhiman
    Nov 22 '18 at 12:13













  • @ivanivan We are handling our inserts with WordPress AJAX Hooks and Filters and default WordPress Functions to insert post wp_insert_post. i have called AJAX request within same AJAX as to run another request only when the previous one is completed, so that we can cancel the action in middle if we wish to. Thanks!

    – Rupak Dhiman
    Nov 22 '18 at 12:15










1




1





15K items shouldn't even cause your server to break a sweat. You might need to set your timeouts in php.ini higher (even if termporarily) but there's no reason you shouldn't be able to handle hundreds of thousands of records.

– Difster
Nov 21 '18 at 11:43





15K items shouldn't even cause your server to break a sweat. You might need to set your timeouts in php.ini higher (even if termporarily) but there's no reason you shouldn't be able to handle hundreds of thousands of records.

– Difster
Nov 21 '18 at 11:43













How are you handling your inserts? I had a long running thing that does an ETL operation (read from file read from db compare the 2 sets of 50k records and write out differences then update DB with data from file) , would take long time to run - 10 minutes or more at times. Cut it down to an average of just under 3 minutes by collecting all data and then doing a single DB connection, loop through the insert statements, and then disconnect. Prior method would connect/disconnect for each insert.

– ivanivan
Nov 21 '18 at 12:06





How are you handling your inserts? I had a long running thing that does an ETL operation (read from file read from db compare the 2 sets of 50k records and write out differences then update DB with data from file) , would take long time to run - 10 minutes or more at times. Cut it down to an average of just under 3 minutes by collecting all data and then doing a single DB connection, loop through the insert statements, and then disconnect. Prior method would connect/disconnect for each insert.

– ivanivan
Nov 21 '18 at 12:06













@Difster How much timeout value you think would be good enough? We have set_time_limit(3000); Thanks!

– Rupak Dhiman
Nov 22 '18 at 12:13







@Difster How much timeout value you think would be good enough? We have set_time_limit(3000); Thanks!

– Rupak Dhiman
Nov 22 '18 at 12:13















@ivanivan We are handling our inserts with WordPress AJAX Hooks and Filters and default WordPress Functions to insert post wp_insert_post. i have called AJAX request within same AJAX as to run another request only when the previous one is completed, so that we can cancel the action in middle if we wish to. Thanks!

– Rupak Dhiman
Nov 22 '18 at 12:15







@ivanivan We are handling our inserts with WordPress AJAX Hooks and Filters and default WordPress Functions to insert post wp_insert_post. i have called AJAX request within same AJAX as to run another request only when the previous one is completed, so that we can cancel the action in middle if we wish to. Thanks!

– Rupak Dhiman
Nov 22 '18 at 12:15














1 Answer
1






active

oldest

votes


















1














Based on my experience I have created more than 50 000 products on wordpress/woocommerce.



First time I used woo commerce API to create products from an external server, it's very easy to do it but will require to much time. Here is documentation [http://woocommerce.github.io/woocommerce-rest-api-docs/#introduction][1]



The best way for me is to use WordPress hooks it will be faster than api. You can set the timeout to -1 and it will not display anymore.



In my opinion, wordpress is not the best choice to deal with huge amounts of data.



Good luck






share|improve this answer
























  • Hey @xhuljo, I think WordPress is not an issue here. What I may be missing is the Server Configuration or the way I am looking at this prospect and going about it. There is something wrong with my plan. As i am crawling the Data on AWS and creating JSON and then importing that JSON into WordPress using Async AJAX and loop. Thanks!

    – Rupak Dhiman
    Nov 22 '18 at 12:21











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53411196%2fadd-more-than-15k-items-in-wordpress-database-without-overwhelming-the-server%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














Based on my experience I have created more than 50 000 products on wordpress/woocommerce.



First time I used woo commerce API to create products from an external server, it's very easy to do it but will require to much time. Here is documentation [http://woocommerce.github.io/woocommerce-rest-api-docs/#introduction][1]



The best way for me is to use WordPress hooks it will be faster than api. You can set the timeout to -1 and it will not display anymore.



In my opinion, wordpress is not the best choice to deal with huge amounts of data.



Good luck






share|improve this answer
























  • Hey @xhuljo, I think WordPress is not an issue here. What I may be missing is the Server Configuration or the way I am looking at this prospect and going about it. There is something wrong with my plan. As i am crawling the Data on AWS and creating JSON and then importing that JSON into WordPress using Async AJAX and loop. Thanks!

    – Rupak Dhiman
    Nov 22 '18 at 12:21
















1














Based on my experience I have created more than 50 000 products on wordpress/woocommerce.



First time I used woo commerce API to create products from an external server, it's very easy to do it but will require to much time. Here is documentation [http://woocommerce.github.io/woocommerce-rest-api-docs/#introduction][1]



The best way for me is to use WordPress hooks it will be faster than api. You can set the timeout to -1 and it will not display anymore.



In my opinion, wordpress is not the best choice to deal with huge amounts of data.



Good luck






share|improve this answer
























  • Hey @xhuljo, I think WordPress is not an issue here. What I may be missing is the Server Configuration or the way I am looking at this prospect and going about it. There is something wrong with my plan. As i am crawling the Data on AWS and creating JSON and then importing that JSON into WordPress using Async AJAX and loop. Thanks!

    – Rupak Dhiman
    Nov 22 '18 at 12:21














1












1








1







Based on my experience I have created more than 50 000 products on wordpress/woocommerce.



First time I used woo commerce API to create products from an external server, it's very easy to do it but will require to much time. Here is documentation [http://woocommerce.github.io/woocommerce-rest-api-docs/#introduction][1]



The best way for me is to use WordPress hooks it will be faster than api. You can set the timeout to -1 and it will not display anymore.



In my opinion, wordpress is not the best choice to deal with huge amounts of data.



Good luck






share|improve this answer













Based on my experience I have created more than 50 000 products on wordpress/woocommerce.



First time I used woo commerce API to create products from an external server, it's very easy to do it but will require to much time. Here is documentation [http://woocommerce.github.io/woocommerce-rest-api-docs/#introduction][1]



The best way for me is to use WordPress hooks it will be faster than api. You can set the timeout to -1 and it will not display anymore.



In my opinion, wordpress is not the best choice to deal with huge amounts of data.



Good luck







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 21 '18 at 11:51









xhuljoxhuljo

424




424













  • Hey @xhuljo, I think WordPress is not an issue here. What I may be missing is the Server Configuration or the way I am looking at this prospect and going about it. There is something wrong with my plan. As i am crawling the Data on AWS and creating JSON and then importing that JSON into WordPress using Async AJAX and loop. Thanks!

    – Rupak Dhiman
    Nov 22 '18 at 12:21



















  • Hey @xhuljo, I think WordPress is not an issue here. What I may be missing is the Server Configuration or the way I am looking at this prospect and going about it. There is something wrong with my plan. As i am crawling the Data on AWS and creating JSON and then importing that JSON into WordPress using Async AJAX and loop. Thanks!

    – Rupak Dhiman
    Nov 22 '18 at 12:21

















Hey @xhuljo, I think WordPress is not an issue here. What I may be missing is the Server Configuration or the way I am looking at this prospect and going about it. There is something wrong with my plan. As i am crawling the Data on AWS and creating JSON and then importing that JSON into WordPress using Async AJAX and loop. Thanks!

– Rupak Dhiman
Nov 22 '18 at 12:21





Hey @xhuljo, I think WordPress is not an issue here. What I may be missing is the Server Configuration or the way I am looking at this prospect and going about it. There is something wrong with my plan. As i am crawling the Data on AWS and creating JSON and then importing that JSON into WordPress using Async AJAX and loop. Thanks!

– Rupak Dhiman
Nov 22 '18 at 12:21




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53411196%2fadd-more-than-15k-items-in-wordpress-database-without-overwhelming-the-server%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Can a sorcerer learn a 5th-level spell early by creating spell slots using the Font of Magic feature?

Does disintegrating a polymorphed enemy still kill it after the 2018 errata?

A Topological Invariant for $pi_3(U(n))$