Make Scrapy send POST data from Javascript function
I'm playing with Scrapy and playing with this tutorial. Things look good but I noticed Steam changed their age check so there is no longer a form in DOM. So the suggested solution will not work:
form = response.css('#agegate_box form')
action = form.xpath('@action').extract_first()
name = form.xpath('input/@name').extract_first()
value = form.xpath('input/@value').extract_first()
formdata = {
name: value,
'ageDay': '1',
'ageMonth': '1',
'ageYear': '1955'
}
yield FormRequest(
url=action,
method='POST',
formdata=formdata,
callback=self.parse_product
)
Checking an example game that forces age check; I noticed the View Page button is no longer a form:
<a class="btnv6_blue_hoverfade btn_medium" href="#" onclick="ViewProductPage()"><span>View Page</span></a>
And the function being called will eventually call this one:
function CheckAgeGateSubmit( callbackFunc )
{
if ( $J('#ageYear').val() == 2019 )
{
ShowAlertDialog( '', 'Please enter a valid date' );
return false;
}
$J.post(
'https://store.steampowered.com/agecheckset/' + "app" + '/9200/',
{
sessionid: g_sessionID,
ageDay: $J('#ageDay').val(),
ageMonth: $J('#ageMonth').val(),
ageYear: $J('#ageYear').val()
}
).done( function( response ) {
switch ( response.success )
{
case 1:
callbackFunc();
break;
case 24:
top.location.reload();
break;
case 15:
case 2:
ShowAlertDialog( 'Error', 'There was a problem verifying your age. Please try again later.' );
break;
}
} );
}
So basically this is making a POST with some data...what would be the best way to do this in Scrapy, since this is not a form any longer? I'm just thinking on ignoring the code where the form is obtained and simply send the request with the FormRequest object...but is this the way to go? An alternative could also be setting cookies for age and pass it on every single request so possibly the age check is ignored altogether?
Thanks!
scrapy
add a comment |
I'm playing with Scrapy and playing with this tutorial. Things look good but I noticed Steam changed their age check so there is no longer a form in DOM. So the suggested solution will not work:
form = response.css('#agegate_box form')
action = form.xpath('@action').extract_first()
name = form.xpath('input/@name').extract_first()
value = form.xpath('input/@value').extract_first()
formdata = {
name: value,
'ageDay': '1',
'ageMonth': '1',
'ageYear': '1955'
}
yield FormRequest(
url=action,
method='POST',
formdata=formdata,
callback=self.parse_product
)
Checking an example game that forces age check; I noticed the View Page button is no longer a form:
<a class="btnv6_blue_hoverfade btn_medium" href="#" onclick="ViewProductPage()"><span>View Page</span></a>
And the function being called will eventually call this one:
function CheckAgeGateSubmit( callbackFunc )
{
if ( $J('#ageYear').val() == 2019 )
{
ShowAlertDialog( '', 'Please enter a valid date' );
return false;
}
$J.post(
'https://store.steampowered.com/agecheckset/' + "app" + '/9200/',
{
sessionid: g_sessionID,
ageDay: $J('#ageDay').val(),
ageMonth: $J('#ageMonth').val(),
ageYear: $J('#ageYear').val()
}
).done( function( response ) {
switch ( response.success )
{
case 1:
callbackFunc();
break;
case 24:
top.location.reload();
break;
case 15:
case 2:
ShowAlertDialog( 'Error', 'There was a problem verifying your age. Please try again later.' );
break;
}
} );
}
So basically this is making a POST with some data...what would be the best way to do this in Scrapy, since this is not a form any longer? I'm just thinking on ignoring the code where the form is obtained and simply send the request with the FormRequest object...but is this the way to go? An alternative could also be setting cookies for age and pass it on every single request so possibly the age check is ignored altogether?
Thanks!
scrapy
Alternatively you should look into cookies. The last time I crawled steam they had some cookie that you could just set to pass the age gate likeageverified=1
. You can set cookies in yourRequest
objects to replicate that.
– Granitosaurus
Jan 3 at 13:57
add a comment |
I'm playing with Scrapy and playing with this tutorial. Things look good but I noticed Steam changed their age check so there is no longer a form in DOM. So the suggested solution will not work:
form = response.css('#agegate_box form')
action = form.xpath('@action').extract_first()
name = form.xpath('input/@name').extract_first()
value = form.xpath('input/@value').extract_first()
formdata = {
name: value,
'ageDay': '1',
'ageMonth': '1',
'ageYear': '1955'
}
yield FormRequest(
url=action,
method='POST',
formdata=formdata,
callback=self.parse_product
)
Checking an example game that forces age check; I noticed the View Page button is no longer a form:
<a class="btnv6_blue_hoverfade btn_medium" href="#" onclick="ViewProductPage()"><span>View Page</span></a>
And the function being called will eventually call this one:
function CheckAgeGateSubmit( callbackFunc )
{
if ( $J('#ageYear').val() == 2019 )
{
ShowAlertDialog( '', 'Please enter a valid date' );
return false;
}
$J.post(
'https://store.steampowered.com/agecheckset/' + "app" + '/9200/',
{
sessionid: g_sessionID,
ageDay: $J('#ageDay').val(),
ageMonth: $J('#ageMonth').val(),
ageYear: $J('#ageYear').val()
}
).done( function( response ) {
switch ( response.success )
{
case 1:
callbackFunc();
break;
case 24:
top.location.reload();
break;
case 15:
case 2:
ShowAlertDialog( 'Error', 'There was a problem verifying your age. Please try again later.' );
break;
}
} );
}
So basically this is making a POST with some data...what would be the best way to do this in Scrapy, since this is not a form any longer? I'm just thinking on ignoring the code where the form is obtained and simply send the request with the FormRequest object...but is this the way to go? An alternative could also be setting cookies for age and pass it on every single request so possibly the age check is ignored altogether?
Thanks!
scrapy
I'm playing with Scrapy and playing with this tutorial. Things look good but I noticed Steam changed their age check so there is no longer a form in DOM. So the suggested solution will not work:
form = response.css('#agegate_box form')
action = form.xpath('@action').extract_first()
name = form.xpath('input/@name').extract_first()
value = form.xpath('input/@value').extract_first()
formdata = {
name: value,
'ageDay': '1',
'ageMonth': '1',
'ageYear': '1955'
}
yield FormRequest(
url=action,
method='POST',
formdata=formdata,
callback=self.parse_product
)
Checking an example game that forces age check; I noticed the View Page button is no longer a form:
<a class="btnv6_blue_hoverfade btn_medium" href="#" onclick="ViewProductPage()"><span>View Page</span></a>
And the function being called will eventually call this one:
function CheckAgeGateSubmit( callbackFunc )
{
if ( $J('#ageYear').val() == 2019 )
{
ShowAlertDialog( '', 'Please enter a valid date' );
return false;
}
$J.post(
'https://store.steampowered.com/agecheckset/' + "app" + '/9200/',
{
sessionid: g_sessionID,
ageDay: $J('#ageDay').val(),
ageMonth: $J('#ageMonth').val(),
ageYear: $J('#ageYear').val()
}
).done( function( response ) {
switch ( response.success )
{
case 1:
callbackFunc();
break;
case 24:
top.location.reload();
break;
case 15:
case 2:
ShowAlertDialog( 'Error', 'There was a problem verifying your age. Please try again later.' );
break;
}
} );
}
So basically this is making a POST with some data...what would be the best way to do this in Scrapy, since this is not a form any longer? I'm just thinking on ignoring the code where the form is obtained and simply send the request with the FormRequest object...but is this the way to go? An alternative could also be setting cookies for age and pass it on every single request so possibly the age check is ignored altogether?
Thanks!
scrapy
scrapy
asked Jan 2 at 19:42
AlejandroVKAlejandroVK
3,272103360
3,272103360
Alternatively you should look into cookies. The last time I crawled steam they had some cookie that you could just set to pass the age gate likeageverified=1
. You can set cookies in yourRequest
objects to replicate that.
– Granitosaurus
Jan 3 at 13:57
add a comment |
Alternatively you should look into cookies. The last time I crawled steam they had some cookie that you could just set to pass the age gate likeageverified=1
. You can set cookies in yourRequest
objects to replicate that.
– Granitosaurus
Jan 3 at 13:57
Alternatively you should look into cookies. The last time I crawled steam they had some cookie that you could just set to pass the age gate like
ageverified=1
. You can set cookies in your Request
objects to replicate that.– Granitosaurus
Jan 3 at 13:57
Alternatively you should look into cookies. The last time I crawled steam they had some cookie that you could just set to pass the age gate like
ageverified=1
. You can set cookies in your Request
objects to replicate that.– Granitosaurus
Jan 3 at 13:57
add a comment |
1 Answer
1
active
oldest
votes
You should probably just set an appropriate cookie and you'll be let right through!
If you take a look at what your browser has when entering the page:
and replicate that in scrapy:
cookies = {
'wants_mature_content':'1',
'birthtime':'189302401',
'lastagecheckage': '1-January-1976',
}
url = 'https://store.steampowered.com/app/9200/RAGE/'
Request(url, cookies)
lastagecheckage
should probably be enough on it's own but I haven't tested it.
Hello @Granitosaurus yeah, that's what I was thinking, although, I'd like to know what would be a good approach when there's no form in a specifc page and we need to make a POST request to some other page, Scrapy wise. I assume Selenium could be an option, but just wondering. In any case, will test your approach, thanks!
– AlejandroVK
Jan 3 at 16:06
@AlejandroVK You don't need Selenium and can replicate the age gate post request as it should be very simple; I could write up an answer for that but it seems a bit silly to replicate the request to get a cookie when you can just set the cookie manually and avoid unnecessary crawling :)
– Granitosaurus
Jan 4 at 1:32
Agreed although for the sake of learning, still wondering how I can send a POST request when there's no form using vanilla Scrapy :)
– AlejandroVK
Jan 4 at 17:29
Feel free to email (it's in my profile) with any cases you have - it's usually quite simple. You have to figure out what sort of body the server is expecting, setContent-Type
header and make a request withmethod
attribute like:Request(method='POST', body='{"body":"hello"}', headers={'Content-Type': 'application/json'})
<- would be an example of POSTing json data.
– Granitosaurus
Jan 5 at 1:41
Will do, thank you sir!
– AlejandroVK
Jan 6 at 22:46
add a comment |
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54012236%2fmake-scrapy-send-post-data-from-javascript-function%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You should probably just set an appropriate cookie and you'll be let right through!
If you take a look at what your browser has when entering the page:
and replicate that in scrapy:
cookies = {
'wants_mature_content':'1',
'birthtime':'189302401',
'lastagecheckage': '1-January-1976',
}
url = 'https://store.steampowered.com/app/9200/RAGE/'
Request(url, cookies)
lastagecheckage
should probably be enough on it's own but I haven't tested it.
Hello @Granitosaurus yeah, that's what I was thinking, although, I'd like to know what would be a good approach when there's no form in a specifc page and we need to make a POST request to some other page, Scrapy wise. I assume Selenium could be an option, but just wondering. In any case, will test your approach, thanks!
– AlejandroVK
Jan 3 at 16:06
@AlejandroVK You don't need Selenium and can replicate the age gate post request as it should be very simple; I could write up an answer for that but it seems a bit silly to replicate the request to get a cookie when you can just set the cookie manually and avoid unnecessary crawling :)
– Granitosaurus
Jan 4 at 1:32
Agreed although for the sake of learning, still wondering how I can send a POST request when there's no form using vanilla Scrapy :)
– AlejandroVK
Jan 4 at 17:29
Feel free to email (it's in my profile) with any cases you have - it's usually quite simple. You have to figure out what sort of body the server is expecting, setContent-Type
header and make a request withmethod
attribute like:Request(method='POST', body='{"body":"hello"}', headers={'Content-Type': 'application/json'})
<- would be an example of POSTing json data.
– Granitosaurus
Jan 5 at 1:41
Will do, thank you sir!
– AlejandroVK
Jan 6 at 22:46
add a comment |
You should probably just set an appropriate cookie and you'll be let right through!
If you take a look at what your browser has when entering the page:
and replicate that in scrapy:
cookies = {
'wants_mature_content':'1',
'birthtime':'189302401',
'lastagecheckage': '1-January-1976',
}
url = 'https://store.steampowered.com/app/9200/RAGE/'
Request(url, cookies)
lastagecheckage
should probably be enough on it's own but I haven't tested it.
Hello @Granitosaurus yeah, that's what I was thinking, although, I'd like to know what would be a good approach when there's no form in a specifc page and we need to make a POST request to some other page, Scrapy wise. I assume Selenium could be an option, but just wondering. In any case, will test your approach, thanks!
– AlejandroVK
Jan 3 at 16:06
@AlejandroVK You don't need Selenium and can replicate the age gate post request as it should be very simple; I could write up an answer for that but it seems a bit silly to replicate the request to get a cookie when you can just set the cookie manually and avoid unnecessary crawling :)
– Granitosaurus
Jan 4 at 1:32
Agreed although for the sake of learning, still wondering how I can send a POST request when there's no form using vanilla Scrapy :)
– AlejandroVK
Jan 4 at 17:29
Feel free to email (it's in my profile) with any cases you have - it's usually quite simple. You have to figure out what sort of body the server is expecting, setContent-Type
header and make a request withmethod
attribute like:Request(method='POST', body='{"body":"hello"}', headers={'Content-Type': 'application/json'})
<- would be an example of POSTing json data.
– Granitosaurus
Jan 5 at 1:41
Will do, thank you sir!
– AlejandroVK
Jan 6 at 22:46
add a comment |
You should probably just set an appropriate cookie and you'll be let right through!
If you take a look at what your browser has when entering the page:
and replicate that in scrapy:
cookies = {
'wants_mature_content':'1',
'birthtime':'189302401',
'lastagecheckage': '1-January-1976',
}
url = 'https://store.steampowered.com/app/9200/RAGE/'
Request(url, cookies)
lastagecheckage
should probably be enough on it's own but I haven't tested it.
You should probably just set an appropriate cookie and you'll be let right through!
If you take a look at what your browser has when entering the page:
and replicate that in scrapy:
cookies = {
'wants_mature_content':'1',
'birthtime':'189302401',
'lastagecheckage': '1-January-1976',
}
url = 'https://store.steampowered.com/app/9200/RAGE/'
Request(url, cookies)
lastagecheckage
should probably be enough on it's own but I haven't tested it.
answered Jan 3 at 14:07


GranitosaurusGranitosaurus
11.6k22445
11.6k22445
Hello @Granitosaurus yeah, that's what I was thinking, although, I'd like to know what would be a good approach when there's no form in a specifc page and we need to make a POST request to some other page, Scrapy wise. I assume Selenium could be an option, but just wondering. In any case, will test your approach, thanks!
– AlejandroVK
Jan 3 at 16:06
@AlejandroVK You don't need Selenium and can replicate the age gate post request as it should be very simple; I could write up an answer for that but it seems a bit silly to replicate the request to get a cookie when you can just set the cookie manually and avoid unnecessary crawling :)
– Granitosaurus
Jan 4 at 1:32
Agreed although for the sake of learning, still wondering how I can send a POST request when there's no form using vanilla Scrapy :)
– AlejandroVK
Jan 4 at 17:29
Feel free to email (it's in my profile) with any cases you have - it's usually quite simple. You have to figure out what sort of body the server is expecting, setContent-Type
header and make a request withmethod
attribute like:Request(method='POST', body='{"body":"hello"}', headers={'Content-Type': 'application/json'})
<- would be an example of POSTing json data.
– Granitosaurus
Jan 5 at 1:41
Will do, thank you sir!
– AlejandroVK
Jan 6 at 22:46
add a comment |
Hello @Granitosaurus yeah, that's what I was thinking, although, I'd like to know what would be a good approach when there's no form in a specifc page and we need to make a POST request to some other page, Scrapy wise. I assume Selenium could be an option, but just wondering. In any case, will test your approach, thanks!
– AlejandroVK
Jan 3 at 16:06
@AlejandroVK You don't need Selenium and can replicate the age gate post request as it should be very simple; I could write up an answer for that but it seems a bit silly to replicate the request to get a cookie when you can just set the cookie manually and avoid unnecessary crawling :)
– Granitosaurus
Jan 4 at 1:32
Agreed although for the sake of learning, still wondering how I can send a POST request when there's no form using vanilla Scrapy :)
– AlejandroVK
Jan 4 at 17:29
Feel free to email (it's in my profile) with any cases you have - it's usually quite simple. You have to figure out what sort of body the server is expecting, setContent-Type
header and make a request withmethod
attribute like:Request(method='POST', body='{"body":"hello"}', headers={'Content-Type': 'application/json'})
<- would be an example of POSTing json data.
– Granitosaurus
Jan 5 at 1:41
Will do, thank you sir!
– AlejandroVK
Jan 6 at 22:46
Hello @Granitosaurus yeah, that's what I was thinking, although, I'd like to know what would be a good approach when there's no form in a specifc page and we need to make a POST request to some other page, Scrapy wise. I assume Selenium could be an option, but just wondering. In any case, will test your approach, thanks!
– AlejandroVK
Jan 3 at 16:06
Hello @Granitosaurus yeah, that's what I was thinking, although, I'd like to know what would be a good approach when there's no form in a specifc page and we need to make a POST request to some other page, Scrapy wise. I assume Selenium could be an option, but just wondering. In any case, will test your approach, thanks!
– AlejandroVK
Jan 3 at 16:06
@AlejandroVK You don't need Selenium and can replicate the age gate post request as it should be very simple; I could write up an answer for that but it seems a bit silly to replicate the request to get a cookie when you can just set the cookie manually and avoid unnecessary crawling :)
– Granitosaurus
Jan 4 at 1:32
@AlejandroVK You don't need Selenium and can replicate the age gate post request as it should be very simple; I could write up an answer for that but it seems a bit silly to replicate the request to get a cookie when you can just set the cookie manually and avoid unnecessary crawling :)
– Granitosaurus
Jan 4 at 1:32
Agreed although for the sake of learning, still wondering how I can send a POST request when there's no form using vanilla Scrapy :)
– AlejandroVK
Jan 4 at 17:29
Agreed although for the sake of learning, still wondering how I can send a POST request when there's no form using vanilla Scrapy :)
– AlejandroVK
Jan 4 at 17:29
Feel free to email (it's in my profile) with any cases you have - it's usually quite simple. You have to figure out what sort of body the server is expecting, set
Content-Type
header and make a request with method
attribute like: Request(method='POST', body='{"body":"hello"}', headers={'Content-Type': 'application/json'})
<- would be an example of POSTing json data.– Granitosaurus
Jan 5 at 1:41
Feel free to email (it's in my profile) with any cases you have - it's usually quite simple. You have to figure out what sort of body the server is expecting, set
Content-Type
header and make a request with method
attribute like: Request(method='POST', body='{"body":"hello"}', headers={'Content-Type': 'application/json'})
<- would be an example of POSTing json data.– Granitosaurus
Jan 5 at 1:41
Will do, thank you sir!
– AlejandroVK
Jan 6 at 22:46
Will do, thank you sir!
– AlejandroVK
Jan 6 at 22:46
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54012236%2fmake-scrapy-send-post-data-from-javascript-function%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Alternatively you should look into cookies. The last time I crawled steam they had some cookie that you could just set to pass the age gate like
ageverified=1
. You can set cookies in yourRequest
objects to replicate that.– Granitosaurus
Jan 3 at 13:57