Coming from C++ background, trying to understand what callback is doing in this function from the Scrapy...
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I'm following the basic Scrapy tutorial, and have some limited python experience. This seems like a recursive function, and I have some questions about what is happening.
This is in the Scrapy tutorial: https://doc.scrapy.org/en/latest/intro/tutorial.html
This ran the same when I specified callback=self.parse
and when I left it out.
Here's the code (the last line is where my question is coming from):
def parse(self, response):
for quote in response.css('div.quote'):
yield {
'text': quote.css('span.text::text').extract_first(),
'author': quote.css('small.author::text').extract_first(),
'tags': quote.css('div.tags a.tag::text').extract(),
}
next_page = response.css('li.next a::attr(href)').extract_first()
if next_page is not None:
next_page = response.urljoin(next_page)
yield scrapy.Request(next_page, callback=self.parse)
The function performs identically when I omit callback=self.parse
and when I leave it in.
Is this callback implicit, and not necessary? Is there a reason you need to have it in there?
Thanks in advance.
python web-scraping scrapy
add a comment |
I'm following the basic Scrapy tutorial, and have some limited python experience. This seems like a recursive function, and I have some questions about what is happening.
This is in the Scrapy tutorial: https://doc.scrapy.org/en/latest/intro/tutorial.html
This ran the same when I specified callback=self.parse
and when I left it out.
Here's the code (the last line is where my question is coming from):
def parse(self, response):
for quote in response.css('div.quote'):
yield {
'text': quote.css('span.text::text').extract_first(),
'author': quote.css('small.author::text').extract_first(),
'tags': quote.css('div.tags a.tag::text').extract(),
}
next_page = response.css('li.next a::attr(href)').extract_first()
if next_page is not None:
next_page = response.urljoin(next_page)
yield scrapy.Request(next_page, callback=self.parse)
The function performs identically when I omit callback=self.parse
and when I leave it in.
Is this callback implicit, and not necessary? Is there a reason you need to have it in there?
Thanks in advance.
python web-scraping scrapy
1
I read this somewhere in the scrapy documentation If callback is None follow defaults to True , otherwise it defaults to False. Even if you do not explicitly define callback within your script, It will still follow the default function which isparse()
in this case. Hope it clears your confusion.
– robots.txt
Jan 3 at 9:38
add a comment |
I'm following the basic Scrapy tutorial, and have some limited python experience. This seems like a recursive function, and I have some questions about what is happening.
This is in the Scrapy tutorial: https://doc.scrapy.org/en/latest/intro/tutorial.html
This ran the same when I specified callback=self.parse
and when I left it out.
Here's the code (the last line is where my question is coming from):
def parse(self, response):
for quote in response.css('div.quote'):
yield {
'text': quote.css('span.text::text').extract_first(),
'author': quote.css('small.author::text').extract_first(),
'tags': quote.css('div.tags a.tag::text').extract(),
}
next_page = response.css('li.next a::attr(href)').extract_first()
if next_page is not None:
next_page = response.urljoin(next_page)
yield scrapy.Request(next_page, callback=self.parse)
The function performs identically when I omit callback=self.parse
and when I leave it in.
Is this callback implicit, and not necessary? Is there a reason you need to have it in there?
Thanks in advance.
python web-scraping scrapy
I'm following the basic Scrapy tutorial, and have some limited python experience. This seems like a recursive function, and I have some questions about what is happening.
This is in the Scrapy tutorial: https://doc.scrapy.org/en/latest/intro/tutorial.html
This ran the same when I specified callback=self.parse
and when I left it out.
Here's the code (the last line is where my question is coming from):
def parse(self, response):
for quote in response.css('div.quote'):
yield {
'text': quote.css('span.text::text').extract_first(),
'author': quote.css('small.author::text').extract_first(),
'tags': quote.css('div.tags a.tag::text').extract(),
}
next_page = response.css('li.next a::attr(href)').extract_first()
if next_page is not None:
next_page = response.urljoin(next_page)
yield scrapy.Request(next_page, callback=self.parse)
The function performs identically when I omit callback=self.parse
and when I leave it in.
Is this callback implicit, and not necessary? Is there a reason you need to have it in there?
Thanks in advance.
python web-scraping scrapy
python web-scraping scrapy
edited Jan 3 at 11:02
stranac
14.8k31725
14.8k31725
asked Jan 3 at 9:20
TrevorTrevor
7
7
1
I read this somewhere in the scrapy documentation If callback is None follow defaults to True , otherwise it defaults to False. Even if you do not explicitly define callback within your script, It will still follow the default function which isparse()
in this case. Hope it clears your confusion.
– robots.txt
Jan 3 at 9:38
add a comment |
1
I read this somewhere in the scrapy documentation If callback is None follow defaults to True , otherwise it defaults to False. Even if you do not explicitly define callback within your script, It will still follow the default function which isparse()
in this case. Hope it clears your confusion.
– robots.txt
Jan 3 at 9:38
1
1
I read this somewhere in the scrapy documentation If callback is None follow defaults to True , otherwise it defaults to False. Even if you do not explicitly define callback within your script, It will still follow the default function which is
parse()
in this case. Hope it clears your confusion.– robots.txt
Jan 3 at 9:38
I read this somewhere in the scrapy documentation If callback is None follow defaults to True , otherwise it defaults to False. Even if you do not explicitly define callback within your script, It will still follow the default function which is
parse()
in this case. Hope it clears your confusion.– robots.txt
Jan 3 at 9:38
add a comment |
1 Answer
1
active
oldest
votes
The documentation you linked explains what's happening in the A shortcut to the start_requests method section:
parse()
is Scrapy’s default callback method, which is called for requests without an explicitly assigned callback
The scrapy tutorial just shows the basic method, and then tries to ease you into using alternatives.
The reason the documentation leaves it in is so that you understand that you can change the callback name there to use a different callback. Otherwise, someone reading the documentation might wonder how to use a different callback, or worse, think it is not possible to use a different callback.
– Gallaecio
Jan 16 at 11:33
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54019357%2fcoming-from-c-background-trying-to-understand-what-callback-is-doing-in-this%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The documentation you linked explains what's happening in the A shortcut to the start_requests method section:
parse()
is Scrapy’s default callback method, which is called for requests without an explicitly assigned callback
The scrapy tutorial just shows the basic method, and then tries to ease you into using alternatives.
The reason the documentation leaves it in is so that you understand that you can change the callback name there to use a different callback. Otherwise, someone reading the documentation might wonder how to use a different callback, or worse, think it is not possible to use a different callback.
– Gallaecio
Jan 16 at 11:33
add a comment |
The documentation you linked explains what's happening in the A shortcut to the start_requests method section:
parse()
is Scrapy’s default callback method, which is called for requests without an explicitly assigned callback
The scrapy tutorial just shows the basic method, and then tries to ease you into using alternatives.
The reason the documentation leaves it in is so that you understand that you can change the callback name there to use a different callback. Otherwise, someone reading the documentation might wonder how to use a different callback, or worse, think it is not possible to use a different callback.
– Gallaecio
Jan 16 at 11:33
add a comment |
The documentation you linked explains what's happening in the A shortcut to the start_requests method section:
parse()
is Scrapy’s default callback method, which is called for requests without an explicitly assigned callback
The scrapy tutorial just shows the basic method, and then tries to ease you into using alternatives.
The documentation you linked explains what's happening in the A shortcut to the start_requests method section:
parse()
is Scrapy’s default callback method, which is called for requests without an explicitly assigned callback
The scrapy tutorial just shows the basic method, and then tries to ease you into using alternatives.
answered Jan 3 at 10:59
stranacstranac
14.8k31725
14.8k31725
The reason the documentation leaves it in is so that you understand that you can change the callback name there to use a different callback. Otherwise, someone reading the documentation might wonder how to use a different callback, or worse, think it is not possible to use a different callback.
– Gallaecio
Jan 16 at 11:33
add a comment |
The reason the documentation leaves it in is so that you understand that you can change the callback name there to use a different callback. Otherwise, someone reading the documentation might wonder how to use a different callback, or worse, think it is not possible to use a different callback.
– Gallaecio
Jan 16 at 11:33
The reason the documentation leaves it in is so that you understand that you can change the callback name there to use a different callback. Otherwise, someone reading the documentation might wonder how to use a different callback, or worse, think it is not possible to use a different callback.
– Gallaecio
Jan 16 at 11:33
The reason the documentation leaves it in is so that you understand that you can change the callback name there to use a different callback. Otherwise, someone reading the documentation might wonder how to use a different callback, or worse, think it is not possible to use a different callback.
– Gallaecio
Jan 16 at 11:33
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54019357%2fcoming-from-c-background-trying-to-understand-what-callback-is-doing-in-this%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
I read this somewhere in the scrapy documentation If callback is None follow defaults to True , otherwise it defaults to False. Even if you do not explicitly define callback within your script, It will still follow the default function which is
parse()
in this case. Hope it clears your confusion.– robots.txt
Jan 3 at 9:38