Coming from C++ background, trying to understand what callback is doing in this function from the Scrapy...

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

I'm following the basic Scrapy tutorial, and have some limited python experience. This seems like a recursive function, and I have some questions about what is happening.

This is in the Scrapy tutorial: https://doc.scrapy.org/en/latest/intro/tutorial.html

This ran the same when I specified callback=self.parse and when I left it out.

Here's the code (the last line is where my question is coming from):

def parse(self, response):

    for quote in response.css('div.quote'):

        yield {

            'text':     quote.css('span.text::text').extract_first(),

            'author':     quote.css('small.author::text').extract_first(),

            'tags': quote.css('div.tags a.tag::text').extract(),

        }



    next_page = response.css('li.next a::attr(href)').extract_first()

    if next_page is not None:

        next_page = response.urljoin(next_page)

        yield scrapy.Request(next_page, callback=self.parse)

The function performs identically when I omit callback=self.parse and when I leave it in.
Is this callback implicit, and not necessary? Is there a reason you need to have it in there?

Thanks in advance.

edited Jan 3 at 11:02

stranac

14.8k31725

asked Jan 3 at 9:20

Trevor

1

I read this somewhere in the scrapy documentation If callback is None follow defaults to True , otherwise it defaults to False. Even if you do not explicitly define callback within your script, It will still follow the default function which is parse() in this case. Hope it clears your confusion.

– robots.txt
Jan 3 at 9:38

add a comment |

I'm following the basic Scrapy tutorial, and have some limited python experience. This seems like a recursive function, and I have some questions about what is happening.

This is in the Scrapy tutorial: https://doc.scrapy.org/en/latest/intro/tutorial.html

This ran the same when I specified callback=self.parse and when I left it out.

Here's the code (the last line is where my question is coming from):

def parse(self, response):

    for quote in response.css('div.quote'):

        yield {

            'text':     quote.css('span.text::text').extract_first(),

            'author':     quote.css('small.author::text').extract_first(),

            'tags': quote.css('div.tags a.tag::text').extract(),

        }



    next_page = response.css('li.next a::attr(href)').extract_first()

    if next_page is not None:

        next_page = response.urljoin(next_page)

        yield scrapy.Request(next_page, callback=self.parse)

The function performs identically when I omit callback=self.parse and when I leave it in.
Is this callback implicit, and not necessary? Is there a reason you need to have it in there?

Thanks in advance.

edited Jan 3 at 11:02

stranac

14.8k31725

asked Jan 3 at 9:20

Trevor

1

I read this somewhere in the scrapy documentation If callback is None follow defaults to True , otherwise it defaults to False. Even if you do not explicitly define callback within your script, It will still follow the default function which is parse() in this case. Hope it clears your confusion.

– robots.txt
Jan 3 at 9:38

add a comment |

I'm following the basic Scrapy tutorial, and have some limited python experience. This seems like a recursive function, and I have some questions about what is happening.

This is in the Scrapy tutorial: https://doc.scrapy.org/en/latest/intro/tutorial.html

This ran the same when I specified callback=self.parse and when I left it out.

Here's the code (the last line is where my question is coming from):

def parse(self, response):

    for quote in response.css('div.quote'):

        yield {

            'text':     quote.css('span.text::text').extract_first(),

            'author':     quote.css('small.author::text').extract_first(),

            'tags': quote.css('div.tags a.tag::text').extract(),

        }



    next_page = response.css('li.next a::attr(href)').extract_first()

    if next_page is not None:

        next_page = response.urljoin(next_page)

        yield scrapy.Request(next_page, callback=self.parse)

The function performs identically when I omit callback=self.parse and when I leave it in.
Is this callback implicit, and not necessary? Is there a reason you need to have it in there?

Thanks in advance.

edited Jan 3 at 11:02

stranac

14.8k31725

asked Jan 3 at 9:20

Trevor

I'm following the basic Scrapy tutorial, and have some limited python experience. This seems like a recursive function, and I have some questions about what is happening.

This is in the Scrapy tutorial: https://doc.scrapy.org/en/latest/intro/tutorial.html

This ran the same when I specified callback=self.parse and when I left it out.

Here's the code (the last line is where my question is coming from):

def parse(self, response):

    for quote in response.css('div.quote'):

        yield {

            'text':     quote.css('span.text::text').extract_first(),

            'author':     quote.css('small.author::text').extract_first(),

            'tags': quote.css('div.tags a.tag::text').extract(),

        }



    next_page = response.css('li.next a::attr(href)').extract_first()

    if next_page is not None:

        next_page = response.urljoin(next_page)

        yield scrapy.Request(next_page, callback=self.parse)

The function performs identically when I omit callback=self.parse and when I leave it in.
Is this callback implicit, and not necessary? Is there a reason you need to have it in there?

Thanks in advance.

python web-scraping scrapy

edited Jan 3 at 11:02

stranac

14.8k31725

asked Jan 3 at 9:20

Trevor

edited Jan 3 at 11:02

stranac

14.8k31725

asked Jan 3 at 9:20

Trevor

edited Jan 3 at 11:02

stranac

14.8k31725

edited Jan 3 at 11:02

stranac

14.8k31725

edited Jan 3 at 11:02

stranac

14.8k31725

asked Jan 3 at 9:20

Trevor

asked Jan 3 at 9:20

Trevor

asked Jan 3 at 9:20

Trevor

1

I read this somewhere in the scrapy documentation If callback is None follow defaults to True , otherwise it defaults to False. Even if you do not explicitly define callback within your script, It will still follow the default function which is parse() in this case. Hope it clears your confusion.

– robots.txt
Jan 3 at 9:38

add a comment |

1

I read this somewhere in the scrapy documentation If callback is None follow defaults to True , otherwise it defaults to False. Even if you do not explicitly define callback within your script, It will still follow the default function which is parse() in this case. Hope it clears your confusion.

– robots.txt
Jan 3 at 9:38

I read this somewhere in the scrapy documentation If callback is None follow defaults to True , otherwise it defaults to False. Even if you do not explicitly define callback within your script, It will still follow the default function which is parse() in this case. Hope it clears your confusion.

– robots.txt
Jan 3 at 9:38

add a comment |

1 Answer
1

active

oldest

votes

The documentation you linked explains what's happening in the A shortcut to the start_requests method section:

parse() is Scrapy’s default callback method, which is called for requests without an explicitly assigned callback

The scrapy tutorial just shows the basic method, and then tries to ease you into using alternatives.

answered Jan 3 at 10:59

stranac

14.8k31725

The reason the documentation leaves it in is so that you understand that you can change the callback name there to use a different callback. Otherwise, someone reading the documentation might wonder how to use a different callback, or worse, think it is not possible to use a different callback.

– Gallaecio
Jan 16 at 11:33

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54019357%2fcoming-from-c-background-trying-to-understand-what-callback-is-doing-in-this%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

The documentation you linked explains what's happening in the A shortcut to the start_requests method section:

parse() is Scrapy’s default callback method, which is called for requests without an explicitly assigned callback

The scrapy tutorial just shows the basic method, and then tries to ease you into using alternatives.

answered Jan 3 at 10:59

stranac

14.8k31725

The reason the documentation leaves it in is so that you understand that you can change the callback name there to use a different callback. Otherwise, someone reading the documentation might wonder how to use a different callback, or worse, think it is not possible to use a different callback.

– Gallaecio
Jan 16 at 11:33

add a comment |

The documentation you linked explains what's happening in the A shortcut to the start_requests method section:

parse() is Scrapy’s default callback method, which is called for requests without an explicitly assigned callback

The scrapy tutorial just shows the basic method, and then tries to ease you into using alternatives.

answered Jan 3 at 10:59

stranac

14.8k31725

The reason the documentation leaves it in is so that you understand that you can change the callback name there to use a different callback. Otherwise, someone reading the documentation might wonder how to use a different callback, or worse, think it is not possible to use a different callback.

– Gallaecio
Jan 16 at 11:33

add a comment |

The documentation you linked explains what's happening in the A shortcut to the start_requests method section:

parse() is Scrapy’s default callback method, which is called for requests without an explicitly assigned callback

The scrapy tutorial just shows the basic method, and then tries to ease you into using alternatives.

answered Jan 3 at 10:59

stranac

14.8k31725

The documentation you linked explains what's happening in the A shortcut to the start_requests method section:

parse() is Scrapy’s default callback method, which is called for requests without an explicitly assigned callback

The scrapy tutorial just shows the basic method, and then tries to ease you into using alternatives.

answered Jan 3 at 10:59

stranac

14.8k31725

answered Jan 3 at 10:59

stranac

14.8k31725

answered Jan 3 at 10:59

stranac

14.8k31725

answered Jan 3 at 10:59

stranac

14.8k31725

The reason the documentation leaves it in is so that you understand that you can change the callback name there to use a different callback. Otherwise, someone reading the documentation might wonder how to use a different callback, or worse, think it is not possible to use a different callback.

– Gallaecio
Jan 16 at 11:33

add a comment |

The reason the documentation leaves it in is so that you understand that you can change the callback name there to use a different callback. Otherwise, someone reading the documentation might wonder how to use a different callback, or worse, think it is not possible to use a different callback.

– Gallaecio
Jan 16 at 11:33

The reason the documentation leaves it in is so that you understand that you can change the callback name there to use a different callback. Otherwise, someone reading the documentation might wonder how to use a different callback, or worse, think it is not possible to use a different callback.

– Gallaecio
Jan 16 at 11:33

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu