what is workers parameter in word2vec in NLP
in below code .
i didn't understand the meaning of workers parameter .
model = Word2Vec(sentences, size=300000, window=2, min_count=5, workers=4)
python machine-learning nlp word2vec
add a comment |
in below code .
i didn't understand the meaning of workers parameter .
model = Word2Vec(sentences, size=300000, window=2, min_count=5, workers=4)
python machine-learning nlp word2vec
2
workers
is the number of threads for the training of the model, higher number = faster training.
– mcoav
Nov 21 '18 at 17:10
radimrehurek.com/gensim/models/… - idownvotedbecau.se/noresearch
– desertnaut
Nov 21 '18 at 17:14
1
I'm voting to close this question as off-topic because this is a case of RTFM.
– Matthieu Brucher
Nov 21 '18 at 17:17
add a comment |
in below code .
i didn't understand the meaning of workers parameter .
model = Word2Vec(sentences, size=300000, window=2, min_count=5, workers=4)
python machine-learning nlp word2vec
in below code .
i didn't understand the meaning of workers parameter .
model = Word2Vec(sentences, size=300000, window=2, min_count=5, workers=4)
python machine-learning nlp word2vec
python machine-learning nlp word2vec
edited Nov 21 '18 at 17:14
desertnaut
18.1k73872
18.1k73872
asked Nov 21 '18 at 17:07
Vishal SuryavanshiVishal Suryavanshi
195
195
2
workers
is the number of threads for the training of the model, higher number = faster training.
– mcoav
Nov 21 '18 at 17:10
radimrehurek.com/gensim/models/… - idownvotedbecau.se/noresearch
– desertnaut
Nov 21 '18 at 17:14
1
I'm voting to close this question as off-topic because this is a case of RTFM.
– Matthieu Brucher
Nov 21 '18 at 17:17
add a comment |
2
workers
is the number of threads for the training of the model, higher number = faster training.
– mcoav
Nov 21 '18 at 17:10
radimrehurek.com/gensim/models/… - idownvotedbecau.se/noresearch
– desertnaut
Nov 21 '18 at 17:14
1
I'm voting to close this question as off-topic because this is a case of RTFM.
– Matthieu Brucher
Nov 21 '18 at 17:17
2
2
workers
is the number of threads for the training of the model, higher number = faster training.– mcoav
Nov 21 '18 at 17:10
workers
is the number of threads for the training of the model, higher number = faster training.– mcoav
Nov 21 '18 at 17:10
radimrehurek.com/gensim/models/… - idownvotedbecau.se/noresearch
– desertnaut
Nov 21 '18 at 17:14
radimrehurek.com/gensim/models/… - idownvotedbecau.se/noresearch
– desertnaut
Nov 21 '18 at 17:14
1
1
I'm voting to close this question as off-topic because this is a case of RTFM.
– Matthieu Brucher
Nov 21 '18 at 17:17
I'm voting to close this question as off-topic because this is a case of RTFM.
– Matthieu Brucher
Nov 21 '18 at 17:17
add a comment |
2 Answers
2
active
oldest
votes
workers = use this many worker threads to train the model (=faster training with multicore machines).
If your system is having 2 cores, and if you specify workers=2, then data will be trained in two parallel ways.
By default , worker = 1 i.e, no parallelization
add a comment |
As others have mentioned, workers
controls the number of independent threads doing simultaneous training.
In general, you'll never want to use more workers than the number of CPU cores.
But further, the gensim Word2Vec
implementation faces a bit more thread-to-thread bottlenecking due to issues like the Python "Global Interpreter Lock" ('GIL') and some of its IO/corpus-handling design decisions.
So on systems with a large number of cores, such as more than 16, the optimal workers
value for maximum throughput is usually less than the full count of cores – often in the 3-12 range. (The exact number will depend on other aspects of your corpus-handling and chosen metaparameters, and for now is most often discovered through trial-and-error.)
If your corpus is already in a specific text format, the latest gensim release, 3.6.0, offers a new input mode that allows better scaling of workers
all the way up to the count of CPU cores. See this section of the release notes about the new corpus_file
parameter for details.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53417258%2fwhat-is-workers-parameter-in-word2vec-in-nlp%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
workers = use this many worker threads to train the model (=faster training with multicore machines).
If your system is having 2 cores, and if you specify workers=2, then data will be trained in two parallel ways.
By default , worker = 1 i.e, no parallelization
add a comment |
workers = use this many worker threads to train the model (=faster training with multicore machines).
If your system is having 2 cores, and if you specify workers=2, then data will be trained in two parallel ways.
By default , worker = 1 i.e, no parallelization
add a comment |
workers = use this many worker threads to train the model (=faster training with multicore machines).
If your system is having 2 cores, and if you specify workers=2, then data will be trained in two parallel ways.
By default , worker = 1 i.e, no parallelization
workers = use this many worker threads to train the model (=faster training with multicore machines).
If your system is having 2 cores, and if you specify workers=2, then data will be trained in two parallel ways.
By default , worker = 1 i.e, no parallelization
answered Nov 21 '18 at 17:22


SUBHOJEETSUBHOJEET
394
394
add a comment |
add a comment |
As others have mentioned, workers
controls the number of independent threads doing simultaneous training.
In general, you'll never want to use more workers than the number of CPU cores.
But further, the gensim Word2Vec
implementation faces a bit more thread-to-thread bottlenecking due to issues like the Python "Global Interpreter Lock" ('GIL') and some of its IO/corpus-handling design decisions.
So on systems with a large number of cores, such as more than 16, the optimal workers
value for maximum throughput is usually less than the full count of cores – often in the 3-12 range. (The exact number will depend on other aspects of your corpus-handling and chosen metaparameters, and for now is most often discovered through trial-and-error.)
If your corpus is already in a specific text format, the latest gensim release, 3.6.0, offers a new input mode that allows better scaling of workers
all the way up to the count of CPU cores. See this section of the release notes about the new corpus_file
parameter for details.
add a comment |
As others have mentioned, workers
controls the number of independent threads doing simultaneous training.
In general, you'll never want to use more workers than the number of CPU cores.
But further, the gensim Word2Vec
implementation faces a bit more thread-to-thread bottlenecking due to issues like the Python "Global Interpreter Lock" ('GIL') and some of its IO/corpus-handling design decisions.
So on systems with a large number of cores, such as more than 16, the optimal workers
value for maximum throughput is usually less than the full count of cores – often in the 3-12 range. (The exact number will depend on other aspects of your corpus-handling and chosen metaparameters, and for now is most often discovered through trial-and-error.)
If your corpus is already in a specific text format, the latest gensim release, 3.6.0, offers a new input mode that allows better scaling of workers
all the way up to the count of CPU cores. See this section of the release notes about the new corpus_file
parameter for details.
add a comment |
As others have mentioned, workers
controls the number of independent threads doing simultaneous training.
In general, you'll never want to use more workers than the number of CPU cores.
But further, the gensim Word2Vec
implementation faces a bit more thread-to-thread bottlenecking due to issues like the Python "Global Interpreter Lock" ('GIL') and some of its IO/corpus-handling design decisions.
So on systems with a large number of cores, such as more than 16, the optimal workers
value for maximum throughput is usually less than the full count of cores – often in the 3-12 range. (The exact number will depend on other aspects of your corpus-handling and chosen metaparameters, and for now is most often discovered through trial-and-error.)
If your corpus is already in a specific text format, the latest gensim release, 3.6.0, offers a new input mode that allows better scaling of workers
all the way up to the count of CPU cores. See this section of the release notes about the new corpus_file
parameter for details.
As others have mentioned, workers
controls the number of independent threads doing simultaneous training.
In general, you'll never want to use more workers than the number of CPU cores.
But further, the gensim Word2Vec
implementation faces a bit more thread-to-thread bottlenecking due to issues like the Python "Global Interpreter Lock" ('GIL') and some of its IO/corpus-handling design decisions.
So on systems with a large number of cores, such as more than 16, the optimal workers
value for maximum throughput is usually less than the full count of cores – often in the 3-12 range. (The exact number will depend on other aspects of your corpus-handling and chosen metaparameters, and for now is most often discovered through trial-and-error.)
If your corpus is already in a specific text format, the latest gensim release, 3.6.0, offers a new input mode that allows better scaling of workers
all the way up to the count of CPU cores. See this section of the release notes about the new corpus_file
parameter for details.
answered Nov 21 '18 at 20:58
gojomogojomo
19.8k64467
19.8k64467
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53417258%2fwhat-is-workers-parameter-in-word2vec-in-nlp%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
workers
is the number of threads for the training of the model, higher number = faster training.– mcoav
Nov 21 '18 at 17:10
radimrehurek.com/gensim/models/… - idownvotedbecau.se/noresearch
– desertnaut
Nov 21 '18 at 17:14
1
I'm voting to close this question as off-topic because this is a case of RTFM.
– Matthieu Brucher
Nov 21 '18 at 17:17