Spacy MemoryError












1















I managed to install spacy but when trying to use nlp then I am getting a MemoryError for some weird reason.



The code I wrote is as follows:



import spacy
import re
from nltk.corpus import gutenberg

def clean_text(astring):
#replace newlines with space
newstring=re.sub("n"," ",astring)
#remove title and chapter headings
newstring=re.sub("[[^]]*]"," ",newstring)
newstring=re.sub("VOLUME S+"," ",newstring)
newstring=re.sub("CHAPTER S+"," ",newstring)
newstring=re.sub("ss+"," ",newstring)
return newstring.lstrip().rstrip()

nlp=spacy.load('en')
alice=clean_text(gutenberg.raw('carroll-alice.txt'))
nlp_alice=list(nlp(alice).sents)


The error I am getting is as follows



The error message



Although when my code is something like this then it works:



import spacy

nlp=spacy.load('en')
alice=nlp("hello Hello")


If anybody could point out what I am doing wrong I would be very grateful










share|improve this question























  • Hmmmm...can you look at your computers memory usage and check if it is being entirely used? Task manager in windows or activity monitor in mac.

    – Joe B
    Jan 2 at 23:20











  • Probably because list(nlp(alice).sents) uses all your memory...

    – juanpa.arrivillaga
    Jan 2 at 23:35
















1















I managed to install spacy but when trying to use nlp then I am getting a MemoryError for some weird reason.



The code I wrote is as follows:



import spacy
import re
from nltk.corpus import gutenberg

def clean_text(astring):
#replace newlines with space
newstring=re.sub("n"," ",astring)
#remove title and chapter headings
newstring=re.sub("[[^]]*]"," ",newstring)
newstring=re.sub("VOLUME S+"," ",newstring)
newstring=re.sub("CHAPTER S+"," ",newstring)
newstring=re.sub("ss+"," ",newstring)
return newstring.lstrip().rstrip()

nlp=spacy.load('en')
alice=clean_text(gutenberg.raw('carroll-alice.txt'))
nlp_alice=list(nlp(alice).sents)


The error I am getting is as follows



The error message



Although when my code is something like this then it works:



import spacy

nlp=spacy.load('en')
alice=nlp("hello Hello")


If anybody could point out what I am doing wrong I would be very grateful










share|improve this question























  • Hmmmm...can you look at your computers memory usage and check if it is being entirely used? Task manager in windows or activity monitor in mac.

    – Joe B
    Jan 2 at 23:20











  • Probably because list(nlp(alice).sents) uses all your memory...

    – juanpa.arrivillaga
    Jan 2 at 23:35














1












1








1








I managed to install spacy but when trying to use nlp then I am getting a MemoryError for some weird reason.



The code I wrote is as follows:



import spacy
import re
from nltk.corpus import gutenberg

def clean_text(astring):
#replace newlines with space
newstring=re.sub("n"," ",astring)
#remove title and chapter headings
newstring=re.sub("[[^]]*]"," ",newstring)
newstring=re.sub("VOLUME S+"," ",newstring)
newstring=re.sub("CHAPTER S+"," ",newstring)
newstring=re.sub("ss+"," ",newstring)
return newstring.lstrip().rstrip()

nlp=spacy.load('en')
alice=clean_text(gutenberg.raw('carroll-alice.txt'))
nlp_alice=list(nlp(alice).sents)


The error I am getting is as follows



The error message



Although when my code is something like this then it works:



import spacy

nlp=spacy.load('en')
alice=nlp("hello Hello")


If anybody could point out what I am doing wrong I would be very grateful










share|improve this question














I managed to install spacy but when trying to use nlp then I am getting a MemoryError for some weird reason.



The code I wrote is as follows:



import spacy
import re
from nltk.corpus import gutenberg

def clean_text(astring):
#replace newlines with space
newstring=re.sub("n"," ",astring)
#remove title and chapter headings
newstring=re.sub("[[^]]*]"," ",newstring)
newstring=re.sub("VOLUME S+"," ",newstring)
newstring=re.sub("CHAPTER S+"," ",newstring)
newstring=re.sub("ss+"," ",newstring)
return newstring.lstrip().rstrip()

nlp=spacy.load('en')
alice=clean_text(gutenberg.raw('carroll-alice.txt'))
nlp_alice=list(nlp(alice).sents)


The error I am getting is as follows



The error message



Although when my code is something like this then it works:



import spacy

nlp=spacy.load('en')
alice=nlp("hello Hello")


If anybody could point out what I am doing wrong I would be very grateful







python nlp spacy






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Jan 2 at 23:16









Marion LaanemaeMarion Laanemae

62




62













  • Hmmmm...can you look at your computers memory usage and check if it is being entirely used? Task manager in windows or activity monitor in mac.

    – Joe B
    Jan 2 at 23:20











  • Probably because list(nlp(alice).sents) uses all your memory...

    – juanpa.arrivillaga
    Jan 2 at 23:35



















  • Hmmmm...can you look at your computers memory usage and check if it is being entirely used? Task manager in windows or activity monitor in mac.

    – Joe B
    Jan 2 at 23:20











  • Probably because list(nlp(alice).sents) uses all your memory...

    – juanpa.arrivillaga
    Jan 2 at 23:35

















Hmmmm...can you look at your computers memory usage and check if it is being entirely used? Task manager in windows or activity monitor in mac.

– Joe B
Jan 2 at 23:20





Hmmmm...can you look at your computers memory usage and check if it is being entirely used? Task manager in windows or activity monitor in mac.

– Joe B
Jan 2 at 23:20













Probably because list(nlp(alice).sents) uses all your memory...

– juanpa.arrivillaga
Jan 2 at 23:35





Probably because list(nlp(alice).sents) uses all your memory...

– juanpa.arrivillaga
Jan 2 at 23:35












1 Answer
1






active

oldest

votes


















1














I'm guessing you truly are running out of memory. I couldn't find an exact number, but I'm sure Carrol's Alice's Adventures in Wonderland has tens of thousands of sentences. This equates to tens of thousands of Span elements from Spacy. Without modification, nlp() determines everything from POS to dependencies for the string passed to it. Moreover, the sents property returns an iterator which should be taken advantage of, as opposed to immediately expanding in a list.



Basically, you're attempting a computation which very likely might be running into a memory constraint. How much memory does your machine support? In the comments Joe suggested watching your machine's memory usage, I second this. My recommendations: check if your are actually running out of memory, or limit the functionality of nlp(), or consider doing your work with the iterator functionality:



for sentence in nlp(alice).sents:
pass





share|improve this answer
























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54014422%2fspacy-memoryerror%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    I'm guessing you truly are running out of memory. I couldn't find an exact number, but I'm sure Carrol's Alice's Adventures in Wonderland has tens of thousands of sentences. This equates to tens of thousands of Span elements from Spacy. Without modification, nlp() determines everything from POS to dependencies for the string passed to it. Moreover, the sents property returns an iterator which should be taken advantage of, as opposed to immediately expanding in a list.



    Basically, you're attempting a computation which very likely might be running into a memory constraint. How much memory does your machine support? In the comments Joe suggested watching your machine's memory usage, I second this. My recommendations: check if your are actually running out of memory, or limit the functionality of nlp(), or consider doing your work with the iterator functionality:



    for sentence in nlp(alice).sents:
    pass





    share|improve this answer




























      1














      I'm guessing you truly are running out of memory. I couldn't find an exact number, but I'm sure Carrol's Alice's Adventures in Wonderland has tens of thousands of sentences. This equates to tens of thousands of Span elements from Spacy. Without modification, nlp() determines everything from POS to dependencies for the string passed to it. Moreover, the sents property returns an iterator which should be taken advantage of, as opposed to immediately expanding in a list.



      Basically, you're attempting a computation which very likely might be running into a memory constraint. How much memory does your machine support? In the comments Joe suggested watching your machine's memory usage, I second this. My recommendations: check if your are actually running out of memory, or limit the functionality of nlp(), or consider doing your work with the iterator functionality:



      for sentence in nlp(alice).sents:
      pass





      share|improve this answer


























        1












        1








        1







        I'm guessing you truly are running out of memory. I couldn't find an exact number, but I'm sure Carrol's Alice's Adventures in Wonderland has tens of thousands of sentences. This equates to tens of thousands of Span elements from Spacy. Without modification, nlp() determines everything from POS to dependencies for the string passed to it. Moreover, the sents property returns an iterator which should be taken advantage of, as opposed to immediately expanding in a list.



        Basically, you're attempting a computation which very likely might be running into a memory constraint. How much memory does your machine support? In the comments Joe suggested watching your machine's memory usage, I second this. My recommendations: check if your are actually running out of memory, or limit the functionality of nlp(), or consider doing your work with the iterator functionality:



        for sentence in nlp(alice).sents:
        pass





        share|improve this answer













        I'm guessing you truly are running out of memory. I couldn't find an exact number, but I'm sure Carrol's Alice's Adventures in Wonderland has tens of thousands of sentences. This equates to tens of thousands of Span elements from Spacy. Without modification, nlp() determines everything from POS to dependencies for the string passed to it. Moreover, the sents property returns an iterator which should be taken advantage of, as opposed to immediately expanding in a list.



        Basically, you're attempting a computation which very likely might be running into a memory constraint. How much memory does your machine support? In the comments Joe suggested watching your machine's memory usage, I second this. My recommendations: check if your are actually running out of memory, or limit the functionality of nlp(), or consider doing your work with the iterator functionality:



        for sentence in nlp(alice).sents:
        pass






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jan 3 at 21:56









        Alex LAlex L

        319411




        319411
































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54014422%2fspacy-memoryerror%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            android studio warns about leanback feature tag usage required on manifest while using Unity exported app?

            SQL update select statement

            'app-layout' is not a known element: how to share Component with different Modules