Pattern recognition in Pandas column(s)












0















I dont know if stackoverflow isn't the right place to post this, if isn't please tell me an alternative.



I have a dataset where i want to classify the data in a "opmerking" (opmerking = comment ) column. I've created a dummy data set to explain what i mean.



enter image description here
each number represents a byte where each bit means something else. So i have a script which takes the most important bits from those bytes and that dataset looks like this:



enter image description here



The bytes/bits are to indicate cycles of a piece of equipment and i need to detect where a cycle isn't normal.
So far i can detect rows where abnormal behaviour is happening but I would like the full cycle.



this is the code i use for finding abnormalities in the Bit data:



dfphotocells = dfsamen[(dfsamen.closeout == 0) & (dfsamen.openout == 0) & (dfsamen.limitswitchopen == 0) & (dfsamen.limitswitchclose == 0) & (dfsamen.prelimit == 0) & (dfsamen.photocells == 1) & (dfsamen.lampgroen == 1) & (dfsamen.lamprood == 1) & (dfsamen.halfopen == 0 ) & (dfsamen["DateTime"]>='2018-06-1 00:00:00') & (dfsamen["DateTime"]<='2018-10-10 00:00:00')]
dfphotocells['opmerking'] = 'photocells'


this obviously only gives me the rows where photocells = 1.



I was thinking of using a naive bayes classifier but i don't know how accurate that would be.



So my full question is:
Should i look at my cycles bytewise or bitwise and which method should I use for detecting a full cycle? (i am not looking for someone to code this for me, just a nudge in the right direction)










share|improve this question

























  • first you need to decide what are your output classes and then use the bytewise data as features to classify the test/new records into any of these classes. You can use a multinomial logistic regression to achieve this.

    – min2bro
    Nov 22 '18 at 11:26








  • 1





    Martin, First of all this image data dont really helpful, better you can paste initial 10 lines of the data to showcase here which will be easy to read, However you can also add if you have done something so far .

    – pygo
    Nov 22 '18 at 11:45











  • I wanted to paste the data in text but that came out messy and more unreadable imo and i needed to show more than 10 lines to display what i mean.

    – Martijn van Amsterdam
    Nov 22 '18 at 12:01











  • I've added a line of code that i use to find the abnormalities, but they only show me one or two rows where that is the case. I am looking for a way where the comment/opmerking applies to the whole cycle.

    – Martijn van Amsterdam
    Nov 22 '18 at 12:07











  • multinomial logistic regression looks like it might work, thanks for the suggestion

    – Martijn van Amsterdam
    Nov 22 '18 at 12:41
















0















I dont know if stackoverflow isn't the right place to post this, if isn't please tell me an alternative.



I have a dataset where i want to classify the data in a "opmerking" (opmerking = comment ) column. I've created a dummy data set to explain what i mean.



enter image description here
each number represents a byte where each bit means something else. So i have a script which takes the most important bits from those bytes and that dataset looks like this:



enter image description here



The bytes/bits are to indicate cycles of a piece of equipment and i need to detect where a cycle isn't normal.
So far i can detect rows where abnormal behaviour is happening but I would like the full cycle.



this is the code i use for finding abnormalities in the Bit data:



dfphotocells = dfsamen[(dfsamen.closeout == 0) & (dfsamen.openout == 0) & (dfsamen.limitswitchopen == 0) & (dfsamen.limitswitchclose == 0) & (dfsamen.prelimit == 0) & (dfsamen.photocells == 1) & (dfsamen.lampgroen == 1) & (dfsamen.lamprood == 1) & (dfsamen.halfopen == 0 ) & (dfsamen["DateTime"]>='2018-06-1 00:00:00') & (dfsamen["DateTime"]<='2018-10-10 00:00:00')]
dfphotocells['opmerking'] = 'photocells'


this obviously only gives me the rows where photocells = 1.



I was thinking of using a naive bayes classifier but i don't know how accurate that would be.



So my full question is:
Should i look at my cycles bytewise or bitwise and which method should I use for detecting a full cycle? (i am not looking for someone to code this for me, just a nudge in the right direction)










share|improve this question

























  • first you need to decide what are your output classes and then use the bytewise data as features to classify the test/new records into any of these classes. You can use a multinomial logistic regression to achieve this.

    – min2bro
    Nov 22 '18 at 11:26








  • 1





    Martin, First of all this image data dont really helpful, better you can paste initial 10 lines of the data to showcase here which will be easy to read, However you can also add if you have done something so far .

    – pygo
    Nov 22 '18 at 11:45











  • I wanted to paste the data in text but that came out messy and more unreadable imo and i needed to show more than 10 lines to display what i mean.

    – Martijn van Amsterdam
    Nov 22 '18 at 12:01











  • I've added a line of code that i use to find the abnormalities, but they only show me one or two rows where that is the case. I am looking for a way where the comment/opmerking applies to the whole cycle.

    – Martijn van Amsterdam
    Nov 22 '18 at 12:07











  • multinomial logistic regression looks like it might work, thanks for the suggestion

    – Martijn van Amsterdam
    Nov 22 '18 at 12:41














0












0








0








I dont know if stackoverflow isn't the right place to post this, if isn't please tell me an alternative.



I have a dataset where i want to classify the data in a "opmerking" (opmerking = comment ) column. I've created a dummy data set to explain what i mean.



enter image description here
each number represents a byte where each bit means something else. So i have a script which takes the most important bits from those bytes and that dataset looks like this:



enter image description here



The bytes/bits are to indicate cycles of a piece of equipment and i need to detect where a cycle isn't normal.
So far i can detect rows where abnormal behaviour is happening but I would like the full cycle.



this is the code i use for finding abnormalities in the Bit data:



dfphotocells = dfsamen[(dfsamen.closeout == 0) & (dfsamen.openout == 0) & (dfsamen.limitswitchopen == 0) & (dfsamen.limitswitchclose == 0) & (dfsamen.prelimit == 0) & (dfsamen.photocells == 1) & (dfsamen.lampgroen == 1) & (dfsamen.lamprood == 1) & (dfsamen.halfopen == 0 ) & (dfsamen["DateTime"]>='2018-06-1 00:00:00') & (dfsamen["DateTime"]<='2018-10-10 00:00:00')]
dfphotocells['opmerking'] = 'photocells'


this obviously only gives me the rows where photocells = 1.



I was thinking of using a naive bayes classifier but i don't know how accurate that would be.



So my full question is:
Should i look at my cycles bytewise or bitwise and which method should I use for detecting a full cycle? (i am not looking for someone to code this for me, just a nudge in the right direction)










share|improve this question
















I dont know if stackoverflow isn't the right place to post this, if isn't please tell me an alternative.



I have a dataset where i want to classify the data in a "opmerking" (opmerking = comment ) column. I've created a dummy data set to explain what i mean.



enter image description here
each number represents a byte where each bit means something else. So i have a script which takes the most important bits from those bytes and that dataset looks like this:



enter image description here



The bytes/bits are to indicate cycles of a piece of equipment and i need to detect where a cycle isn't normal.
So far i can detect rows where abnormal behaviour is happening but I would like the full cycle.



this is the code i use for finding abnormalities in the Bit data:



dfphotocells = dfsamen[(dfsamen.closeout == 0) & (dfsamen.openout == 0) & (dfsamen.limitswitchopen == 0) & (dfsamen.limitswitchclose == 0) & (dfsamen.prelimit == 0) & (dfsamen.photocells == 1) & (dfsamen.lampgroen == 1) & (dfsamen.lamprood == 1) & (dfsamen.halfopen == 0 ) & (dfsamen["DateTime"]>='2018-06-1 00:00:00') & (dfsamen["DateTime"]<='2018-10-10 00:00:00')]
dfphotocells['opmerking'] = 'photocells'


this obviously only gives me the rows where photocells = 1.



I was thinking of using a naive bayes classifier but i don't know how accurate that would be.



So my full question is:
Should i look at my cycles bytewise or bitwise and which method should I use for detecting a full cycle? (i am not looking for someone to code this for me, just a nudge in the right direction)







python pandas






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 22 '18 at 12:04







Martijn van Amsterdam

















asked Nov 22 '18 at 11:14









Martijn van AmsterdamMartijn van Amsterdam

12710




12710













  • first you need to decide what are your output classes and then use the bytewise data as features to classify the test/new records into any of these classes. You can use a multinomial logistic regression to achieve this.

    – min2bro
    Nov 22 '18 at 11:26








  • 1





    Martin, First of all this image data dont really helpful, better you can paste initial 10 lines of the data to showcase here which will be easy to read, However you can also add if you have done something so far .

    – pygo
    Nov 22 '18 at 11:45











  • I wanted to paste the data in text but that came out messy and more unreadable imo and i needed to show more than 10 lines to display what i mean.

    – Martijn van Amsterdam
    Nov 22 '18 at 12:01











  • I've added a line of code that i use to find the abnormalities, but they only show me one or two rows where that is the case. I am looking for a way where the comment/opmerking applies to the whole cycle.

    – Martijn van Amsterdam
    Nov 22 '18 at 12:07











  • multinomial logistic regression looks like it might work, thanks for the suggestion

    – Martijn van Amsterdam
    Nov 22 '18 at 12:41



















  • first you need to decide what are your output classes and then use the bytewise data as features to classify the test/new records into any of these classes. You can use a multinomial logistic regression to achieve this.

    – min2bro
    Nov 22 '18 at 11:26








  • 1





    Martin, First of all this image data dont really helpful, better you can paste initial 10 lines of the data to showcase here which will be easy to read, However you can also add if you have done something so far .

    – pygo
    Nov 22 '18 at 11:45











  • I wanted to paste the data in text but that came out messy and more unreadable imo and i needed to show more than 10 lines to display what i mean.

    – Martijn van Amsterdam
    Nov 22 '18 at 12:01











  • I've added a line of code that i use to find the abnormalities, but they only show me one or two rows where that is the case. I am looking for a way where the comment/opmerking applies to the whole cycle.

    – Martijn van Amsterdam
    Nov 22 '18 at 12:07











  • multinomial logistic regression looks like it might work, thanks for the suggestion

    – Martijn van Amsterdam
    Nov 22 '18 at 12:41

















first you need to decide what are your output classes and then use the bytewise data as features to classify the test/new records into any of these classes. You can use a multinomial logistic regression to achieve this.

– min2bro
Nov 22 '18 at 11:26







first you need to decide what are your output classes and then use the bytewise data as features to classify the test/new records into any of these classes. You can use a multinomial logistic regression to achieve this.

– min2bro
Nov 22 '18 at 11:26






1




1





Martin, First of all this image data dont really helpful, better you can paste initial 10 lines of the data to showcase here which will be easy to read, However you can also add if you have done something so far .

– pygo
Nov 22 '18 at 11:45





Martin, First of all this image data dont really helpful, better you can paste initial 10 lines of the data to showcase here which will be easy to read, However you can also add if you have done something so far .

– pygo
Nov 22 '18 at 11:45













I wanted to paste the data in text but that came out messy and more unreadable imo and i needed to show more than 10 lines to display what i mean.

– Martijn van Amsterdam
Nov 22 '18 at 12:01





I wanted to paste the data in text but that came out messy and more unreadable imo and i needed to show more than 10 lines to display what i mean.

– Martijn van Amsterdam
Nov 22 '18 at 12:01













I've added a line of code that i use to find the abnormalities, but they only show me one or two rows where that is the case. I am looking for a way where the comment/opmerking applies to the whole cycle.

– Martijn van Amsterdam
Nov 22 '18 at 12:07





I've added a line of code that i use to find the abnormalities, but they only show me one or two rows where that is the case. I am looking for a way where the comment/opmerking applies to the whole cycle.

– Martijn van Amsterdam
Nov 22 '18 at 12:07













multinomial logistic regression looks like it might work, thanks for the suggestion

– Martijn van Amsterdam
Nov 22 '18 at 12:41





multinomial logistic regression looks like it might work, thanks for the suggestion

– Martijn van Amsterdam
Nov 22 '18 at 12:41












0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53429736%2fpattern-recognition-in-pandas-columns%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53429736%2fpattern-recognition-in-pandas-columns%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

MongoDB - Not Authorized To Execute Command

How to fix TextFormField cause rebuild widget in Flutter

in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith