How do I parse the elements in this list?












0















I have a list to parse, (but I am finding a generic way to parse any list like this):



dev-libs/icu-63.1-r1 alpha amd64 arm arm64 ia64 ppc ppc64 x86 hppa s390
dev-libs/icu-layoutex-63.1 alpha amd64 ia64 ppc ppc64 x86 hppa sparc
dev-lang/perl-5.28-r1 s390
virtual/ruby_gems-0.3_pre24 amd64 x86



This seems to fall sometimes, because it tries the parse the architectures list like starting with alpha till the end of line, but I really want to ignore everything after a package version but leave the posibility of space existence after a version.



My code is following: (print stuff just for debug)



for line in args.list:
print(line)
package_category = re.search(r'((?<==)w+-w+|w+-w+|w+)', line).group(0)
print(package_category)
package_name = re.search(r'(?<=/)[a-z]+.[a-z]+', line).group(0)
print(package_name)
package_version = re.search(r'(?<=-)d+.d-*w*s?', line).group(0)


I expect this to do following:



package_category variable should contain a category like:



dev-libs
dev-lang
virtual



package_name should contain a package name, like:



icu
icu-layoutex
perl
ruby_gems



package_version:



63.1-r1
63.1
0.3_pre24



the rest should be just ignored



currently I suddenly hit the architrctures list somehow with the output:



dev-libs/icu-63.1-r1
dev-libs
icu
alpha
alpha
Traceback (most recent call last):
File "./repomator.py", line 47, in <module>
package_name = re.search(r'(?<=/)[a-z]+.[a-z]+', line).group(0)
AttributeError: 'NoneType' object has no attribute 'group'










share|improve this question

























  • I ran your code and it gave me exactly what you expect. Where are you failing?

    – Aaron_ab
    Jan 1 at 12:54











  • Why not a list of package_category that contains all the strings before / and a list of package_name that contains all the strings after / but before second occurrence of -and a list of package_version that contains all the strings after second occurrence of - and before empty " "

    – DirtyBit
    Jan 1 at 12:54











  • @Aaron_ab I've edited the OP to include the error I hit

    – Misha Lavrov
    Jan 1 at 13:01
















0















I have a list to parse, (but I am finding a generic way to parse any list like this):



dev-libs/icu-63.1-r1 alpha amd64 arm arm64 ia64 ppc ppc64 x86 hppa s390
dev-libs/icu-layoutex-63.1 alpha amd64 ia64 ppc ppc64 x86 hppa sparc
dev-lang/perl-5.28-r1 s390
virtual/ruby_gems-0.3_pre24 amd64 x86



This seems to fall sometimes, because it tries the parse the architectures list like starting with alpha till the end of line, but I really want to ignore everything after a package version but leave the posibility of space existence after a version.



My code is following: (print stuff just for debug)



for line in args.list:
print(line)
package_category = re.search(r'((?<==)w+-w+|w+-w+|w+)', line).group(0)
print(package_category)
package_name = re.search(r'(?<=/)[a-z]+.[a-z]+', line).group(0)
print(package_name)
package_version = re.search(r'(?<=-)d+.d-*w*s?', line).group(0)


I expect this to do following:



package_category variable should contain a category like:



dev-libs
dev-lang
virtual



package_name should contain a package name, like:



icu
icu-layoutex
perl
ruby_gems



package_version:



63.1-r1
63.1
0.3_pre24



the rest should be just ignored



currently I suddenly hit the architrctures list somehow with the output:



dev-libs/icu-63.1-r1
dev-libs
icu
alpha
alpha
Traceback (most recent call last):
File "./repomator.py", line 47, in <module>
package_name = re.search(r'(?<=/)[a-z]+.[a-z]+', line).group(0)
AttributeError: 'NoneType' object has no attribute 'group'










share|improve this question

























  • I ran your code and it gave me exactly what you expect. Where are you failing?

    – Aaron_ab
    Jan 1 at 12:54











  • Why not a list of package_category that contains all the strings before / and a list of package_name that contains all the strings after / but before second occurrence of -and a list of package_version that contains all the strings after second occurrence of - and before empty " "

    – DirtyBit
    Jan 1 at 12:54











  • @Aaron_ab I've edited the OP to include the error I hit

    – Misha Lavrov
    Jan 1 at 13:01














0












0








0








I have a list to parse, (but I am finding a generic way to parse any list like this):



dev-libs/icu-63.1-r1 alpha amd64 arm arm64 ia64 ppc ppc64 x86 hppa s390
dev-libs/icu-layoutex-63.1 alpha amd64 ia64 ppc ppc64 x86 hppa sparc
dev-lang/perl-5.28-r1 s390
virtual/ruby_gems-0.3_pre24 amd64 x86



This seems to fall sometimes, because it tries the parse the architectures list like starting with alpha till the end of line, but I really want to ignore everything after a package version but leave the posibility of space existence after a version.



My code is following: (print stuff just for debug)



for line in args.list:
print(line)
package_category = re.search(r'((?<==)w+-w+|w+-w+|w+)', line).group(0)
print(package_category)
package_name = re.search(r'(?<=/)[a-z]+.[a-z]+', line).group(0)
print(package_name)
package_version = re.search(r'(?<=-)d+.d-*w*s?', line).group(0)


I expect this to do following:



package_category variable should contain a category like:



dev-libs
dev-lang
virtual



package_name should contain a package name, like:



icu
icu-layoutex
perl
ruby_gems



package_version:



63.1-r1
63.1
0.3_pre24



the rest should be just ignored



currently I suddenly hit the architrctures list somehow with the output:



dev-libs/icu-63.1-r1
dev-libs
icu
alpha
alpha
Traceback (most recent call last):
File "./repomator.py", line 47, in <module>
package_name = re.search(r'(?<=/)[a-z]+.[a-z]+', line).group(0)
AttributeError: 'NoneType' object has no attribute 'group'










share|improve this question
















I have a list to parse, (but I am finding a generic way to parse any list like this):



dev-libs/icu-63.1-r1 alpha amd64 arm arm64 ia64 ppc ppc64 x86 hppa s390
dev-libs/icu-layoutex-63.1 alpha amd64 ia64 ppc ppc64 x86 hppa sparc
dev-lang/perl-5.28-r1 s390
virtual/ruby_gems-0.3_pre24 amd64 x86



This seems to fall sometimes, because it tries the parse the architectures list like starting with alpha till the end of line, but I really want to ignore everything after a package version but leave the posibility of space existence after a version.



My code is following: (print stuff just for debug)



for line in args.list:
print(line)
package_category = re.search(r'((?<==)w+-w+|w+-w+|w+)', line).group(0)
print(package_category)
package_name = re.search(r'(?<=/)[a-z]+.[a-z]+', line).group(0)
print(package_name)
package_version = re.search(r'(?<=-)d+.d-*w*s?', line).group(0)


I expect this to do following:



package_category variable should contain a category like:



dev-libs
dev-lang
virtual



package_name should contain a package name, like:



icu
icu-layoutex
perl
ruby_gems



package_version:



63.1-r1
63.1
0.3_pre24



the rest should be just ignored



currently I suddenly hit the architrctures list somehow with the output:



dev-libs/icu-63.1-r1
dev-libs
icu
alpha
alpha
Traceback (most recent call last):
File "./repomator.py", line 47, in <module>
package_name = re.search(r'(?<=/)[a-z]+.[a-z]+', line).group(0)
AttributeError: 'NoneType' object has no attribute 'group'







python regex






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 1 at 13:00







Misha Lavrov

















asked Jan 1 at 12:47









Misha LavrovMisha Lavrov

12




12













  • I ran your code and it gave me exactly what you expect. Where are you failing?

    – Aaron_ab
    Jan 1 at 12:54











  • Why not a list of package_category that contains all the strings before / and a list of package_name that contains all the strings after / but before second occurrence of -and a list of package_version that contains all the strings after second occurrence of - and before empty " "

    – DirtyBit
    Jan 1 at 12:54











  • @Aaron_ab I've edited the OP to include the error I hit

    – Misha Lavrov
    Jan 1 at 13:01



















  • I ran your code and it gave me exactly what you expect. Where are you failing?

    – Aaron_ab
    Jan 1 at 12:54











  • Why not a list of package_category that contains all the strings before / and a list of package_name that contains all the strings after / but before second occurrence of -and a list of package_version that contains all the strings after second occurrence of - and before empty " "

    – DirtyBit
    Jan 1 at 12:54











  • @Aaron_ab I've edited the OP to include the error I hit

    – Misha Lavrov
    Jan 1 at 13:01

















I ran your code and it gave me exactly what you expect. Where are you failing?

– Aaron_ab
Jan 1 at 12:54





I ran your code and it gave me exactly what you expect. Where are you failing?

– Aaron_ab
Jan 1 at 12:54













Why not a list of package_category that contains all the strings before / and a list of package_name that contains all the strings after / but before second occurrence of -and a list of package_version that contains all the strings after second occurrence of - and before empty " "

– DirtyBit
Jan 1 at 12:54





Why not a list of package_category that contains all the strings before / and a list of package_name that contains all the strings after / but before second occurrence of -and a list of package_version that contains all the strings after second occurrence of - and before empty " "

– DirtyBit
Jan 1 at 12:54













@Aaron_ab I've edited the OP to include the error I hit

– Misha Lavrov
Jan 1 at 13:01





@Aaron_ab I've edited the OP to include the error I hit

– Misha Lavrov
Jan 1 at 13:01












1 Answer
1






active

oldest

votes


















1














Is that what you want:



(?P<category>w+(?:-w+)?)/(?P<name>[a-z]+(?:[-_][a-z]+)?)-(?P<version>S+)


Demo



Explanation:



(?<category>            # named group category
w+ # 1 or more word character
(?:-w+)? # optional, a dash then 1 or more word character
) # end group
/ # a slash
(?<name> # named group name
[a-z]+ # 1 or more alpha
(?:[-_][a-z]+)? # optional, dash or underscore and 1 or more alpha
) # end group
- # a dash
(?<version> # named group version
S+ # 1 or more non space character
) # end group


code:



import re

list = [
'dev-libs/icu-63.1-r1 alpha amd64 arm arm64 ia64 ppc ppc64 x86 hppa s390 ',
'dev-libs/icu-layoutex-63.1 alpha amd64 ia64 ppc ppc64 x86 hppa sparc',
'dev-lang/perl-5.28-r1 s390',
'virtual/ruby_gems-0.3_pre24 amd64 x86'
]
for line in list:
res = re.search(r'(?P<category>w+(?:-w+)?)/(?P<name>[a-z]+(?:[-_][a-z]+)?)-(?P<version>S+)', line)
print "cat: ",res.group('category'),"t name: ",res.group('name'), "ttversion: ",res.group('version')


Output:



cat:  dev-libs    name:  icu        version:  63.1-r1
cat: dev-libs name: icu-layoutex version: 63.1
cat: dev-lang name: perl version: 5.28-r1
cat: virtual name: ruby_gems version: 0.3_pre24





share|improve this answer


























  • hmm, regex itself works, but when I try to assign the results into dictonary like this (for the further use): packages.append({ "category": res.group('category'), "name": res.group('name'), "version": res.group('version') }) I got error AttributeError: 'NoneType' object has no attribute 'group' Why does it try to perform search again, if this is already in group? :(

    – Misha Lavrov
    Jan 1 at 13:53











  • @MishaLavrov: I don't know too much python, but, I think dictionary haven't append method. And how do you use it? Are you doing the assignmentinside or outside the for loop? Inside the for loop, just add packages["category"] = res.group('category'), same for name and version. I've just tried it and it works fine.

    – Toto
    Jan 1 at 16:13











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53995560%2fhow-do-i-parse-the-elements-in-this-list%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














Is that what you want:



(?P<category>w+(?:-w+)?)/(?P<name>[a-z]+(?:[-_][a-z]+)?)-(?P<version>S+)


Demo



Explanation:



(?<category>            # named group category
w+ # 1 or more word character
(?:-w+)? # optional, a dash then 1 or more word character
) # end group
/ # a slash
(?<name> # named group name
[a-z]+ # 1 or more alpha
(?:[-_][a-z]+)? # optional, dash or underscore and 1 or more alpha
) # end group
- # a dash
(?<version> # named group version
S+ # 1 or more non space character
) # end group


code:



import re

list = [
'dev-libs/icu-63.1-r1 alpha amd64 arm arm64 ia64 ppc ppc64 x86 hppa s390 ',
'dev-libs/icu-layoutex-63.1 alpha amd64 ia64 ppc ppc64 x86 hppa sparc',
'dev-lang/perl-5.28-r1 s390',
'virtual/ruby_gems-0.3_pre24 amd64 x86'
]
for line in list:
res = re.search(r'(?P<category>w+(?:-w+)?)/(?P<name>[a-z]+(?:[-_][a-z]+)?)-(?P<version>S+)', line)
print "cat: ",res.group('category'),"t name: ",res.group('name'), "ttversion: ",res.group('version')


Output:



cat:  dev-libs    name:  icu        version:  63.1-r1
cat: dev-libs name: icu-layoutex version: 63.1
cat: dev-lang name: perl version: 5.28-r1
cat: virtual name: ruby_gems version: 0.3_pre24





share|improve this answer


























  • hmm, regex itself works, but when I try to assign the results into dictonary like this (for the further use): packages.append({ "category": res.group('category'), "name": res.group('name'), "version": res.group('version') }) I got error AttributeError: 'NoneType' object has no attribute 'group' Why does it try to perform search again, if this is already in group? :(

    – Misha Lavrov
    Jan 1 at 13:53











  • @MishaLavrov: I don't know too much python, but, I think dictionary haven't append method. And how do you use it? Are you doing the assignmentinside or outside the for loop? Inside the for loop, just add packages["category"] = res.group('category'), same for name and version. I've just tried it and it works fine.

    – Toto
    Jan 1 at 16:13
















1














Is that what you want:



(?P<category>w+(?:-w+)?)/(?P<name>[a-z]+(?:[-_][a-z]+)?)-(?P<version>S+)


Demo



Explanation:



(?<category>            # named group category
w+ # 1 or more word character
(?:-w+)? # optional, a dash then 1 or more word character
) # end group
/ # a slash
(?<name> # named group name
[a-z]+ # 1 or more alpha
(?:[-_][a-z]+)? # optional, dash or underscore and 1 or more alpha
) # end group
- # a dash
(?<version> # named group version
S+ # 1 or more non space character
) # end group


code:



import re

list = [
'dev-libs/icu-63.1-r1 alpha amd64 arm arm64 ia64 ppc ppc64 x86 hppa s390 ',
'dev-libs/icu-layoutex-63.1 alpha amd64 ia64 ppc ppc64 x86 hppa sparc',
'dev-lang/perl-5.28-r1 s390',
'virtual/ruby_gems-0.3_pre24 amd64 x86'
]
for line in list:
res = re.search(r'(?P<category>w+(?:-w+)?)/(?P<name>[a-z]+(?:[-_][a-z]+)?)-(?P<version>S+)', line)
print "cat: ",res.group('category'),"t name: ",res.group('name'), "ttversion: ",res.group('version')


Output:



cat:  dev-libs    name:  icu        version:  63.1-r1
cat: dev-libs name: icu-layoutex version: 63.1
cat: dev-lang name: perl version: 5.28-r1
cat: virtual name: ruby_gems version: 0.3_pre24





share|improve this answer


























  • hmm, regex itself works, but when I try to assign the results into dictonary like this (for the further use): packages.append({ "category": res.group('category'), "name": res.group('name'), "version": res.group('version') }) I got error AttributeError: 'NoneType' object has no attribute 'group' Why does it try to perform search again, if this is already in group? :(

    – Misha Lavrov
    Jan 1 at 13:53











  • @MishaLavrov: I don't know too much python, but, I think dictionary haven't append method. And how do you use it? Are you doing the assignmentinside or outside the for loop? Inside the for loop, just add packages["category"] = res.group('category'), same for name and version. I've just tried it and it works fine.

    – Toto
    Jan 1 at 16:13














1












1








1







Is that what you want:



(?P<category>w+(?:-w+)?)/(?P<name>[a-z]+(?:[-_][a-z]+)?)-(?P<version>S+)


Demo



Explanation:



(?<category>            # named group category
w+ # 1 or more word character
(?:-w+)? # optional, a dash then 1 or more word character
) # end group
/ # a slash
(?<name> # named group name
[a-z]+ # 1 or more alpha
(?:[-_][a-z]+)? # optional, dash or underscore and 1 or more alpha
) # end group
- # a dash
(?<version> # named group version
S+ # 1 or more non space character
) # end group


code:



import re

list = [
'dev-libs/icu-63.1-r1 alpha amd64 arm arm64 ia64 ppc ppc64 x86 hppa s390 ',
'dev-libs/icu-layoutex-63.1 alpha amd64 ia64 ppc ppc64 x86 hppa sparc',
'dev-lang/perl-5.28-r1 s390',
'virtual/ruby_gems-0.3_pre24 amd64 x86'
]
for line in list:
res = re.search(r'(?P<category>w+(?:-w+)?)/(?P<name>[a-z]+(?:[-_][a-z]+)?)-(?P<version>S+)', line)
print "cat: ",res.group('category'),"t name: ",res.group('name'), "ttversion: ",res.group('version')


Output:



cat:  dev-libs    name:  icu        version:  63.1-r1
cat: dev-libs name: icu-layoutex version: 63.1
cat: dev-lang name: perl version: 5.28-r1
cat: virtual name: ruby_gems version: 0.3_pre24





share|improve this answer















Is that what you want:



(?P<category>w+(?:-w+)?)/(?P<name>[a-z]+(?:[-_][a-z]+)?)-(?P<version>S+)


Demo



Explanation:



(?<category>            # named group category
w+ # 1 or more word character
(?:-w+)? # optional, a dash then 1 or more word character
) # end group
/ # a slash
(?<name> # named group name
[a-z]+ # 1 or more alpha
(?:[-_][a-z]+)? # optional, dash or underscore and 1 or more alpha
) # end group
- # a dash
(?<version> # named group version
S+ # 1 or more non space character
) # end group


code:



import re

list = [
'dev-libs/icu-63.1-r1 alpha amd64 arm arm64 ia64 ppc ppc64 x86 hppa s390 ',
'dev-libs/icu-layoutex-63.1 alpha amd64 ia64 ppc ppc64 x86 hppa sparc',
'dev-lang/perl-5.28-r1 s390',
'virtual/ruby_gems-0.3_pre24 amd64 x86'
]
for line in list:
res = re.search(r'(?P<category>w+(?:-w+)?)/(?P<name>[a-z]+(?:[-_][a-z]+)?)-(?P<version>S+)', line)
print "cat: ",res.group('category'),"t name: ",res.group('name'), "ttversion: ",res.group('version')


Output:



cat:  dev-libs    name:  icu        version:  63.1-r1
cat: dev-libs name: icu-layoutex version: 63.1
cat: dev-lang name: perl version: 5.28-r1
cat: virtual name: ruby_gems version: 0.3_pre24






share|improve this answer














share|improve this answer



share|improve this answer








edited Jan 1 at 13:17

























answered Jan 1 at 13:03









TotoToto

66.1k175699




66.1k175699













  • hmm, regex itself works, but when I try to assign the results into dictonary like this (for the further use): packages.append({ "category": res.group('category'), "name": res.group('name'), "version": res.group('version') }) I got error AttributeError: 'NoneType' object has no attribute 'group' Why does it try to perform search again, if this is already in group? :(

    – Misha Lavrov
    Jan 1 at 13:53











  • @MishaLavrov: I don't know too much python, but, I think dictionary haven't append method. And how do you use it? Are you doing the assignmentinside or outside the for loop? Inside the for loop, just add packages["category"] = res.group('category'), same for name and version. I've just tried it and it works fine.

    – Toto
    Jan 1 at 16:13



















  • hmm, regex itself works, but when I try to assign the results into dictonary like this (for the further use): packages.append({ "category": res.group('category'), "name": res.group('name'), "version": res.group('version') }) I got error AttributeError: 'NoneType' object has no attribute 'group' Why does it try to perform search again, if this is already in group? :(

    – Misha Lavrov
    Jan 1 at 13:53











  • @MishaLavrov: I don't know too much python, but, I think dictionary haven't append method. And how do you use it? Are you doing the assignmentinside or outside the for loop? Inside the for loop, just add packages["category"] = res.group('category'), same for name and version. I've just tried it and it works fine.

    – Toto
    Jan 1 at 16:13

















hmm, regex itself works, but when I try to assign the results into dictonary like this (for the further use): packages.append({ "category": res.group('category'), "name": res.group('name'), "version": res.group('version') }) I got error AttributeError: 'NoneType' object has no attribute 'group' Why does it try to perform search again, if this is already in group? :(

– Misha Lavrov
Jan 1 at 13:53





hmm, regex itself works, but when I try to assign the results into dictonary like this (for the further use): packages.append({ "category": res.group('category'), "name": res.group('name'), "version": res.group('version') }) I got error AttributeError: 'NoneType' object has no attribute 'group' Why does it try to perform search again, if this is already in group? :(

– Misha Lavrov
Jan 1 at 13:53













@MishaLavrov: I don't know too much python, but, I think dictionary haven't append method. And how do you use it? Are you doing the assignmentinside or outside the for loop? Inside the for loop, just add packages["category"] = res.group('category'), same for name and version. I've just tried it and it works fine.

– Toto
Jan 1 at 16:13





@MishaLavrov: I don't know too much python, but, I think dictionary haven't append method. And how do you use it? Are you doing the assignmentinside or outside the for loop? Inside the for loop, just add packages["category"] = res.group('category'), same for name and version. I've just tried it and it works fine.

– Toto
Jan 1 at 16:13




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53995560%2fhow-do-i-parse-the-elements-in-this-list%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

MongoDB - Not Authorized To Execute Command

in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith

How to fix TextFormField cause rebuild widget in Flutter