Choose dictionary keys only if their values don't have a certain number of duplicates

-1

Given a dictionary and a limit for the number of keys in a new dictionary, I would like the new dictionary to contain the keys with the highest values.

The given dict is:

dict = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1 }

I want to get a new dictionary that has the keys with the highest values of length limit.

For instance for limit=1 the new dict is

{'apple':5}

if the limit=2

{'apple':5, 'pears':4}

I tried this:

return dict(sorted(dictation.items(),key=lambda x: -x[1])[:limit])

but when I try limit=3, I get

{'apple':5, 'pears':4, 'orange':3}

But it shouldn't include orange:3 because orange and kiwi have same priority if we include kiwi and orange it will exceed the limit so it shouldn't include both. I should return

{'apple':5, 'pears':4}

edited Nov 21 '18 at 18:55

Conner

23.4k84568

asked Nov 21 '18 at 18:43

Comp

456

1

I followed this question until you said "because I can't add orange. If I add it will be more than the limit.".

– timgeb
Nov 21 '18 at 18:47

1

So you're saying if there are multiple items with the same count, you should only take any if all of them fit within the limit? Have you looked into whether Counter.most_common does what you need? I'd recommend not trying to fit it into one line.

– jonrsharpe
Nov 21 '18 at 18:49

add a comment |

-1

Given a dictionary and a limit for the number of keys in a new dictionary, I would like the new dictionary to contain the keys with the highest values.

The given dict is:

dict = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1 }

I want to get a new dictionary that has the keys with the highest values of length limit.

For instance for limit=1 the new dict is

{'apple':5}

if the limit=2

{'apple':5, 'pears':4}

I tried this:

return dict(sorted(dictation.items(),key=lambda x: -x[1])[:limit])

but when I try limit=3, I get

{'apple':5, 'pears':4, 'orange':3}

But it shouldn't include orange:3 because orange and kiwi have same priority if we include kiwi and orange it will exceed the limit so it shouldn't include both. I should return

{'apple':5, 'pears':4}

edited Nov 21 '18 at 18:55

Conner

23.4k84568

asked Nov 21 '18 at 18:43

Comp

456

1

I followed this question until you said "because I can't add orange. If I add it will be more than the limit.".

– timgeb
Nov 21 '18 at 18:47

1

So you're saying if there are multiple items with the same count, you should only take any if all of them fit within the limit? Have you looked into whether Counter.most_common does what you need? I'd recommend not trying to fit it into one line.

– jonrsharpe
Nov 21 '18 at 18:49

add a comment |

-1

Given a dictionary and a limit for the number of keys in a new dictionary, I would like the new dictionary to contain the keys with the highest values.

The given dict is:

dict = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1 }

I want to get a new dictionary that has the keys with the highest values of length limit.

For instance for limit=1 the new dict is

{'apple':5}

if the limit=2

{'apple':5, 'pears':4}

I tried this:

return dict(sorted(dictation.items(),key=lambda x: -x[1])[:limit])

but when I try limit=3, I get

{'apple':5, 'pears':4, 'orange':3}

But it shouldn't include orange:3 because orange and kiwi have same priority if we include kiwi and orange it will exceed the limit so it shouldn't include both. I should return

{'apple':5, 'pears':4}

edited Nov 21 '18 at 18:55

Conner

23.4k84568

asked Nov 21 '18 at 18:43

Comp

456

Given a dictionary and a limit for the number of keys in a new dictionary, I would like the new dictionary to contain the keys with the highest values.

The given dict is:

dict = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1 }

I want to get a new dictionary that has the keys with the highest values of length limit.

For instance for limit=1 the new dict is

{'apple':5}

if the limit=2

{'apple':5, 'pears':4}

I tried this:

return dict(sorted(dictation.items(),key=lambda x: -x[1])[:limit])

but when I try limit=3, I get

{'apple':5, 'pears':4, 'orange':3}

But it shouldn't include orange:3 because orange and kiwi have same priority if we include kiwi and orange it will exceed the limit so it shouldn't include both. I should return

{'apple':5, 'pears':4}

python dictionary

edited Nov 21 '18 at 18:55

Conner

23.4k84568

asked Nov 21 '18 at 18:43

Comp

456

edited Nov 21 '18 at 18:55

Conner

23.4k84568

asked Nov 21 '18 at 18:43

Comp

456

edited Nov 21 '18 at 18:55

Conner

23.4k84568

edited Nov 21 '18 at 18:55

Conner

23.4k84568

edited Nov 21 '18 at 18:55

Conner

23.4k84568

asked Nov 21 '18 at 18:43

Comp

456

asked Nov 21 '18 at 18:43

Comp

456

asked Nov 21 '18 at 18:43

Comp

456

1

I followed this question until you said "because I can't add orange. If I add it will be more than the limit.".

– timgeb
Nov 21 '18 at 18:47

1

So you're saying if there are multiple items with the same count, you should only take any if all of them fit within the limit? Have you looked into whether Counter.most_common does what you need? I'd recommend not trying to fit it into one line.

– jonrsharpe
Nov 21 '18 at 18:49

add a comment |

1

I followed this question until you said "because I can't add orange. If I add it will be more than the limit.".

– timgeb
Nov 21 '18 at 18:47

1

So you're saying if there are multiple items with the same count, you should only take any if all of them fit within the limit? Have you looked into whether Counter.most_common does what you need? I'd recommend not trying to fit it into one line.

– jonrsharpe
Nov 21 '18 at 18:49

I followed this question until you said "because I can't add orange. If I add it will be more than the limit.".

– timgeb
Nov 21 '18 at 18:47

So you're saying if there are multiple items with the same count, you should only take any if all of them fit within the limit? Have you looked into whether Counter.most_common does what you need? I'd recommend not trying to fit it into one line.

– jonrsharpe
Nov 21 '18 at 18:49

add a comment |

3 Answers
3

active

oldest

votes

The way to go would be to use a collections.Counter and most_common(n). Then you can take one more as needed and keep popping until the last value changes:

from collections import Counter



dct = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}

n = 3



items = Counter(dictation).most_common(n+1)

last_val = items[-1][1]

if len(items) > n:

    while items[-1][1] == last_val:

        items.pop()



new = dict(items)

# {'apple': 5, 'pears': 4}

edited Nov 21 '18 at 18:54

answered Nov 21 '18 at 18:49

schwobaseggl

37.2k32442

But OP says for limit=3 the dict should still have 2 keys, for reasons I don't unterstand.

– timgeb
Nov 21 '18 at 18:50

@timgeb I added the necessary bumpiness. Lost all of its appeal :(

– schwobaseggl
Nov 21 '18 at 18:57

still shorter then mine

– Patrick Artner
Nov 21 '18 at 19:03

add a comment |

This is computationally not very good, but it works. It creates a Counter object to get the sorted output for your data and a inverted defaultdict that holds list that match to a score - it creates the result using both and some math:

from collections import defaultdict, Counter



def gimme(d,n):

    c = Counter(d)

    grpd = defaultdict(list)

    for key,value in c.items():

        grpd[value].append(key)





    result = {}

    for key,value in c.most_common():

        if len(grpd[value])+len(result) <= n:

            result.update( {k:value for k in grpd[value] } )

        else:

            break

    return result

Test:

data = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1 }



for k in range(10):

    print(k, gimme(data,k))

Output:

0 {}

1 {'apple': 5}

2 {'apple': 5, 'pears': 4}

3 {'apple': 5, 'pears': 4}

4 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3}

5 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3}

6 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

7 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

8 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

9 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

answered Nov 21 '18 at 19:01

Patrick Artner

24.2k62443

add a comment |

As you note, filtering by the top n doesn't exclude by default all equal values which exceed the stated cap. This is by design.

The trick is to consider the (n+1) th highest value and ensure the values in your dictionary are all higher than this number:

from heapq import nlargest



dictation = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}



n = 3

largest_items = nlargest(n+1, dictation.items(), key=lambda x: x[1])

n_plus_one_value = largest_items[-1][1]



res = {k: v for k, v in largest_items if v > n_plus_one_value}



print(res)



{'apple': 5, 'pears': 4}

We assume here len(largest_items) < n, otherwise you can just take the input dictionary as the result.

The dictionary comprehension seems expensive. For larger inputs, you can use bisect, something like:

from heapq import nlargest

from operator import itemgetter

from bisect import bisect



dictation = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}



n = 3

largest_items = nlargest(n+1, dictation.items(), key=lambda x: x[1])

n_plus_one_value = largest_items[-1][1]



index = bisect(list(map(itemgetter(1), largest_items))[::-1], n_plus_one_value)



res = dict(largest_items[:len(largest_items) - index])



print(res)



{'apple': 5, 'pears': 4}

edited Nov 22 '18 at 2:28

answered Nov 21 '18 at 19:03

jpp

101k2162111

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53418662%2fchoose-dictionary-keys-only-if-their-values-dont-have-a-certain-number-of-dupli%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

The way to go would be to use a collections.Counter and most_common(n). Then you can take one more as needed and keep popping until the last value changes:

from collections import Counter



dct = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}

n = 3



items = Counter(dictation).most_common(n+1)

last_val = items[-1][1]

if len(items) > n:

    while items[-1][1] == last_val:

        items.pop()



new = dict(items)

# {'apple': 5, 'pears': 4}

edited Nov 21 '18 at 18:54

answered Nov 21 '18 at 18:49

schwobaseggl

37.2k32442

But OP says for limit=3 the dict should still have 2 keys, for reasons I don't unterstand.

– timgeb
Nov 21 '18 at 18:50

@timgeb I added the necessary bumpiness. Lost all of its appeal :(

– schwobaseggl
Nov 21 '18 at 18:57

still shorter then mine

– Patrick Artner
Nov 21 '18 at 19:03

add a comment |

The way to go would be to use a collections.Counter and most_common(n). Then you can take one more as needed and keep popping until the last value changes:

from collections import Counter



dct = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}

n = 3



items = Counter(dictation).most_common(n+1)

last_val = items[-1][1]

if len(items) > n:

    while items[-1][1] == last_val:

        items.pop()



new = dict(items)

# {'apple': 5, 'pears': 4}

edited Nov 21 '18 at 18:54

answered Nov 21 '18 at 18:49

schwobaseggl

37.2k32442

But OP says for limit=3 the dict should still have 2 keys, for reasons I don't unterstand.

– timgeb
Nov 21 '18 at 18:50

@timgeb I added the necessary bumpiness. Lost all of its appeal :(

– schwobaseggl
Nov 21 '18 at 18:57

still shorter then mine

– Patrick Artner
Nov 21 '18 at 19:03

add a comment |

The way to go would be to use a collections.Counter and most_common(n). Then you can take one more as needed and keep popping until the last value changes:

from collections import Counter



dct = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}

n = 3



items = Counter(dictation).most_common(n+1)

last_val = items[-1][1]

if len(items) > n:

    while items[-1][1] == last_val:

        items.pop()



new = dict(items)

# {'apple': 5, 'pears': 4}

edited Nov 21 '18 at 18:54

answered Nov 21 '18 at 18:49

schwobaseggl

37.2k32442

The way to go would be to use a collections.Counter and most_common(n). Then you can take one more as needed and keep popping until the last value changes:

from collections import Counter



dct = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}

n = 3



items = Counter(dictation).most_common(n+1)

last_val = items[-1][1]

if len(items) > n:

    while items[-1][1] == last_val:

        items.pop()



new = dict(items)

# {'apple': 5, 'pears': 4}

edited Nov 21 '18 at 18:54

answered Nov 21 '18 at 18:49

schwobaseggl

37.2k32442

edited Nov 21 '18 at 18:54

answered Nov 21 '18 at 18:49

schwobaseggl

37.2k32442

answered Nov 21 '18 at 18:49

schwobaseggl

37.2k32442

answered Nov 21 '18 at 18:49

schwobaseggl

37.2k32442

But OP says for limit=3 the dict should still have 2 keys, for reasons I don't unterstand.

– timgeb
Nov 21 '18 at 18:50

@timgeb I added the necessary bumpiness. Lost all of its appeal :(

– schwobaseggl
Nov 21 '18 at 18:57

still shorter then mine

– Patrick Artner
Nov 21 '18 at 19:03

add a comment |

But OP says for limit=3 the dict should still have 2 keys, for reasons I don't unterstand.

– timgeb
Nov 21 '18 at 18:50

@timgeb I added the necessary bumpiness. Lost all of its appeal :(

– schwobaseggl
Nov 21 '18 at 18:57

still shorter then mine

– Patrick Artner
Nov 21 '18 at 19:03

But OP says for limit=3 the dict should still have 2 keys, for reasons I don't unterstand.

– timgeb
Nov 21 '18 at 18:50

@timgeb I added the necessary bumpiness. Lost all of its appeal :(

– schwobaseggl
Nov 21 '18 at 18:57

still shorter then mine

– Patrick Artner
Nov 21 '18 at 19:03

add a comment |

from collections import defaultdict, Counter



def gimme(d,n):

    c = Counter(d)

    grpd = defaultdict(list)

    for key,value in c.items():

        grpd[value].append(key)





    result = {}

    for key,value in c.most_common():

        if len(grpd[value])+len(result) <= n:

            result.update( {k:value for k in grpd[value] } )

        else:

            break

    return result

Test:

data = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1 }



for k in range(10):

    print(k, gimme(data,k))

Output:

0 {}

1 {'apple': 5}

2 {'apple': 5, 'pears': 4}

3 {'apple': 5, 'pears': 4}

4 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3}

5 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3}

6 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

7 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

8 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

9 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

answered Nov 21 '18 at 19:01

Patrick Artner

24.2k62443

add a comment |

from collections import defaultdict, Counter



def gimme(d,n):

    c = Counter(d)

    grpd = defaultdict(list)

    for key,value in c.items():

        grpd[value].append(key)





    result = {}

    for key,value in c.most_common():

        if len(grpd[value])+len(result) <= n:

            result.update( {k:value for k in grpd[value] } )

        else:

            break

    return result

Test:

data = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1 }



for k in range(10):

    print(k, gimme(data,k))

Output:

0 {}

1 {'apple': 5}

2 {'apple': 5, 'pears': 4}

3 {'apple': 5, 'pears': 4}

4 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3}

5 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3}

6 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

7 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

8 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

9 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

answered Nov 21 '18 at 19:01

Patrick Artner

24.2k62443

add a comment |

from collections import defaultdict, Counter



def gimme(d,n):

    c = Counter(d)

    grpd = defaultdict(list)

    for key,value in c.items():

        grpd[value].append(key)





    result = {}

    for key,value in c.most_common():

        if len(grpd[value])+len(result) <= n:

            result.update( {k:value for k in grpd[value] } )

        else:

            break

    return result

Test:

data = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1 }



for k in range(10):

    print(k, gimme(data,k))

Output:

0 {}

1 {'apple': 5}

2 {'apple': 5, 'pears': 4}

3 {'apple': 5, 'pears': 4}

4 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3}

5 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3}

6 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

7 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

8 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

9 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

answered Nov 21 '18 at 19:01

Patrick Artner

24.2k62443

from collections import defaultdict, Counter



def gimme(d,n):

    c = Counter(d)

    grpd = defaultdict(list)

    for key,value in c.items():

        grpd[value].append(key)





    result = {}

    for key,value in c.most_common():

        if len(grpd[value])+len(result) <= n:

            result.update( {k:value for k in grpd[value] } )

        else:

            break

    return result

Test:

data = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1 }



for k in range(10):

    print(k, gimme(data,k))

Output:

0 {}

1 {'apple': 5}

2 {'apple': 5, 'pears': 4}

3 {'apple': 5, 'pears': 4}

4 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3}

5 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3}

6 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

7 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

8 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

9 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

answered Nov 21 '18 at 19:01

Patrick Artner

24.2k62443

answered Nov 21 '18 at 19:01

Patrick Artner

24.2k62443

answered Nov 21 '18 at 19:01

Patrick Artner

24.2k62443

answered Nov 21 '18 at 19:01

Patrick Artner

24.2k62443

add a comment |

As you note, filtering by the top n doesn't exclude by default all equal values which exceed the stated cap. This is by design.

The trick is to consider the (n+1) th highest value and ensure the values in your dictionary are all higher than this number:

from heapq import nlargest



dictation = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}



n = 3

largest_items = nlargest(n+1, dictation.items(), key=lambda x: x[1])

n_plus_one_value = largest_items[-1][1]



res = {k: v for k, v in largest_items if v > n_plus_one_value}



print(res)



{'apple': 5, 'pears': 4}

We assume here len(largest_items) < n, otherwise you can just take the input dictionary as the result.

The dictionary comprehension seems expensive. For larger inputs, you can use bisect, something like:

from heapq import nlargest

from operator import itemgetter

from bisect import bisect



dictation = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}



n = 3

largest_items = nlargest(n+1, dictation.items(), key=lambda x: x[1])

n_plus_one_value = largest_items[-1][1]



index = bisect(list(map(itemgetter(1), largest_items))[::-1], n_plus_one_value)



res = dict(largest_items[:len(largest_items) - index])



print(res)



{'apple': 5, 'pears': 4}

edited Nov 22 '18 at 2:28

answered Nov 21 '18 at 19:03

jpp

101k2162111

add a comment |

As you note, filtering by the top n doesn't exclude by default all equal values which exceed the stated cap. This is by design.

The trick is to consider the (n+1) th highest value and ensure the values in your dictionary are all higher than this number:

from heapq import nlargest



dictation = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}



n = 3

largest_items = nlargest(n+1, dictation.items(), key=lambda x: x[1])

n_plus_one_value = largest_items[-1][1]



res = {k: v for k, v in largest_items if v > n_plus_one_value}



print(res)



{'apple': 5, 'pears': 4}

We assume here len(largest_items) < n, otherwise you can just take the input dictionary as the result.

The dictionary comprehension seems expensive. For larger inputs, you can use bisect, something like:

from heapq import nlargest

from operator import itemgetter

from bisect import bisect



dictation = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}



n = 3

largest_items = nlargest(n+1, dictation.items(), key=lambda x: x[1])

n_plus_one_value = largest_items[-1][1]



index = bisect(list(map(itemgetter(1), largest_items))[::-1], n_plus_one_value)



res = dict(largest_items[:len(largest_items) - index])



print(res)



{'apple': 5, 'pears': 4}

edited Nov 22 '18 at 2:28

answered Nov 21 '18 at 19:03

jpp

101k2162111

add a comment |

As you note, filtering by the top n doesn't exclude by default all equal values which exceed the stated cap. This is by design.

The trick is to consider the (n+1) th highest value and ensure the values in your dictionary are all higher than this number:

from heapq import nlargest



dictation = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}



n = 3

largest_items = nlargest(n+1, dictation.items(), key=lambda x: x[1])

n_plus_one_value = largest_items[-1][1]



res = {k: v for k, v in largest_items if v > n_plus_one_value}



print(res)



{'apple': 5, 'pears': 4}

We assume here len(largest_items) < n, otherwise you can just take the input dictionary as the result.

The dictionary comprehension seems expensive. For larger inputs, you can use bisect, something like:

from heapq import nlargest

from operator import itemgetter

from bisect import bisect



dictation = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}



n = 3

largest_items = nlargest(n+1, dictation.items(), key=lambda x: x[1])

n_plus_one_value = largest_items[-1][1]



index = bisect(list(map(itemgetter(1), largest_items))[::-1], n_plus_one_value)



res = dict(largest_items[:len(largest_items) - index])



print(res)



{'apple': 5, 'pears': 4}

edited Nov 22 '18 at 2:28

answered Nov 21 '18 at 19:03

jpp

101k2162111

As you note, filtering by the top n doesn't exclude by default all equal values which exceed the stated cap. This is by design.

The trick is to consider the (n+1) th highest value and ensure the values in your dictionary are all higher than this number:

from heapq import nlargest



dictation = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}



n = 3

largest_items = nlargest(n+1, dictation.items(), key=lambda x: x[1])

n_plus_one_value = largest_items[-1][1]



res = {k: v for k, v in largest_items if v > n_plus_one_value}



print(res)



{'apple': 5, 'pears': 4}

We assume here len(largest_items) < n, otherwise you can just take the input dictionary as the result.

The dictionary comprehension seems expensive. For larger inputs, you can use bisect, something like:

from heapq import nlargest

from operator import itemgetter

from bisect import bisect



dictation = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}



n = 3

largest_items = nlargest(n+1, dictation.items(), key=lambda x: x[1])

n_plus_one_value = largest_items[-1][1]



index = bisect(list(map(itemgetter(1), largest_items))[::-1], n_plus_one_value)



res = dict(largest_items[:len(largest_items) - index])



print(res)



{'apple': 5, 'pears': 4}

edited Nov 22 '18 at 2:28

answered Nov 21 '18 at 19:03

jpp

101k2162111

edited Nov 22 '18 at 2:28

answered Nov 21 '18 at 19:03

jpp

101k2162111

answered Nov 21 '18 at 19:03

jpp

101k2162111

answered Nov 21 '18 at 19:03

jpp

101k2162111

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu