1

I have list like following

m=[['abc','x-name',222],['pqr','y-name',333],['mno','j-name',333],['qrt','z-name',111],['dcu','lz-name',999]]

Let's say I want to get top 2 out of this list considering 3rd column(i.e 222 or etc)

I know I can get the Max one like following

>>> m=[['abc','x-name',222],['pqr','y-name',333],['mno','j-name',333],['qrt','z-name',111],['dcu','lz-name',999]]
>>> print max(m, key=lambda x: x[2])
['dcu', 'lz-name', 999]

but what I have to get top 2 (considering the duplicates) my result should be

['dcu', 'lz-name', 999] ['pqr','y-name',333] ['mno','j-name',333]

Is it possible? I head is spinning trying to figure it out, can you pls have look and help me..

OR -just got idea

You can tell me to delete MAX element so that I can get top 2 elements using iteration( duplicate will be a problem though)

Manojcode
  • 41
  • 4

1 Answers1

1

You can sort and slice instead:

>>> from operator import itemgetter
>>> sorted(m, key=itemgetter(2), reverse=True)[:3]
[['dcu', 'lz-name', 999], ['pqr', 'y-name', 333], ['mno', 'j-name', 333]]

Or, using the heapq.nlargest():

>>> import heapq
>>> heapq.nlargest(3, m, key=itemgetter(2))
[['dcu', 'lz-name', 999], ['pqr', 'y-name', 333], ['mno', 'j-name', 333]]

This, though, would not handle the duplicates nicely and it is not of a linear time complexity, plus it would created a sorted copy of the initial list in memory. Please see the following threads for linear-time and more memory-efficient solutions:

Community
  • 1
  • 1
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • Actually my list is too big 10Meg to 100 meg around, will sorting be a good idea? – Manojcode Apr 09 '16 at 13:41
  • So alecxe,just to be clear, do you say, I should be going with sorted()? – Manojcode Apr 09 '16 at 14:04
  • @Manojcode I would experiment with the linear-time solutions that would not produce an extra list in memory more, like [this one](http://stackoverflow.com/a/16226255/771848). Plus, the sample solutions I've posted don't handle the duplicates...not sure how strict of a requirement it is. Thanks. – alecxe Apr 09 '16 at 14:08
  • In terms of requirements, i could not skip any data. But What I am seeing is both approaches are handling duplicates pretty fine as it sorts the list first and I take top 5 which is pretty much I need, but still complexity perspective, I am not sure if there is any other solution to this.. – Manojcode Apr 09 '16 at 15:44