我试图在执行groupby之后获取具有第二高值的行的索引,但是我没有得到正确的结果
I am trying to get the index of the row with the second highest value after doing groupby but I am not getting the right result
df = pd.DataFrame({'Sp':['a','b','c','d','e','f'], 'Mt':['s1', 's1', 's2','s2','s2','s3'], 'Value':[1,2,3,4,5,6], 'count':[3,2,5,10,10,6]})这样做
df.iloc[df.groupby(['Mt'])['Value'].apply(lambda x: (x!=max(x)).idxmax())]正在返回
Mt Sp Value count 0 s1 a 1 3 2 s2 c 3 5 5 s3 f 6 6对于组s2,应返回原始数据帧的索引3.
For group s2 , index 3 of the original dataframe should be returned.
推荐答案由于已经对值"进行了排序,因此您可以使用 nth :
Since 'Value' is already sorted you can use nth:
In [11]: g = df.groupby("Mt", as_index=False) In [12]: g.nth(-2) Out[12]: Mt Sp Value count 0 s1 a 1 3 3 s2 d 4 10否则,我将首先按值df = df.sort_values("Value")排序.
Otherwise I'd first sort by Value, df = df.sort_values("Value").
如果您想要最后一个(如果给定的组中少于两个),您也可以抓住它
If you want the last (if there are fewer than two in a given group), you could grab that too
In [21]: g = df.groupby("Mt") In [22]: res = g.nth(-1) In [23]: res.update(g.nth(-2)) In [24]: res Out[24]: Sp Value count Mt s1 a 1 3 s2 d 4 10 s3 f 6 6
一个相关的功能是 (获取最后两个元素):
A related function is tail (to get the last two elements):
In [31]: g.tail(2) Out[31]: Mt Sp Value count 0 s1 a 1 3 1 s1 b 2 2 3 s2 d 4 10 4 s2 e 5 10 5 s3 f 6 6