数据挖掘 - python numpy中的唯一列值 - 吾爱随笔录

我的数组看起来像这样

a=np.array([[ 25, 29, 19, 93],
       [27、59、23、345]、
       [24、426、15、593]、
       [24, 87, 50.2, 139],
       [13、86、12.4、139]、
       [13、25、85、142]、
       [62, 62, 68.2, 182],
       [27、25、20、150]、
       [25, 53, 71, 1850],
       [64、67、21.1、1570]、
       [64、57、73、1502]]）

我想根据第 0 列的唯一值返回第 2 列的最小值。第 0 列应该包含唯一值。我尝试了以下代码，但没有给我确切的结果。有人可以帮我解决这个问题吗？谢谢

sidx = np.lexsort(a[:,[2,0]].T)
dx = np.append(np.flatnonzero(a[2:,0] >a[:-2,0]), a.shape[0]-1)
结果 = a[sidx[idx]]
打印结果

我想得到类似的结果

[25...
 27
 24
 13
 62
 64...]

a=[[196512 28978 十进制('12.7805170314276')]
 [196512 34591 十进制('12.8994111000000')]
 [196512 13078 十进制（'12.9135746000000'）]
 [196641 114569 十进制（'12.9267705000000'）]
 [196641 118910 十进制（'12.8983353775637'）]
 [196641 100688 Decimal('12.9505091000000')]]这是一个很大的列表
我用了，
df = pd.DataFrame(a)
df.columns = ['a','b','c']
df.index = df.a.astype(str)
dd=df.groupby('a').min()['c']

但我得到了，

195556 12.7805170314276
195937 12.7805170314276
196149 12.7805170314276
196152 12.7805170314276
196155 12.7805170314276
196262 12.7805170314276

import pandas as pd df = pd.DataFrame(a) df.columns = ['a','b','c','d'] df.index = df.a.astype(str) # to preserve correspondence df.groupby('a').min()['b'] a 13.0 25.0 24.0 87.0 25.0 29.0 27.0 25.0 62.0 62.0 64.0 57.0 Name: b, dtype: float64

from decimal import Decimal y=np.array([[196512, 28978, Decimal('12.7805170314276')], [196512, 34591, Decimal('12.8994111000000')] , [196512, 13078, Decimal('12.9135746000000')] , [196641, 114569, Decimal('12.9267705000000')] , [196641, 118910, Decimal('12.8983353775637')] , [196641, 100688, Decimal('12.9505091000000')]]) df = pd.DataFrame(y) df.columns = ['a','b','c'] df.index = df.a.astype(str) dd=df.groupby('a').min()['c'] In [210]: dd Out[210]: a 196512 12.7805170314276 196641 12.8983353775637 Name: c, dtype: object