我有以下 CSV 数据:
shot_id,round_id,hole,shotType,clubType,desiredShape,lineDirection,shotQuality,note
48,2,1,tee,driver,straight,straight,good,
49,2,1,approach,iron,straight,right,bad,
50,2,1,approach,wedge,straight,straight,bad,
51,2,1,approach,wedge,straight,straight,bad,
52,2,1,putt,putter,straight,straight,good,
53,2,1,putt,putter,straight,straight,good,
54,2,2,tee,driver,draw,straight,good,
55,2,2,approach,iron,draw,straight,good,
56,2,2,putt,putter,straight,straight,good,
57,2,2,putt,putter,straight,straight,good,
58,2,3,tee,driver,draw,straight,good,
59,2,3,approach,iron,straight,right,good,
60,2,3,chip,wedge,straight,straight,good,
61,2,3,putt,putter,straight,straight,good,
62,2,4,tee,iron,straight,straight,good,
63,2,4,putt,putter,straight,straight,good,
64,2,4,putt,putter,straight,straight,good,
65,2,5,tee,driver,straight,left,good,
66,2,5,approach,wedge,straight,straight,good,
67,2,5,putt,putter,straight,straight,bad,
68,2,5,putt,putter,straight,straight,good,
69,2,6,tee,driver,draw,straight,bad,
70,2,6,approach,hybrid,draw,straight,good,
71,2,6,putt,putter,straight,straight,good,
72,2,6,putt,putter,straight,straight,good,
73,2,7,tee,driver,straight,straight,good,
74,2,7,approach,wood,fade,straight,good,
75,2,7,approach,wedge,straight,straight,bad,long
76,2,7,putt,putter,straight,straight,good,
77,2,7,putt,putter,straight,straight,good,
78,2,8,tee,iron,straight,right,bad,
79,2,8,approach,wedge,straight,straight,good,
80,2,8,putt,putter,straight,straight,bad,
81,2,9,tee,driver,straight,straight,good,
82,2,9,approach,iron,straight,straight,good,
83,2,9,approach,wedge,straight,straight,bad,
84,2,9,putt,putter,straight,straight,good,
85,2,9,putt,putter,straight,straight,good,
86,2,10,tee,driver,straight,left,good,
87,2,10,approach,iron,straight,left,good,
88,2,10,chip,wedge,straight,straight,good,
89,2,10,putt,putter,straight,straight,good,
90,2,10,putt,putter,straight,straight,good,
91,2,11,tee,driver,draw,straight,good,
92,2,11,approach,iron,draw,straight,good,
93,2,11,putt,putter,straight,straight,good,
94,2,11,putt,putter,straight,straight,good,
95,2,12,tee,iron,draw,straight,good,
96,2,12,putt,putter,straight,straight,good,
97,2,12,putt,putter,straight,straight,good,
98,2,13,tee,driver,draw,straight,good,
99,2,13,approach,wood,straight,straight,bad,topped
100,2,13,putt,putter,straight,straight,good,
101,2,13,putt,putter,straight,straight,good,
102,2,14,tee,driver,draw,straight,good,
103,2,14,approach,wood,straight,straight,bad,
104,2,14,approach,iron,draw,straight,good,
105,2,14,approach,wedge,straight,straight,bad,
106,2,14,putt,putter,straight,straight,bad,
107,2,14,putt,putter,straight,straight,good,
108,2,15,tee,iron,draw,right,bad,
109,2,15,approach,wedge,straight,straight,good,
110,2,15,putt,putter,straight,straight,good,
111,2,15,putt,putter,straight,straight,good,
112,2,16,tee,driver,draw,right,good,
113,2,16,approach,iron,straight,left,bad,
114,2,16,approach,wedge,straight,left,bad,
115,2,16,putt,putter,straight,straight,good,
116,2,17,tee,driver,straight,straight,good,
117,2,17,approach,wood,straight,right,bad,
118,2,17,approach,wedge,straight,straight,good,
119,2,17,putt,putter,straight,straight,good,
120,2,17,putt,putter,straight,straight,good,
121,2,18,tee,driver,fade,right,bad,
122,2,18,approach,wedge,straight,straight,good,
123,2,18,approach,wedge,straight,straight,good,
124,2,18,putt,putter,straight,straight,good,
125,2,18,putt,putter,straight,straight,good,
而且我希望能够确定哪些值组合是最常出现的。
- 球杆类型:发球杆、木杆、铁杆、挖起杆、推杆
- 击球类型:开球、进场、切球、推杆
- 线方向:左、中、右
- 射击质量:好,坏,中性
理想情况下,我能够确定一个最佳位置(没有双关语)组合:“driver”+“tee”+“straight”+“good”
我打算仅针对静态数据集来衡量这一点,而不是针对任何未来值或预测。所以,我的想法是,这可能是一个聚类/k-means 问题。那是对的吗?
如果是这样,我将如何开始使用 R 中的这些类型的值进行 K-Mean 分析?
如果这不是 kmeans 问题,那么它是什么?