我正在开发一个包含 13 个变量的线性模型,包括目标变量(物品的在线购买收入)。因此,我首先使用常规变量构建模型 1,然后在数据标准化后构建模型 2。我在这里复制了两个模型的系数:
Model1(Without Normalized Data)
Coefficients:
(Intercept) xid xcartadd
6.386e+01 -4.301e-03 -1.229e+02
xcartuniqadd xcartaddtotalrs xcartremove
1.239e+02 7.788e-02 -1.424e+02
xcardtremovetotal xcardtremovetotalrs xproductviews
5.588e+02 -3.445e-02 1.369e+01
xuniqprodview xprodviewinrs xsizeselecteduniview
-1.530e+01 5.401e-04 -1.299e+02
xsizeselectedtotalviews xsizeselectedtotalviewsrs
6.280e+01 -2.453e-02
Model 2(With Normalized data)
Coefficients:
(Intercept) xid
3.900e+02 -4.301e-03
xcartadd_n xcartuniqadd_n
-2.623e+03 2.069e+03
xcartaddtotalrs_n xcartremove_n
1.785e+03 -1.721e+02
xcardtremovetotal_n xcardtremovetotalrs_n
4.474e+02 -5.360e+01
xproductviews_n xuniqprodview_n
7.979e+03 -7.378e+03
xprodviewinrs_n xsizeselecteduniview_n
4.757e+02 -1.218e+03
xsizeselectedtotalviews_n xsizeselectedtotalviewsrs_n
1.044e+03 -5.374e+02
我的问题是:
仅将规范化数据纳入模型或非规范化数据是否合适?
将归一化数据和非归一化数据结合起来是否合适?
如何从它们中选择最合适的预测变量用于模型?