我是 ac# 高级开发人员,我的任务是尝试预测每个新客户的潜力,或者每个客户的价值。我没有机器学习的经验,但我玩过accord-framework.net 并在简单的任务上得到了一些不错的结果。
我的训练数据模型是:
GeoLocation, // the country of ip when registed. iso code string
Age, // number
DateRegistered, //date time
Email, //string can be broken to vendors as catergorial (gmail, yahoo, microsoft and such)
EmailValidated, //is the email really exists. bool
PhoneNumber, //string
PhoneNumberValidated, // is the phone number really exists
CampaignName, //string (may be categirial)
UserAgent, //string should I make it categorial? (has info about browser, device, verndor, operation system and such, long string)
LandedOnPage, //string first url the customer entered from
RegisteredFromPage, //string url of the page that the user registered from
RefererUrl, //string url the client came to our site from,
NumberOfPurchases, //the amount of times the customer puschase something on our site
CustomerValueUsd, //the total amount of USD the customer spent in our site
输出应该是CustomerValueUsd
我有很多历史数据,所以我可以回测。
我的问题:
- 即使我没有机器学习经验,执行此任务是否有意义?考虑到我使用的是众所周知的框架,这项任务有多复杂?
- 假设我正在接受任务,我应该选择哪种算法来执行这种任务?
- 我应该如何构建训练数据?看到我的评论,你认为我的评论可以开始吗?或者我可以直接破坏数据?