I am trying to use CategoryEncoder for categorical data for prediction.
In my case, I predict value based on caller and value.
The csv header and sample data is as follows, where caller is categorical data, and specified as C in header:
func,caller,callee,timestamp,value
string,string,string,datetime,float
,C,,T,
ord_IInvQueryCSV_funcQ:127.0.1.1,funcK,com.gyl.scm.center.query.service.impl.InvQueryCSVImpl.funcQ:127.0.1.1,2018-01-01 03:26:13.960,48
Here is the corresponding encoding part for categorical data caller from model param in JSON format
"caller": {
"name": "caller",
"fieldname": "caller",
"w": 21,
"categoryList": ["funcK","funcQ","funcM"],
"type": "CategoryEncoder"
}
It comes to an error while running program. However, the error disappear when the special flag C is not set in csv header. I am not sure whether the caller field is involved in prediction without C flag.
Could anyone tell me the right usage of categorical data?
Thanks