Multivariate inputs with multiple output prediction

I have a dataset having 5 input columns and 5 output columns like below.

pr-1 pr-2 pr-3 pr-4 pr-5 action-1 action-2 action-3 action-4 action-5

pr-1 to pr-5 are will be the process names and action-1 to action5 are the actions to be taken.
The process names will not be a string, it can be of multiple strings like “run-parts --lsbsysinit /etc/update-motd.d”. The actions will be of categorical values.

I have a couple of queries about the implementation. kindly guide me on this.

1> how will i encode input of multiple strings?
2> I know using nupic we can not able to produce multiple output…is there any way around or improvement to achieve this?

Hello @Millan. What is your goal here? Do you want to predict something or get an anomaly indication?

It looks like your strings might be pretty complex. Are there a near-infinite number of them? If there are many commands per string, maybe it would be an idea to encode the commands independently and combine them into one representation for each string input?

I want to predict what will be the actions

How many action categories are there? Do you have any idea what factors in the data might affect an action? Do you want to predict all the actions? If so, you will need to have a model for each action you want to predict.

1 Like