Inference speedup with respect to non sparse model

I have two models trained on MNIST, one is the model provided in the nupic.torch library, it uses sparse convolution and sparse linear layers and a KWinners activation function. The other replaces KWinners with ReLU and all the sparse layer with the non sparse corresponding layers from PyTorch.
I trained the models and benchmarked them on GPU expecting the sparse one to be faster, instead I found that the non sparse version is almost 2x faster.
My question is if I am doing something wrong or if sparse layers do not bring improvements to the inference time of a model.