Question re: odd quirk of Swig

I’m working on improving introspection of networks and encountered a weird quirk involving Swig that I don’t quite know how to interpret and was hoping someone might be able to explain.

So, I have a region:

(Pdb) id(region), region
(4505485008, <nupic.bindings.engine_internal.Region; proxy of <Swig Object of type 'nupic::Region *' at 0x10c8c44b0> >)

Let’s say I want to inspect its inputs, starting with the zeroth:

(Pdb) spec = region.getSpec()
(Pdb) inputPair = spec.inputs.getByIndex(0)
(Pdb) inputPair
('externalApicalInput', <nupic.bindings.engine_internal.InputSpec; proxy of <Swig Object of type 'nupic::InputSpec *' at 0x10c9e97b0> >)

Here’s where it gets weird:

(Pdb) inputPair = spec.inputs.getByIndex(0)
(Pdb) inputSpec = inputPair[1]
(Pdb) inputSpec, inputPair[1]
(<nupic.bindings.engine_internal.InputSpec; proxy of <Swig Object of type 'nupic::InputSpec *' at 0x10c9e97b0> >, <nupic.bindings.engine_internal.InputSpec; proxy of <Swig Object of type 'nupic::InputSpec *' at 0x10c9e97e0> >)
(Pdb) id(inputSpec), id(inputPair[1])
(4506675472, 4506675344)

inputSpec and inputPair[1] are different objects, seemingly breaking familiar convention re: accessors.

Consider this contrived example:

(Pdb) foo = {1: object()}
(Pdb) bar = foo[1]
(Pdb) id(bar), id(foo[1])
(4460603824, 4460603824)

In this case, bar and foo[1] are the same object. Similarly, when using a list:

(Pdb) mylist = [object(), object()]
(Pdb) baz = mylist[1]
(Pdb) id(baz), id(mylist[1])
(4460601792, 4460601792)

This raises a few questions:

  1. Python is allocating a new object, here. What mechanism is at work that is different from the list and dict examples I included? I’ve never seen anything like that and it struck me as odd.
  2. What sort of performance hit do we undertake when allocating a new nupic.bindings.engine_internal.InputSpec object
  3. Does it even matter? i.e. can I safely ignore this or is it worth investigating?

Perhaps somewhat related, you can’t do a tuple expansion of inputPair. Although it repr()'s to a tuple, it’s not actually a tuple but an instance of nupic.bindings.engine_internal.InputPair which means you can’t do the following, which would be perfectly legal for a tuple or other sequence:

name, inputSpec = inputPair

Cheers!

My 2c:

  1. The getByIndex function is implemented in C++ so the return value has to be wrapped in a Python object. Even if the C++ return value is the same memory pointer, I think a new PyObject wrapper will be created. There may be some way to tell SWIG to keep track somehow (or do so manually in the SWIG layer).
  2. Very good question to be asking. There are different ways you can avoid creating new Python objects in performance-sensitive cases. I don’t think this is such a case though (how often do we pull out an InputSpec?). I’d be concerned if it was something we needed to do inside a loop that runs very quickly.
  3. My intuition is that it doesn’t matter in this case (see above) but it’s good that you are noticing this and you will have to use your judgement on whether you think there is a performance or correctness concern with the SWIG interface.
1 Like

Oh and it should be simple to add tuple expansion in the SWIG interface. I believe unpacking uses the iterator interface[1].

[1] https://www.python.org/dev/peps/pep-0234/

Thanks. Seems impact on performance is a non-issue. Multiple repeated inputSpec = inputPair[1] calls yield alternating instances. It’s not allocating a new object every time, but reusing the same two objects. Weird in its own right, but whatevs.