Re-Organisation of the C++, PY Community repos

community
structure
repo
hackers

#21

Correct.

In a nupic installation the bindings are actually a mixture of python and cpp modules (algorithm, engine, math)

As far as I can see it’s just used for configuration.

I think for now we are good. In the future we might restructure. I’m still stuck in “small” issues like recompiling nupic.core with 64bit float vs 32floats. Or even 128 floats…

Hope you don’t take it the wrong way. Right now I really like having everything in one visual studio solution, including nupic. Great for testing and tinkering.
cmake is on the todo list!

Yes, let’s brainstorm!


#22

I’m eager to try this on linux


#23

@David_Keeney we are using C++ Network API and Region for working with different hierarchies in htmresearch


#24

I am looking for people (C++, Py,…) who would like to participate in the community Reviewers team. I think it’s important to see the code with more eyes (not only for bugs, but for design, usability). But I also hope for quick feedback and rapid developement in the community repo.

  • I’ve set “rules” for merging that at least one review is required to get the new code in.

#25

Goals

Q: What would you like the goals, direction of a community repo would be?
After we get more ideas, I’ll make a poll again.

For me it is:

  • at start, continue with active depelopment based on numenta/nupic.core/py
  • active PR review, fixes, open for big changes,…
  • keep multiplatform support
  • keep high level of code (tests, reviews)
  • gradual development to new big features (optimization, py3, restructuring,…)
  • ease of use for the programmers,developers(=us :wink: )

#26

All of this discussion is good. But we must keep in mind what this library is for. This is not a product or production application. This is a framework for experimentation.

From my viewpoint this library is intended for people who want to experiment with their own implementations of any of the components.

  • So the first priority is understandability of how the algorithms in the library work.
  • The second priority is flexibility in how the parts can be connected.
  • The third priority is flexibility on platform and programming language of the user.
  • Performance is important but not at the expense of the other three goals.

We should assume that the number of algorithms and variations on those algorithms will continue to grow as people find new things that work…after all that is the purpose of the library. Consequently do not expect any of the API to remain constant. As new ideas are presented we should try to incorporate them. For example, if someone discovers a new way to do SP and has a working implementation in Python we should help them port that to C++ and add it to the library if they are unable to do that themselves. I would not expect to see polished code being offered and that should be ok. Every module in the library must have a corresponding unit test module but don’t expect offered modules to have one so we must help them provide one.

@breznak your thing seems to be C++ optimization…that is great. You can help optimize the submitted modules as long as it does not make it harder to understand or loose flexibility.

Having said that, the actual layout of the library should be focused on how easy it is to understand even by someone that is not a professional programmer. In my opinion it is not mandatory for the community core library to be a clone of nupic’s production core as long as we can identify the changes they have made so we can incorporate them into our library. I expect there to be considerable deviation.


#27

@thanh-binh.to Thank you. That is reason enough to include it in the C# interface.


#28

We use it heavily internally. The OPF is built on top of it. I suggest you keep it, it provides the flexibility to construct layers and columns.


#29

Thank you for your points, will def. add them to the poll.

This is not a product or production application. This is a framework for experimentation.

I agree, in a way. I’m open for more rapid and extreme changes, but on the other hand, I’d like to have a fork that is “an actively developed continuation of the Numenta’s repositories”. So that Numenta can try to sync once in a while, if they wish, and people who build their apps on top of it can continue to use a fixed and developed descendant. So I’ll also add

  • compatibility (more or less) with the current Numenta API
  • rapid (vs conservative) development (API breakage)
  • unit-test coverage (each new feature is tested)
  • keep c++ / Py feature+API parity (vs the repos can separately diverge and live on their own)

Note, I’m collecting ideas to ask here, not that I’d agree with all the points I’m listing here.

library should be focused on how easy it is to understand even by someone that is not a professional programmer

I’m not sure about this one. Either they are scientists and focus mainly on the papers/NeuroSci, or programmers who focus (and know) the internal workings, or application users, who use products based on HTM (Grok, HTM schools, …)…imho

I would not really care about whitespaces and coding-style so much (always Matt had to punch me to do that :wink: )

…if someone discovers a new way to do SP and has a working implementation in Python we should help them port that to C++ and add it to the library if they are unable to do that themselves

Careful with this, of course we’ll do it if we like that or it’s uber cool feature; but you might soon end up porting code you are not interested at.

More ponts, ideas for the poll, what you want from future nupic?


#30
  • pure python functionality (no other dependency on bindings, for quick prototyping)
  • focus on Py repo
  • focus on c++ repo
  • provide releases for binary installs (pypi)

#31

Great job coordinating, you all. It would be a Very Good Thing to get everyone working on the same forked codebase with a set of objectives. You seem to be doing in the right direction.

Just be careful about letting “new algorithms” into the project. When this happens, be very clear about where they originated, whether they are biologically inspired or not (cite papers). It will help in the future.


#32

@rhyolight Ah, yes I agree. These should be HTM algorithms. I was thinking in terms of some of the variations of the HTM modules listed in the API specifications…like backtracking TM, and perhaps some more encoders and classifiers or even some monitoring tools. Hopefully some new things will eventually come out of Numenta’s current research that we can add.


#33

I know you already set up the repositories, but…
My stab at it…

// core repository (C++)
/nupic.core
  /packages // [machine]/[vendor]/[operating system]/[package].[tar.gz|zip]
    /x64
      /gcc
        /linux (*.tar.gz)
          ...
        /windows (*.zip)
          ...
      /msvc
        /windows (*.zip)
          ...
    /x86
      /gcc
        /linux (*.tar.gz)
          ...
        /windows (*.zip)
          ...
      /msvc
        /windows (*.zip)
          ...
  /modules // can these go in separate repositories?
    /nupic.core
      /include
        /nupic // headers
          ...
      /src
        ...
    /nupic.core.cs // bindings
      /include
        /nupic // headers
          ...
      /src // header-only?
        ...
    /nupic.core.py // bindings
      /include
        /nupic // headers
          ...
      /src // header-only?
        ...
  /tests
    ...
CMakeLists.txt

// client repository (C++)
/nupic -> [nupic.core]
  /include
    /nupic // headers
      ...
  /src
    ...

// client repository (C#)
/nupic.cs -> [nupic.core, nupic.core.cs]
  ...

// client repository (Python)
/nupic.py -> [nupic.core, nupic.core.py]
  ...

#34

the reorganization did not happen yet. There’s been a discussion with good points for both sides.
So far the repo tries to keep things with minimal changes and gradually evolve.

The related topics seem to be repo structure & how to manage dependencies.

/gcc
    /linux (*.tar.gz)
      ...

instead of stuffing the binaries into the repo, can we just point to a public release and use the cmake: file(DOWNLOAD ..) as you’ve suggested somewhere else?

What is again difference between a nupic.core(c++) and nupic.core.client(c++) ?

Seeing the problems we ran into in the Py3 PR, I’d like to ask What are the real benefits this brings us? Compared to the status quo. As each of these changes is quite a lot of work.