It depends how much control you want/need over the other aspects of clustering. Sockets will do the job and leave you in full control to optimize, but you’ll have to take care of scaling, fault tolerance, security, etc.
At the very least, you probably want to use container orchestration like kubernetes to handle deployment, scaling, and management in a cross platform manner.
If you want to abstract away the protocol and also get resiliency and service discovery, you might want to look at a service mesh pattern like istio (github) or others. This means if you’re willing to conform to the pattern, you don’t need to think about network communication, you just build the business logic.
Disclaimer: I’ve worked with containers and orchestration layers quite a lot and they are fantastic for stateless containers, however I am yet to build a service mesh app and take it to production!
I guess the next question is, what are the global state requirements? What data will each node consume and produce, and to what extent to the other nodes need to know about it?
I am thinking that in the human case we have 100 or so maps that can be considered as transforming an input image to an output image.
These “images” are the tokens exchanged.
I expect this process to be peristalsis in nature so there would have to be coordinated execution across the network.
It is highly likely the at first everything would run off a single workstation.
If I start with plain sockets are there general guidelines to make it easy to step up to a more comprehensive system as development progresses?
Just so we’re on the same page, are we talking about a Temporal Memory style where transformation logic changes dynamically in response to new input? (as an illustration, I’ll borrow @Paul_Lamb’s diagram rather than draw my own). It would be handy to know how your architecture differs in terms of processing distribution and state transfer.
The short answer is that you can always refactor and insert service mesh logic later, and there are implementations like linkerd-tcp that can somewhat squeeze in-between existing TCP applications. Without reigniting the functional vs imperative holy war, I would just make sure you’ve been a good dev and separated out the transport code into a single function that accepts your token data and takes care of the socket calls, this would of course make refactoring easier later.
If you’re planning to focus on transformation processing and having a simple, static topology doesn’t compromise the design then I’d say strike while the iron’s hot rather than spend the next two days reading kubernetes/linkerd doco. However, if routing logic and state synchronisation is (or becomes) a major factor then I think it’s worth pausing and looking at how this has been solved already.
Oh and in terms of easing containerisation/orchestration later, if you can build something that can work off a minimalist docker image like alpine, you’ll do a lot less waiting
In other words, not building too much toward a specific linux distribution or kernel feature.
Actually I will make one strong recommendation on the protocol front - rather than pure sockets, start with gRPC (https://grpc.io/)
This keeps the learning curve shallow and the upfront effort small, but gives you a better abstraction that you can plug something like envoy into later.
I am working in C and PERL, neither of which seem to be listed in the fast start section
Guess you won’t be starting fast then!
Juniper networks have written a C implementation but it’s pre-alpha, so may not turn out to be a good idea (worth a quick shot though?). There don’t seem to be any mature perl implementations either.
The other option would be to use HTTP/2, e.g. with nghttp2, along with a standard protocol buffer library like flatbuffers. I haven’t used either of these, as I haven’t used C since programming a robot to follow a white line once in a university lab. Using HTTP/2 would give you some flexibility down the track vs a pure TCP socket, as many of the service meshes will have better support for it.
Having said all of that, I doubt many people will have tried to combine the low-level world of C (typically for embedded/high performance applications) with the high level world of web apps, web APIs, middleware, proxies etc that occupy the service mesh space and are typically written in OO or functional languages. So if there will ever be a good time for you give up some control and venture a bit higher level, that time might be now
I tend to believe you, as far as modern www-oriented solutions goes (In fact I know almost nothing of them). But industry-level solutions to distributed applications do exist for the close-to-the-wire world. Even for C old’timers
You’re probably right that the web universe and OO would be a faster way to prototype stuff. And maybe that’s what @bitking is looking for here, after all.
I would have believed, seeing Matt linuxing his way through NuPIC-configuration concerns (ugh) and maybe server farms (?), that some of that had already been tackled by someone. But I may have been mistaken.
First write a single process app, constraining yourself to structure your different areas so that each comm between areas is a serialized message
Still single-process, it shall be relatively straightforward to multithread (one per area), each awaiting a signal before crunching upon said message and sending their output as a message too.
Wrapping that message handling part into actual socket concerns (TCP please) is now guaranteed straightforward.
You may afterwards start to fiddle with a multi-process approach on same workstation, and straightforwardly after that a multiprocess approach on several stations. Once on either of this multiprocess world, look for supervision/config solutions. (Here maybe SNMP afterall, #forgetwhatisaidinpm).
Or you can also scrap what I just wrote if you want to take advantage of GPU, in which case having thrown multithreading into the mix will get you ‘meh’.
then setup two stations, two areas, and… devise a scalable anything.
I’m sure they do exist, but to start a greenfields project in 2018 in such a way would have to be pretty rare. In a world of Scala, Erlang/Elixir, Julia or libraries on top of the more mainstream languages, you could spend a lot of time reinventing the wheel.