0

I'm writing an app in Python and wish to write it without any state variables (rules out properties and any other variables which reside outside a function).

Analogy: From my Erlang experience I know that, Erlang has a neat actor-model and as part of it, it can block on a 'receive' construct. As I understand, this is the underlying mechanism of gen_server which allows storing state as parameter, rather than an external variable. Is there something like this in Python? Or am I on a wrong track?

Specific question: Is there a way in Python which will allow me to store state (DB connection handle in my case) without using any variable which resides outside a function? I'm okay using any actively maintained frameworks that might be required to achieve this.

ameyazing
  • 403
  • 12
  • 24
  • The only thing that comes to mind is something like Twisted framework: http://twistedmatrix.com/trac/ – Tomasz Maciejewski Sep 07 '17 at 12:40
  • Hi Tomasz, Thanks for the reply and yes, your suggestion meets my requirement, but Twisted seems to be a networking library. If you are implying that I open a socket connection from each module that wants to write to / read from DB, that would be a really 'twisted' way to achieve my aim (pun intended) :D. I'm curious to know if there is a more elegant way to do it. – ameyazing Sep 07 '17 at 16:36

1 Answers1

0

I think what you are looking for is a pointer on how to write Python in a functional way that avoids use of global state as much as possible and, in particular, something that will make it "reactive" in the way that Erlang is inherently.

Is it possible to generally avoid global state in Python? Yes, for the most part. Some things are going to simply be global because they represent real-world resources, though, and you're not going to escape from that (but that's what using queue to sequentialize communication is all about). That doesn't mean that you have to have two concurrent threads writing to the same socket, of course. It does mean, however, that you can have threads spawned or processes forked without having much visibility over global state, and write them in a way where they develop their own state internally (which is the normal thing to do anyway).

Is is possible to make Python work in a "reactive" sort of way? Sure. There are a few "reactive" frameworks for Python, and many coders (myself included) find Python to be most maintainable and understandable when written in a functional style that makes limited (or no) use of the class keyword. Put the two together and you can probably hit somewhat close to your goal. I've never used it, but ReactiveX for Python seems aimed in this direction. There is also an answer here on SO that might give you some food for thought.

So doable? Yes. But desirable? That depends entirely on the actual problem you are trying to solve, which you have not expressed. That could easily mean this is an X-Y problem, and you'll need to solve for X before you can move ahead in a sane way.

Beware paradigms that are an ill fit for a language. I personally do not believe that Python's strengths lie in this area, nor do I find Python's (lack of) error recovery strategy to be conducive to concurrent network programming. There are many big, scary monsters in the concurrent programming closet and shared data is only one of them -- scheduling, access, resource control, queue overflow management, sequencing, sequenced bottlenecks, etc. will all rear their head at some point.

There is a reason Python has a GIL and there is a reason that the best solution people have found to massive concurrency in most single-threaded languages like Python is to use OS threads combined with something like Docker to force arbitrary system partitioning, have some sort of recovery approach, and defer scheduling and resource management back down to the underlying host OS (even if these effects are largely unconscious achievements to most of the folks in the Docker community).

Those are super hard problems and you don't want to have to write solutions to those super hard problems into every project you develop.

It just turns out that the Erlang runtime already provides all of this as a matter of course because these are the exact problems it was designed to solve. The differences here are much deeper than OOP vs FP coding paradigms and really touch on the underlying runtime and resource management paradigm (which is somehow mysteriously omitted from discussions of this nature in the normal case). If a need for reactive, massive concurrency is your problem, I would recommend using Erlang for your project -- but maybe Python for the parts that are better expressed in Python (especially if there are already Python libs that do whatever heavy lifting you need).

zxq9
  • 13,020
  • 1
  • 43
  • 60
  • Hi zxq9, thanks for the excellent explanation. Your first paragraph sums up my requirement pretty well. I want to write FP code which avoids global states. I was in love with Erlang's no side-effects guarantee that comes with writing functions that purely depend on it arguments. I'm not trying to achieve any concurrency as I'm writing a simple crawler (client, not server). Want to write Python code in no side-effects way. What do you suggest for that? ReactiveX? Will have a look at it today and revert. – ameyazing Sep 12 '17 at 04:26
  • @ameyazing If you are on the client side then this is *much* easier to sort out. If you are parallelizing access, consider having each thread spawn with its request target, open its own socket, and terminate when its job is done (while, at most, sharing only a reference to a logger/sequential-worker that has a queue to which you append). If you are just writing a single-threaded program then *socket programming is already reactive* and you've got nothing to worry about. Write in a functional style, iterating over your targets, and that's that. No magic library or framework needed! – zxq9 Sep 12 '17 at 09:50
  • I did not understand your comment "having each thread spawn its request target". I thought Python did not support multithreading. Anyway, at this point performance is not on my priority list. I'm okay with single-threaded access. Rx looks promising. Will give it a try soon. Thanks. – ameyazing Sep 17 '17 at 04:37
  • @ameyazing Mutithreading in Python works, but there are certain constraints related to the GIL and data access. IMO, given the existence of Erlang and ease of interoperation via sockets (BERT is nice) I just haven't found multithreading and multiprocessing in Python to be a big win. Its just not a strength of OOPy languages. – zxq9 Sep 17 '17 at 11:55