Sunday, 6 April 2008

COP - The better way of OOP?

The last weeks I haven't had as much time for the development of my Erlang software as wanted, but I've been busy in refactoring the Tideland CEL Lightweight Message Bus. It has been the goal to reach the beta state but after some tests and the development of a first real service I realized where I still have to do some work. Now the system follows the OTP design principles more than before, supports stateful services, has an improved load behaviour, scales better, and the API is more simply. *smile* Currently I'm doing some stability tests where services are restarted automatically when they die. But that's not the focus of my entry today.

I'm now developing software the object-oriented way since about 20 years, mostly in Smalltalk, Python, and Java. The common paradigm in many tutorials has been that objects are a kind of things with some knowledge communicating with each other through sending messages and receive the answers. But when programming in those systems it feels more like calling functions with an invisible record as the first argument to access the record fields. The program itself is a kind of imperative programming. OK, there are inheritance, the overriding of methods, and polymorphism. So there's a bit more, yes. And with threads or processes there are classes which allow an asynchronous execution. But after all there's still no feeling of really independent objects populating a common world, living together, acting autonomically, communicating through real messages.

But now, after learning concurrency-oriented programming with Erlang, it's different. On the first look Erlang seem to be a strange language, with Prolog roots and working functional. So you have to get accustomed with the pattern matching - it's great - and the fact that every variable can only be written once. A process is just a function that is spawned to work in the background. It can run once or endless through tail-recursion. But the real important fact is that every process has a queue for the asynchronous receiving of real messages. The receive construct uses the typical Erlang pattern matching so that a process can handle different messages differently. Additionally the construct can contain a timeout statement for automatic tasks after some time of idling. Links and monitors allow processes to get notified if another process is dying, once again through sending messages to the monitoring process.

my_object(State) ->
receive
{method_a, Arg1, Arg2} ->
NewState = do_method_a(State, Arg1, Arg2),
my_object(NewState);
{method_b, Arg1} ->
NewState = do_method_b(State, Arg1),
my_object(NewState);
{'EXIT', Pid, Reason} ->
NewState = handle_exit(Pid, Reason),
my_object(NewState)
end.

This may look inconvenient. But the generic OTP modules like the gen_server and callbacks hide this mechanism and allow quick, convenient, and powerful implementations. Own generic modules can integrate other ones, so some kind of inheritance can be implemented. In case of my Lightweight Message Bus the services are simply modules subscribing to the bus through

cellmb:subscribe(service_name, my_service_module)

In case of a stateful service a call of

cellmb:publish(Ctx, service_name, do_something, Args)

would lead to the execution of

my_service_module:do_something(Ctx, Args, State).

This way the implementation of own services is really simple. After some time of learning working with those kinds of objects gets more and more natural. You even don't have to do very much to distribute those processes over multiple cores, processors, nodes, or computers. But you've got to rethink your knowledge about application design to optimally use this concurrency based behaviour. It is still not trivial to find if and how problems can be solved through parallel execution.

What's missing: Not everything is an object, only those processes working with receive and tail-recursion. So you can't ask a string for the length, like it can be done in Smalltalk. Instead Erlang provides helpful libraries for the work with the standard and higher-level data types. If this is required, a pure object-oriented language, Erlang doesn't fit. But for me this doesn't hurt. I'm a fan of the clean style of Smalltalk. Nevertheless Erlang is productive and expressive. So why care if everything is an object? And the question hasn't been about the language, it has been about concurrency-oriented programming as a way of object-oriented programming. A typical Erlang system consists of up to several thousand processes on each node, processes like the ones I've described above. They are based on the generic server, the event handler, the finite state machine, the supervisor, or own implementations. And they all work like objects in a real world, really parallel, communicating with each other when needed. For me this behaviour seems to be the better way of object-oriented programming.

2 Comments:

Anonymous said...

weird that you did not mention ruby ;)

especially .send and method_missing should give a lot of freedom to really talk only with objects therein

What about the Io language?
I think its syntax is worse than ruby, but it has one really huge idea going for it - prototypes (plus that it has very few keywords)

Its kinda like modelling your objects without the need to make a class first, and then clone from that class

BTW i think if one does not mind Erlang's syntax (i mind, but i am a bad programmer anyway) then the syntax of Io is appealing even more

mue said...

It's not about the language. I know Ruby, but I like Smalltalk more.

Here I talk about the behaviour, the real parallel execution of thousands of processes, and their asynchronous communication through messages and message queues.