Sunday, 27 April 2008

Number 16

The last friday we've had our last spring whisky tasting at Kottkamp, the next one will be in September. And once again it has been a very warm feeling to be welcomed by my friends. It's really a very good community.

Due to the long time until we meet again I wanted to buy a new malt for my collection. My first thought has been the Benromach Peat Smoke I've tasted in February. But this time it hasn't been so good as I've had it in my mind after the first tasting. So I've tried the new Dallas Dhu Signatory 1975. This 31 years old Speyside is very dry, with notes of vanilla and butterscotch, salt, and citrus fruits. Very nice, but not the one I want. It seemed that my gusto this evening should be for Islays. So the next one has been the Bowmore Vintage 1989, a very good 16 years old malt matured in a bourbon cask and non chill-filtered. The flavour has notes of butterscotch, candied fruits, marzipan, and smoke. The palate contains vanilla, soft fruits, and some peat. And the finish is soft and very elegant.

I've almost decided to take this malt when I've got a hint to taste the Bruichladdich Infinity. This Islay has no special age, it is mixed out of three malts of the years 1989 to 1991, all matured in sherry casks. The flavour contains creamy notes of sherry, dried fruits, and some chocolate, very nice. It continues with more fruits, mostly citrus, malt, sherry, and traces of pepper.  The finish is complex, spicy, with dark fruits. A very good Islay, so I've decided to make it my number 16.

The last one this evening has been a Banff Signatory 1975, a old and special Speyside. It has been good, with plum, honey, hay, cinnamon, and sweet herbs. The finish is warm and soft. But after the Islays I've had no real feeling for this malt.

So at the end of the tasting I said goodbye to my friends, took my Bruichladdich, and cycled home to test, if the contents of my bottle tastes like the one I've tested before. *smile*

Saturday, 26 April 2008

Erlang for the OO-minded

Somehow my thoughts about concurrency-oriented programming as a better way of object-orientation seem to jolted the community. I never had so much traffic on my site. The discussion has been undecided. Some had problems with the functional nature of erlang, other ones with the single-assignment of variables in Erlang. Those who understood the ideas behind the Erlang messaging and the generic server have been able to follow me. So I'll try to make it clearer.

So let's take a small - and useless *smile* - Java class:

public class Adder {
private int a, b;

public void setA(int anA) {
a = anA
}

public void setB(int aB) {
b = aB
}

public int result() {
return a + b
}
}

The usage would be really simple:

Adder myAdder = new Adder();

myAdder.setA(1);
myAdder.setB(2);

System.out.println(myAdder.result());

And now the same thing in Erlang. I will show it in two different ways. The first one works without processes. It is simple, straight foreward and can be realized this way in many languages. But it doesn't use the advantages.

create() ->
{0, 0}.

set_a(A, {_oldA, B}) ->
{A, B}.

set_b(B, {A, _oldB}) ->
{A, B}.

result({A, B}) ->
A + B.

If the module of this code is called adder the usage would be

A1 = adder:create(),

A2 = adder:set_a(1, A1),
A3 = adder:set_b(2, A2),

io:format("~w", [adder:result(A3)]).

That's no real object-oriented way, it only shows the way how many Erlang modules work, e.g. to create and manage dictionaries. The data is managed using the Erlang basic and higher level data types like tuples and lists. Every change creates a new data or data structure because variables in Erlang are only single-assignable. Beside optimization the the major reason is the prevention of side-effects. Later more about that.

The implementation as a process looks a bit different:

create() ->
spawn(?MODULE, loop, [0, 0]).

set_a(Pid, A) ->
Pid ! {set_a, A}.

set_b(Pid, B) ->
Pid ! {set_b, B}.

result(Pid) ->
Pid ! {result, self()},
receive
{response, Value} -> Value
end.

loop(A, B) ->
receive
{set_a, newA} ->
loop(newA, B);
{set_b, newB} ->
loop(A, newB);
{result, Pid} ->
Pid ! {response, A + B}
loop(A, B)
end.

The major parts are the creation of the process using spawn and the process function loop. Inside this function messages are received and processed. After that the function is called tail-recursive, there'll be no stack-overflow. The status of the process - in object-oriented languages called attributes, instance variables, or properties - are maintained in the arguments A and B. Alternatively of several single arguments one tuple of record containing a complex data structure can be used. The other functions above are just helper functions, especially the result function. This is due to the asynchronous message handling where the return of the result is also a message send and receive. So the usage will be

Pid = adder2.create(),

adder2:set_a(Pid, 1),
adder2:set_b(Pid, 2),

io:format("~w", [adder2:result(Pid)]).

Here you easily see how this time only one object - the process - is created and modified. Due to the fact that a process has only one message queue all messages are handled sequentially. So there're no problems with synchronizations, semaphores, or locks. Multiple processes can use this process with no problems. And here the old metaphor of sending messages to objects is really true. You may think "Nah, doing dispatching on my own, bullshit." But OTP modules like gen_server and the callback mechanism allow to simplify that. They are generic, like abstract classes, and provide all the needed stuff so that you can concentrate on the business logic and some comfort functions.

So where's the advantage? Surely not in those tiny processes, I would implement them as standard modules like the first example. But the strength of Erlang is the concurrency, the parallel processing. Spawned processes are not working sequentially but really parallely, on one core, on multiple cores, on multiple processors, and on multiple systems. And that's the big advantage. Think about a special architecture like pipes and filters for the processing of a larger amount of data.


In a typical sequential way each retrieved insurance holder would be processes step by step and typically on a single processor. The processing of a large number of insurance holders and their contracts would last a long time. One solution could be the usage of multi-threading for the filters together with synchronized data queues for the pipes. Using inheritance simplifies the implementation. This solution would use multiple cores and processors. But still there's a limitation in the distribution of the filters or groups of filters. For example everything up to the premium filter could be on a first system, the both branches behind it on two further systems. With most languages you would need special pipes for the remote communication, which again would make the whole solution more complex.

To distribute processes in Erlang it would be just necessary to add another function:

create(Node) ->
spawn(Node, ?MODULE, loop, [0, 0]).

This way the adder - or in the scenario above a filter - could be startet on a different node and be used like if it is working locally.

Pid = adder2.create(my_node@my-server.in.my.net)

Beside that there's no need for more implementation. Only the VMs have to be started with a name, a cookie for securing the networking, and a host file with the names of all the nodes. It's funny how simple it is. The example above also shows how Erlang handles polymorphism. One way is the arity of the functions. That's the reason why the export of functions also contains the number of arguments. Here's one small example defining a function in the adder module.

-module(adder).
-export([add/1]).

add(List) ->
add(List, 0).

add([Head|Tail], Acc) ->
add(Tail, Acc + Head);
add([], Acc) ->
Acc.

Only the add function with one argument will be exported, the other one is internally. It shows also the second way of polymorphism, the pattern-matching. While there are elements in the list the first of the two add functions with two arguments will be executed. It adds the head element to the accumulator and continues recursively with the tail. If the list is empty the second one is called, which returns the accumulator as the result. Beside the functions the pattern-matching also works in the case-, if-, and receive-statements. I've shown this already in the adder process function above. It's easy to see how the received tuples could contain the same command atom as the first element and then a different number of arguments as the further elements.

Another way to realize polymorphism are guards. Those are constraints which can be added to function definitions and pattern-matchings. One major task of guards is to do type checking. Erlang uses duck typing, so the arity is sometimes not enough, e.g. for a function to append anything in it's string representation to a string.

string_append(String, Float) when is_float(Float) ->
...;
string_append(String, Integer) when is_integer(Integer) ->
...;
string_append(String, Tuple) when is_tuple(Tuple) ->
...

Multiple guards can also be combined using a semicolon (or) and a comma (and). They will be evaluated short-circuited to increase the performance. Their flexible definition additionally allows a more powerful polymorphism than in traditional languages. Think about a process for withdrawals which shall do this differently for different amounts:

loop(State) ->
receive
{withdraw, Amount, Account, Lo, Hi} when Amount =< Lo ->
% Perform a standard withdraw.
...;
{withdraw, Amount, Account, Lo, Hi} when Amount > Hi ->
% Perform a special customer approval before the withdraw.
...;
{withdraw, Amount, Account, Lo, Hi} ->
% Perform a simple customer approval for withdrawals between lo and hi.
...
end.

One big part of object-orientation is still missing: the inheritance. Here Erlang has no real solution in the sense of deep hierarchies based on one root class. But with behaviours and callbacks you can at least realize something like abstract classes and their children. The OTP libraries use this for several powerful modules. Here's my very small implementation of the generic server. The original one is by far more sophisticated.

-module(server).
-export([start/2, stop/1, call/2]).

start(Module, Args) ->
% Call init/1 in Module.
% It has to return an initial state.
State = Module:init(Args),
spawn(?MODULE, loop, [Module, State]).

stop(Pid) ->
Pid ! stop,
ok.

call(Pid, Msg) ->
Pid ! {call, Msg, self()},
receive
{response, Value} -> Value
end.

loop(Module, State) ->
receive
{call, Msg, Pid} ->
% Call function handle/2 in Module.
% It has to return {Value, NewState}.
{Value, NewState} = Module:handle(Msg, State),
Pid ! {response, Value},
loop(Module, NewState);
stop ->
% Call function terminate/1 in Module.
Module:terminate(State)
end.

This way the developer just has to implement the three functions init/1, handle/2 for each message, and terminate/1.

-module(account_server).
-export([init/1, handle/2, terminate/1]).

init(Args) ->
% Create an initial state, e.g. a database connection.
...

handle({open, Account}, State) ->
...,
{0, NewState};
handle({withdraw, Amount, Account}, State) ->
...,
{Balance, NewState};
handle({deposit, Amount, Account}, State) ->
...,
{Balance, NewState};
handle({balance, Account}, State) ->
...,
{Balance, NewState}.

terminate(State) ->
...,
ok.

So a simple session could be:

Pid = server:start(account_server, DatabaseName),

server:call(Pid, {open, 4711}),
server:call(Pid, {deposit, 1000.0, 4711}),
server:call(Pid, {withdraw, 250.0, 4711}),

Balance = server:call(Pid, {balance, 4711}),

% Balance now should be 750.0.

server:stop(Pid).

As written above this is typically more powerful and elegant handled, but this example should be enough to let you understand how Erlang processes could be seen as a kind of objects. Additionally to the features I mentioned here the receive construct also knows a time-based action which is called when no message has arrived for a given time. And through a simple mechanism parent processes can be notified if a child dies. These both features again allow more powerful solutions. Maybe this is reason enough for you to be as interested as I am in developing with Erlang.

Friday, 18 April 2008

Exciting days

The current days are a bit exciting with only few spare time for me. It started last Friday with a quick flight to Poland and the birthday party of my brother in law in the evening. On Sunday we celebrated the confirmation of our niece. Then from Monday till Thursday I got two courses in software architecture by the CMU SEI - Software Architecture: Principles and Practices (SAPP) and Documenting Software Architectures (DSA). On Monday and Tuesday Software Product Lines (SPL) and in May Software Architecture Design and Analysis (SADA) will follow. Last Tuesday I finished the work on my Erlang article which will be published next month. Today our little daughter Vanessa has her 12th birthday and on Sunday our older daughter Janina has her confirmation. *phew*

But I also had the chance to start the next improvement on the Tideland CEL Lightweight Message Bus. I'm currently adding a registry for the dynamic resolution of service names in a set of networked nodes. So if a publish can't be addressed to a local service the broker will retrieve a reference to an instance from the other nodes and cache this information. In the next step I'll add some kind of aspect orientation. So functions for cross-cutting concerns can be assigned to services so that they are executed before, after or around a service function.

My postings on COP and OOP led to much interest and response. Some of the comments in other forums showed that the writer doesn't know Erlang at all. In their eyes functional and object-orientation are diametrically opposed. So how about CLOS? Hmmm. But others could follow. So I'll write a small introduction Erlang for the OO-minded.

Monday, 7 April 2008

Ideas for an Erlang Object System

Using the Erlang concurrency-oriented style for object-oriented programming lacks the elegance of languages like Smalltalk. One way to solve this problem would be a pre-processor with an own syntax generating the Erlang code. But I don't like this solution because it would feel like a foreign substance. Additionally this language would have to be complete, documented, and able to use the Erlang libs. So a different idea would be a simple library, almost like gen_server, but more with OO ideas in mind. It would rely on callback modules together with the dynamic invocation of functions. First a call of ObjRef = eos:new(my_module) or ObjRef = eos:new(my_module, Args) would create an instance as a new linked process. The initialization could use my_module:new(Args, InitialState). The initial state would be the result of a recursive initialization through calling my_module:parent_module() and initializing those modules. This behaviour, calling parent_module/0 until it returns undefined or doesn't exist in a module, would the basis for inheritence.

To simplify life the EOS should only support synchronous method invocations and ignore all other Erlang messages. The call of Result = eos:invoke(ObjRef, my_method, Args) would lead to the call of my_module:my_method(Args, State) and has to be answered with Result or {Result, NewState}. The function invoke/3 would look for the function inside the module and, if it isn't exported in that module, recursively in the parent modules. If it can't be found it would try to invoke does_not_understand/2 the same way. If even this function can't be found the system should raise an error.

The dispose of the object could be done manually through eos:dispose(ObjRef) or automatically using the typical Erlang mechanism of linked processes and their notification. Alltogether this system is really simple and it definitely doesn't compete with the standard OO languages. But it may help some experienced OO developers to feel more homelike in the Erlang world. What do you say?

Sunday, 6 April 2008

Second 2008 tasting

So much work, so few time. The second single malt tasting at Kottkamp has already been on March 28th. But I needed until today to write some notes about it. *sigh* Once again it has been great fun to meet friends, talk, and taste good whiskys. Since you need an entrance card the audience changed a bit. The limited cards - the store is simply too small - are sold out very fast. So you've got to reserve your card early.

Sadly I forgot my little form for my tasting notes. So I'll try to remember how the malts have been. I hope I'll be better organized next time. *smile*.

The first one has been the best one, once again a Glenrothes 33y Signatory, sherry butt, cask strength. A rich complex nose with orange and honey, the palate is warm and fruity with some vanilla, and the finish is warm, creamy and long lasting. A great malt, but sadly also expensive. The second one has also been very good, but not as good as the Glenrothes. It has been a Clynelish 1973 33y Signatory with 54%. There nose contains notes of peat, nuts, and citrus fruits, the palate contains honey, fruits, salt, and smoke, and the finish is very long with some salt. The third one has been a Highland Park 1985 21y, always a good choice, followed by a Laphroiag 10y. The last one this evening has been a good Speyside again, a Banff 1975 29y. But I haven't been able to pay the right attention to it, after the Islays before. *smile*.

End of April we'll have the last spring tasting. You'll read about it.

COP - The better way of OOP?

The last weeks I haven't had as much time for the development of my Erlang software as wanted, but I've been busy in refactoring the Tideland CEL Lightweight Message Bus. It has been the goal to reach the beta state but after some tests and the development of a first real service I realized where I still have to do some work. Now the system follows the OTP design principles more than before, supports stateful services, has an improved load behaviour, scales better, and the API is more simply. *smile* Currently I'm doing some stability tests where services are restarted automatically when they die. But that's not the focus of my entry today.

I'm now developing software the object-oriented way since about 20 years, mostly in Smalltalk, Python, and Java. The common paradigm in many tutorials has been that objects are a kind of things with some knowledge communicating with each other through sending messages and receive the answers. But when programming in those systems it feels more like calling functions with an invisible record as the first argument to access the record fields. The program itself is a kind of imperative programming. OK, there are inheritance, the overriding of methods, and polymorphism. So there's a bit more, yes. And with threads or processes there are classes which allow an asynchronous execution. But after all there's still no feeling of really independent objects populating a common world, living together, acting autonomically, communicating through real messages.

But now, after learning concurrency-oriented programming with Erlang, it's different. On the first look Erlang seem to be a strange language, with Prolog roots and working functional. So you have to get accustomed with the pattern matching - it's great - and the fact that every variable can only be written once. A process is just a function that is spawned to work in the background. It can run once or endless through tail-recursion. But the real important fact is that every process has a queue for the asynchronous receiving of real messages. The receive construct uses the typical Erlang pattern matching so that a process can handle different messages differently. Additionally the construct can contain a timeout statement for automatic tasks after some time of idling. Links and monitors allow processes to get notified if another process is dying, once again through sending messages to the monitoring process.

my_object(State) ->
receive
{method_a, Arg1, Arg2} ->
NewState = do_method_a(State, Arg1, Arg2),
my_object(NewState);
{method_b, Arg1} ->
NewState = do_method_b(State, Arg1),
my_object(NewState);
{'EXIT', Pid, Reason} ->
NewState = handle_exit(Pid, Reason),
my_object(NewState)
end.

This may look inconvenient. But the generic OTP modules like the gen_server and callbacks hide this mechanism and allow quick, convenient, and powerful implementations. Own generic modules can integrate other ones, so some kind of inheritance can be implemented. In case of my Lightweight Message Bus the services are simply modules subscribing to the bus through

cellmb:subscribe(service_name, my_service_module)

In case of a stateful service a call of

cellmb:publish(Ctx, service_name, do_something, Args)

would lead to the execution of

my_service_module:do_something(Ctx, Args, State).

This way the implementation of own services is really simple. After some time of learning working with those kinds of objects gets more and more natural. You even don't have to do very much to distribute those processes over multiple cores, processors, nodes, or computers. But you've got to rethink your knowledge about application design to optimally use this concurrency based behaviour. It is still not trivial to find if and how problems can be solved through parallel execution.

What's missing: Not everything is an object, only those processes working with receive and tail-recursion. So you can't ask a string for the length, like it can be done in Smalltalk. Instead Erlang provides helpful libraries for the work with the standard and higher-level data types. If this is required, a pure object-oriented language, Erlang doesn't fit. But for me this doesn't hurt. I'm a fan of the clean style of Smalltalk. Nevertheless Erlang is productive and expressive. So why care if everything is an object? And the question hasn't been about the language, it has been about concurrency-oriented programming as a way of object-oriented programming. A typical Erlang system consists of up to several thousand processes on each node, processes like the ones I've described above. They are based on the generic server, the event handler, the finite state machine, the supervisor, or own implementations. And they all work like objects in a real world, really parallel, communicating with each other when needed. For me this behaviour seems to be the better way of object-oriented programming.