My criticism of OOP
Introduction
I was an OOP fanboy, but then I needed to work with a procedural code, which I found easier to understand, and bolting new features on an unknown code base is easier too. Although procedural approach is quite different and may look quite alien for those who never used it.
In this writing I will compare the traditional procedural programming approach with the more recent class-based one, and show examples why forcing class based stuff is not better than following a consistent procedural approach: it does not give anything extra. It just let you say the same old thing a different way.
Method calls
You write something like this in your OOP language:
result = object.method(param1, param2);
The same in a fully procedural way would look like this (I used the C#-ish out and ref here):
method(out result, ref object, param1, param2);
But what do you do, if have a situation something like this:
proc(out result1, out result2, ref object1, ref object2, param1, param2);
In OOP style you would write something like this:
(result1, result2) = (object1, object2).proc(param1, param2);
Mainstream OOP languages does not support multiple return values, but this is not a problem, you can define a struct for that. But if you would want to call a method on multiple objects, you are in trouble. Here is a case when you would call a method on multiple objects: Let's assume you are writing a Breakout like game. Various kinds of balls can collide with various kinds of bricks (both are separate polymorph types), the outcome of each collision can be different and affects both objects' state simultaneously which is calculated from their previous state.
OOP languages does not provide good solution for this. You need to either break the encapsulation or hack the problem around visitors. Neither solution is elegant enough. In a procedural language no need to worry too much about internal encapsulation, just use both structs.
Encapsulation
Encapsulation is good thing if you provide interface to a third party. In OOP languages it's done by using private members and functions. But this does not really hide the internals of your class, since it's between the same braces where your public interface lives. In the procedural world handles are used. Let's see the following C code:
In header (to publish):
/* Declaring FooHandle as a pointer to an incomplete struct. */ typedef struct FooStruct *FooHandle; FooHandle createFoo(); void doSomthingWithFoo(FooHandle foo, int x, int y); void destroyFoo(FooHandle foo);
In the C file (not to publish):
#include "foo.h" typedef struct { int x, y; } FooStruct; FooHandle createFoo() { FooStruct *foo = (FooStruct*)malloc(sizeof(*foo)); foo->x = 0; foo->y = 0; return foo; } void doSomthingWithFoo(FooHandle _foo, int x, int y) { FooStruct *foo = _foo; foo->x = x; foo->y = y; } void destroyFoo(FooHandle _foo) { free(_foo); }
As you can see, the third party cannot know the internals of the FooStruct from the header, the pointer is totally opaque. This truly encapsulates the internals of your data isn't it? Of course it's not type-safe in C but the better compilers will show you a warning, if you happen to pass a different type.
Inheritance
It's said that you should prefer composition over inheritance. The problem of inheritance that it creates strong coupling between two types that cannot be changed in runtime... So if you use inheritance to bring in some functionality, use a member in your class instead. It's said you should not use inheritance for the sole sake of reusing classes, then the other option is using inheritance to implement polymorphism.
In a procedural language, like C, you don't have inheritance but composition:
struct A { struct B reusedData; /*more fields.*/ }
Polymorphism
If inheritance should not be used for the sole purpose of reuse, the inheritance must exist in OOP languages to implement polymorphism, so letting the users adding behavior to an existing (abstract) class. Now let's see the how this is done in the procedural world.
Letting users add behaviors to your library
In an OOP world this is done by using abstract classes and letting the users implement them. In the procedural world, the same is done by letting the users set custom callback functions. Let's see it in an example:
In foo.h:
typedef struct FooStruct *FooHandle; typedef void (*NotificationCallback)(void *cbData, int notification); /*...*/ void createFoo(FooHandle handle); void setNotificationCallBack(FooHandle handle, NotificationCallback callback, void *cbData); void doOperation(FooHandle handle); /*...*/
In foo.c:
typedef struct FooStruct { /*...*/ NotificationCallback notifyCallback; void *cbData; /*...*/ } FooStruct; FooHandle createFoo() { FooStruct *foo = (FooStruct*)malloc(sizeof(*foo)); /*...*/ foo->notifyCallback = NULL; foo->cbData = NULL; /*...*/ return foo; } void setNotificationCallBack(FooHandle handle, NotificationCallback callback, void *cbData) { FooStruct *foo = handle; foo->notifyCallback = callback; foo->cbData = cbData; } void doOperation(FooHandle handle) { FooStruct *foo = handle; /* Do something expensive operation while calling the notify callback */ /*...*/ while (/*...*/) { /*...*/ if (foo->notifyCallback != NULL) foo->notifyCallback(foo->cbData, 42); /* Execute callback if set. */ /*...*/ } /*...*/ }
The user of the FooHandle can pass the address of a function and a pointer to a custom data, that will be used by the callback. Basically the pointer to the custom data represents the data members of a class derived from an abstract class. The function pointer is the derived method.
If there are multiple callbacks, each one and its callback can be set independently in similar way. So this is much more flexible way than using an abstract – derived class pair. You immediately have flexible a strategy pattern here!
Internal subtyping
In this case not the user who will add new types but the developers who develop the library. In this case using switch is fine in procedural world:
/*...*/ switch (type) { case T_FOO: doFoo(); break; case T_BAR: doBar(); break; default: { assert(0); /* You forgot to add a case. */ } } /*...*/
In the OOP world, you use polymorphism in this case. But it has a disadvantage the writers of the code often forget: the control flow become more difficult follow for someone else who is reading the code. All you see is:
object->virtualMethod(param1, param2);
The control disappears in the virtual method. If you want to know what can happen here, you probably look up the object
's type you will probably find an abstract class with the virtualMethod
. Not too helpful...
Now you will need to find all the derived classes to find out what can happen. If there are many derived classes, you will need to find all source files, which takes more time than scrolling past a huge switch.
There is a strong debate about this switch vs. polymorphism topic (eg. here: http://c2.com/cgi/wiki?SwitchStatementsSmell). But we have a fundamental problem here: We have a Cartesian product of subtypes and actions. Which is a two dimensional table, we want to represent this table in an one dimensional text, this is the so called expression problem..
So in procedural programming you have:
Operation1 SubType1 SubType2 SubType3 Operation2 SubType1 SubType2 SubType3 Operation3 SubType1 SubType2 SubType3
In an OOP language you have:
SubType1 Operation1 Operation2 Operation3 SubType2 Operation1 Operation2 Operation3 SubType3 Operation1 Operation2 Operation3
You just traded complexities here. With switches you repeat subtypes, with polymorphism you repeat operations in classes.
Multidimensional polymorphism
This is the case of the balls and brick in a Breakout like game. It's also called multiple dispatch. As far as I know none of the mainstream OOP languages support this feature. C won't support it either. But you can use switches again:
void collide(Ball *ball, Brick *brick) { /*...*/ switch (ball->type) { case BAT_NORMALBALL: { switch (brick->type) { case BRT_NORMALBRICK: collideNormalNormal(ball, brick); break; case BRT_CONCRETEBRICK: collideNormalConcrete(ball, brick); break; default: assert(0); } } break; case BAT_FIERYBALL: { switch (brick->type) { case BRT_NORMALBRICK: collideFieryNormal(ball, brick); break; case BRT_CONCRETEBRICK: collideFieryConcrete(ball, brick); break; default: assert(0); } } break; default: { assert(0); } } /*...*/ }
Default asserts are important here. Visitors wouldn't be much simpler than this.
But why OOP is so popular?
- Education
- Students learn the basics of the programming in a procedural language. Then they learn the "correct way" of programming. Which usually means learning OOP and patterns, etc. while ignoring other programming paradigms like procedural or functional.
- Can be sold
- GUI designer programs are visual programs using an OO language and they can be sold to the non-technical managers more easily than any other language.
- Frameworks, libraries
- The most feature rich frameworks are written in an OOP language.
- Intertia
- Once every student learned OOP they cannot imagine programming any other way than OOP.
Conclusion
OOP is not better than the procedural paradigm. It's just a way of organizing code, and it's too popular...