Separate Data And Procedures
2021.08.06
Last updated
2021.08.06
Last updated
Before classes, there were procedures and data. Data could be structured data. In C they are called struct
.
But developers noticed that for the same data sets they usually needed the same procedures. So they packed the data together with their procedures and called it a class
.
So a class contains:
member variables - I call them now 'data'
member functions - I call them procedures
But was it really a good idea?
Thinking of a large, real-life code base, is it possible to put all codes into the class where the data is located? I think, no. It would mean huge classes. We never do it.
It would also mean, that the same classes would change again and again. Every developer would always work on the same few classes.
From a clean code aspect, bundling data and procedures is not a goal either.
We would like to organize our code by features and not by data. We want to create many small and independent classes.
Procedures should also have well-defined input and output data, rather than being coupled with the data. When they are coupled with the data, which they read and write at the same time, then it is impossible to make a distinction between input and output data.
The final nail in the coffin of the class is the widespread use of dependency injection. We already separate data and procedures and handle them in different ways.
We put procedures in stateless singleton classes, which are instantiated by the injection framework.
We used to organize data in domain models or data models. They should not contain business logic.
So we used to have these types of classes:
Separated procedural and data classes are no classes!
According to the original concept of classes, these are no classes at all, since they don't couple data and procedures.
Unfortunately, today's languages like Java or C++ still call them classes. Not only the keywords are the same, but they both seem to have data and procedure members. So, by the syntax, there is no difference between them.
The reason is simply, that they are class-based languages. Everything is a class. That's why we, programmers still treat them as classes.
Here is what we have by syntax:
Do procedural classes really have data members? No. They have only other components injected. Does it make them stateful? Well, yes, but we just don't use to change the injected components. So we use them differently as they are intended.
Do data classes really have procedural members? No. They have only accessors and mutators (getters and setters). They do not add any new information to the class besides accessing the data members. Does it make them having procedures? Well, yes, but we just don't use to write procedures in them. At least no complex business logic. So we use them differently as they are intended.
We misuse data members in procedural classes and procedural members in data classes.
Why do we do this? Because we don't have other choices due to the syntax.
In Java 14 the data structure is brought back with the record
keyword. This class is defined entirely by the data it carries:
It features automatically generated accessors, equals, hashcode, etc.
It has only getters because all members are automatically final
.
The class is not inheritable, it is final
too.
Read more here:
This also supports the idea that data classes are no classes, and it also gives a new language keyword for them.
We should see and treat separated data and procedural classes as something new. What if we would at least 'imagine' different names for them?
Procedural classes could be called unit or component.
unit would make clear what we test via unit testing.
component advantages:
It would remember us to use the composite pattern instead of OOP.
With Java/Spring we would usually mark a component with the @Component
annotation.
Also, their members should be called procedures.
Data classes could be called records. I have just taken the name from the new Java records.
Their accessors and mutators should be called attributes.
So here is what we have in reality:
Object-oriented programming is based on traditional classes. Should we still do OOP if we have no more classes? Logically, the answer is no.
If we separate procedural and data classes, we should not write programs in a classical OOP way. We should also revise all OOP principles, which are no more valid.
In another article, I suggest not using inheritance anymore.
With this, most of the encapsulation, polymorphism, and open-closed principles are gone.
Many design patterns that use inheritance become unusable. Or we should redesign them without inheritance.
Some other rules, like the dependency inversion principle, can remain valid. (Allowing the usage of interfaces.)
There can be ones, like the law of Demeter, which will be valid either for the procedural or the data classes.
As an interesting example, let's take a look at the LoD. It is also simplified as a "dot counting rule". So we should not write the following code:
getContractService().getUserService().getAddressService().getAddress()
It is still valid for procedural classes. What about data classes?
If we process a data structure then we need to know the entire structure. That's why it was created for. The following kind of code is very common and perfectly valid:
getContract().getUser().getAddress().getZipCode()
There are long articles on the internet struggling with this issue and sometimes coming close to the solution, that LoD is not useful when we access data members.
Criticism of OOP in the Wikipedia article.
Procedural
Data
Statefulness
stateless
stateful
Cardinality
singleton
prototype
Instantiation
by injection framework
by application, ORM framework, etc.
Procedural
Data
Name
class
class
Data members
Yes
Yes
Procedure members
Yes
Yes
Procedural
Data
Name
unit
or component
record
Data members
No
Yes
Procedure members
procedure
attribute
Statefulness
stateless
stateful
Cardinality
singleton
prototype
Instantiation
by injection framework
by application, ORM framework, etc