Controlling the program flow with artificial null values is the Insider Information antipattern. It means that "if Value is null then the Data was empty". So it "finds out" the original situation, instead of receiving it as clear information.
The last two issues together also make it a spaghetti code: we pass the nulls with the purpose, to make branching in the other method. So the first method "continues" in the second one, instead of completing the branches.
Call the real common code from the use cases, which do not contain branches.
classFeature {voidcalculate() {Collection<Data> dataCollection =loadData();calculate(dataCollection); }Collection<Data> loadData() {return...; }// The only branching for the use casesvoidcalculate(Collection<Data> dataCollection) {for (Data data : dataCollection) {if (data.first.isPresent() &&data.second.isPresent()) {doUseCaseBoth(data.first.get(),data.second.get()); } elseif (data.first.isPresent() &&!data.second.isPresent()) {doUseCaseFirst(data.first.get()); } else {doUseCaseSecond(data.second.get()); } } }// The use casesvoiddoUseCaseBoth(Value first,Value second) {calculateBoth(first, second);registerFirst(first);registerSecond(second); }voiddoUseCaseFirst(Value first) {calculateFirst(first);registerFirst(first); }voiddoUseCaseSecond(Value second) {calculateSecond(second);registerSecond(second); }voidcalculateBoth(Value first,Value second) {// ... }voidcalculateFirst(Value first) {// ... }voidcalculateSecond(Value second) {// ... }// The real common code, without branchingvoidregisterFirst(Value first) {// ... }voidregisterSecond(Value second) {// ... }}
Separate Data Creation and Processing
But there are more subtle problems with our code:
The calculation loads data.
It is not clear what is the input of the calculation.
It is not clear when the input data is complete.
The use cases are actually defined by the data, depending on which Value is missing. Why not reflect them directly in the data? In other words, we could define the use cases earlier, before the calculation starts. The calculation could contain only the implementations for the use cases.
For this, we create data transfer objects that clearly contain the data by use cases. And we add an extra step between the data loading and the calculation that creates these DTOs.
classValuePair {Value first;Value second;ValuePair(Value first,Value second) {this.first= first;this.second= second; }}classValuePairs {Collection<ValuePair> both =newArrayList<>();Collection<Value> firsts =newArrayList<>();Collection<Value> seconds =newArrayList<>();}classFeature {voidcalculate() {Collection<Data> dataCollection =loadData();// Create the DTOsValuePairs valuePairs =createValuePairs(dataCollection);calculate(valuePairs); }Collection<Data> loadData() {return...; }// The only branching for the use casesValuePairscreateValuePairs(Collection<Data> dataCollection) {ValuePairs valuePairs =newValuePairs();for (Data data : dataCollection) {if (data.first.isPresent() &&data.second.isPresent()) {valuePairs.both.add(newValuePair(data.first.get(),data.second.get())); } elseif (data.first.isPresent() &&!data.second.isPresent()) {valuePairs.firsts.add(data.first.get()); } else {valuePairs.seconds.add(data.second.get()); } }return valuePairs; }voidcalculate(ValuePairs valuePairs) {doUseCaseBoth(valuePairs.both);doUseCaseFirst(valuePairs.firsts);doUseCaseSecond(valuePairs.seconds); }// The use casesvoiddoUseCaseBoth(Collection<ValuePair> boths) {for (ValuePair both : boths) {calculateBoth(both.first,both.second);registerFirst(both.first);registerSecond(both.second); } }voiddoUseCaseFirst(Collection<Value> firsts) {for (Value first : firsts) {calculateFirst(first);registerFirst(first); } }voiddoUseCaseSecond(Collection<Value> seconds) {for (Value second : seconds) {calculateSecond(second);registerSecond(second); } }voidcalculateBoth(Value first,Value second) {// ... }voidcalculateFirst(Value first) {// ... }voidcalculateSecond(Value second) {// ... }// The real common code, without branchingvoidregisterFirst(Value first) {// ... }voidregisterSecond(Value second) {// ... }}
Note, how the if commands have disappeared from the calculate() method. Why don't we need branching here? Because we have to process all use cases. No decisions are needed. Actually, this was done by the original code too, but it was "disguised" as a branching. So this new code is more expressive.
As a consequence of the new data structure, we could also re-organize the for-each loops, since we already get the distinct collections for the use cases.
Create Data Model And Inputs/Outputs
What we really did in the previous step is that we created a data model for the feature. We also made the new data as a clear input of the calculation.
Now it's time to re-organize our code into the proper units.
I suggest creating a sub-package called model and put the DTOs and their factories here. I also suggest naming the factory classes exactly after the data classes they create.
Additionally, ValuePairs could be made less mutable and created clearly by a constructor. This would also better express and separate the use cases.
packagefeature;publicclassCalculator {publicvoidcalculate(ValuePairs valuePairs) {doUseCaseBoth(valuePairs.both);doUseCaseFirst(valuePairs.firsts);doUseCaseSecond(valuePairs.seconds); }// The use casesprivatevoiddoUseCaseBoth(Collection<ValuePair> boths) {for (ValuePair both : boths) {calculateBoth(both.first,both.second);registerFirst(both.first);registerSecond(both.second); } }privatevoiddoUseCaseFirst(Collection<Value> firsts) {for (Value first : firsts) {calculateFirst(first);registerFirst(first); } }privatevoiddoUseCaseSecond(Collection<Value> seconds) {for (Value second : seconds) {calculateSecond(second);registerSecond(second); } }privatevoidcalculateBoth(Value first,Value second) {// ... }privatevoidcalculateFirst(Value first) {// ... }privatevoidcalculateSecond(Value second) {// ... }// The real common code, without branchingprivatevoidregisterFirst(Value first) {// ... }privatevoidregisterSecond(Value second) {// ... }}
Of course, the package must not be literally called "feature". Instead, it should have the actual feature name.
Simpler Or Shorter?
You may say that the resulting code is pretty much longer than the original one. Yes, it's true. But how can it be better, and how can we say that it is simpler?
Think of the daily work of a programmer. We need to develop and modify features and use cases. Imagine that we need to do a change or fix a bug in one of the use cases.
How could we find it in the code, if it has no name? If it is just embedded somewhere?
How could we find it, if it is implemented in more places? If it is "hiding" in more code parts?
How could we modify it and be sure, we do not change something else at the same time?
So, such code is hard to maintain and our changes will be prone to errors. A good code can be longer, but it is easier to find something in it, and it is simpler to maintain.
Everything we implement is a deal: it has a price and a benefit. Yes, we pay an extra price to write clean and safe code. But we save many times more effort and problems with it, through the lifetime of the program. And that's why I not recommend using inheritance: the highest price with the lowest benefit. Did you notice that there is no inheritance in this example?
You can also think of the extra code that they were implicitly in the original code too, but it was not expressed, not visible. The extra code had to be there but it wasn't.
Most of the new code lines contain names. And this is not a "by-product" of our refactoring. Having names is a goal. We use names to separate the parts and to find them.
Also, note that by writing more code lines we also have moved them into more small units. Each unit we created is smallerand simpler than the original code.
And remember, this is a simplified example. A real-life code is much more complicated. And if the parts of a complicated code are mixed and not named either, then we will have problems with maintaining that code.