Encapsulation is not Information Hiding

A Gemini space capsule in orbit

I have recently had problems integrating code written on other projects into my own application. In each case, the problem was caused by a reference to an external resource - a file name, for example - being hard-coded in the class of some object in an internal package within the the code I was trying to use. The explanation for the design of these packages, and why I shouldn't change things, was that the dependency was "encapsulated" within the object in question.

Many articles and books use the word "encapsulation" as a synonym for "information hiding". However, encapsulation and information hiding are two separate, orthogonal concepts:

Information hiding lets one build higher level abstractions on top of lower level details. Good information hiding lets one ignore details of the system that are unrelated to the task at hand.

Encapsulation ensures that there are no unexpected dependencies between conceptualy unrelated parts of the system. Good encapsulation lets one easily predict how a change to one object will, or will not, impact on other parts of the system. Achieving encapsulation requires the use of common coding techniques: defining immutable value types, avoiding global variables and singletons, copying collections or mutable value objects when storing them in instance variables or returing them from methods, and so forth.

When hiding information it is important that the right information is hidden in the right place. The problems I encountered were caused by information about the environment of the application that should have been specified at the application scope being hidden hidden in a lower level class that should have been passed that information, not known it a priori.

The problem I have with using the word "encapsulation" to mean "information hiding" is that encapsulation is always a good thing to do but hiding information in the wrong place is not. Suppose I say:

"Let's encapsulate the exact data structure used by the cache in the CachingStockLoader class." "Let's encapsulate the name of the application's log file in the CalculationProgressListener class."

That sounds good to me, but is it really?

I find it much easier to make good decisions when I am clear about when I am doing encapsulation and when I am doing information hiding. For example, I would restate the statements above to be explicit that they really refer to information hiding:

"Let's hide the exact data structure used by the cache in the CachingStockLoader class." "Let's hide the name of the application's log file in the CalculationProgressListener class."

That will make me realise that the first decision is correct but the second is suspect. Code that loads stocks should not have to care whether the loader caches previous requests in a hash table or red-black tree or whatever. The name of the application's log file, on the other hand, should probably not be hidden away in that CalculationProgressListener class; it should be specified by the application and passed to instances when they are constructed.

I find it essential to keep the distinction between "encapsulation" and "information hiding" in mind when thinking about design decisions or discussing them when pair programming. When I use the wrong word to think or talk about a design decision I find it harder to realise when the decision is incorrect. When I admit that I am doing information hiding, not encapsulation, I can better decide whether the information I am hiding should be hidden at all and if so, where I should hide it.

Copyright © 2005 Nat Pryce. Posted 2005-03-17. Share it.