Sunday, May 06, 2007

A Day of Reckoning

As a cunning linguist, one of my favorite web sites is the Online Etymology Dictionary. This web site is a perfect example of the incredible potential of the Internet for benefitting mankind as a whole, and any human being as an individual. Humans are networking organisms. Our brains are networks. And our brians store and seek information using networking. But that is a topic for another discussion, I reckon.

What I figured I would discuss today is abstraction. As a programmer, I am keenly aware of the perceptions of people with regards to computers, what they think they are, and what they think they do. Misperceptions about computers extend even to the family of those who call themselves "developers," or even "programmers," and this is a subject of no small concern to me. While abstraction is an incredibly useful tool, one that our brains employ naturally, it can become a source of confusion, and as much of an impediment to progress as it is an aid.

The problem is exemplified by the increasingly common phenomenon in the programming business of the ignorant developer. Technical schools, tools, and high-level programming languages enable people to employ the tools of programming without understanding what they are, or why they work. While this perhaps fills a business need, providing a less-expensive pool of "professional developers" for certain types of tasks, I think that ultimately it may produce more problems than it solves. An ignorant developer is much more likely to build unstable software. Unstable software is not necessarily immediately apparent. A short-term savings can develop into a long-term albatross, in the business world.

Getting back to the Online Etymology Dictionary, there is a correlation between the parsing of language and the understanding of it. We often think we understand the meaning of words when in fact, we only have a vague and impartial sense of what they mean. In fact, we sometimes think we understand the meaning of words because we employ them successfully, when in fact we don't understand them at all. This too is a long-term problem. And with the nearly-instant availability of information on the Internet today, there is no excuse for ignorance.

The word "computer" comes from the Latin "computare," which means literally "to count." There is a reason for this. When most of us think of computers, we envision a box with a monitor, a mouse, a keyboard, and perhaps some other peripherals attached to it, either inside or outside of it. This is a fallacy. In fact, a computer is nothing more than the processor inside the box. The rest of the machine is an interface to the processor, and a set of tools for doing such things as storing and organizing data produced by the processor.

The processor of a computer, or more accurately, the computer itself, does only one thing, and it does it very well. It counts. A computer is roughly the same thing as the ancient Chinese abacus. The ancient Chinese abacus was the first known computer, which was in fact an extension of the earliest computer, which was the human hand. We have a base 10 numbering system because we have 10 fingers on our 2 hands. Before the abacus, people used their fingers (and possibly toes) for counting.

All of mathematics is based on counting, even Calculus. Addition is counting up. Subtraction is counting down. Multiplication is addition (counting up). Division is subtraction (counting down). And all mathematical operations are composed of various combinations of addition, subtraction, multiplication, and division.

But like mathematics, computing has evolved abstractions which enable long operations to be expressed more concisely. Mathematical abstractions have proven to be extremely useful, enabling us to create an ever-increasing box of mathematical tools which we employ in performing practical calculations used in nearly every aspect of our lives. Anything in the universe can be expressed using mathematics. And this is because everything we perceive we perceive by mathematical means.

We identify a thing by defining it. And the word "define" is derived from the Latin "finire," meaning "to bound, to limit." The bounds of anything are determined by measuring it, or by expressing what is that, and what is not that. Measuring involves using some form of mathematical expression. The simplest form of mathematical expression is binary. That is 0. Not that is not zero. In other words, similarity is determined by measuring difference. When the difference between 2 things is measured as 0, they are identical. Therefore, that is 0, and not that is not zero. In a binary number system, these 2 ideas are expressed completely, and may be used to express any mathematical idea.

I have always been grateful that the first programming language I learned was C. C is a procedural language, and a low-level language which is structured to resemble mathematics. Much of the C language looks like algebra, while some of it resembles trigonometry more closely, such as the definition of functions.

Like mathematics, the seeds of more abstract languages is in the semantics of C. And because of the demand for software that performs ever-increasingly complex operations, abstract programming concepts have been built in this foundation, which is itself an abstraction, such as Object-Oriented programming.

Object-Oriented programming is actually an abstraction of procedural programming, as all programming is indeed procedural. A processor performs exactly 1 mathematical operation at a time. However, like a function, which is an abstract concept encapsulating many single procedural operations as a single atomic unit, an object is a similar encapsulation of processes, an encapsulation of encapsulations as it were, which is a convenience we use for the sake of expediency.

It is this very abstraction which provides the power of object-oriented programming. By encapsulating groups of operations within other groups of operations (ad infinitum) which perform the same or similar tasks, we can omit the details, which do not change from one use to another, and accomplish much more with much less physical work (writing code, that is). In addition, because our brains employ abstraction "to distraction," we find the abstraction of object-oriented programming more "intuitive," when used to deal with concepts which seem less mathematical to our perception, due to our advanced abstraction of those concepts in our own minds.

However, this also exposes a danger, a great danger in fact. It is now possible to employ these abstractions without fully understanding the underlying mathematical principles that they encapsulate. A real-world example of this can be seen in nearly any convenience store or fast-food restaurant, when the clerk makes change for you. I am old enough to remember when such people would "count your change" to you. If you paid for something that costs $2.50, and you produced a $5.00 bill, the clerk would count into your hand, starting from $2.50, and adding each amount as he/she counted: "$2.75 (quarter), $3.00 (quarter), "$4.00 (dollar bill), $5.00 (dollar bill)." When the clerk reached $5.00, you had your change, and it was correct. In addition, you knew it was correct, because it had been counted out as you watched. Today, a clerk punches in a price (or reads a bar code), types in the amount received from the customer, and the cash register does a subtraction to reveal the change required. Unfortunately, most of these clerks couldn't make change if their lives depended on it.

The same danger exists in the world of programming. Most developers have little or no education in computer science. Instead, they have gone to some technical school (perhaps) where they were taught "how to do" various types of things, and not taught "why to do" them. The end result is a developer who has a limit to their ability. Once you step outside of the limited realm of what they have been taught, and a problem is exposed that requires a more low-level approach, or would be better solved with a low-level understanding, they are lost.

The difference here is that, unlike convenience store and fast-foot restaurant clerks, these are supposedly "professional" people who should not be stopped at any point in the development process. And because of the demands imposed by an ever-increasingly-complex set of software requirements, problems that require a low-level understanding of the mathematical principles of computing are almost inevitable in the career of any developer. A convenience store clerk is not expected to be able to solve complex problems. Their job is to collect money, to keep the store shelves stocked with merchandise, and to perform similarly simple tasks, and they are paid in accordance with the skill level required. But a developer faces a much higher expectation of skill and knowledge, and above all, an ability to solve complex problems.

So, we find ourselves on the horns of a conundrum here. The solution, as I see it, can only be applied on a personal level. It is important to understand the basics before attempting to enter the realm of abstraction. If one does this, one will be successful. If one does not, there is likely to be a point at which one will have to learn the basics remedially. The former is more desirable, as the point at which one is required to learn the basics is not going to be an inconvenient one.

At least, that's how I figure it, I reckon.

No comments: