Although it might appear to the contrary, computers are really quite stupid machines. Or better said, quite simple, because they only “understand” simple instructions such as “add these two numbers together” or “tell me which number is greater.” This is what we call machine language. The great advantage of machine language is that a computer processor can run these simple operations at full throttle.
During information technology’s early days, computer programmers – a majority of whom were women – had the tiresome task of writing in this language. Let’s think for a moment about the attention to detail that is required to use only this means to write a program to launch a man into space. It would be like instead of using phrases such as “go straight to the intersection and turn right” to explain to someone how to get from one street to another, you had to say “raise your right leg, raise your left leg, repeat ten times, turn your body 90 degrees to the right …”
It goes without saying that we humans are terrible with this level of detail; exclusively using machine language would have relegated computers to the domain of specialists. In the early 1950’s, the American Grace Hopper – who was not only a programmer but also a serving member of her country’s navy – had a brilliant idea: Why not write a program that would translate commands written in English into machine language? Thus the compiler was born.
In the world of computing, all those programs that translate one programming language into another are called compilers. The most frequent practice is to compile a language that more closely resembles a human language – commonly called high-level languages – with machine language. Although it is now possible to find compilers between different high-level languages. Building on our previous example, the compiler translates our phrase “go straight to the intersection” into the necessary sequence of movements.
Names, variables, and functions
The most important feature of high-level programming languages is that they provide abstractions so we can avoid grappling with the most minute level of detail. Different languages work with different levels and forms of abstraction. Nonetheless, almost all of them share some general concepts.
The location of data inside the computer is expressed as a place in the processor called a record, or in memory, or on an external device. But from a human point of view, a command such as “divide the value in record A by the value found in memory position 42” doesn’t reveal anything about what is happening. This is why programming languages allow elements that constitute a program to be named; if we call record A “distance” and the memory position we refer to is “time”, it is easier for us to understand that the program is calculating velocity. A name that is given to represent a data point is commonly called a variable. Programmers, thus talk about “declaring a variable” when they mean “give a name to the place that contains the data point” or “assign a variable” when they want to say “give a specific value to this data point.”
Let’s go back to our example. If we are developing a navigation program, it is likely that we will have to repeat the same series of commands many times in order to verify the name of a street. One possibility is to copy and paste the code that performs this task every time we need it. But a better approach is to assign a name to a sequence of commands; this is what we call a function, procedure, or subroutine (the slight differences between these elements is not important at this point).
Using functions has two key advantages: first, it facilitates the understanding of the program by documenting the purpose behind a series of commands. But it also lets us make corrections or modify the program from a single, central point: if we discover an error in the way we are checking the name of a street, we only have to make the correction in the function, and not in each location where it is used.
Apart from its significant complexity, there is another problem with machine lange: there is not a single “machine language” – rather each new processor model introduces variations. Although different models can share a large part of their machine language (which is when we talk about them having the same architecture), this would mean that a programmer would have to learn a new programming language each time a new model of processor were to hit the market. To put it another way, it would be like having to re-learn how to sweep each time we bought a new broom.
The introduction of compilers as intermediaries between a high-level language and machine language solves a large part of this problem: we can create various compilers that understand the same language on one side of the communication chain but produce different machine language output. This way we don’t need anything more that a small group of experts to keep the compilers up to date, allowing the majority of programmers to ignore the whims of the market and keep coding the same way. This explains why languages like C or Lisp that were already in use in the 1970s can still be used today.
In this article we have talked about programming languages in the plural, because the same way human language caters for infinite variations, so to do programming languages. In our next installment, we will try give a broad description of the most important families of languages. As we mentioned above, the differences are in the abstractions provided by each language.