Home » Uncategorized

SQL: A Supposed English-like Language

SQL is envisioned as an English-like language. Simple SQL statements read like English sentences. SQL writes a statement in an English way with English prepositions unnecessarily attached, while other main programming languages only use English words as the mnemonic of a certain concept or operation, producing formal program statements instead of English sentences. For example, the FROM clause is the main part of a query but it is put in the end of a SQL statement, and the BY in the SQL GROUP BY clause is in fact unnecessary.  

Originally, the English-like design aimed to enable non-IT end users who can read and write English to make queries in SQL. In reality, the idea, however well-intended, fails, if not completely. Most of the non-IT users can only write very simple queries in SQL. But for such queries, applications are usually equipped with convenient and interactive visual interface to assist the process. There’s no need to hand-code them, making the design lose its significance. In fact it is the programmers who use SQL most frequently to perform queries. They also use the language to program, and it doesn’t make much difference for them in understanding an English-like language and a non-English-like language. Actually the English-like design is more a hindrance than an advantage.

SQL is extremely strict about the syntax implementation. Any deviation will result in a refusal from the interpreter, so SQL users must learn to observe the rule of syntax. This is the common feature of all programming languages. By contrast, natural languages are flexible about the grammar and thus can accommodate some degree of ambiguities. SQL lacks this flexibility natural languages have. It wasn’t easy to achieve such a feature in the emerging era of SQL when the technology wasn’t so advanced.

The English-like nature of SQL has had big consequences, besides the lack of the merit of the English language. The simulation of English grammar doesn’t make SQL easier to grasp; rather, it serves to complicate it.

The biggest consequence of English-like design is that SQL becomes non-procedural. The program logic is performed step by step, during which variables are used to store intermediate values for use in later steps. The English language, however, uses a small number of conjunctions or relative pronouns to connect and define a relationship between clauses or between a main clause and a subordinate clause. It writes as much information as possible in one sentence. To express complex meanings exactly, it uses one or more subordinate clauses introduced by relative pronouns. SQL imitates this feature of English language by putting many actions like SLECT, WHERE and GROUP in one statement, producing multi-level nested subqueries for complicated queries. For example, SQL uses the WHERE clause and HAVING clause for the same action. The case is rare in other programming languages.

Stepwise coding effectively facilitates understanding and execution. A query is easy to understand if done in steps but becomes difficult to understand if handled in one statement. An example is to find clients whose sales are two times higher than the average. Naturally we will compute the average and then find the eligible clients. This needs only two statements. To do it in SQL, a long statement with a subquery is needed. The example isn’t difficult as it only involves the two-layer logic. For the English-like SQL, its description of a two-level relationship isn’t so difficult to understand. In reality, there are many queries that have three to five layers of subqueries that are real obstacles to understanding.

SQL’s lack of interest in stepwise coding leads to long statements. The complicated SQLs programmers have to write are often measured by K, instead of by lines. There’s a world difference in understanding a 100-line piece of code written in a single one statement and in 100 statements. It’s not only laborious to produce an extremely long statement, but it’s possible that the programmer themselves can’t understand it after a long time. A non-stepwise long statement is also difficult to debug and takes long to write.

About the procedurality, fans of SQL defend the language by saying that SQL users only need to take care of what they want and don’t need to care about how to do it. Machines will find the method of doing it, making it unnecessary for syntax to support the procedurality.

That’s nonsense!

Every programming language supports the “how” at some degree. Coding in assembly language needs to take care of the actions of the register and memory, but doesn’t need to concern about the actions of the lower-level NAND gate. With SQL there’s no need to handle data operations at the physical storage level (file system, internal memory and hard disk), but we still need to handle operations at the logical level (tables and fields). In a sense a SQL statement is a description of the computing logic. A complicated multi-level nested query also offers the execution steps while expressing the computing target. In other words, SQL describes a computing target by stating the execution steps, though the way is rather sophisticated.  

Not an enthusiast of stepwise computation though, SQL is to some extent procedure-oriented. The stored procedure is a way of executing a SQL statement in a step-by-step fashion. Invoking a SQL statement with an external program can also achieve procedural computing. Without taking into consideration the low performance resulted from generating temporary tables (for storing intermediate values) and performing database I/O (used to get computing results during the invocation of SQL statements by an external program), both stored procedures and invocation by external programs provide equal functionalities. But to deal with the performance issue in handling huge amounts of data, we still need long SQL statements. The techniques are a valuable supplement to the SQL’s insufficient procedurality ability when the amount of data being handled isn’t large and the performance issue can be ignored.