summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorache <ache@ache.one>2024-03-19 12:03:49 +0100
committerache <ache@ache.one>2024-03-19 12:03:49 +0100
commitad28468e8c6418d7bd8c1e891c538857469e57e0 (patch)
tree9afd72d2794bb9e3f6e4337de0051f4cdaa58076
parentUse HTTPS for git server link (diff)
Errors in en version of bizarreries du langage C
-rw-r--r--articles/c-language-quirks.md86
1 files changed, 41 insertions, 45 deletions
diff --git a/articles/c-language-quirks.md b/articles/c-language-quirks.md
index 6d84890..4a9343e 100644
--- a/articles/c-language-quirks.md
+++ b/articles/c-language-quirks.md
@@ -19,9 +19,9 @@ The quirks of the C language
============================
![The C Programming Language logo](res/c_language.svg)
-C is a language with a simple syntaxe.
+C is a language with a simple syntax.
The only complexity of this language come from the fact that it acts in a machine-like way.
-However, a part of the C syntaxe is almost never taught.
+However, a part of the C syntax is almost never taught.
Let's tackle these mysterious cases! 🧞
:::note
@@ -36,7 +36,7 @@ Table of contents
The uncommon operators
----------------------
-There is two operators in the C language that are almost never used.
+There are two operators in the C language that are almost never used.
The first is the comas operator.
In C, the comma is used to separate the elements of a definition or to separate the elements of a function. In short, it is a punctuation element.
But not only ! It's also an operator.
@@ -53,8 +53,7 @@ printf("%d", (5,3) );
It prints 3.
The operator `,` is used to juxtapose expressions.
-La valeur de l'expression complète est égale à la valeur de la dernière expression.
-The value of the complete expression is equal to the value of the last expression.
+The value of the whole expression is equal to the value of the last expression.
This operator is very useful in a `for` loop to multiply the iterations.
For example, to increment `i` and decrement `j` in the same iteration of a `for` loop, we can do:
@@ -65,7 +64,7 @@ for( ; i < j ; i++, j-- ) {
}
~~~
-Or again, in small `if` to simplifiate
+Or again, in small `if` to simplify
~~~c
if( argc > 2 && argv[2][0] == '0' )
@@ -73,7 +72,7 @@ if( argc > 2 && argv[2][0] == '0' )
~~~
Here, we assign `action` and `color`.
-Normaly to do 2 assignations, we should use curly braces.
+Normally to do 2 assignations, we should use curly braces.
We can also use the comma operator to remove parentheses.
@@ -98,7 +97,7 @@ The "ternary" for the intimates.
The only one operator of the language that takes 3 operands.
It is used to simplify conditional expressions.
-For example to print the minimum of 2 nombers, without ternary, we could do:
+For example to print the minimum of 2 numbers, without ternary, we could do:
~~~c
if (a < b)
@@ -123,7 +122,7 @@ printf("%d", a<b ? a : b);
~~~
Thanks to the use of ternary, we saved a repetition as well as a few lines.
-More importantly, we make it more lisible. When one know how to read a ternary...
+More importantly, we make it more readable. When one know how to read a ternary...
To read a ternary expression, you need to split it into 3 parts:
@@ -134,7 +133,7 @@ expression_1 ? expression_2 : expression_3
The value of the whole expression `expression_2` is the `expression_1` is evaluated to true and `expression_3` otherwise.
So shorts expressions become easier to read.
-The expression `a<b ? a : b` reads « If `a` is less or equal to `b` then `a` else `b` ».
+The expression `a<b ? a : b` reads “If `a` is less or equal to `b` then `a` else `b`”.
I still insist on the fact that this operator, if it is badly used, can harm the readability of the code.
Note that a ternary expression can be used as the operand of another ternary expression:
@@ -153,7 +152,6 @@ The most readable way is to use a macro :
printf("%d", MIN(a,MIN(b,c)));
~~~
-That's it ! Two operators that now
Et voilà ! Two operators who will now gain some interest.
By the way, even the operators that we already know, we do not necessarily master the syntax.
@@ -162,25 +160,25 @@ Access to an array
------------------
-If you ever learn C, you must have learn that to access the third element of an array, you can do:
+If you ever learn C, you must have learned that to access the third element of an array, you can do:
~~~c
int arr[5] = {0, 1, 2, 3, 4, 5};
printf("%d", arr[2]);
~~~
-2 not 3, because a array in C starts at 0.
+2 not 3, because an array in C starts at 0.
If an array start at 0, it's a story of address[^address]
The address of an array is the one of the first element of that array.
By pointer arithmetic, the address of the third element of an array is therefor `arr+2`.
-So we may have write:
+So we might have written:
~~~c
printf("%d", *(arr+2));
~~~
-Them, since addition is commutative, `arr+2` and `2+arr` are equivalent.
+Then, since addition is commutative, `arr+2` and `2+arr` are equivalent.
In fact, we could even have done:
~~~c
@@ -190,7 +188,7 @@ printf("%d", 2[arr]);
It's perfectly valid.
With good reason: the syntax `E[F]` is strictly equivalent to `*((E)+(F))`, no more, no less.
Suddenly the name of this section seams misleading.
-This operator have nothing to deal with arrays.
+This operator has nothing to deal with arrays.
In fact, it's sugar syntax for pointer arithmetic.[^sugar]
For example to print the character `=` for `1`, `!` for `0` and `~` for `2`.
@@ -215,8 +213,8 @@ printf("%c", is_good["!=~"] ); // Prints '!' if is_good is 0
// '~' if is_good is 2
~~~
-Since everybody write `arr[3]`, please don't write `3[arr]`, it's useful to know, not to do.
-[^address]: It's a little more complicated than that. When it was necessary to choose if an array must starts at 0 or 1, the compilation time of a program was of particular importance. It had been decided to start at 0 to gain time (processor cycle actually) by not doing the indices translation needed for 1 based array. The [Citation Needed](http://exple.tive.org/blarg/2013/10/22/citation-needed/) article from Mike Hoye explains a lot better then I could.
+Since everybody writes `arr[3]`, please don't write `3[arr]`, it's useful to know, not to do.
+[^address]: It's a little more complicated than that. When it was necessary to choose if an array must start at 0 or 1, the compilation time of a program was of particular importance. It had been decided to start at 0 to gain time (processor cycle actually) by not doing the indices translation needed for 1 based array. The [Citation Needed](http://exple.tive.org/blarg/2013/10/22/citation-needed/) article from Mike Hoye explains a lot better than I could.
[^sugar]: For the story `[]` is inherited from B language where the concept of array wasn't even a thing. An array (or vector like it was called back them) was just the address of the first element of a bytes sequence.
Only pointers arithmetic allowed to access to all elements of a vector, it was an evolution compared to BCPL, the B ancestor, who use the syntax `V!4` to access to the fifth element of an array, but I digress...
@@ -227,7 +225,7 @@ Initialisation
The initialisation is something we master in C.
It is the fact of giving a value to a variable during its declaration.
-Basically, we define it's value.
+Basically, we define its value.
For an array[^array] :
~~~c
@@ -244,7 +242,7 @@ If we only want to initialise the third element, since the initialisation of an
int arr[10] = {0, 0, 5};
~~~
-But in fact, there is a another syntax for that:
+But in fact, there is an another syntax for that:
~~~c
int arr[10] = {[2] = 5};
@@ -295,9 +293,8 @@ message to_send = { .msg="Code 10", .dst="23:12:23", .src=""};
message to_send = { .dst="23:12:23", .msg="Code 10"};
~~~
-With theses syntaxes we could also make the code more verbose or/and easier to read..
-Avec ces syntaxes on peut également alourdir le code, mais généralement, on gagne en lisibilité.
-Sometimes theses syntaxes can be used very wisely !
+With these syntaxes we could also make the code more verbose or/and easier to read...
+Sometimes these syntaxes can be used very wisely !
As here in this base64 decoder:
>~~~c
>static int b64_d[] = {
@@ -354,7 +351,7 @@ Well, I can do that:
printf("%d", ((int[]){5,4,5,2,1}) [i] ); // With i set to something >=0 and <5
~~~
-It's not very readable, but in many case this syntax is useful.
+It's not very readable, but in many cases this syntax is useful.
For example with a structure:
~~~c
@@ -375,7 +372,7 @@ We call these expressions *compound literals* (which is a pain to translate in a
Introduction to VLAs
--------------------
-Variable Lenght Arrays are arrays with length only know at runtime.
+Variable Length Arrays are arrays with length only know at runtime.
If never encounter VLA, this should clink you:
~~~c
@@ -387,7 +384,7 @@ for(int i = 0 ; i < n ; i++)
arr[i] = 0;
~~~
-A lot of teachers must have reprove that code.
+A lot of teachers must have repressed that code.
We have been taught that an array must have a know size at compile time.
VLA are the exception.
Introduced with the C99 standard, VLAs have a bad reputation.
@@ -395,7 +392,7 @@ There are several reasons for this, which I won't go into here[^reasons].
I'm just going to talk about the non-intuitive behaviors introduced with VLAs.
-To define a VLA, it's the same syntax as a classical array but the size of the array is an non constant expression.
+To define a VLA, it's the same syntax as a classical array, but the size of the array is a non-constant expression.
But first, let's see what and how to use a VLA (the normal way).
~~~c
@@ -408,7 +405,6 @@ unsigned int arr3[foo()]; // avec foo une fonction définie ailleurs
~~~
A variable length array can not be initialised nor declared `static`.
-Un VLA ne peut pas être initialisé, de plus, il ne peut être déclaré `static`.
Thus, both of these statements are incorrect:
~~~c
@@ -453,7 +449,7 @@ The VLAs exceptions
The most known deviant behavior of VLAs is their relation to `sizeof`.
-`sizeof` is an unary operator that retrieves the size of a type from an expression or from the name of a type surrounded by parentheses.
+`sizeof` is a unary operator that retrieves the size of a type from an expression or from the name of a type surrounded by parentheses.
~~~c
/* How sizeof works using examples */
@@ -473,9 +469,9 @@ The third one is the size of the type of the expression `a` (which is `float`[^s
The fourth is the size of a `double`[^system] (the type of `a*2.`[^float]).
The last one is the size of the type `size_t` since it is the type of the expression `b++`[^system].
-Ici, we don't care about the value of the expression since `sizeof` doesn't care more.
-The value of the sizeof expression is determined at compile time.
-The operations inside the sizeof statements aren't executed.
+Here, we don't care about the value of the expression since `sizeof` doesn't care more.
+The value of the `sizeof` expression is determined at compile time.
+The operations inside the `sizeof` statements aren't executed.
Since the expression must be valid, its type is determined at compile time.
~~~c
@@ -484,7 +480,7 @@ printf("%zu", sizeof(char[++n])); // Prints 6
~~~
*Ouch* ! Here are the VLAs.
-In the type `int[++n]`, `++n` is a non constant expression.
+In the type `int[++n]`, `++n` is a non-constant expression.
So the array is a VLA.
To know the size of the array, the compiler must execute the expression inside the bracket.
This, `n` holds 6 now and `sizeof` indicates that an array of `char` declared within this expression should have a size of `6`.
@@ -507,13 +503,13 @@ Note that there are other exceptions induced by the standardisation of VLAs such
And even some conditional branches are forbidden when using a VLA.
[^system]: On my computer.
-[^float]: Here, the implementation follow the IEEE 754 standard, where the size of floating number “simple” is 4 bytes and “double” is 8. `2.` has type `double` so `2.*a` has the same type as its greater operand.
+[^float]: Here, the implementation follows the IEEE 754 standard, where the size of floating number “simple” is 4 bytes and “double” is 8. `2.` has type `double` so `2.*a` has the same type as its greater operand.
A flexible array
----------------
-You may never heard something like “flexible arrays member”.
+You may never hear something like “flexible array members”.
This is normal, these respond to a very specific and uncommon problem.
The objective is to allocate a structure but with one field (an array) of unknown size at compile time and all on a contiguous space[^why].
@@ -541,7 +537,7 @@ And use it like that:
~~~
But here the array may not be next to the structure in memory.
-As a consequence, if we copy the structure we the value of `arr` will be the same for the copy and for the original one.
+As a consequence, if we copy the structure the value of `arr` will be the same for the copy and for the original one.
To avoid that, we must copy the structure, reallocate the array and copy it.
Let's see another way.
@@ -580,7 +576,7 @@ In C, if there's one thing we shouldn't talk about, it's *labels*.
We use it with the `goto` statement. The forbidden one !
To hide them, we replace the `goto` by named statement more explicit like `break` or `continue`.
-So we don't have `goto` anymore and we never lear what is a label anymore when we learn the C syntax.
+So we don't have `goto` anymore, and we never learn what is a label anymore when we learn the C syntax.
This is how `goto` and a label are used:
@@ -665,7 +661,7 @@ while( count-- > 0 )
*to = *from++;
~~~
-Dividing by 8 (arbitrary number) we also divides the number of tests and decrements by 8.
+Dividing by 8 (arbitrary number) we also divide the number of tests and decrements by 8.
However, if `count` is not divisible by 8, we have a problem, we don't do all the instructions.
It would be nice to be able to jump directly to the 2nd instruction, if you only have 6 instructions left.
@@ -673,7 +669,7 @@ And this is where labels can help us!
Thanks to the `switch` we can jump directly to the right instruction.
We only have to label every instruction with the number of instruction remaining to do.
-Them we jump to that instruction with the `switch` statement.
+Then we jump to that instruction with the `switch` statement.
It is very rare to have to use this type of trick.
@@ -686,9 +682,9 @@ Complex numbers
Once again, we will study a syntax introduced by C99.
More exactly there are 3 types that have been introduced which are complex numbers.
-The type of a complex number is `double _Complex` (the other two follow the same pattern, I will only write about the `double` version)
+The type of complex number is `double _Complex` (the other two follow the same pattern, I will only write about the `double` version)
-Thus in C, it is possible to declare a complex number like this:
+Thus, in C, it is possible to declare a complex number like this:
~~~c
@@ -814,7 +810,7 @@ Here is a table summarising the trigraph and digraph sequences and their charact
| `##` | `%:%:` | |
-The **main** difference between the digraphes and trigraphes are inside a string:
+The **main** difference between the digraphs and trigraphs is inside a string:
~~~c
puts("??= is a hashtag");
@@ -828,13 +824,13 @@ So this line of code is perfectly valid:
~~~
The only use of this syntax nowadays is to obfuscate a source code very easily.
-A combination of a ternary with a trigraph and a digraph and you have an absolutely unreadable code 😉
+With a combination of a ternary with trigraphs and a digraph you get an absolutely unreadable code 😉
~~~c
printf("%d", a ?5??((arr):>:0);
~~~
-[^C23]: In C23, trigraphes are deprecated and doesn't work anymore.
+[^C23]: In C23, trigraphs are deprecated and doesn't work anymore.
**Never use it in a serious code**.
@@ -844,7 +840,7 @@ printf("%d", a ?5??((arr):>:0);
That's it, I hope you've learned something from this post. Don't forget to use these syntaxes sparingly.
-I would like to thank [Taurre](https://zestedesavoir.com/membres/voir/Taurre/) for validating this article, but also for his pedagogy on the forums for years, as well as [blo yhg](https://zestedesavoir.com/membres/voir/blo%20yhg/) for his careful proofreading.
+I would like to thank [Taurre](https://zestedesavoir.com/membres/voir/Taurre/) for validating this article in French, but also for his pedagogy on the forums for years, as well as [blo yhg](https://zestedesavoir.com/membres/voir/blo%20yhg/) for his careful proofreading.
Note that you can (re)discover a lot of code abusing the C language syntax at [IOCCC](http://ioccc.org/winners.html). 😈