summaryrefslogtreecommitdiff
path: root/articles/c-language-quirks.md
blob: 4a9343eb287a97887c2555b4e8c0be939e6d58d2 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
---

pubDate = 2018-11-18
tags = ['language', 'obfuscation', 'programmation']

[author]
name = "ache"
email = "ache@ache.one"

[[alt_lang]]
lang = "fr"
url = "/articles/bizarreries-du-langage-c"

---



The quirks of the C language
============================

![The C Programming Language logo](res/c_language.svg)  
C is a language with a simple syntax.
The only complexity of this language come from the fact that it acts in a machine-like way.
However, a part of the C syntax is almost never taught.
Let's tackle these mysterious cases! 🧞

:::note
To understand this post, it is necessary to have a basic knowledge of a language with a syntax and operation close to C.
:::


Table of contents
----------


The uncommon operators
----------------------

There are two operators in the C language that are almost never used.
The first is the comas operator.
In C, the comma is used to separate the elements of a definition or to separate the elements of a function. In short, it is a punctuation element.
But not only ! It's also an operator.


### The comma operator

The following instruction, although unnecessary, is quite valid:

~~~c
printf("%d", (5,3) );
~~~

It prints 3.
The operator `,` is used to juxtapose expressions.

The value of the whole expression is equal to the value of the last expression.

This operator is very useful in a `for` loop to multiply the iterations.
For example, to increment `i` and decrement `j` in the same iteration of a `for` loop, we can do:

~~~c
for( ; i < j ; i++, j-- ) {
 // [...]
}
~~~

Or again, in small `if` to simplify

~~~c
if( argc > 2 && argv[2][0] == '0' )
    action = 4, color = false;
~~~

Here, we assign `action` and `color`.
Normally to do 2 assignations, we should use curly braces.

We can also use the comma operator to remove parentheses.

~~~c
while( c = getchar(), c != EOF && c != '\n' ) {
  // [...]
}
// Is strictly equivalent to :
while( (c = getchar()) != EOF && c != '\n' ) {
  // [...]
}
~~~


But above all, do not abuse this operator!
You can, in a rather fast way, obtain unreadable things.  
This remark is also valid for the next operator!

### The ternary operator

The "ternary" for the intimates.
The only one operator of the language that takes 3 operands.
It is used to simplify conditional expressions.

For example to print the minimum of 2 numbers, without ternary, we could do:

~~~c
if (a < b)
    printf("%d", a);
else
    printf("%d", b);
~~~

Or simply:

~~~c
int min = a;
if( b < a)
  min = b;
printf("%d", min);
~~~

And using the ternary operator:

~~~c
printf("%d", a<b ? a : b);
~~~

Thanks to the use of ternary, we saved a repetition as well as a few lines.
More importantly, we make it more readable. When one know how to read a ternary...

To read a ternary expression, you need to split it into 3 parts:

~~~c
expression_1 ? expression_2 : expression_3
~~~

The value of the whole expression `expression_2` is the `expression_1` is evaluated to true and `expression_3` otherwise.

So shorts expressions become easier to read.
The expression `a<b ? a : b` reads “If `a` is less or equal to `b` then `a` else `b`”.

I still insist on the fact that this operator, if it is badly used, can harm the readability of the code.
Note that a ternary expression can be used as the operand of another ternary expression:

~~~c
printf("%d", a<b ? a<c ? a : b<c ? b : c : b < c ? b : c); 
~~~

From now on, we take the minimum of three numbers.
It is spaced, but impossible to follow.
The most readable way is to use a macro :

~~~c
#define MIN(a,b) ((a) < (b) ? (a) : (b))

printf("%d", MIN(a,MIN(b,c)));
~~~

Et voilà ! Two operators who will now gain some interest.
By the way, even the operators that we already know, we do not necessarily master the syntax.


Access to an array
------------------


If you ever learn C, you must have learned that to access the third element of an array, you can do:

~~~c
int arr[5] = {0, 1, 2, 3, 4, 5};
printf("%d", arr[2]);
~~~

2 not 3, because an array in C starts at 0.
If an array start at 0, it's a story of address[^address]
The address of an array is the one of the first element of that array.
By pointer arithmetic, the address of the third element of an array is therefor `arr+2`.

So we might have written:

~~~c
printf("%d", *(arr+2));
~~~

Then, since addition is commutative, `arr+2` and `2+arr` are equivalent.
In fact, we could even have done:

~~~c
printf("%d", 2[arr]);
~~~

It's perfectly valid.
With good reason: the syntax `E[F]` is strictly equivalent to `*((E)+(F))`, no more, no less.
Suddenly the name of this section seams misleading.
This operator has nothing to deal with arrays.
In fact, it's sugar syntax for pointer arithmetic.[^sugar]

For example to print the character `=` for `1`, `!` for `0` and `~` for `2`.

We may do:
~~~c
if( is_good == 1 )
  printf("%c", '=');
else if( is_good == 0 )
  printf("%c", '!');
else
  printf("%c", '~');
~~~

But it can be done easier:
~~~c
printf("%c", "!=~"[is_good]);

// As we saw :
printf("%c", is_good["!=~"] ); // Prints '!' if is_good is 0
                               //        '=' if is_good is 1
                               //        '~' if is_good is 2
~~~

Since everybody writes `arr[3]`, please don't write `3[arr]`, it's useful to know, not to do.
[^address]: It's a little more complicated than that. When it was necessary to choose if an array must start at 0 or 1, the compilation time of a program was of particular importance. It had been decided to start at 0 to gain time (processor cycle actually) by not doing the indices translation needed for 1 based array. The [Citation Needed](http://exple.tive.org/blarg/2013/10/22/citation-needed/) article from Mike Hoye explains a lot better than I could.

[^sugar]: For the story `[]` is inherited from B language where the concept of array wasn't even a thing. An array (or vector like it was called back them) was just the address of the first element of a bytes sequence.
Only pointers arithmetic allowed to access to all elements of a vector, it was an evolution compared to BCPL, the B ancestor, who use the syntax `V!4` to access to the fifth element of an array, but I digress...


Initialisation
--------------

The initialisation is something we master in C.
It is the fact of giving a value to a variable during its declaration.
Basically, we define its value.

For an array[^array] :
~~~c
int arr[10] = {0};
arr[2] = 5;
~~~

In the first line, we initialise the array with 0 because, every value not specified is set to 0 by default.
The next line is an affectation, not an initialisation.

If we only want to initialise the third element, since the initialisation of an array following the values order, we should write:

~~~c
int arr[10] = {0, 0, 5};
~~~

But in fact, there is an another syntax for that:

~~~c
int arr[10] = {[2] = 5};
~~~

We simply say that the third element value is 5.
The rest is 0 by default.
An equivalent syntax also exists for structures and unions.

:::information
 For the example, we will use the structure `point` that I will use many times in this article, same for the structure `message`.
:::

We can initialise a point based on its components.

~~~c
typedef struct point {
  int x,y;
} point;


point A = {.x = 1, .y = 2};
~~~

Here, there is no ambiguity.
But for a structure more complex, this syntax is really helpful.

~~~c
typedef struct message {
   char src[20], dst[20], msg[200];
} message;

// [...]

message to_send = {.src="", .dst="23:12:23", .msg="Code 10"};

// Is way more self-explanatory than :

message to_send = {"", "23:12:23", "Code 10"};

// We don't need to follow the declared order of fields structure

message to_send = { .msg="Code 10", .dst="23:12:23", .src=""};

// And also since any field of a structure is initialised to its null value if it is not initialized explicitly.
// We can also omit the `src` field.

message to_send = { .dst="23:12:23", .msg="Code 10"};
~~~

With these syntaxes we could also make the code more verbose or/and easier to read...
Sometimes these syntaxes can be used very wisely !
As here in this base64 decoder:
>~~~c
>static int b64_d[] = {
>    ['A'] =  0, ['B'] =  1, ['C'] =  2, ['D'] =  3, ['E'] =  4,
>    ['F'] =  5, ['G'] =  6, ['H'] =  7, ['I'] =  8, ['J'] =  9,
>    ['K'] = 10, ['L'] = 11, ['M'] = 12, ['N'] = 13, ['O'] = 14,
>    ['P'] = 15, ['Q'] = 16, ['R'] = 17, ['S'] = 18, ['T'] = 19,
>    ['U'] = 20, ['V'] = 21, ['W'] = 22, ['X'] = 23, ['Y'] = 24,
>    ['Z'] = 25, ['a'] = 26, ['b'] = 27, ['c'] = 28, ['d'] = 29,
>    ['e'] = 30, ['f'] = 31, ['g'] = 32, ['h'] = 33, ['i'] = 34,
>    ['j'] = 35, ['k'] = 36, ['l'] = 37, ['m'] = 38, ['n'] = 39,
>    ['o'] = 40, ['p'] = 41, ['q'] = 42, ['r'] = 43, ['s'] = 44,
>    ['t'] = 45, ['u'] = 46, ['v'] = 47, ['w'] = 48, ['x'] = 49,
>    ['y'] = 50, ['z'] = 51, ['0'] = 52, ['1'] = 53, ['2'] = 54,
>    ['3'] = 55, ['4'] = 56, ['5'] = 57, ['6'] = 58, ['7'] = 59,
>    ['8'] = 60, ['9'] = 61, ['+'] = 62, ['/'] = 63, ['='] = 64
>};
>~~~
Source: [Taurre](https://openclassrooms.com/forum/sujet/defis-8-tout-en-base64-19054?page=1#message-6921633)


[^array]:
    `{0}` can also be used for a number. Any value initializing a simple variable (pointer, number, ...) can optionally take braces.

    ~~~c
    int a = {11};
    float pi = {3.1415926};
    char* s = {"unicorn"};
    ~~~

    The main feature of the array is the use of commas between the braces.


The compound literals
---------------------


Since we are talking about arrays.
There is a simple syntax for using single-use arrays.

I would like to use this array:
~~~c
int arr[5] = {5, 4, 5, 2, 1};
printf("%d", arr[i]); // With i set to something >=0 and <5
~~~


However, I only use this array once...
It's a bit disturbing to have to use an identifier just for that.


Well, I can do that:
~~~c
printf("%d", ((int[]){5,4,5,2,1}) [i] ); // With i set to something >=0 and <5
~~~

It's not very readable, but in many cases this syntax is useful.
For example with a structure:

~~~c

// To send our message:
send_msg( (message){ .dst="192.168.11.1", .msg="Code 11"} );

// To print the distance between two points
printf("%d", distance( (point){1, 2}, (point){2, 3} )  );

// Or on Linux, in system programming
execvp( "bash" , (char*[]){"bash", "-c", "ls", NULL} );
~~~

We call these expressions *compound literals* (which is a pain to translate in any other language)


Introduction to VLAs
--------------------

Variable Length Arrays are arrays with length only know at runtime. 
If never encounter VLA, this should clink you:

~~~c
int n = 11;

int arr[n];

for(int i = 0 ; i < n ; i++)
  arr[i] = 0;
~~~

A lot of teachers must have repressed that code. 
We have been taught that an array must have a know size at compile time.
VLA are the exception.
Introduced with the C99 standard, VLAs have a bad reputation.
There are several reasons for this, which I won't go into here[^reasons].

I'm just going to talk about the non-intuitive behaviors introduced with VLAs.

To define a VLA, it's the same syntax as a classical array, but the size of the array is a non-constant expression.
But first, let's see what and how to use a VLA (the normal way).

~~~c
int n = 50;
int arr[n];

double arr2[2*n];

unsigned int arr3[foo()]; // avec foo une fonction définie ailleurs
~~~

A variable length array can not be initialised nor declared `static`.
Thus, both of these statements are incorrect:

~~~c
int n = 30;

int arr[n] = {0};
static arr2[n];
~~~

In a function, we can use this syntax to refer to a VLA:

~~~c
void bar(int n, int arr[n]) {

}
~~~

Since, in C the size of the first dimension of an array isn't really of interest as an argument of a function.
A real-life case may be in passing a 2-dimension VLA where the second dimension *must* be specified:

~~~c
void foo( int n, int m, int arr[][m]) {

}
~~~

Note that it is possible to use the character `*` (yet another use ...) instead of the size of one or more dimensions of a VLA, but *only* within a prototype.

~~~c
void foo(int, int, int[][*]);
~~~

Well, after that short introduction, let's talk about the interesting cases.
The quirks and eccentricities that the VLAs have introduced !


[^reasons]: I can nevertheless give you two references [“Is it safe to use variable-length arrays?”](https://stackoverflow.com/questions/7326547/is-it-safe-to-use-variable-length-arrays) from stack overflow and this article : [“The Linux Kernel Is Now VLA-Free”](https://www.phoronix.com/scan.php?page=news_item&px=Linux-Kills-The-VLA).


The VLAs exceptions
-------------------

The most known deviant behavior of VLAs is their relation to `sizeof`.

`sizeof` is a unary operator that retrieves the size of a type from an expression or from the name of a type surrounded by parentheses.

~~~c
/*  How sizeof works using examples */ 
float a;
size_t b = 0;

printf("%zu", sizeof(char)); // Prints 1
printf("%zu", sizeof(int));  // Prints 4
printf("%zu", sizeof a);     // Prints 4
printf("%zu", sizeof(a*2.)); // Prints 8
printf("%zu", sizeof b++);   // Prints 8
~~~

The first result are very surprising, the size of a `char` is defined to be 1 and `sizeof(char)` must return 1 (as per the C standard).
The second one is the size of `int`[^system].
The third one is the size of the type of the expression `a` (which is `float`[^system]).
The fourth is the size of a `double`[^system] (the type of `a*2.`[^float]).
The last one is the size of the type `size_t` since it is the type of the expression `b++`[^system].

Here, we don't care about the value of the expression since `sizeof` doesn't care more.
The value of the `sizeof` expression is determined at compile time.
The operations inside the `sizeof` statements aren't executed.
Since the expression must be valid, its type is determined at compile time.

~~~c
int n = 5;
printf("%zu", sizeof(char[++n])); // Prints 6
~~~

*Ouch* ! Here are the VLAs.
In the type `int[++n]`, `++n` is a non-constant expression.
So the array is a VLA.
To know the size of the array, the compiler must execute the expression inside the bracket.
This, `n` holds 6 now and `sizeof` indicates that an array of `char` declared within this expression should have a size of `6`.

This is only slightly intuitive since the VLAs here have introduced an *exception to the rule* which is not to execute the expression passed to `sizeof`.

Another odd behaviour introduced by VLAs is the execution of expressions related to the size of a VLA in the definition of a function. Thus :

~~~c
int foo( char arr[printf("bar")] ) {
   printf("%zu", sizeof arr);
}
~~~

Assuming that the displays do not cause an error, calling this function will display `bar3`.
The `printf("bar")` statement is evaluated and then only the body of the function is executed (the "3").


Note that there are other exceptions induced by the standardisation of VLAs such as I already state, the impossibility to allocate VLAs statically (quite logical), or the impossibility to use VLAs in a structure (GNU GCC supports it anyway).
And even some conditional branches are forbidden when using a VLA.

[^system]: On my computer.
[^float]: Here, the implementation follows the IEEE 754 standard, where the size of floating number “simple” is 4 bytes and “double” is 8. `2.` has type `double` so `2.*a` has the same type as its greater operand.


A flexible array
----------------

You may never hear something like “flexible array members”.
This is normal, these respond to a very specific and uncommon problem.

The objective is to allocate a structure but with one field (an array) of unknown size at compile time and all on a contiguous space[^why].

Here, there is no VLAs, because as we already stated, VLAs are forbidden as structure field.
We must use dynamic allocation
We could write that:

~~~c
struct foo {
  int* arr;
};
~~~

And use it like that:

~~~c
  struct foo* contiguous = malloc( sizeof(struct foo) );
  if (contiguous) {
    contiguous->arr = malloc( N * sizeof *contiguous->arr );
    if (contiguous->arr) {
      contiguous->arr[0] = 11;
    }
  }
~~~

But here the array may not be next to the structure in memory.
As a consequence, if we copy the structure the value of `arr` will be the same for the copy and for the original one.
To avoid that, we must copy the structure, reallocate the array and copy it.
Let's see another way.

~~~c
struct foo {
  /* ... At least another field because the C standard say so ... */
  int flexiArr[];
};
~~~

Here, the field `flexiArr` is a member array flexible.
Such an array must be the last element of the structure and not specify a size[^zero]. It is used like this:

~~~c
  struct foo* contiguous = malloc( sizeof(struct foo) + N * sizeof *flexiArr );
  if (contiguous) {
    flexiArr[0] = 11;
  }
~~~


This syntax responds as much to a need for portability on architectures imposing a particular alignment (the array is contiguous to the structure) as to the need to show a semantic link between the array and the structure (the array belongs to the structure).


[^why]: We may want that this space to be contiguous for many reasons.
One is to optimise the use of the processor cache.
Another one is that the management of the network layers which are well suited to the use of flexible array.

[^zero]: Prior to the specification of flexible array members in C99, it was common practice to use arrays of size one to replicate the concept.


A labels history
----------------

In C, if there's one thing we shouldn't talk about, it's *labels*.
We use it with the `goto` statement. The forbidden one !

To hide them, we replace the `goto` by named statement more explicit like `break` or `continue`.
So we don't have `goto` anymore, and we never learn what is a label anymore when we learn the C syntax.

This is how `goto` and a label are used:


~~~c
goto end;


end: return 0;
}
~~~

Basically, a label is a name given to an instruction.
We use it mainly in `switch` statement now a day.

~~~c

switch( action ) {
    case 0:
        do_action0();
    case 1:
        do_action1();
        break;
    case 2:
        do_action2();
    break;
    default:
        do_action3();
}
~~~

Here each `case` and the `default` are in fact labels.
Except that you can't use them with `goto` since there is no name to refer.


:::question
Why are you telling us this?
:::

Firstly, it's good to know that it's called a label.
Secondly, because I'm going to tell you about a classic.
The *Duff's device*.

It's a kind of loop unrolled and optimised.
The goal is to reduce the number of loop check (as well as the number of decrements).

Here is the historical version written by Tom Duff

~~~c
{
    register n = (count + 7) / 8;
    switch (count % 8) {
    case 0: do { *to = *from++;
    case 7:      *to = *from++;
    case 6:      *to = *from++;
    case 5:      *to = *from++;
    case 4:      *to = *from++;
    case 3:      *to = *from++;
    case 2:      *to = *from++;
    case 1:      *to = *from++;
            } while (--n > 0);
    }
}
~~~


It doesn't matter what `register` means. Also, `to` is a particular pointer, but it doesn't really matter.

Here, what I want to tell you about is that `do-while` loop in the middle of a `switch`.

The test we try to avoid is `--v > 0`.

Normally, `n` would actually be `count`.
And we would have to test `count` times.
The same goes for its decrement.


That's to say:

~~~c
while( count-- > 0 )
    *to = *from++;
~~~

Dividing by 8 (arbitrary number) we also divide the number of tests and decrements by 8.
However, if `count` is not divisible by 8, we have a problem, we don't do all the instructions.
It would be nice to be able to jump directly to the 2nd instruction, if you only have 6 instructions left.

And this is where labels can help us!
Thanks to the `switch` we can jump directly to the right instruction.

We only have to label every instruction with the number of instruction remaining to do.
Then we jump to that instruction with the `switch` statement.


It is very rare to have to use this type of trick.
It's mostly an optimisation from another time.
But since I would like to talk about that syntax, it was necessary to talk about labels (or was it ?)


Complex numbers
---------------

Once again, we will study a syntax introduced by C99.
More exactly there are 3 types that have been introduced which are complex numbers.
The type of complex number is `double _Complex` (the other two follow the same pattern, I will only write about the `double` version)

Thus, in C, it is possible to declare a complex number like this:


~~~c
double complex point = 2 + 3 * I;
~~~

Here we find the special macros `complex` and `I` (defined in the `<complex.h>` header).
The former is used to create a complex type while the latter is used to define the imaginary part of a complex number.

In memory a complex variable takes up as much space as 2 times the real type on which it is based.
A complex variable is used as a normal variable.
The arithmetic is intuitive since it is based on the real type.
Note that it is recommended to use the macro `CMPLX` to initialise a complex number:


~~~c
double complex cplx = CMPLX(2, 3);
~~~


For a better handling of cases where the imaginary part (the one multiplied by `I`) would be `NAN`, `INFINITY` or even more or less 0.

The `<complex.h>` header offers us a really simple way to use imaginary numbers.
Indeed, many common functions for manipulating imaginary numbers are available.


Generic macros
--------------

There is a way in C to have macros that are defined differently depending on the type of one of its arguments.
This syntax is however "new" since it dates from the C11 standard.

This genericity is achieved with generic selections based on the syntax ` _Generic ( /* ... */ )`.  
To understand the syntax, let's look at a simplistic example:

~~~c
#include <stdio.h>
#include <limits.h>

#define MAXIMUM_OF(x) _Generic ((x), \
                                char: CHAR_MAX, \
                                int:  INT_MAX,  \
                                long: LONG_MAX  \
                                )

int main(int argc, char* argv[]) {
    int i = 0;
    long l = 0;
    char c = 0;

    printf("%i\n", MAXIMUM_OF(i));
    printf("%d\n", MAXIMUM_OF(c));
    printf("%ld\n", MAXIMUM_OF(l));
    return 0;
}
~~~


Here we print the maximum that can be stored by each of the types we use.
This is something that would not have been possible without the use of this new keyword `_Generic`.
To use this syntax, we use the keyword `_Generic` to which we pass 2 parameters.
The first is an expression whose type will influence the expression that is finally executed.
The second is a sequence of type and expression associations (type: expression) whose associations are separated by commas.
In the end, only the expression designated by the type of the first expression is finally evaluated.

A real-world example could be:

~~~c
int powInt(int,int);

#define POW(x,y) _Generic ((y), double: pow, float: powf, long double: powl, int: powInt)((x), (y))
~~~


There's not much more to say except that it's possible to have the word `default` in the list of types, which will then correspond to all unmentioned types.
So a cleaner definition of the `POW` macro from earlier could be :

~~~c
int powIntuInt(int a, unsigned int b);
double powIntInt(int a, int b);
double powFltInt(double a,int b) { return pow (a,b); }
double powfFltInt(float a,int b) { return powf(a,b); }
double powlFltInt(long double a,int b) { return powl(a,b); }


#define POWINT(x) _Generic((x), double: powFltInt,          \
                                float : powfFltInt,         \
                                long double: powlFltInt,    \
                                unsigned int: powIntuInt,   \
                                default: powIntInt)
#define POW(x,y) _Generic ((y), double: pow, float: powf, long double: powl, default: POWINT((x)) )((x), (y))
~~~


Too special characters
----------------------


Let's go back in time again.
I'm going to mention one more thing.
A time when not all characters were as accessible as they are today on so many types of keyboards.

Keyboards didn't necessarily have compose keys.
Thus, it was impossible to type the `#` character.

The `#` character could then be replaced by the sequence `??=`.
And for each character not on the keyboard and used in the C language, there was a `??` based sequence called trigraph.
Another version based on 2 *more readable* characters is called digraphs.

Here is a table summarising the trigraph and digraph sequences and their character representation.

| Character | Digraph | Trigraph |
| --------- | ------- | -------- |
|     `#`   |    `%:` |    `??=` |
|     `[`   |    `<:` |    `??(` |
|     `]`   |    `:>` |    `??)` |
|     `{`   |    `<%` |    `??<` |
|     `}`   |    `%>` |    `??>` |
|     `\`   |         |    `??/` |
|     `^`   |         |    `??'` |
|     `\|`  |         |    `??!` |
|     `~`   |         |    `??-` |
|     `##`  |  `%:%:` |          |


The **main** difference between the digraphs and trigraphs is inside a string:

~~~c
puts("??= is a hashtag");
puts("%:");
~~~

These medieval mechanisms are still valid today in C. [^C23]
So this line of code is perfectly valid:
~~~c
??=define FIRST arr<:0]
~~~

The only use of this syntax nowadays is to obfuscate a source code very easily.
With a combination of a ternary with trigraphs and a digraph you get an absolutely unreadable code 😉

~~~c
printf("%d", a ?5??((arr):>:0);
~~~

[^C23]: In C23, trigraphs are deprecated and doesn't work anymore.


**Never use it in a serious code**.

## To conclude


That's it, I hope you've learned something from this post. Don't forget to use these syntaxes sparingly.

I would like to thank [Taurre](https://zestedesavoir.com/membres/voir/Taurre/) for validating this article in French, but also for his pedagogy on the forums for years, as well as [blo yhg](https://zestedesavoir.com/membres/voir/blo%20yhg/) for his careful proofreading.

Note that you can (re)discover a lot of code abusing the C language syntax at [IOCCC](http://ioccc.org/winners.html). 😈