Macros to the rescue!
In C, the only way to do metaprogramming is to use macros. Perhaps the reason we could not reify such abstractions as tagged unions till this moment is that C macros can only work with individual items but cannot operate on sequences thereof; put it simply, macros cannot loop or recurse. Therefore, without loops or recursion at our disposal, we could not generate something from a series of functions comprising some software interface, or from a series of variants comprising some tagged union. But we can enrich the preprocessor in such a way that it becomes possible. Read on.
Metalang99 is the solution I came up with.
Metalang99 is a language to write macros. Sorry, it is a
language to write recursive macros!
What was previously impossible soon became possible:
datatype(
BinaryTree,
(Leaf, int),
(Node, BinaryTree *, int, BinaryTree *)
);
int sum(const BinaryTree *tree) {
match(*tree) {
of(Leaf, x) return *x;
of(Node, lhs, x, rhs) return sum(*lhs) + *x + sum(*rhs);
}
return -1;
}
Adapted from Datatype99, a library for tagged unions.
Oh, sorry again, I am a bit sleepy today. I forgot one crucial
detail: to make the following code work, you must
#include <datatype99.h>
. Let me mend myself this
time:
#include <interface99.h>
#define Shape_IFACE \
vfunc( int, perim, const VSelf) \
vfunc(void, scale, VSelf, int factor)
interface(Shape);
typedef struct {
int a, b;
} Rectangle;
int Rectangle_perim(const VSelf) { /* ... */ }
void Rectangle_scale(VSelf, int factor) { /* ... */ }
impl(Shape, Rectangle);
typedef struct {
int a, b, c;
} Triangle;
int Triangle_perim(const VSelf) { /* ... */ }
void Triangle_scale(VSelf, int factor) { /* ... */ }
impl(Shape, Triangle);
Adapted from Interface99, a library for software interfaces.
Everything is correct now.
Everything you need to make it work is a one-liner
#include <interface99.h>
.
Believe it or not, this little detail is the exact purpose of the preprocessor. Let me explain it to you: preprocessor macros are embedded into the language for a reason. After all, macros are just a means for code generation, so why cannot we generate code using external tools, provided that they are often more advanced and so on? Because aside from being “advanced and so on”, they are also less natural.
What is wrong with external codegen?
The thing is that with native macros, you can interleave invocations
thereof with the actual code, or business logic, or files in which you
usually write your code. With third-party code generators, you cannot.
You can only fscanf
some code from file.blah
and fprintf
the generated code to generated.h
.
Okay, even if you had a ready C parser to read macro invocations of the
form X(...)
directly from source.c
, where
X
is defined as
#define X(...) /* Consume all arguments! */
not to break
the real compilation, where would you generate code? Please, do not tell
me that you are going to fprintf
right into
source.c
! Because you know, the placement of
functions/types makes much sense in C, and you cannot
fprintf
the generated code for X(...)
to
generated.h
and include it in source.c
. The
things might break apparently. And yes, you cannot just swallow the
whole source.c
and output source-generated.c
somewhere because your IDE would then unironically say “goodbye good
luck” to you – at least, constructions generated by such macros would no
longer be visible when you write code.
That is, with third-party code generators, you are forced to separate the files in which you write ordinary code from the files to be fed to the code generator.
With native macros, you write code as usual.
With native macros, you do not violate the normal order in which
linguistic constructions cooperate with each other. When you write
struct Vect { ... }
, you write it in the same file as
Vect_add
, Vect_remove
, and so on. Why should
you apparently write datatype(T, ...)
in a separate file
when it is also a linguistic construction? Elaborating further, why
should we treat software interfaces as an alien spacecraft fallen to
Earth?
With Datatype99
and Interface99,
you generate the stuff in-place. Tagged unions and software interfaces
are those kinds of abstractions to be considered as parts of the host
language, i.e., C. Therefore, they should be treated in the same way as
we treat struct
, as we treat union
, functions,
and variables.
No, I am not claiming that external codegen is useless. It has applications in a build process and other areas; for example, sometimes it is perfectly fine to separate files (OpenSSL, n.d.). What I am trying to convey is to use the right tool for the job. But wait, the suggested libraries rely on some heavy-duty macros, and it is crystal clear that the vanilla C preprocessor is not meant for such kind of abuse, right?
This is the turning point of our spontaneous discussion.
The side effects of aggressive macros
Instead of thinking philosophically, I encourage you to think pragmatically.
Instead of thinking about what is good and what is bad, I encourage you to think about benefits and possible side effects.
The benefits include more concise, safe, clean code.
The side effects might include scary compilation errors and preposterous compilation times.
Not really.
When I started designing Metalang99, I was aware of how metaprogramming can go insane. Metalang99 is an attempt to make it less insane. With some unhealthy curiosity, you might accidentally call Satan, and he will kindly produce gigabytes of error messages for you, dear. Not kidding, I experienced it on my own:
In the above error, I asked a compiler to show a full backtrace of
macro expansions. Most of the time, it is just a senseless bedsheet of
macro definitions, so I always turn it down by
-ftrack-macro-expansion=0
(GCC) or
-fmacro-backtrace-limit=1
(Clang).
But how to produce errors that people understand?
This question is out of the scope bla-bla-bla. I will just show you some real errors you can get from Datatype99 real quick:
-
playground.c
/bin/sh
-
playground.c
/bin/sh
-
playground.c
/bin/sh
Looks nice?
I know how to break this wonderful world. Look:
playground.c
/bin/sh
$ gcc playground.c -Imetalang99/include -Idatatype99 -ftrack-macro-expansion=0
playground.c:3:1: error: static assertion failed: "invalid term `ML99_PRIV_IF_0 ~(ML99_PRIV_listFromTuplesError, ML99_PRIV_listFromTuplesProgressAux) (DATATYPE99_PRIV_parseVariant, 2, (Foo, int) ~, (Bar, int), ~)`"
3 | datatype(A, (Foo, int) ~, (Bar, int));
| ^~~~~~~~
Looks less nice?
Bad news: it is impossible to handle all kinds of errors in macros gracefully. But we do not need to handle all of them. It would be sufficient to handle most of them. Now I shall convince you that even Rust, a language that sells itself as a language with comprehensible errors, even Rust sometimes produces complete nonsense:
(Kindly given by Waffle Lapkin.)
Show more hordes of errors…
(I believe some of them were on stable Rust.)
Even so, most of the time, Rust performs well enough.
Even so, most of the time, Datatype99 & Inteface99 perform well enough.
Rust exemplifies perfectly that a system need not be ideal to be
practically useful. The same holds for the macros: I rarely see
complete nonsense from my macros, but whether you like it or not, it
might happen. Surely, it is not a reason to abandon the whole approach;
as you can see, your computer is still there, your terminal did not die
under tons of error messages, and everything you need to do is just to
carefully look at the macro invocation and perhaps run your compiler
with -E
2. The funny fact is that even in
Rust, I was forced to cargo-expand some
macros several times to get a sense of what is wrong, so why no one is
saying that Rusty macros are totally unusable?
Regarding compilation times, they are just fine.
Final words
Let me sum up.
The purpose of the preprocessor is to enable seamless integration.
The purpose of the preprocessor is to allow your macros to be conveniently interleaved with the rest of your code.
The purpose of the preprocessor is not to break the normal order in which linguistic abstractions cooperate with each other.
The purpose of the preprocessor is to be
natural…
and this is what external codegen cannot suggest,
no matter how you try.
Links:
- Installation instructions for Metalang99, Datatype99, Interface99.
- The motivational post for the above libraries.
- The mailing list for the above libraries. Join and talk with us!
Afterword
I wrote this blog post as a single answer to questions like “Your macros are nice, but the preprocessor is a wrong tool for them”. I found it a logical contradiction when people appeal to one unpleasant aspect to ruin the whole approach, provided that we have several examples in the industry when the method works just fine.
Before I started to implement Datatype99, I searched for prior art. There were only such projects as adt4c. The problem with them is that they use external code generation; you can just compare how the same functionality is done using this approach and using Datatype99. Additionally, I made Datatype99 minimalistic, meaning that it provides only ADTs and nothing more – adt4c, on the contrary, comes with a type polymorphism implementation.
Before I started to implement Metalang99, I tried to use Boost/Preprocessor (hirrolot, n.d.a). Owing to its fundamental limitations, I abandoned the project. These drawbacks include macro bluepainting (when the preprocessor blocks macro recursion) and very poor diagnostics. I fixed both of them in Metalang99.
Some people state that these macros (Datatype99 & Interface99) are opaque, meaning you do not know what they generate. It is simply not true; from the very beginning, I stated explicitly what they do generate (hirrolot, n.d.b, n.d.d); I do not try to fool you with a nifty interface and a bloody cannibalic massacre under the hood. I designed these libraries in such a way that they even do not require libc, they are FFI-friendly, and they do not impose any restrictions on your environment.
I do encourage you to use such well-documented and well-tested libraries as Datatype99 & Interface99, but I discourage you from polluting your codebase with fancy macro-based DSLs implemented upon Metalang99. Currently, I do not see much need for Metalang99 except these libraries; they provide a facade for your APIs, reuse them, do not try to express something by defining more macros if you can do that with already existent linguistic abstractions. If you nonetheless decided to use recursive macros of Metalang99, at least use them sanely – always prove to yourself why you need more new macros instead of ordinary functions or/and just simpler macros, and you will thank me one day. Macros can only be a solution when you are “run out of your language”; please, do not.
If you ask me, “Why use C if there are Zig/Rust/C++/etc?”, you can afford another language; this is cool but not always possible – you can find the information elsewhere in the Internet (Simucal, n.d.; hirrolot, n.d.c; Drew DeVault, n.d.).
References
I believe that the C preprocessor was initially put into the language as a temporary workaround. With the preprocessor, you can do conditional compilation, foreach-macros, generics, etc. Nowadays, most of this stuff is done by “the right tools” but back in the 70’s, it was unclear how to solve such problems.↩︎
-E
stands for “preprocess only”. It is supported at least by GCC and Clang but other compilers should have the same option as well (probably under a different name).↩︎