migmit: (Default)
[personal profile] migmit
It's XXI century. We have flying cars now. What we also have, is a lot of research in programming languages. We have heavy books on refactoring. We have tons of "best practices". And then...

You know, a while ago I bitched about generics in Rust not being really generics, but rather templates, as they are specialized for every type they are used with. In fact, they are more limited than templates, as they don't allow partial specialization.

A few days ago I got wind that they finally woke up and implemented generics as, well, generics. I've checked immediately, and, no, we are still at the same spot.

But then another thought came: how can they specialize generic functions for types they know nothing about? I mean, if module A defines a generic function, and module B uses it with the type also defined in B, and A is compiled first — there is no possible way to specialize this function for the type defined in B, right? C++ avoids this by having all templates implemented in header files, so that they are available when compiling B — which really means that the function is NOT compiled with A (unless it uses it with some other type), but rather with B. Rust can't do that, after compiling the module you can safely remove the source code, and it will work fine.

And somebody answered my question with this: http://www.reddit.com/r/rust/comments/2c6pn0/how_does_rust_implement_calling_generic_functions/cjcio0p

Yep. They keep the source code in the compiled file.

Well, not EXACTLY the source code — the AST. But the difference is, in my book, négligeable.

Yeah. It's THAT bad.

On the other hand, there is Go — which is another new language, intended to fill the same niche as Rust — or at least a close one. It solves the problem with generics elegantly and efficiently — by not having generics. At all.

Which is fine, if there were hope that generics would emerge in some later version — like it was with Java. But it seems that Go people rather enjoy not having generics: "Lack of generics has bothered me a little, but copy and paste with a multiple-cursor editor really makes short work of it."

Yes, its XXI century, and the earth is doomed.

Date: 2014-09-15 02:13 pm (UTC)
From: [identity profile] juan-gandhi.livejournal.com
Funny.

I just don't see what's wrong with passing around the AST.

Date: 2014-09-15 04:11 pm (UTC)
From: [identity profile] migmit.livejournal.com
Do you think there is something wrong with passing around source code?

Date: 2014-09-15 04:48 pm (UTC)
From: [identity profile] nealar.livejournal.com
В общем случае - ничего. Но в "скомпилированном модуле" - это какой-то детский сад.

Date: 2014-09-15 05:21 pm (UTC)
From: [identity profile] thedeemon.livejournal.com
Там сказали "stored in the metadata of a library". Не вижу принципиального отличия от хаскельных .hi файлов.

Date: 2014-09-15 09:11 pm (UTC)
From: [identity profile] juan-gandhi.livejournal.com
No definite opinion on this.

Date: 2014-09-16 06:01 pm (UTC)
From: [identity profile] migmit.livejournal.com
OK, how about this: RELYING on source code being passed around. So that nothing can ever work without it.

Separate compilation, anyone?

Date: 2014-09-16 06:46 pm (UTC)
From: [identity profile] juan-gandhi.livejournal.com
Well, I see your point now, and consider the practice sinful.

Date: 2014-09-15 04:06 pm (UTC)
From: [identity profile] thedeemon.livejournal.com
So what do you think would be the right approach to implementing generics in a "systems" compiled language?

Date: 2014-09-15 04:15 pm (UTC)
From: [identity profile] migmit.livejournal.com
I have a bit vague idea of what a "systems" language is.

But I thing the best way would be to borrow Haskell's approach, when a record of functions is passed to the function as an implicit argument.

Date: 2014-09-15 04:34 pm (UTC)
From: [identity profile] thedeemon.livejournal.com
How would inlining work then?

Date: 2014-09-15 04:50 pm (UTC)
From: [identity profile] migmit.livejournal.com
If type is completely known at compile time, then this implicit parameter is also known, and I can't see the problem with inlining it.

Date: 2014-09-15 05:00 pm (UTC)
From: [identity profile] thedeemon.livejournal.com
Take your own example. We compile module A with a generic function f. Then we make module B with some type defined and pass it to the function f from A. Imagine we don't have source code of A at this moment. How will f be inlined?

Date: 2014-09-15 05:03 pm (UTC)
From: [identity profile] migmit.livejournal.com
It won't. That a very simple fact of separate compilation: cross-module inlining doesn't work. Which is an optimization issue, not a semantic one.

Date: 2014-09-15 05:16 pm (UTC)
From: [identity profile] thedeemon.livejournal.com
Well, that's an unsatisfactory answer, completely unsuitable for languages like Rust in which we want to write browsers, kernels, drivers and other low-level things. This is why both Rust and GHC store definitions for inlinable functions.

"A consequence is that GHC must be able to do cross-module, and indeed cross-package, inlining. The idea is simple:

* When compiling a Haskell module Lib.hs, GHC produces object code in Lib.o and an "interface file" in Lib.hi. This interface file contains information about all the functions that Lib exports, including both their types and, for sufficiently small functions, their definitions.
* When compiling a module Client.hs that imports Lib, GHC reads the interface Lib.hi. So if Client calls a function Lib.f defined in Lib, GHC can use the information in Lib.hi to inline Lib.f."


- Simon Marlow and Simon Peyton-Jones, http://www.aosabook.org/en/ghc.html

Date: 2014-09-15 05:44 pm (UTC)
From: [identity profile] nealar.livejournal.com
Да, точно, как же драйверам жить без генериков!
browsers, kernels - немного другая история. ЧСХ, они не столь низкоуровневые.

Date: 2014-09-15 05:54 pm (UTC)
From: [identity profile] thedeemon.livejournal.com
А генерики тут не принципиальны, речь про инлайнинг и простых функций тоже.
Но в типичном коде на Расте и подобных ему генериков будет куча, ибо тайпклассы на каждом шагу. Что до низкоуровневости, то тут речь больше о требованиях к эффективности. Если простые операции вдруг будут требовать кучу непроинлайненных вызовов мелких генерик функций (вроде монадных операций), то ничего не взлетит, проект можно закрывать.

Date: 2014-09-15 07:33 pm (UTC)
From: [identity profile] nealar.livejournal.com
А генерики тут не принципиальны, речь про инлайнинг и простых функций тоже.
А попилить язык на подмножества: тут инлайним, тут не инлайним, тут рыбу заворачивали?
генериков будет куча, ибо тайпклассы на каждом шагу
Тады ой.

Date: 2014-09-16 03:31 am (UTC)
From: [identity profile] thedeemon.livejournal.com
>А попилить язык на подмножества: тут инлайним, тут не инлайним, тут рыбу заворачивали?

Страуструп улыбается и целует в макушку.

Date: 2014-09-15 07:15 pm (UTC)
From: [identity profile] migmit.livejournal.com
You do realize, that in C cross-module inlining doesn't work either, right?

Date: 2014-09-15 07:34 pm (UTC)
From: [identity profile] nealar.livejournal.com
Это возражение бьётся контраргументом "Ц - плохой язык".

Date: 2014-09-16 06:01 pm (UTC)
From: [identity profile] migmit.livejournal.com
Но удовлетворительный.

Date: 2014-09-16 02:28 am (UTC)
From: [identity profile] thedeemon.livejournal.com
"Link Time Optimization (LTO) gives GCC the capability of dumping its internal representation (GIMPLE) to disk, so that all the different compilation units that make up a single executable can be optimized as a single module. This expands the scope of inter-procedural optimizations to encompass the whole program (or, rather, everything that is visible at link time)."

And C++ inlines "generics" on source code level, as you mentioned.

Date: 2014-09-16 05:03 am (UTC)
From: [identity profile] migmit.livejournal.com
Thanks. I'll have to experiment with this before I can formulate the answer.

Anyway, "This is why both Rust and GHC store definitions for inlinable functions" is still wrong. For GHC it might be an optimization; for Rust it's semantics. It's generics just won't work otherwise.
Edited Date: 2014-09-16 05:04 am (UTC)

Date: 2014-09-16 06:03 am (UTC)
From: [identity profile] thedeemon.livejournal.com
You're right, this optimization is probably not the main reason for doing it in Rust. But it's inevitable for a language aiming to replace C++ anyway.

Date: 2014-09-16 08:27 am (UTC)
From: [identity profile] migmit.livejournal.com
I doubt it's "inevitable". Of 4 machines I have quick access to - my own VM at work, corporate VM, my home Mac and my home kinda-server - only the last one even have GCC compiled with LTO. If it's that inevitable, why isn't it everywhere already?

Date: 2014-09-16 10:40 am (UTC)
From: [identity profile] thedeemon.livejournal.com
Several reasons come to mind:
1. A lot of things being written in C++ where inlining is performed earlier, so LTO is not so important there.
2. Typical C code doesn't do so many calls to little functions as would a language full of type classes do, so again, not so crucial for C as for Rust-like.
3. Inertia.
4. Probably low gain from LTO for C. Which means C++ and other aggressively inlining and optimizing languages are bound to be faster than C. Canonical example is sorting: C++ blows C out of the water, being able to inline the comparison instead of calling a function each time.

Date: 2014-09-16 03:29 pm (UTC)
From: [identity profile] thesz.livejournal.com
LTO is dauntingly complex. They still consider it experimental (it is in works from 2007, I guess).

Date: 2014-09-16 06:03 pm (UTC)
From: [identity profile] migmit.livejournal.com
Well, my Ubuntu 14.04.1 has it compiled into gcc by default.

Date: 2014-09-16 07:22 pm (UTC)
From: [identity profile] thesz.livejournal.com
Good to know. I thought they were further away from enabling it by default.