|
|
Subscribe / Log in / New account

SELF: Anatomy of an (alleged) failure

Did you know...?

LWN.net is a subscriber-supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net.

June 23, 2010

This article was contributed by Joe 'Zonker' Brockmeier.

Like most community-run events, the second SouthEast LinuxFest (SELF) featured the standard set of positive talks on Linux and open source. It also featured a somewhat more controversial talk about failures to get some features merged into the Linux kernel by Ryan "icculus" Gordon.

[Ryan Gordon]

The talk was delivered without slides, and Gordon started by admitting the talk was biased — unlikely to win many friends in the kernel community. The focus of the talk, which was not terribly obvious from the schedule, was on several high-profile attempts to add new features into the Linux kernel that failed, due — according to Gordon — to kernel politics or personality conflicts. He said he had spoken to a number of kernel developers who experienced failure, but few were willing to go "on the record," for the talk. As examples he used his own experiences, Eric S. Raymond's CML2, Con Kolivas's failure with the Completely Fair Scheduler, along with Hans Reiser and the Reiser4 filesystem.

Gordon is behind icculus.org, and says that he spends most of his time porting video games to Linux. In the course of working with Linux, Gordon says that he discovered that "Linux sucks at a lot of important tasks". He noted that Apple has solved a number of the things that Linux does poorly (though he ceded that Mac OS X also does many things badly), and that Linux developers should be "stealing some stuff" from Apple. Gordon pointed to the Time Machine backups and universal binaries that allow users to install software on PowerPC or Intel-based machines and have them "just work".

Gordon wasn't using universal binaries as an idle example. Gordon attempted to solve that problem himself by creating FatELF, a universal binary format for Linux. He described the process of creating the patch, sending it to the kernel list for acceptance, and expecting success. Instead of succeeding, Gordon said it was a "spectacular failure".

The problem? Gordon says that he misinterpreted a response from Alan Cox as being in favor of the patch, when actually he was against it. According to Gordon, "the worst thing you can do is have a kernel maintainer tell you what they don't like and ignore it. Once Alan Cox was openly hostile, people came out of the woodwork." He then drew an analogy between the kernel community and the movie Mean Girls, likening Cox to one of the popular girls and other community members as following Cox's example. Gordon says that the kernel community has a "herd mentality" and that "you can't move the kernel maintainers".

After recounting his own experience with FatELF, Gordon talked about Raymond and the problems getting CML2 — a replacement for the kernel build system — into the Linux kernel. Gordon highlighted the fact that CML2 going into the kernel seemed a foregone conclusion to Raymond, and that it was originally supposed to go into the 2.5.1 kernel. He also talked at length about Raymond and Linus Torvalds's discussions about CML2 and Torvalds's general agreement that CML2 could go into the kernel.

Gordon said that the kernel community was hostile towards Python and Raymond, but little was mentioned about Raymond's sometimes caustic responses to the kernel developers. Gordon also mentioned a little, but very little, of the technical problems of CML2, such as 30-second wait time to compile its kernel-rules.cml into a binary format for use with the cmlconfigure program. He did discuss the — perhaps not entirely reasonable — objection that CML2 consisted of a major change to the kernel where the maintainers typically prefer a set of small incremental changes. He wondered: how does one implement a new configuration language as an incremental change?

One might question the wisdom of using Hans Reiser as an example of the kernel development process gone wrong, as Gordon did. Reiser had a contentious relationship with the kernel community to begin with, and the fact that it was necessary to cite correspondence from Reiser's cell tends to undermine his credibility. But Gordon is to be credited with, at least, being thorough in interviewing his sources. Gordon showed several letters from Reiser gathered over months of correspondence about the failure to get Reiser4 into the mainline kernel.

He described the problems that Reiser had in attempting to get Reiser4 into the kernel. In an unfortunate analogy, he said that Reiser failed to get a "fair trial" from the kernel community in considering Reiser4. Gordon also hinted that corporate interests may have sabotaged Reiser4 from being adapted into mainline — a longstanding contention of Reiser's — because of conflicting interests on behalf of the maintainers, though he cited no evidence for it. Gordon painted the kernel community as abusive towards Reiser, but omitted any mention that Reiser could also be caustic in return. Certainly, Gordon's point that the FAQ on KernelNewbies is unnecessarily personal is well-taken.

But little was said about any of the real technical problems with Reiser4. Personality issues aside, real technical problems stood (and still stand) in the way of merging Reiser4 into the mainline kernel.

Due to time, Gordon rushed through the discussion of Kolivas's failure to get the Completely Fair Scheduler into the kernel. He characterized Kolivas's treatment as "rude", and suggested that it was particularly bad treatment that Kolivas's ideas made it into the kernel but his actual code didn't.

Gordon ended his talk by throwing out a few ideas to improve the kernel development process. He suggested that the audience, and other Linux users, join the kernel mailing list and lobby for features that they want. He also challenged the idea that developers should have a "thick skin" to participate in kernel development, and suggested that the atmosphere of lkml needs to improve.

Despite the focus on the "spectacular failures", Gordon did acknowledge that this was a sporadic problem at worst — not a systemic failure. He also took great pains to say, both during and after the talk, that everyone he spoke with held Andrew Morton in the highest regard. Gordon said that developers should "study Andrew Morton" with great intensity.

The talk was interesting because it provides a different view of the kernel development process than is normally given to general audiences. The kernel development community, and larger FOSS development community, has an understanding of the development process as imperfect. It is well-known that it can be political and personal, as Torvalds himself pointed out in response to Kolivas's departure. The SELF audience, by and large, was composed of Linux enthusiasts that are far removed from the development process. It will be interesting to see if any of the audience decides to take Gordon's advice and begin lobbying the kernel list.

Kernel development is often held up as a shining example of the open source development process to larger audiences. Gordon's talk presented a narrative that demonstrates the less pleasant side of kernel hacking, and the disappointment that developers feel when their contributions are rejected. It might have been more valuable by presenting a less biased view, but it is a story that should be heard.


Index entries for this article
GuestArticlesBrockmeier, Joe
ConferenceSouthEast LinuxFest/2010


(Log in to post comments)

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 19:19 UTC (Wed) by Kamilion (subscriber, #42576) [Link] (28 responses)

It's a shame, because I really would have liked to have the FatELF system. I have a large USB flash drive that I move between several systems. Right now, I'm stuck with a 32-bit Lucid simply because I can't boot 64-bit on every system. It would have been nice to build in PPC and ARM support too.

Storage is cheap -- time is not always.

Don't give up, Ryan! Perseverance and patience is rewarded eventually.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 20:01 UTC (Wed) by Lovechild (guest, #3592) [Link] (23 responses)

I remain very excited about FatELF, I hope that the general idea of a universal binary will still arrive at some point in the near future. It is a really useful feature for reaching beyond were we are now and making things just work.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 20:35 UTC (Wed) by dlang (guest, #313) [Link]

before you worry about getting a binary that will work on arm, i386, powerpc, AMD64, itanium, etc a bigger issue is that even if you stick with a single architecture you end up with different libraries on different systems and so you frequently need different binaries to support the different options they use.

even within a single distro you have different versions using different libraries depending on what features are compiled in (does it use mysql or sqlite for a database for example)

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 20:56 UTC (Wed) by aliguori (subscriber, #30636) [Link] (1 responses)

If you install qemu's linux-user, then you can run non-native Linux binaries with just as you run native Linux binaries. This works by installing a misc binary hook and requires no kernel support. That said, we can't get a single binary that works universally across Linux distributions on a single architecture so I'm not sure that multi-arch binaries really makes that much sense.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 7:26 UTC (Thu) by tzafrir (subscriber, #11501) [Link]

What about /lib/ld-linux.so.2 ?

BTW: At least on Debian Squeeze / Sid, you just need to use:

aptitude install qemu-user-static binfmt-support

The rest of the setup is done automagically. But then again, if you want to use this for non-static binaries, you need to set up a chroot.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 21:45 UTC (Wed) by xav (guest, #18536) [Link] (17 responses)

FATelf is a born-dead project. Nobody wants to pay the price of compiling as many times an exec as there are different architectures.
And then there are more differences than simply arch diffs. What's the use of a Fedora ARM binary, or an Ubuntu Itanium one ? If the matching distro doesn't exists, it's a waste.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 1:08 UTC (Thu) by cesarb (subscriber, #6266) [Link] (10 responses)

What would be interesting would be a bytecode architecture, like some sort of LLVM IR, compiled via a JIT at run time.

This way the same "executable" would run on ARM, x86-32, x86-64...

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 9:50 UTC (Thu) by tzafrir (subscriber, #11501) [Link] (1 responses)

Someone already has. It's called qemu.

SELF: Anatomy of an (alleged) failure

Posted Jun 26, 2010 0:11 UTC (Sat) by waucka (guest, #63097) [Link]

If you want to do JIT, you should do it on an intermediate representation (IR) designed for the purpose. Deliberately using x86 (or any native code, really) for that purpose is ridiculous. Besides, we wouldn't necessarily have to do JIT all the time. With a good IR, we could have live CDs and USB sticks use JIT and convert to native code at install-time.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 14:09 UTC (Thu) by pcampe (guest, #28223) [Link] (1 responses)

>What would be interesting would be a bytecode architecture [...]

We have something definitely better: source code.

Allowing for the distribution of a (more or less) universally-runnable, auto-sustained and self-contained component will ultimately result in making easier the distribution of closed source program, which is something we should contrast.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 16:20 UTC (Thu) by Spudd86 (guest, #51683) [Link]

yes, but unless you run a source distro like Gentoo you may not have the dev files for everything on your system and lots of users are fairly averse to compiling themselves, plus a source distro can hit problems that DO NOT EVER hit binary distros (including people with misbehaving build systems, automagic deps (weather it ends up depending on another package being installed changes if the other package is installed a build time or not, with no way to turn this behavior off))

Also people are going to want to use pre-compiled code, and most people don't really want to learn how to package their stuff for every distro ever, let alone actually compile it 20 or 30 times.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 16:13 UTC (Thu) by Spudd86 (guest, #51683) [Link] (1 responses)

If you go look at some of the older LLVM papers they pretty much describe doing this... (I don't know if anyone implemented it, but given that they DO have a JIT compiler for LLVM IR already I think you could probably already do this in a limited form see http://llvm.org/cmds/lli.html the current llvm command that will run a LLVM IR bytecode object file with the LLVM JIT)

The papers talk about profiling and optimizing the IR and writing that back to the binary, so you get a binary optimized for your workload.

This still has the issues of library incompatibilities across architectures (even within the same distro) since the library may not have all the same options compiled in, or many export a slightly different set of symbols or all kinds of other things...

SELF: Anatomy of an (alleged) failure

Posted Jun 27, 2010 16:03 UTC (Sun) by nix (subscriber, #2304) [Link]

IIRC this is currently being done by ClamAV (using LLVM, natch).

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 21:20 UTC (Thu) by dw (subscriber, #12017) [Link] (1 responses)

This has been tried many times, including (but not limited to) TDF (http://en.wikipedia.org/wiki/TenDRA_Distribution_Format) and ANDF (http://en.wikipedia.org/wiki/Architecture_Neutral_Distrib...).

I believe ANDF was the basis for some failed UNIX standard in the early 90s, but that's long before my time.

There's at least one more recent attempt along the same lines (forgotten its name).

SELF: Anatomy of an (alleged) failure

Posted Jun 28, 2010 16:13 UTC (Mon) by salimma (subscriber, #34460) [Link]

GNUstep? also, the ROX (RISC OS on X) Desktop.

SELF: Anatomy of an (alleged) failure

Posted Jul 1, 2010 7:45 UTC (Thu) by eduperez (guest, #11232) [Link]

> What would be interesting would be a bytecode architecture, like some sort of LLVM IR, compiled via a JIT at run time.
> This way the same "executable" would run on ARM, x86-32, x86-64...

You man, like... Java?

SELF: Anatomy of an (alleged) failure

Posted May 4, 2012 19:10 UTC (Fri) by ShannonG (guest, #84474) [Link]

This is why the kernel-mailing is hostile.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 2:18 UTC (Thu) by PaulWay (guest, #45600) [Link] (2 responses)

I think you're seeing it as a solve-everything idea, where really it's a solve-specific-things idea.

Obviously installing a system with every binary containing code for every possible architecture is going to be horribly large. But that's not what you use FatELF for.

Imagine, however, a boot CD or USB key that can boot and run on many architectures. That would be a case where the extra space used would be compensated by its universality. A live or install CD could then drag architecture-specific packages from the relevant distribution. A system rescue CD would work anywhere. You wouldn't worry about the overhead because the benefit would be one medium that would work (just about) everywhere. Likewise, an application installer could provide an initial FatELF loader that would then choose from the many supplied architecture-specific binaries to install.

In these circumstances I think FatELF makes a lot of sense. And, as Apple seems to be proving, the overhead is something that people don't notice (or, at least, are willing to cope with).

Have fun,

Paul

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 20:44 UTC (Thu) by vonbrand (guest, #4458) [Link]

If it really was for "many architectures" (How many do you even see in a normal week? For me it's 3: x86_64, i386, SPARC64; very rarely a PPC Mac. And of those I'd futz around with x86_64 and i386 almost exclusively.) it would be at most some 100MiB for each on a normal CD. Better get USB thumbdrives for each (or carry 5 or 6 CDs around).

SELF: Anatomy of an (alleged) failure

Posted Jun 25, 2010 1:49 UTC (Fri) by dvdeug (subscriber, #10998) [Link]

Can you even get one CD to boot on both ix86 and ARM or PowerPC? Even if you can, along with getting the right kernel to boot up, you should be to symlink in the correct binary directories for the architecture.

I'm having a lot of trouble finding any case where FatELF can't be replaced by shell scripts and symlinks. You want to support both ix86 and AMD-64 with your game; have the executable name run a shell script that runs the right binaries.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 7:34 UTC (Thu) by mjthayer (guest, #39183) [Link] (2 responses)

> What's the use of a Fedora ARM binary, or an Ubuntu Itanium one ? If the matching distro doesn't exists, it's a waste.
Speaking from personal experience, binary compatibility is a lot easier than most Linux people think. I think that the focus they have on source makes them terrified of binary issues. (Source is good of course, but that doesn't mean that binary is always bad.) I help maintain a rather complex piece of software for which we have (among other options) a universal binary installer. It was a bit of work to start with to work out what issues we had to solve (the autopackage webpage is a very good resource here, whatever you may think of their package format), but once we got past that we have had very few issues over many years.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 16:35 UTC (Thu) by Spudd86 (guest, #51683) [Link] (1 responses)

Yes, binary compatibility is easier than most people seem to think, but it is also very frequently done wrong (including by Mr. Gordon, I generally have make stuff he's packaged stop using at least 2 or 3 bundled libraries before it works (if I'm not using the Gentoo ebuild of it...), he tends to bundle libstdc++ and libgcc, as well as SDL... all of these have had a stable ABI for a while now, and if the ABI changes, so does the soname so distros can ship a compat package (which they generally do) so there's no need to distribute them, the only people who benefit would be people with old versions that are missing new API that the game uses, it's irritating. I bought the Humble Bundle and all the games that weren't flash based didn't start due to the bundled libraries causing breakage)

These days you mostly have to worry about make sure you compile on a system that has old enough versions of everything that you're not using newer versions of stuff than your users will have (eg use g++ 4.3 so that Gentoo users that use the stable compiler don't have to install newer gcc and mess about with LD_LIBRARY_PATH so your app can find the newer libstdc++, it's nice since g++ 4.4 and 4.5's libstdc++ is backwards ABI compatible with all the older ones (4.0 and later, 3.x is a separate issue, but you can both available at once so you just need a compat package)) You don't even need to statically link many things, unless you have reason to believe they will not be packaged by your user's distro.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 18:08 UTC (Thu) by jwakely (subscriber, #60262) [Link]

> it's nice since g++ 4.4 and 4.5's libstdc++ is backwards ABI compatible with all the older ones (4.0 and later, ...

3.4 and later

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 7:49 UTC (Thu) by jengelh (subscriber, #33263) [Link] (1 responses)

FatELF is an excuse to provide source code.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 8:33 UTC (Thu) by dlang (guest, #313) [Link]

I think you mean that it's an excuse to NOT provide source code.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 20:10 UTC (Wed) by gnb (subscriber, #5132) [Link] (3 responses)

One thing I may have missed in the original lkml discussion of FatELF is why this required any significant kernel work. ELF allows a file to contains plenty of sections, the section naming is flexible and the file can specify its own interpreter. So why what does a new file format achieve that can't be done by installing an ELF program loader that knows how to find the sections for the correct arch, and a standard ELF file that includes section data for all the supported arches and a .interp section that names said loader as its interpreter?

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 15:03 UTC (Thu) by flewellyn (subscriber, #5047) [Link]

Or, even if a new format IS necessary, using binfmt_misc to load its custom dynamic linker that chooses the correct sections for the architecture? That strikes me as the obvious solution.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 16:39 UTC (Thu) by Spudd86 (guest, #51683) [Link] (1 responses)

FatELF isn't a new format, it's just ELF with special section set up for multi-arch stuff, the kernel needs to understand this so it can load them properly

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 19:05 UTC (Thu) by gnb (subscriber, #5132) [Link]

That's the bit I don't understand. Why can't it just hand the thing off to a custom program loader, either by specifying one in the .interp section or, as others have said, by using binfmt support?

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 19:44 UTC (Wed) by bronson (subscriber, #4806) [Link] (2 responses)

I feel certain that at least two of these are crowning successes.

I have the most first-hand experience with CML2. One weekend I used it to hack together a project I called Compache. It used the CML2 GUI to configure all the settings, modules, and patches desired for a particular Apache build, then built it. (anyone remember how painful compiling a full-featured Apache 1.3 could be back in the day? Some of the worst dependency hell I've ever experienced)

In concept it worked great. The GUI was pretty and the dependencies were decent (it did require a few gross hacks, I forget why). In practice it was horrid, mostly due to CML2 limitations. Initially I had good correspondence with ESR and tried to help finish some important features but with each release it got slower, crankier, and the finish line felt further away.

In my experience Reiser3 was a failure too: it was merged before it was ready, Hans moved on to other stuff, and it took years for others to stabilize it. Since Resier4 was going to be a significantly MORE intrusive patchset, I think the kernel team was prudent not to make that mistake again. Disclosure: I suffered data loss due to a well known and long lived Reiser3 bug so I'm not the most impartial party here.

So, I see CML2 and Reiser4 as a perfect example of how the kernel process keeps questionable ideas on the sidelines until they've been shaken out. Shame about all the Aunt Tillie vitriol on LKML of course.

I don't see much need for FatELF on any of my systems, but I definitely do see a need for a better scheduler! Wish I knew more about the CFS story.

Great article.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 21:50 UTC (Wed) by xxiao (guest, #9631) [Link] (1 responses)

Can't agree with you more.
CFS sucks so much that I have to patch BFS instead these days.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 6:54 UTC (Thu) by mingo (subscriber, #31122) [Link]

Can't agree with you more. CFS sucks so much that I have to patch BFS instead these days.

Please also take a minute to report your observations to us upstream scheduler maintainers as well.

The scheduler is evolving like most other Linux core kernel subsystems: the rate of change is pretty healthy, the spectrum of contributors is broad and we try to address every bugreport and every patch that gets submitted to us.

In the past year alone we merged more than 250 patches to the upstream scheduler, written by more than 50 different authors - improving the code / adding features and addressing bugs/complaints.

Note that as of v2.6.34 there is little left of the original 'CFS' code that we merged in v2.6.22 - if you follow upstream scheduler commits you will see that Peter Zijlstra has rewritten most of it. (but if there's any bugs left in the scheduler then feel free to blame them on me ;-)

Thanks,

Ingo

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 20:00 UTC (Wed) by dlang (guest, #313) [Link] (6 responses)

'lobbying on the kernel list' is not very productive.

it is productive to test the things that you are interested in (and especially be willing to test them in ways other than your 'normal' workload so that you can try and find ways that they break), but you also need to be willing to test alternatives proposed by kernel folks and be able to document (not just state) why what's being proposed is better.

it's especially not useful to show up from the blue strongly advocating the acceptance of any one feature and be completely silent on everything else

reiserfs4, Con's CFS, devfs, and wakelocks haven't had any lack of people pushing to get them accepted. In fact, they are all cases where the advocates went beyond being a benefit and many of them got to the point where their voices actually hurt the feature.

repeated demands to get feature X added to the kernel don't work

repeated statements that it's being shipped with distro X and therefor it should be merged don't work

continued statements that 'it works for me so it should be merged' don't work.

except for the outright demands, experience from people using a particular patch is helpful, and examples of it working well for people are useful data points, but they are not enough because there is always the potential that it's not going to work for other people or other situations. The more comprehensive the testing is the more useful the report is (i.e. desktop use only is not very helpful, especially after the first dozen or so people speak up and say it helps them)

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 2:02 UTC (Thu) by NicDumZ (guest, #65935) [Link]

I came here to say this. Exactly.

Lobbying a community never does good. It's annoying and just adds unnecessary noise that makes me feel like running away from the topic.
If you're unhappy with the development process, get involved, gain more importance by providing insightful reviews, and alternatives to patches that you dislike, and then people will listen.

But jumping in with "+1 I could use that" or "the review process here is awfuly broken"-type of comments is just not going to help

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 3:35 UTC (Thu) by vladimir (guest, #14172) [Link] (1 responses)

> repeated statements that it's being shipped with distro X and therefore it should be merged don't work

Why should this be true? After all, it's gotten heavily tested on a variety of workloads, etc., etc.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 7:41 UTC (Thu) by dlang (guest, #313) [Link]

stated once, it's a datapoint that has value, stated 100 times it's nagging that does more harm than good.

in no way is it a definitive statement in any case. Every distro applies some patches that are inappropriate in and for the long term at some point, so just because they shipped it doesn't mean it's the right thing to do.

As an example of this, look at AppArmor. It's being shipped with Ubuntu, but the developers aren't leaning on that constantly, they are improving the code in response to complaints and getting closer to being merged. The fact that it is being shipped with ubuntu is one line in their multi-hundred line patch summary (in other words, it's not being ignored, but they aren't posting daily or even weekly asking for it to be applied because Ubuntu ships it).

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 7:12 UTC (Thu) by mingo (subscriber, #31122) [Link] (2 responses)

repeated statements that it's being shipped with distro X and therefor it should be merged don't work
While i agree with the gist of your posting, i'd like to insert a qualification to this statement: if a piece of out-of-tree code is in a distribution then that certainly strengthens that code, and strengthens the case for upstream inclusion as well.

Especially if a piece of out-of-tree code is included in a big Linux distribution then upstream maintainers do not ignore it. There's reasons why distributions get big, out of the pool of literally hundreds of baby distributions - and technical incompetence is certainly not amongst those reasons.

So upstream kernel maintainers definitely must not ignore cases where a distribution chooses to include a big chunk of out-of-tree code. Distribution developers are often closer to users/customers and feel the pain of user suffering more directly than upstream maintainers.

So distribution developers asking for upstream inclusion is very much material. (And if upstream is being stupid then the requests should be repeated ;-) Many of our best features were first test-driven in distributions.

On the other hand, non-developer users of those distributions asking for inclusion, especially if they lack the technical expertise to make the case for upstream inclusion (and i suspect this was the main case you meant) is certainly counter-productive.

Thanks,

Ingo

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 8:32 UTC (Thu) by dlang (guest, #313) [Link] (1 responses)

yes, that was part of the point I was making.

"it works for me" can be a useful posting, especially if a patch hasn't had much coverage, or you have an unusual workload/machine to test it on, but there is a huge difference between "it works for me" and "it works for me so that means that you should merge it". one "it doesn't work for me" will out shout a thousand "it works for me" posts

people 'lobbying' for something tend to not provide the testing that is so useful.

SELF: Anatomy of an (alleged) failure

Posted Jun 28, 2010 15:09 UTC (Mon) by rjw@sisk.pl (subscriber, #39252) [Link]

Well, in fact none of the examples given in the original article are about code submitted by a distribution. All of them, and also the ones from your previous comments, fall into the common category where someone, apparently from the "outside", brings us a feature (not being a device driver) complete with bells and whistles and wants us to take it. People generally don't react well to that, which is not surprising (to me at least).

Moreover, such "complete features" often do much more than is really necessary to address the particular problem their submitters had in mind when they started to work on them. In many cases this "extra stuff" makes them objectionable. In some other cases they attempt to address many different problems with one, supposedly universal, feature which confuses things. It also often happens that the feature submitters are not willing to drop anything or redesign, because of the amount of work it took them to develop their code, so the objections cause the entire feature to be rejected eventually.

Now, if you do something that people are not going to react well to and you give them good technical reasons to object to it, you shouldn't be surprised too much when it fails in the end, should you?

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 20:10 UTC (Wed) by cmccabe (guest, #60281) [Link] (107 responses)

FATELF seems uneccesary. Why not just put your 32-bit binaries in one filesystem, and your 64 bit ones in another. Then use unionFS to merge one or the other into your rootfs, depending on which architecture you're on. No need for a big new chunk of potentially insecure and buggy kernel code.

The reason why Apple invented FAT binaries is because they were interested in maintaining extensive binary compatibility with their old systems. Linux has never had this policy. Binaries that worked great on Fedora Core 9 probably won't work on Fedora Core 12, or Ubuntu 9.04, or whatever.

> One might question the wisdom of using Hans Reiser as an example of the
> kernel development process gone wrong

This just might be the understatement of the day!

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 20:13 UTC (Wed) by jzb (editor, #7867) [Link] (2 responses)

Thanks. I was going for understated. ;-)

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 16:27 UTC (Thu) by fuhchee (guest, #40059) [Link] (1 responses)

Isn't it ad hominem to discount someone's technical ideas merely because of homicide?

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 19:22 UTC (Thu) by jldugger (guest, #57576) [Link]

Ad hominem isn't always a fallacy. If the argument is that the LKML doesn't play well with others, and you use Reiser as an example, demonstrating that Reiser also doesn't play well with others makes it harder to assign blame.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 20:38 UTC (Wed) by drag (guest, #31333) [Link] (50 responses)

> FATELF seems uneccesary. Why not just put your 32-bit binaries in one filesystem, and your 64 bit ones in another. Then use unionFS to merge one or the other into your rootfs, depending on which architecture you're on. No need for a big new chunk of potentially insecure and buggy kernel code.

The point of it is to make things for users easier to deal with... forcing them to deal with UnionFS (especially when it's not part of the kernel and does not seem to ever likely to be incorporated) and using layered file systems by default on every Linux install sounds like a huge PITA to deal with.

Having 'Fat' binaries is really the best solution for OSes that want to support multiple arches in the easiest and most user-friendly way possible (especially in x86-64 were it can run 32bit and 64bit code side by side).

It's not just a matter of supporting Adobe flash or something like that, but it's just a superior technical solution for all levels from a users and system administration perspective.

> The reason why Apple invented FAT binaries is because they were interested in maintaining extensive binary compatibility with their old systems. Linux has never had this policy. Binaries that worked great on Fedora Core 9 probably won't work on Fedora Core 12, or Ubuntu 9.04, or whatever.

Actually Apple is very average when it comes to backwards compatibility. They certainly are no Microsoft. The point of fat binaries is just to make things easier for users and developers... which is exactly the entire point to having a operating system in the first place.

Some Linux kernel developers like to maintain that they support a stable ABI for userland and brag that software written for Linux in 2.0 era will still work in 2.6. In fact it seems that maintaining userspace ABI/API is a high priority for them. (Much higher then typical userland developer anyways. Application libraries are usually the bigger problem then anything with the kernel in terms of compatibility issues.)

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 21:40 UTC (Wed) by dlang (guest, #313) [Link] (48 responses)

why would you need a fat binary for a AMD64 system? if you care you just use the 32 bit binaries everywhere.

using a 64 bit kernel makes a huge difference in a system, but unless a single application uses more than 3G of ram it usually won't matter much to the app if it's 32 bit or 64 bit. there are some apps where it will matter, but those are special cases and probably not where a universal binary would be applicable.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 22:10 UTC (Wed) by drag (guest, #31333) [Link] (32 responses)

> using a 64 bit kernel makes a huge difference in a system

I do actually use a 64bit kernel with 32bit userland. With Fat binaries I would not have to give a shit one way or the other.

> but unless a single application uses more than 3G of ram it usually won't matter much to the app if it's 32 bit or 64 bit. there are some apps where it will matter, but those are special cases and probably not where a universal binary would be applicable.

Here are some issues:

* The fat binary solves the problems you run into with the transition process of moving to a 64bit system. This makes it easier for users and Linux distribution developers to cover all the multitude of corner cases. For example: Installing 'Pure 64' versions of Linux for a period of time meant that you had to give up the ability to run OpenOffice.org. This is solved now, but it's certainly not a isolated issue.

* People who actually need to run 64bit software for performance enhancements or memory requirements will have their applications 'just work' (completely regardless to whether they were 32bit or 64bit) with no requirements for complicated multi-lib setups, chroots, and other games that users have to solve. They just install it and it will 'just work'.

* Currently; If you do not need 64bit compatibility now you will probably want to install only 32bit binaries. However if in the future you run into software that requires 64bit compatibility. With the status quo it would require you to re-install the OS

* Distributions would not have to supply multiple copies of the same software packages in order to support the arches they need to support.

* Application developers (both OSS and otherwise) can devote their time more efficiently to meet the needs of their users and can treat 64bit compatibility as a optional feature that they can support when it's appropriate for them rather then being forced to move to 64bit as dictated by Linux OS design limitations.

Yeah FAT binaries only really solve 'special cases' issues with supporting multiple arches, but the number of special cases are actually high and diverse. When you examine the business market were everybody uses custom in-house software then the special cases are even more numerous then the typical problems you run into with home users.

Sure it's not absolutely required and there are lots of work arounds for each issue you run into. On a scale of 1-10 in terms of importance (were 10 is most important, and 1 is least) it ranks about a 3 or a 4, But the point is that FAT binaries is simply a superior technical solution then what we have right now, would solve a lot of usability issues, and comes from a application developer that has to deal with _real_world_ issues caused by lack of fat binaries that works with software that is really desirable for a significant number of potential Linux users.

He would of not spent all this time and effort into implementing FatElf if it did not solve a severe issue for him.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 22:52 UTC (Wed) by cmccabe (guest, #60281) [Link] (10 responses)

> * Currently; If you do not need 64bit compatibility now you will probably
> want to install only 32bit binaries. However if in the future you run into
> software that requires 64bit compatibility. With the status quo it would
> require you to re-install the OS

When you get a new computer, normally you reinstall the OS and copy over your /home directory. For all but a few highly technical users, this is the norm. Windows even has a special "feature" called Windows Genuine Advantage that forces you to reinstall the OS when the hardware has changed. You *cannot* use your previous install.

Anyway, running a Linux installer and then doing some apt-get only takes an hour or two.

> * Application developers (both OSS and otherwise) can devote their time
> more efficiently to meet the needs of their users and can treat 64bit
> compatibility as a optional feature that they can support when it's
> appropriate for them rather then being forced to move to 64bit as
> dictated by Linux OS design limitations.

FATELF has nothing to do with whether software is 64-bit clean. If some doofus is assuming that sizeof(long) == 4, FATELF is not going to ride to the rescue. (Full disclosure: sometimes that doofus has been me in the past.)

> He would of not spent all this time and effort into implementing FatElf if
> it did not solve a severe issue for him.

I can't think of even a single issue that FATELF "solves," except maybe to allow people distributing closed-source binaries to have one download link rather than two. In another 3 or 4 years, 32-bit desktop systems will be a historical curiosity, like dot-matrix printers or commodore 64s, and we will be glad we didn't put some kind of confusing and complicated binary-level compatibility system into the kernel.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 0:25 UTC (Thu) by drag (guest, #31333) [Link]

> Windows even has a special "feature" called Windows Genuine Advantage that forces you to reinstall the OS when the hardware has changed. You *cannot* use your previous install.

Windows sucks in a lot of ways, but Windows sucking has nothing to do with Linux sucking also. You can improve Linux and make it more easy to use without giving a crap what anybody in Redmond is doing.

If I am your plumber and you pay me money to fix your plumbing and I do a really shitty job at fixing it.. and you complain to me about it to me... does it comfort you when I tell you that whenever your neighbor washes his dishes that the basement floods? Does it make your plumbing better knowing that somebody else has it worse then you?

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 12:07 UTC (Thu) by nye (guest, #51576) [Link] (4 responses)

>When you get a new computer, normally you reinstall the OS and copy over your /home directory. For all but a few highly technical users, this is the norm. Windows even has a special "feature" called Windows Genuine Advantage that forces you to reinstall the OS when the hardware has changed. You *cannot* use your previous install.

I know FUD is the order of the day here at LWN, but this has gone beyond that point and I feel the need to call it:

You are a liar.

SELF: Anatomy of an (alleged) failure

Posted Jun 25, 2010 8:26 UTC (Fri) by k8to (guest, #15413) [Link]

I'm confused. FUD is the order of the day?

SELF: Anatomy of an (alleged) failure

Posted Jun 27, 2010 12:12 UTC (Sun) by nix (subscriber, #2304) [Link] (2 responses)

Well, to be charitable, WGA is an appalling intentionally-user-hostile mess that MS keep very much underdocumented, so it is reasonable to believe that this is what WGA does without being a liar. One could simply be mistaken.

(Certainly when WGA fires, it does make it *appear* that you have to reinstall the OS, because it demands that you pay MS a sum of money equivalent to a new OS install. But, no, they don't give you a new OS for that. You pay piles of cash and get a key back instead, which makes your OS work again -- until you have the temerity to change too much hardware at once; the scoring system used to determine which hardware is 'too much' is documented, but not by Microsoft.)

SELF: Anatomy of an (alleged) failure

Posted Jun 28, 2010 10:03 UTC (Mon) by nye (guest, #51576) [Link] (1 responses)

For the record, my experience of WGA is as follows:

I've never actually *seen* WGA complain about a hardware change; the only times I've ever seen it are when reinstalling on exactly the same hardware (eg 3 times in a row because of a problem with slipstreaming drivers).

In principal though, if you change more than a few items of hardware at once (obviously this would include transplanting the disk into another machine) or whenever you reinstall then Windows will ask to be reactivated. If you reactivate too many times over a short period, it will demand that you call the phone number to use automated phone activation. At some point it will escalate to non-automated phone activation where you actually speak to a person. This is the furthest I've ever seen it go, though I believe there's a further level where you speak to the person and you have to give them a plausible reason for why you've installed the same copy of Windows two dozen times in the last week. If you then can't persuade them, this would be the point where you have to pay for a new license.

This is obnoxious and hateful, to be sure, but it is entirely unlike the behaviour described. The half-truths and outright untruths directed at Windows from some parts of the open source community make it hard to maintain credibility when describing legitimate grievances or technical problems, and this undermines us all.

SELF: Anatomy of an (alleged) failure

Posted Jun 28, 2010 13:25 UTC (Mon) by nix (subscriber, #2304) [Link]

Well, that's quite different from my experience (it fired once and demanded I phone a number where a licensing goon tried to extract the cost of an entire Windows license from me despite my giving them a key: 'that key is no longer valid because WGA has fired', wtf?).

I suspect that WGA's behaviour (always ill-documented) has shifted over time, and that as soon as you hit humans on phone lines you become vulnerable to the varying behaviour of those humans. I suspect all the variability can be explained away that way.

Still, give me free software any day. No irritating license enforcer and hackability both.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 12:28 UTC (Thu) by Cato (guest, #7643) [Link]

Windows does make it hard to re-use an existing installation on new hardware, but it is certainly possible. Enterprises do this every day, and some backup tools make it possible to restore Windows partition images onto arbitrary hardware, including virtual machines.

Linux is much better at this generally, but this ability is not unique to Linux.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 17:26 UTC (Thu) by jschrod (subscriber, #1646) [Link] (2 responses)

> When you get a new computer, normally you reinstall the OS and copy over
> your /home directory.

And if you use it for anything beyond office/Web surfing, you configure the system for a few days afterwards... (Except if you have a professional setup with some configuration management behind it, which the target group of this proposal most probably doesn't have.)

> Windows even has a special "feature" called Windows Genuine Advantage
> that forces you to reinstall the OS when the hardware has changed. You
> *cannot* use your previous install.

OK, that shows that you are not a professional. This is bullshit, plain and simple: For private and SOHO users, WGA may trigger reactivation, but no reinstall. (Enterprise-class users use deployment tools anyhow and do not come in such a situation.)

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 19:04 UTC (Thu) by cmccabe (guest, #60281) [Link] (1 responses)

> OK, that shows that you are not a professional. This is bullshit, plain
> and simple: For private and SOHO users, WGA may trigger reactivation, but
> no reinstall. (Enterprise-class users use deployment tools anyhow and do
> not come in such a situation.)

Thank you for the correction. I do not use Windows at work. It's not even installed on my work machine. So I'm not familiar with enterprise deployment tools for Windows. I wasn't trying to spread FUD-- just genuinely did not know there was a way around WGA in this case.

However, the point I was trying to make is that most home users expect that new computer == new OS install. Some people in this thread have been claiming that Linux distributions need to support moving a hard disk between 32 and 64 bit machines in order to be a serious contender for desktop operation system. (And they're unhappy with the obvious solution of using 32-bit everywhere.)

I do not think that most home users, especially nontechnical ones, are aware that this is even possible with Windows. I certainly don't think they would view it as a reason not to switch.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 19:50 UTC (Thu) by vonbrand (guest, #4458) [Link]

It is much simpler than that: Very few people do move disks from one computer to the next. And those who do have the technical savvy to handle any resulting mess.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 23:29 UTC (Wed) by dlang (guest, #313) [Link] (19 responses)

you happily run 32 bit userspace on a 64 bit kernel, you already don't have to care about this.

as for transitioning, install a 64 bit system and 32 bit binaries, as long as you have the libraries on the system they will work. fatelf doesn't help you here (it may help if your libraries were all fat, but I fail to see how that's really much better than having /lib32 /lib64 (your hard drive may be large enough to double the size of everything stored on it, but mine sure isn't)

distros would still have to compile and test all the different copies of their software for all the different arch's they support, they would just combine them together before shipping them (at which point they would have to ship more CDs/DVDs and or pay higher bandwidth charges to get people copies of the binaries that don't do them any good)

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 0:37 UTC (Thu) by drag (guest, #31333) [Link] (3 responses)

> you happily run 32 bit userspace on a 64 bit kernel, you already don't have to care about this.

I do have to care about it if, in the future, I want to run a application that benefits from 64bit-ness.

Some operations are faster in 64bit and many applications, such as games, already benefit from the larger address space.

> (it may help if your libraries were all fat, but I fail to see how that's really much better than having /lib32 /lib64 (your hard drive may be large enough to double the size of everything stored on it, but mine sure isn't)

Yes. That is what I am talking about. Getting rid of architecture-specific directories and going with FatElf for everything.

Your wrong in thinking that having 64bit and 32bit support in a binary means that your doubling your system's footprint. Generally speaking the architectural-specific files in a software package is small compared to the overall size of the application. Most ELF files are only a few K big. Only rarely do they get up past a half a dozen MB.

My user directory is about 4.1GB large. Adding Fatelf support for 32bit/64bit applications would probably only plump it up a 400-600 MB or so..

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 7:56 UTC (Thu) by dlang (guest, #313) [Link] (2 responses)

If it really is such low overhead and as useful as you say, Why don't you (alone or with help from others who believe this) demonstrate this for us on a real distro.

take some distro (say ubuntu since it supports multiple architectures) download the repository (when I did this a couple years ago it was 600G, nowdays it's probably larger so it make take $150 or so to buy a USB 2TB drive, it will take you a while to download everything), then create a unified version of the distro, making all the binaries and libraries 'fat' and advertise the result. I'm willing to bet that if you did this as a plain repackaging of ubuntu with no changes you would even be able to get people to host it for you (you may even be able to get Cononical to host it if your repackaging script is simple enough)

I expect that the size difference is going to be larger than you think (especially if you include every architecure that ubuntu supports, not just i486 and AMD64), and this size will end up costing performance as well as having effects like making it hard to create an install CD etc.

I may be wrong and it works cleanly, in which case there will be real ammunition to go back to the kernel developers with (although you do need to show why you couldn't just use additional ELF sections with a custom loader instead as was asked elsewhere)

If you could do this and make a CD like the ubuntu install CD, but one that would work on multiple architectures (say i486, AMD64, powerPC) that would get people's attention. (but just making a single disk that does this without having the rest of a distro to back it up won't raise nearly the interest that you will get if you can script doing this to an entire distro)

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 12:12 UTC (Thu) by nye (guest, #51576) [Link] (1 responses)

>If it really is such low overhead and as useful as you say, Why don't you (alone or with help from others who believe this) demonstrate this for us on a real distro.

Because the subject of this article already did that: http://icculus.org/fatelf/vm/

It's not as well polished as it could be - I got the impression that he didn't see much point in improving it after it was dismissed out of hand.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 20:47 UTC (Thu) by MisterIO (guest, #36192) [Link]

IMO the problem with FatELF isn't that nobody proved that it was doable(because they did that), but that nobody really acknowledeged that there's any real problem with the current situation.

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 24, 2010 1:24 UTC (Thu) by cesarb (subscriber, #6266) [Link] (14 responses)

> your hard drive may be large enough to double the size of everything stored on it, but mine sure isn't

And some are very small indeed. One of my machines has only a 4 gigabyte "hard disk" (gigabyte, not terabyte). It is a EeePC 701 4G. (And it is in fact a small SSD, thus the quotes.)

There is also the Live CDs/DVDs, which are limited to a fixed size. Fedora is moving to use LZMA to cram even more stuff into its live images (https://fedoraproject.org/wiki/Features/LZMA_for_Live_Images). Note also that installing from a live image, at least on Fedora and IIRC on Ubuntu, is done by simply copying the whole live image to the target disk, so the size limitations of live images directly influence what is installed by default.

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 24, 2010 9:06 UTC (Thu) by ncm (guest, #165) [Link] (2 responses)

This, very incidentally, is one of the reasons I object to Gnome forcing a dependency on Nautilus into gnome-session. In practice, Gnome works fine without Nautilus, once you jimmy the gnome-session package install and poke exactly one gconf entry. That saves 60M on disk, and a gratifying amount of RAM/swap. It's only arrogance and contempt that makes upstream keep the dependency.

Disconnecting nautilus from gnome session

Posted Jun 24, 2010 20:56 UTC (Thu) by speedster1 (guest, #8143) [Link] (1 responses)

I know this is off-topic... but would you mind giving a little more detail on how to remove the nauilus dependency?

Disconnecting nautilus from gnome session

Posted Jun 26, 2010 9:56 UTC (Sat) by ncm (guest, #165) [Link]

In gconf-editor, go to desktop/gnome/session/, and change required_components_list to "[windowmanager,panel]".

While we're way, way off topic, you might also want to go to desktop/gnome/interface and change gtk_key_theme to "Emacs" so that the text edit box keybindings (except in Epiphany, grr) are Emacs-style.

Contempt, thy name is Gnome.

Getting back on topic, fat binaries makes perfect sense for shared libraries, so they can all go in /lib and /usr/lib. However, there's no reason to think anybody would force them on you for an EEE install.

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 24, 2010 12:30 UTC (Thu) by Cato (guest, #7643) [Link]

Nobody is suggesting that everyone should have to double the size of their binaries - most distros would use single architecture binaries. FatELF is a handy feature for many special cases, that's all.

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 24, 2010 17:03 UTC (Thu) by chad.netzer (subscriber, #4257) [Link] (9 responses)

Which reminds me, why aren't loadable binaries compressed on disk, and uncompressed on the fly? Surely any decompression overhead is lower than a rotating storage disk seek, and common uncompressed binaries would get cached anyways. I suppose it's because it conflicts with mmap() or something.

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 24, 2010 19:40 UTC (Thu) by dlang (guest, #313) [Link] (8 responses)

it all depends on your system.

do you have a SSD?

are you memory constrained (decompressing requires that you have more space than the uncompressed image)

do you page out parts of the code and want to read in just that page later (if so, you would have to uncompress the entire binary to find the appropriate page)

what compression algorithm do you use? many binaries don't actually compress that well, and some decompression algorithms (bzip2 for example) are significantly slower than just reading the raw data.

I actually test this fairly frequently in dealing with processing log data. in some condititions having the data compressed and uncompressing it when you access it is a win, in other cases it isn't.

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 25, 2010 0:14 UTC (Fri) by chad.netzer (subscriber, #4257) [Link] (7 responses)

Yeah, I made reference to some of the gotchas (spindles, mmap/paging). Actually, it sounds like the kind of thing that, should you care about it, is better handled by a compressed filesystem mounted onto the bin directories, rather than some program loader hackery.

Still, why the heck must my /bin/true executable take 30K on disk? And /bin/false is a separate executable that takes *another* 30K, even though they are both dynamically linked to libc??? Time to move to busybox on the desktop...

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 25, 2010 0:38 UTC (Fri) by dlang (guest, #313) [Link] (4 responses)

re: size of binaries

http://www.muppetlabs.com/~breadbox/software/tiny/teensy....

A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 25, 2010 2:41 UTC (Fri) by chad.netzer (subscriber, #4257) [Link] (3 responses)

Yes, I'm familiar with that old bit of cleverness. :) Note that the GNU coreutils stripped /bin/true and /bin/false executables are more than an order of magnitude larger than the *starting* binary that is whittled down in that demonstration. Now, *that* is code bloat.

To be fair getting your executable much smaller than the minimal disk block size is just a fun exercise. Whereas coreutils /bin/true may actually benefit from an extent based filesystem. :) Anyway, it's just a silly complaint I'm making, though it has always annoyed me a tiny bit.

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 25, 2010 12:25 UTC (Fri) by dark (guest, #8483) [Link] (2 responses)

Yes, but GNU true does so much more! It supports --version, which tells you all about who wrote it and about the GPL and the FSF. It also supports --help, which explains true's command-line options (--version and --help). Then there is the i18n support, so that people from all over the world can learn about --help and --version. You just don't get all that with a minimalist ELF binary.

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 25, 2010 15:38 UTC (Fri) by intgr (subscriber, #39733) [Link] (1 responses)

Indeed, I use those features every day! ;)

PS: Shells like zsh actually ship builtin "true" and "false" commands

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 29, 2010 23:03 UTC (Tue) by peter-b (guest, #66996) [Link]

So does POSIX sh. The following command is equivalent to true:

:

The following command is equivalent to false:

! :

I regularly use both when writing shell scripts.

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 27, 2010 16:42 UTC (Sun) by nix (subscriber, #2304) [Link]

There are two separate binaries because the GNU Project thinks it is confusing to have single binaries whose behaviour changes depending on what name they are run as, even though this is ancient hoary Unix tradition. Apparently people might go renaming the binaries and then get confused when they don't work the same. Because we do that all the time, y'know.

(I think this rule makes more sense on non-GNU platforms, where it is common to rename *everything* via --program-prefix=g or something similar, to prevent conflicts with the native tools. But why should those of us using the GNU toolchain everywhere be penalized for this?)

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 27, 2010 16:46 UTC (Sun) by nix (subscriber, #2304) [Link]

The size, btw, is probably because the gnulib folks have found bugs in printf which the glibc folks refuse to fix (they only cause buffer overruns or bus errors on invalid input, after all, how problematic could that be?) so GNU software that uses gnulib will automatically replace glibc's printf with gnulib's at configure time. (That this happens even for things like /bin/true, which will never print the class of things that triggers the glibc printf bugs, is a flaw, but not a huge one.)

And gnulib, because it has no stable API or ABI, is always statically linked to its users.

26Kb for a printf implementation isn't bad.

SELF: Anatomy of an (alleged) failure

Posted Jun 27, 2010 12:08 UTC (Sun) by nix (subscriber, #2304) [Link]

* Currently; If you do not need 64bit compatibility now you will probably want to install only 32bit binaries. However if in the future you run into software that requires 64bit compatibility. With the status quo it would require you to re-install the OS
So, because some distribution's biarch support sucks enough that it can't install a bunch of 64-bit dependencies into /lib64 and /usr/lib64 when you install a 64-bit binary, we need a kernel hack?

Please. There are good arguments for FatELF, but this is not one of them.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 23:34 UTC (Wed) by cortana (subscriber, #24596) [Link] (11 responses)

> why would you need a fat binary for a AMD64 system? if you care you just use the 32 bit binaries everywhere.

So I could use Flash.

So I could buy a commercial Linux game and run it without having to waste time setting up an i386 chroot or similar.

Both areas that contribute to the continuing success of Windows and Mac OS X on the desktop.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 2:27 UTC (Thu) by BenHutchings (subscriber, #37955) [Link] (9 responses)

FatELF might make it somewhat easier for Adobe or the game developer to distribute x86-64 binaries to those that can use them, but if they don't intend to build and support x86-64 binaries then it doesn't solve your problem.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 9:10 UTC (Thu) by cortana (subscriber, #24596) [Link] (8 responses)

FatELF would have made it easier for distributors to ship a combined i386/amd64 distro. This would have made it possible for them to ship i386 libraries that are required to support i386 web browsers, for Flash, and i386 games and other proprietary software.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 10:43 UTC (Thu) by michich (guest, #17902) [Link] (7 responses)

But what you describe already works today and FatELF is not needed for it. It's called multilib.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 11:00 UTC (Thu) by cortana (subscriber, #24596) [Link] (6 responses)

But it doesn't work today, at least on any Debian-derived distribution. You have to rely on the ia32-libs conglomeration package to have picked up the right version of a library that you want, when it was last updated, which is not a regular occurrence (bugs asking that libraries for Flash 10 are still open 2 years later).

Even if Debian did have an automatic setup for compiling all library packages with both architectures, you are then screwed because they put the amd64 libraries in /lib (with a symlink at /lib64) and the i386 libraries in /lib32. So your proprietary i386 software that tries to dlopen files in /lib fails because they are of the wrong architecture.

You could argue that these are Debian-specific problems. You might be right. But they are roadblocks to greater adoption of Linux on the desktop, and now that the FatELF way out is gone, we're back to the previous situation: waiting for the 'multiarch' fix (think FatELF but with all libraries in /lib/$(arch-triplet)/libfoo.so rather than the code for several architectures in a FatELF-style, single /lib/libfoo.so), which has failed to materialise in the 6 years since I first saw it mentioned. And which still won't fix proprietary software that expects to find its own architecture's files at /lib.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 17:44 UTC (Thu) by vonbrand (guest, #4458) [Link]

That multilib doesn't work on Debian is squarely Debian's fault (my Fedora here is still not completely 32-bit free, but getting there). No need to burden the kernel for that.

SELF: Anatomy of an (alleged) failure

Posted Jun 27, 2010 12:31 UTC (Sun) by nix (subscriber, #2304) [Link] (4 responses)

Even if Debian did have an automatic setup for compiling all library packages with both architectures, you are then screwed because they put the amd64 libraries in /lib (with a symlink at /lib64) and the i386 libraries in /lib32. So your proprietary i386 software that tries to dlopen files in /lib fails because they are of the wrong architecture.
I've run LFS systems with the /lib / /lib32 layout for many years (because I consider /lib64 inelegant on a principally 64-bit system). You know how many things I've had to fix because they had lib hardwired into them? *Three*. And two of those were -config scripts (which says how old they are right then and there, modern stuff would use pkg-config). Not one was a dlopen(): they all seem to be using $libdir as they should.

This simply is not a significant problem.

SELF: Anatomy of an (alleged) failure

Posted Jun 27, 2010 13:14 UTC (Sun) by cortana (subscriber, #24596) [Link] (1 responses)

I'm very happy that you did not run into this problem, but I have. IIRC it was with Google Earth. strace clearly showed it trying to dlopen some DRI-related library, followed by it complaining about 'wrong ELF class' and quitting.

SELF: Anatomy of an (alleged) failure

Posted Jun 27, 2010 17:48 UTC (Sun) by nix (subscriber, #2304) [Link]

Well, DRI is a whole different kettle of worms. I suspect a problem with your OpenGL implemementation, unless Google Earth has a statically linked one (ew).

(Words cannot express how much I don't care about statically linked apps.)

SELF: Anatomy of an (alleged) failure

Posted Jul 10, 2010 12:31 UTC (Sat) by makomk (guest, #51493) [Link] (1 responses)

Yeah, dlopen() problems with not finding libraries in /lib32 don't tend to happen, mostly because it's just easier to do it the right way from the start and let dlopen() search the appropriate directories. (Even on pure 32-bit systems, some libraries are in /lib on some systems, /usr/lib on others, and perhaps even in /usr/local/lib or $HOME/lib if they've been manually installed.)

SELF: Anatomy of an (alleged) failure

Posted Jul 10, 2010 20:24 UTC (Sat) by nix (subscriber, #2304) [Link]

dlopen() doesn't search directories for you, does it? Programs generally want to look in a subdirectory of the libdir, anyway. Nonetheless they almost all look in the right place.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 18:43 UTC (Thu) by Spudd86 (guest, #51683) [Link]

FatELF won't have any effect on the flash situation at all, it has nothing to do with shipping one or two binaries, Adobe just doesn't care enough about 64-bit Linux to ship flash for it, that's it, FatELF won't change that, and it won't magically make 32<->64 dynamic linking work either, they are different ABI's and you'd still need a shim layer

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 7:55 UTC (Thu) by jengelh (subscriber, #33263) [Link] (2 responses)

>but unless a single application uses more than 3G of ram it usually won't matter much to the app if it's 32 bit or 64 bit.

Hell it will. Unless the program in question directly uses hand-tuned assembler, the 32-bit one will usually not run with SSE2, just the olde x87, which is slower, will be any computations involving larger-than-32 integers..

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 18:08 UTC (Thu) by pkern (subscriber, #32883) [Link] (1 responses)

Which is only partly true. Look into (/usr)?/lib/i686 and you'll see libs that will be loaded by the linker in preference to the plain ia32 ones if the hardware supports more than the least common denominator. It even works with /usr/lib/sse2 here on Debian if the package has support for it (see ATLAS or speex).

But of course, normally you don't rely on newer features everywhere, breaking support for older machines. Ubuntu goes i686 now, Fedora's already there, I think; and if you want more optimization I guess Gentoo is the way to go because you don't have to think portable. ;-)

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 18:25 UTC (Thu) by jengelh (subscriber, #33263) [Link]

Indeed, but that is for libraries only, it does not catch code inside programs or dlopened plugins.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 7:43 UTC (Thu) by tzafrir (subscriber, #11501) [Link]

If I make a multi-arch CD (i386+powerpc, for instance) I already have to work aorund a number of issues. The ability to use standard binaries from packages rather than rebuilding my own as fat ones, is a GoodThing. I have to mess up with a unionfs anyway, for the writable file system.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 20:47 UTC (Wed) by Frej (guest, #4165) [Link]

>FATELF seems uneccesary. Why not just put your 32-bit binaries in one filesystem, and your 64 bit ones in another. Then use unionFS to merge one or the other into your rootfs, depending on which architecture you're on. No need for a big new chunk of potentially insecure and buggy kernel code.

Assuming it's not a sin to distribute binaries, how would unionfs help when you download a binary?

>The reason why Apple invented FAT binaries is because they were interested in maintaining extensive binary compatibility with their old systems. Linux has never had this policy. Binaries that worked great on Fedora Core 9 probably won't work on Fedora Core 12, or Ubuntu 9.04, or whatever.

Well without FATelf you need two binaries for each Fedora Core release. But ofcourse if you just want linux for servers and admins - FATEF won't matter that much.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 21:32 UTC (Wed) by RCL (guest, #63264) [Link] (17 responses)

Those who want "year of Linux desktop" (i.e. adoption by wide masses) to come should treat maintaining binary compatibility (backward and/or between major distros) as the most important goal...

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 21:53 UTC (Wed) by dlang (guest, #313) [Link] (14 responses)

you have to establish compatibility before you can worry about maintaining it ;-)

what features are you willing to give up to get your universal compatibility?

as a trivial example, if an application needs to store some data and the upstream support sqlite, mysql, postgresql, flat files, or various 'desktop storage' APIs, which one should the universal binary depend on? and why?

KDE and Gnome each have their 'standard' tool for storing contact information, should Gnome users be force to load KDE libraries and applications (or KDE users forced to use the Gnome ones) to maintain compatibility?

what if someone comes up with something new, should that be forbidden/ignored so that a universal binary can work on older systems that don't have the new software?

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 22:53 UTC (Wed) by RCL (guest, #63264) [Link] (4 responses)

I don't really have well-thought solution, but more or less it's like this:

1) A single entity (with dictatorship rights) is designated to maintain core "system" in a way similar to how Linux kernel itself (or *BSD) is maintained. A new platform name is defined (or, ideally, "Linux" is redefined to mean kernel + core system).

2) Entity picks a set of core libraries which it is capable to actually maintain (and guarantee the backward compatibility for) and no compatible system is allowed to replace it/enhance/modify it in any way (even recompiling the kernel locally) without losing the (official) compatibility and ability to use platform name (which should be made a trademark).

3) Versioning policy is similar to Apple or Windows: every update (other than security fixes) should have a given name and version (with means to check that from code). The platform is updated in its entirety only, bugfixes are accumulated and introduced all at once.

I think that it is sufficient for the above set of libraries to include only functionality needed to write a game (generally speaking, any application with low-latency interactive video and audio).

In some ways it is similar to creating another distro dedicated to binary stability and binary multimedia applications, but it is not intended to be a full-blown distro with its own package management and policies, just a well-defined set of binary libraries + kernel.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 23:16 UTC (Wed) by dlang (guest, #313) [Link] (1 responses)

this is exactly what the Linux Standards Base is attempting to do.

unfortunately in practice it just doesn't work. This may be because they don't have sufficient dictatorial powers, but nobody wants to give them that much power ;-)

as for redefining what 'Linux' means, good luck with that windmill.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 23:41 UTC (Wed) by RCL (guest, #63264) [Link]

I think that LSB is not exactly the same, but is much wider in scope - it even dictates the installer. And it is a certification board, not a vendor so it cannot produce/maintain the aforementioned core system.

And overall... well, I'm not going to fight for that binary compatibility. I'm a game developer, sympathetic to Linux, but my target platforms are wildly different.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 16:29 UTC (Thu) by sorpigal (guest, #36106) [Link] (1 responses)

This sounds an awfully lot like Debian, to me.

SELF: Anatomy of an (alleged) failure

Posted Jun 27, 2010 12:14 UTC (Sun) by nix (subscriber, #2304) [Link]

More like a sort of really crippled and inflexible FreeBSD, with all ports forced to update only when the OS has a major version number bump: if you want a bugfix you have to wait for another giant mass of features to land. Great idea, not.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 23:03 UTC (Wed) by drag (guest, #31333) [Link] (8 responses)

> as a trivial example, if an application needs to store some data and the upstream support sqlite, mysql, postgresql, flat files, or various 'desktop storage' APIs, which one should the universal binary depend on? and why?

Well presumably with 'Fat Binary Support' the Linux distribution will take advantage of that to provide Fat binaries for their main OS.

That way you avoid dealing with issues with having to do ugly hacks like maintain separate */lib and */lib64 things. So the application author should not have deal with issues like that (unless I am missing some aspect of SQL databases datatypes differences between 32 and 64bit arches.)

A distro moving to "Fat binay support" model should simultaneously be able to support backwards compatibility with 32bit legacy applications and be prepared to deal with the shiny new 64bit future without forcing users and application developers to deal with the details.

--------------------------------

From my personal experiences sharing my home directory between multiple versions of Debian with different arches (64bit/32bit, and PPC 32bit) the only big issue with compatibility on application storage was with X-Moto and it's use of sqlite to store game information. It had to do with endiness issues between x86 and PPC, but I think it was actually fixed at a later date...

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 23:19 UTC (Wed) by dlang (guest, #313) [Link] (7 responses)

you don't need fat binaries to run 32 bit binaries on 64 bit systems, you just need the right libraries on the system, and a fat binary doesn't help you there (if you are on a 64 bit system but only have 32 bit versions of some library that the app needs, should you run the 32 bit version?)

the OP was wanting a single flat binary that would run on every distro, doing that requires that all distros agree on what datastore to use when the application can be compiled to work with many different ones.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 23:49 UTC (Wed) by drag (guest, #31333) [Link] (6 responses)

> you don't need fat binaries to run 32 bit binaries on 64 bit systems,

Of course not. It just makes it a more difficult and irritating for users and developers and distribution makers to support mutli-arch and play games with having multiple locations and packages for the same pieces of software.

There is a reason I run 64bit kernel with 32bit userland on my Linux systems nowadays.. I tried running 64bit only and things like that, but it's a PITA to do that in Linux while 64bit application support is trivial (for end users) in OS X...

> you just need the right libraries on the system, and a fat binary doesn't help you there (if you are on a 64 bit system but only have 32 bit versions of some library that the app needs, should you run the 32 bit version?)

Well if I only have a 32bit-only version of a library (instead of the vastly preferable 32/64 fat binary library) then that would presume that only 32bit versions of that library exists.

Therefore a 64bit version (or a 32bit/64bit 'Fat binary' version) of a application that depends on that library would be impossible to have in the first place... right?

Either way it's not really any different then what we have to deal with it now, but with Fat binary support it would be handled intelligently by the system were as right now it requires a lot of manual intervention and a significant technical understanding on the part of end users to deal with these sort of compatibility issues.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 23:53 UTC (Wed) by drag (guest, #31333) [Link]

> I tried running 64bit only

Well for a better understanding; I was running Debian while attempting to juggle both 64bit and 32bit compatibility for the various applications I needed to run. Fedora is a bit better...

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 0:59 UTC (Thu) by dlang (guest, #313) [Link] (4 responses)

I run 64 bit only systems everywhere (although ubuntu uses ndis wrappers to run 32 bit flash for firefox) and don't run into problems.

I will admit I don't run binary-only software (i.e. commercial games) on most of my systems, but that's more due to the lack of commercial games available for linux than anything else.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 1:11 UTC (Thu) by RCL (guest, #63264) [Link] (3 responses)

Just by the way... Given the above discussion, what would you recommend for a developer that wants to ship a binary-only Linux game (or game-like application) and to target as wide userbase as possible?

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 19:59 UTC (Thu) by vonbrand (guest, #4458) [Link]

I'd guess 32 bit (for oldish machines and netbooks). But some serious gamers I know spend more on their grapics card than I do on a complete machine, for for high-end 64 bit 2 (or even 4) cores is probably the way to go.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 21:41 UTC (Thu) by MisterIO (guest, #36192) [Link] (1 responses)

I may be somewhat naive here, but what about 32 and 64 bit versions of .deb and .rpm packages?

SELF: Anatomy of an (alleged) failure

Posted Jun 25, 2010 13:50 UTC (Fri) by vonbrand (guest, #4458) [Link]

On current Fedora, you can install 32 and 64 bit versions happyly (most of the time), the installed packages do share non-architectured stuff (like manpages and whatnot). Yes, it does require some delicate juggling when building the packages to make sure said manpages and such are exactly equal and some other considerations.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 1:37 UTC (Thu) by akumria (guest, #7773) [Link]

What is "wide masses" in your regard?

1% of the global population?

10% of all computer users?

100% of all operating systems?

In some areas it has been the "year of the Linux desktop" since 1997, for others, they are just starting.

An example is in this week's LWN. Poseidon Linux. Year of the scientific Linux desktop since 2004.

Anand

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 8:08 UTC (Thu) by jengelh (subscriber, #33263) [Link]

Linux is backward compatible - you can still run binaries once compiled for Linux 2.0, and there are even reports that some ddated to 0.99 work. Ask tytso.

SELF: Anatomy of an (alleged) failure

Posted Jun 23, 2010 21:53 UTC (Wed) by Tara_Li (guest, #26706) [Link] (29 responses)

Um... FatELF - Ok, we're going to have one binary that runs on... i386, x86-64, itanium, a dozen different ARMs, Sparcs... I think Alpha support got dropped somewhere along the way... IBM Sys/390s... You know, somewhere along that line, you're dropping a 2 or 3 gigabyte binary file on my machine just to run Mozilla?

Bah. I really don't see a good case for FatELF.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 3:23 UTC (Thu) by ccurtis (guest, #49713) [Link] (5 responses)

It seems fairly plain to me. Look at all the different flavors of ARM and MIPS and VIA and A3 and Atom cores that people carry around in their handheld computers. When the day comes that you don't have to depend on an iStore or the App Market or Obj-C or Dalvik or whatever, and you just want to ship your 5MB game binary with its 500MB of textures without making your customers dig through lists of every cell phone model in existence, FatELF might actually be rather handy.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 5:01 UTC (Thu) by bronson (subscriber, #4806) [Link] (2 responses)

IF that day comes (I'm skeptical -- architectural diversity seems to be increasing), I expect kernel devs will be more receptive to it.

Until then, it seems like you're trying to merge the solution before the problem even exists.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 13:37 UTC (Thu) by ccurtis (guest, #49713) [Link] (1 responses)

I'm not necessarily arguing for FatELF, but isn't anticipating the market and having a solution before something becomes a problem the very definition of innovation?

Personally, I like the idea of having solutions rather than problems.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 17:11 UTC (Thu) by chad.netzer (subscriber, #4257) [Link]

Except when you guess wrong, and burden everyone with a worse problem. (Many examples exist, though RAMBUS jumps to mind)

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 20:02 UTC (Thu) by vonbrand (guest, #4458) [Link]

Then ship the application as a .jar (or whatever the virtual machine du jour might be) file. Problem solved.

SELF: Anatomy of an (alleged) failure

Posted Jun 25, 2010 15:45 UTC (Fri) by intgr (subscriber, #39733) [Link]

As has been mentioned above, this problem is already solved. Shell scripts run on pretty much all Linux devices and are perfectly adequate for choosing the right binary to execute.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 17:49 UTC (Thu) by pj (subscriber, #4506) [Link] (3 responses)

I'm less worried about Mozilla and more interested in lib* so that a FatELF-aware gcc/linker can do cross compiles easily. Ever tried doing a build for an embedded box like ppc or arm on a non-ppc or arm machine? The toolchain is *painful* because you have to make sure to like all the right libs from all the right places for the destination arch plus you have to tell them that although on the current system they're found in /lib/arch-foo/, on the destination system they'll be in /lib ... total PITA. FatELF would provide a solution to that: all the libs are in... /lib. Done, period, end of story, picking the right segment out of the ELF file is something that the linker should do (and complain if it's not found!).

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 19:13 UTC (Thu) by tzafrir (subscriber, #11501) [Link]

What will it take to create them?

Specifically, I have libfoo installed for i386 from my distro. I now want to install libfoo for mips (or even worse: the powerpc variant of the day). Does it mean I have to modify /usr/lib/libfoo.so.1 as shipped by my distro?

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 19:43 UTC (Thu) by dlang (guest, #313) [Link]

If only it worked that easily.

sometimes you need different versions of compilers for different architectures.

go read Rob Landley's blog for ongoing headaches in cross compiling.

having the results all in one file is trivial compared to all the other problems.

SELF: Anatomy of an (alleged) failure

Posted Jun 25, 2010 15:59 UTC (Fri) by vonbrand (guest, #4458) [Link]

Your "FatELF aware toolchain" is the sum total of the separate cross-toolchains, so there is no real gain here. That said, GCC has been the cross compiler of choice for most of its life, so it has quite a set of options for doing what you want, cleanly. Not your everyday use, sure, so it can be rough going.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 23:36 UTC (Thu) by Tet (subscriber, #5433) [Link] (18 responses)

You know, somewhere along that line, you're dropping a 2 or 3 gigabyte binary file on my machine just to run Mozilla? Bah. I really don't see a good case for FatELF.

Yeesh. Everyone is bringing up countless examples of where FatELF could be abused and claiming that it's therefore useless. But no one has mentioned that FatELF solves some very real problems, problems that I encounter on a fairly regular basis. Here's a hint: if you don't want to use fat binaries, then don't. I'll guarantee you that even if it were included upstream, Fedora/Debian/OpenSUSE/Ubuntu etc would continue to release architecture specific images. But for some of us, that's not good enough, and FatELF is one solution to the problem. If people want to suggest others, I'm all ears...

SELF: Anatomy of an (alleged) failure

Posted Jun 25, 2010 13:54 UTC (Fri) by vonbrand (guest, #4458) [Link] (17 responses)

Please enlighten us to the recurring problems you have that FatELF would solve.

For my part, I haven't run into any situation that didn't have a simple solution which did not involve changing the kernel and the whole buildchain. Doing so adds so much overhead that the problem would have to be humongous to make it worth my while, but you can leave that consideration out if you like.

SELF: Anatomy of an (alleged) failure

Posted Jun 25, 2010 18:57 UTC (Fri) by Tet (subscriber, #5433) [Link] (16 responses)

Please enlighten us to the recurring problems you have that FatELF would solve

There's only one ~/.mozilla/plugins

Since my $HOME is NFS mounted across a mix of 32-bit and 64-bit OSes, I'm basically screwed. 32-bit plugins won't work with a 64-bit Firefox and vice versa. Yes, you could argue the application should be fixed, but the same applies to gimp and to countless other apps, which means an awful lot of applications are out there to fix. If I could get a fat libflashplayer.so, for example, everything would Just Work™. I'm not suggesting that the whole OS should be fat binaries/shared libraries. But I'd like the option to use them where they make sense, as I believe they do here. Again, if you have a simple solution that doesn't involve FatELF or something similar, please let me know.

SELF: Anatomy of an (alleged) failure

Posted Jun 25, 2010 19:35 UTC (Fri) by dlang (guest, #313) [Link] (10 responses)

before you worry about getting a fat libflashplayer.so there first needs to be a 64 bit libflashplayer.so for you to use and merge with the 32 bit one.

users of 64 bit desktops still user the 32 bit libflashplayer.so run though a ndiswrapper layer.

so this is not a case where FatELF would help in practice.

even in theory, firefox doesn't have to have the plugin binaries under ~/ so if you don't install them there and instead install them in one of the other places that it can live you would be able to NFS mount $HOME without a problem.

the same thing goes for any application that uses plugins. the plugin binaries should be able to be installed outside of $HOME. If they can be, then the application can work if $HOME is shared.

SELF: Anatomy of an (alleged) failure

Posted Jun 25, 2010 21:16 UTC (Fri) by Tet (subscriber, #5433) [Link] (9 responses)

Ye gods, does it really take much effort to see past the lack of a 64-bit flash plugin (which incidentally, I do have, even if it's been discontinued by Adobe)? The same applies to any plugin. Forget that I mentioned flash, and think instead about a java plugin or an acroread plugin, or any other plugin you care to think of.

the same thing goes for any application that uses plugins. the plugin binaries should be able to be installed outside of $HOME. If they can be, then the application can work if $HOME is shared.

Yes, but here in the real world, they're not. Even if you could find a suitable location that would be writeable by a non-privileged user, it would mean changing the applications, and as I mentioned, there are many, many of those. Simply making a fat shared library possible would be a much easier and cleaner solution, and would have negligible impact on those that didn't want or need to use it. I don't understand why so many are opposed to it.

SELF: Anatomy of an (alleged) failure

Posted Jun 25, 2010 21:27 UTC (Fri) by dlang (guest, #313) [Link] (8 responses)

re: 64 bit flash

you do realize that the version you have is vunerable to exploits that are being used in the wild. Adobe decided to discontinue the 64 bit version instead of fixing it.

a fat plugin is only useful if you also have fat libraries everywhere. This directly contridicts posts earlier that said not to worry about the bload as the distros would still ship non-fat distros.

by the way, do you expect plugs to work across different operating systems as well so that you can have your $HOME NFS mounted on MACOS as well? where do you draw the line at what yo insist is needed?

SELF: Anatomy of an (alleged) failure

Posted Jun 25, 2010 21:40 UTC (Fri) by Tet (subscriber, #5433) [Link] (7 responses)

Yes, I do know about the vulnerabilities with 64-bit flash. But like I said, this conversation isn't about flash. Despite your claim, I don't need fat libraries everywhere. On the 32-bit machines, I would already have the corresponding 32-bit libraries installed. And on the 64-bit machines, I'd have the 64-bit libraries installed. No, I don't use OS X, nor is it relevant to this discussion. If a particular approach solves a problem (as it will here), it's probably worthwhile, even if it doesn't solve every problem. I say again, why are you so anti fat binaries/libraries?

SELF: Anatomy of an (alleged) failure

Posted Jun 25, 2010 22:17 UTC (Fri) by dlang (guest, #313) [Link] (5 responses)

people supporting want a Fat binary that's only fat enough for your particular system, but claim that having fat binaries would solve distribution problems because there wouldn't need to be multiple copies.

these two conflict, if you are going to support FAT binaries for every possible combination of options the distribution problem is much larger. If you are going to want the fat binary to support every possible system in a single binary it's going to be substantially larger.

it's not that I am so opposed to the idea of fat binaries as it is I don't see them as being that useful/desirable. the problems they are trying to address seem to be solvable by other means pretty easily, and there is not much more than hand waving over the cost.

SELF: Anatomy of an (alleged) failure

Posted Jun 26, 2010 7:33 UTC (Sat) by Tet (subscriber, #5433) [Link] (4 responses)

I'm not claiming fat binaries solve any particular distribution problem, nor do I believe that their existence means that fat binaries must cover every possible combination. In fact, just shipping a combined ia32 and x86_64 binary would cover 99% of the real world machines. But even if you don't want to ship a fat binary, it's not hard to envisage tools that would allow an end user to create a fat binary from two (or more) slim ones.

I've outlined a case where it would be both useful and desirable to have them, and to date, I haven't seen any sensible alternatives being proposed.

SELF: Anatomy of an (alleged) failure

Posted Jun 26, 2010 15:08 UTC (Sat) by vonbrand (guest, #4458) [Link] (3 responses)

Just use two binary packages, with non-architecure-dependent stuff exactly the same, and arrange for the package manager to manage files belonging to several packages. RPM does this, and it works.

No need to screw around with the kernel, no need to have 3 versions of the package (arch 1, arch 2, fat).

SELF: Anatomy of an (alleged) failure

Posted Jun 26, 2010 18:20 UTC (Sat) by tzafrir (subscriber, #11501) [Link] (1 responses)

It works for rpm when all those shared fiels are identical in every package.

But what you want to do here is that rpm will be able to merge files from different packages into a single file on disk. This won't work.

SELF: Anatomy of an (alleged) failure

Posted Jun 26, 2010 21:49 UTC (Sat) by dlang (guest, #313) [Link]

and fat binaries won't work in cases where you need different config file options for different architectures and the config file is on a shared drive

SELF: Anatomy of an (alleged) failure

Posted Jun 27, 2010 16:11 UTC (Sun) by Tet (subscriber, #5433) [Link]

<paxman>
Answer the question!
</paxman>

Your "solution" doesn't solve the problem where application plugins are concerned. Firstly, the majority them are not installed using the system package manager in the first place, and secondly, it's utterly irrelevant anyway. You can't package the achitecture specific bits separately, because the application only looks for them in one place. As I said right at the start, it would be good to fix the applications, but there are a hell of a lot of them. Fat binaries would solve the problem. Your suggestions wouldn't, without first also patching the apps.

SELF: Anatomy of an (alleged) failure

Posted Jun 27, 2010 17:17 UTC (Sun) by nix (subscriber, #2304) [Link]

FatELF wouldn't help with OSX anyway: OSX doesn't use ELF.

SELF: Anatomy of an (alleged) failure

Posted Jun 27, 2010 17:15 UTC (Sun) by nix (subscriber, #2304) [Link] (4 responses)

Ah, right. So we don't need completely arbitrary fat binaries at all: we need a 'fat dlopen()'.

I suspect -- though it's a kludge -- you could do this with an LD_PRELOADed wrapper around dlopen() which tweaks the filename appropriately, and slight changes to the downloading parts of e.g. firefox to put its dynamically loaded stuff in per-arch subdirectories when autodownloaded. dlopen() is not a hidden symbol so should be vulnerable to interposition.

SELF: Anatomy of an (alleged) failure

Posted Jun 27, 2010 17:55 UTC (Sun) by Tet (subscriber, #5433) [Link] (3 responses)

So we don't need completely arbitrary fat binaries at all: we need a 'fat dlopen()'

To solve this particular problem, yes. But then Ryan's FatELF release supported dlopen()ing fat shared libraries.

SELF: Anatomy of an (alleged) failure

Posted Jun 27, 2010 20:36 UTC (Sun) by nix (subscriber, #2304) [Link] (2 responses)

Yes, but if that's all you need to do, the kernel side of FatELF is superfluous.

SELF: Anatomy of an (alleged) failure

Posted Jun 28, 2010 8:53 UTC (Mon) by Tet (subscriber, #5433) [Link] (1 responses)

Oh agreed, and in this particular case, it's not necessary. However, there are other situations where full fat binaries might be a win. I just get extremely annoyed by people claiming that the whole concept of multi-arch binaries is useless just because they happen to not have a valid use for them, and are unable to see that others might have.

SELF: Anatomy of an (alleged) failure

Posted Jun 28, 2010 13:22 UTC (Mon) by nix (subscriber, #2304) [Link]

The attitude appears to be 'distributors don't need them therefore they are useless'. This seems, to me, more than a little shortsighted...

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 7:35 UTC (Thu) by epa (subscriber, #39769) [Link] (1 responses)

The reason why Apple invented FAT binaries is because they were interested in maintaining extensive binary compatibility with their old systems. Linux has never had this policy.
Might this not change? Perhaps one reason Linux has never kept backwards compatibility as well as Apple (or Windows, or Solaris) is because we haven't had the infrastructure and tools to do so easily. A mechanism for fat binaries might be one piece of the puzzle.

Be careful not to fall into the classic trap of equating 'my favourite system cannot support X' with 'X is unworkable' or even 'X is morally the wrong thing to do'.

SELF: Anatomy of an (alleged) failure

Posted Jun 25, 2010 3:36 UTC (Fri) by ajf (guest, #10844) [Link]

Perhaps one reason Linux has never kept backwards compatibility as well as Apple (or Windows, or Solaris) is because we haven't had the infrastructure and tools to do so easily.
It's a misunderstanding to say that Apple uses fat binaries because they care about backward compatibility; what they cared about, and implemented fat binaries to support, was cross-platform compatability. (The distinction is that they wanted new software to work with new operating system releases on both old and new hardware; they're less interested in new software working with old operating systems.)

FatELF?

Posted Jun 24, 2010 17:35 UTC (Thu) by vonbrand (guest, #4458) [Link]

Au contraire. I do believe the a.out binaries from the very first days of Linux still run fine on current kernels. What has changed is the environment: The currently most popular binary format is completely different, new libraries, languages have new ABIs, new ways to communicate among components are common today, ... A "FatELF binary" doesn't do any good if the right libraries, configuration files, devices, ... aren't available. Adding all that in would result in GargantuanELF.

In any case, the idea makes no sense, as this can be handled some other ways: Just pack stuff up into a cpio(1) or some such file package plus a custom header, and create a special loader that handles that header. No kernel change needed (heck, if you can run Java or Win32 apps like native, you certainly can do this). The binary format will be different in any case, use that freedom to create something that doesn't require kernel changes.

SELF: Anatomy of an (alleged) failure

Posted Jun 27, 2010 17:34 UTC (Sun) by da4089 (subscriber, #1195) [Link]

> The reason why Apple invented FAT binaries is because they were interested
> in maintaining extensive binary compatibility with their old systems.

Actually, Apple inherited fat binaries in Mach-O from NeXT.

NeXT supported fat binaries because NeXTSTEP (and later OpenStep) was available for multiple CPU architectures (68k, x86, SPARC32, PARISC), and they wanted to enable ISVs to ship binary applications that worked on all platforms.

To make that work effectively, they
a) Maintain ABIs, use weak-linking, etc
b) Distribute applications as a bundle
c) Support fat binaries

This is a viable approach, as Apple has recently demonstrated.

But it's a very different model to the usual Linux distribution, which needs none of those things, and relies on the dependency resolution of the packaging system and rigorous testing of API compatibility when building the consistent package set.

I don't think attempting to move Linux towards the NeXT/Apple model is useful, but I also don't see why those that want to can't maintain an out-of-tree patch to make it work.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 1:28 UTC (Thu) by neilbrown (subscriber, #359) [Link] (1 responses)

> Gordon said that developers should "study Andrew Morton" with great intensity.

Very sound advise. Study Linus too. And Alan and Al and Dave and ...

Working with Linux isn't just about technology, it is also about people communicating with each other and working together. To succeed we need to learn how others communicate so we can effectively communicate with them.

There is a great wealth of experience - both technical and social - in the kernel community. Study it to your benefit - ignore it at your peril.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 6:55 UTC (Thu) by nikanth (guest, #50093) [Link]

So, is there a list of good people to try and emulate and list of bad people...? ;-)

Just kidding. I agree with you.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 16:03 UTC (Thu) by error27 (subscriber, #8346) [Link] (5 responses)

In the end aren't you still going to use rpm to put your fatelf binaries on the filesystem in the right place? Could the rpm format be extended to include more than one arch? You'd put two binaries into the same rpm and only install the right one.

The thing about it is that if you don't add a feature, you can change your mind later and add it. But if you do add a feature, you can't remove it. Looking through this list, in retrospect most of the decisions were the correct ones.

The sad bit is if you end up pissing off a developer. Especially it's sad that Con was upset.

SELF: Anatomy of an (alleged) failure

Posted Jun 24, 2010 22:01 UTC (Thu) by MisterIO (guest, #36192) [Link]

IMO Con reacted in a somewhat childish way, but I can nonetheless see some truth in this post:
http://lkml.org/lkml/2007/8/1/7

SELF: Anatomy of an (alleged) failure

Posted Jun 25, 2010 9:54 UTC (Fri) by paulj (subscriber, #341) [Link]

That's pretty much how IRIX pkg worked, to deal with the various ABIs (o32, n32 and 64bit).

SELF: Anatomy of an (alleged) failure

Posted Jun 25, 2010 13:59 UTC (Fri) by vonbrand (guest, #4458) [Link] (2 responses)

The solution adopted is not a fat RPM containing several versions, but separate (but coordinated) RPMs for the different architectures. Look at current Fedora repositories for x86_64, they have packages for 32 and 64 bits. If I want to install one or the other or both, I can do so. No need for 32-bit people do get the for them useless 64-bit versions, no need for pure 64-bit systems to be burdened with 32 bit stuff.

SELF: Anatomy of an (alleged) failure

Posted Jun 26, 2010 18:13 UTC (Sat) by tzafrir (subscriber, #11501) [Link] (1 responses)

So which package includes /usr/bin/hello ? The i386 one? The amd64 one? The i686-with-uclibc one? The armel (v4) one?

SELF: Anatomy of an (alleged) failure

Posted Jun 27, 2010 11:59 UTC (Sun) by tialaramex (subscriber, #21167) [Link]

All of them, the package management software is aware of your preferred architecture and will install that package. Packages which provide dependencies (e.g. a library) can be multiply installed, so that both the 32-bit and 64-bit library are installed, each from its own package.

This stuff has all been working and in everyday use for some time.

Where to rejected patches go?

Posted Jul 2, 2010 7:33 UTC (Fri) by milki (guest, #68231) [Link] (1 responses)

Btw, I've wondered this for a while. Back in 2.0 days, there was a small repository of applyable patches (iBCS2, PC speaker sound card emulation, dual monitor support and Co.), which weren't included in the mainline kernel. The externel hosting was never an official feature, though.

But what about today? Where do patches go if they are rejected? Is everything just thrown away?? Shouldn't there be a public review and archival repository somewhere? Or even a separate GIT for incoming patches? (Can't believe it's all sent per email.)

If use contributed patches don't make it INTO the kernel, it would help if they were at least AVAILABLE somewhere. Otherwise there is little chance for user lobbying, or at least patching your own kernel.

Where to rejected patches go?

Posted Jul 6, 2010 9:35 UTC (Tue) by dlang (guest, #313) [Link]

the biggest problem is that there is no one place where rejected patches are ever together today.

there are many ways patches can be sent, to many different people (and in some cases the patches themselves aren't sent, just a request to pull from a git repository that may disappear in a few days)

it would be a significant amount of work to gather such patches, let alone maintain them.

there are patches like this today, but they get hosted and maintained by the people who care about them.


Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds