From: Charles Oliver Nutter on
On Mon, Jan 25, 2010 at 6:25 PM, Mike Dalessio <mike.dalessio(a)gmail.com> wrote:
> Charlie, you're making a great case against using FFI.

FFI is much better than writing any C code at all, due to the
security, stability, and portability problems of writing your own C
bindings. If you are permitted to load a given library and that
library is available and you *must* use that library, FFI is the only
logical choice. But it doesn't get around the fact that you need the
library you're binding to be available and loadable on your target
platform. FFI > C bindings, but [platform-independent binary] > FFI.
And that usually means Java-based.

I should also point out that you don't necessarily have to write JVM
libraries in Java; you could also use Scala or Fan or similar
languages, and it would be just as portable (albeit a bit larger due
to the runtime dependency on those languages' runtime libraries).

But yes, at the end of the day, I believe writing stuff in a portable
binary format like JVM bytecode (or CLR bytecode) is a better choice
than writing in a language that has to be recompiled for every target
system. You ought to know that already...would I be working on JRuby
if I believed any differently? :)

And yes...I'd love to be able to recommend that everyone just use Ruby
for everything. But I don't think it's simply a performance issue;
there's some pretty amazing things you can get for free with a rich
static type system.

- Charlie

From: Aaron Patterson on
On Tue, Jan 26, 2010 at 03:12:17AM +0900, Charles Oliver Nutter wrote:
> On Mon, Jan 25, 2010 at 6:25 PM, Mike Dalessio <mike.dalessio(a)gmail.com> wrote:
> > Charlie, you're making a great case against using FFI.
>
> FFI is much better than writing any C code at all, due to the
> security, stability, and portability problems of writing your own C
> bindings.

References please.

Last I checked, it was just as easy to segv from an FFI library as a C
library. Plus with FFI you don't get any benefits of compile time
checks. You can't, for example, check for #define constants.

With FFI you must:

1. Duplicate header files (see below for more problems)
2. Understand struct layouts and the sizeof() for each member
3. Do runtime checking of library features
4. Worry about weak ref maps when using void pointers (see the id2ref
problem in nokogiri)
5. Pay a runtime conversion price from ruby data types to FFI types
6. Educate users on LD_LIBRARY_PATH
7. Worry about 32bit and 64bit issues (like Tony mentioned)

The duplication of header files becomes an even larger problem if the
library you're wrapping changes it's struct layout. Where a simple
recompile would have solved the problem, now (without warning) you're getting
surprising values in your FFI program. Plus typical debugging tools
like gdb get you nowhere.

Example:

Library "foo" ships with a struct like this:

struct awesome {
float hello;
char * world;
};

Then later changes to:

struct awesome {
char * world;
float hello;
};

You wrapped the first one, upgrade the library, then boom. It doesn't work.
With a compiled program, you wouldn't care.

Unfortunately, none of the problems I've just listed off are
theoretical. I have personally run in to every one of them and can
provide you with real world examples. FFI is awesome for certain,
confined, small, stable use cases. I use FFI, and I enjoy it. But
saying that it's "the only logical choice" seems wrong.

I am curious what your experience has been, and why you haven't run in to the
same problems? How do other people overcome these issues?

--
Aaron Patterson
http://tenderlovemaking.com/

From: Mike Dalessio on
[Note: parts of this message were removed to make it a legal post.]

On Mon, Jan 25, 2010 at 1:12 PM, Charles Oliver Nutter
<headius(a)headius.com>wrote:

> On Mon, Jan 25, 2010 at 6:25 PM, Mike Dalessio <mike.dalessio(a)gmail.com>
> wrote:
> > Charlie, you're making a great case against using FFI.
>
> FFI is much better than writing any C code at all, due to the
> security, stability, and portability problems of writing your own C
> bindings. If you are permitted to load a given library and that
> library is available and you *must* use that library, FFI is the only
> logical choice. But it doesn't get around the fact that you need the
> library you're binding to be available and loadable on your target
> platform. FFI > C bindings, but [platform-independent binary] > FFI.
> And that usually means Java-based.
>
> I should also point out that you don't necessarily have to write JVM
> libraries in Java; you could also use Scala or Fan or similar
> languages, and it would be just as portable (albeit a bit larger due
> to the runtime dependency on those languages' runtime libraries).
>
> But yes, at the end of the day, I believe writing stuff in a portable
> binary format like JVM bytecode (or CLR bytecode) is a better choice
> than writing in a language that has to be recompiled for every target
> system. You ought to know that already...would I be working on JRuby
> if I believed any differently? :)
>

I agree with everything you're saying, more or less.

However, none of that relates at all to what I think is the crux of the
issue, which is that everyone writing a non-pure-Ruby gem today is forced to
choose one of these options:

1) Support nearly everyone by maintaining two ports of your code: FFI for
JRuby; C for MRI, Rubinius and MacRuby. Don't support GAE.
2) Support everyone by maintaining two ports of your code: JVM for JRuby and
GAE; C for MRI, Rubinius and MacRuby.
3) Maintain only a single port, FFI, and force everyone on MRI to take a
performance hit of some kind. Oh, and don't support Rubinius, MacRuby or
GAE.
4) Don't support JRuby or GAE. Just write it in C.
5) Don't support MRI, Rubinius, or MacRuby. Just write it for the JVM.

Complicated? Yes. I've summed it all up in a nice matrix here:
http://gist.github.com/286126

I personally think these choices all suck, and I refuse to paint a happy
face on any of them.

We chose option 1 for Nokogiri (you're welcome, intarnets), but everyone
who's writing a gem today has to make this decision for themselves.

My point is that any of these choices contains a tradeoff, and stating that
one in particular "hurts" people more than another is just disingenuous. I'd
rather help people understand the tradeoffs.

From: Chuck Remes on

On Jan 25, 2010, at 1:12 PM, Mike Dalessio wrote:

> On Mon, Jan 25, 2010 at 1:12 PM, Charles Oliver Nutter
> <headius(a)headius.com>wrote:
>
>> On Mon, Jan 25, 2010 at 6:25 PM, Mike Dalessio <mike.dalessio(a)gmail.com>
>> wrote:
>>> Charlie, you're making a great case against using FFI.
>>
>> FFI is much better than writing any C code at all, due to the
>> security, stability, and portability problems of writing your own C
>> bindings. If you are permitted to load a given library and that
>> library is available and you *must* use that library, FFI is the only
>> logical choice. But it doesn't get around the fact that you need the
>> library you're binding to be available and loadable on your target
>> platform. FFI > C bindings, but [platform-independent binary] > FFI.
>> And that usually means Java-based.
>>
>> I should also point out that you don't necessarily have to write JVM
>> libraries in Java; you could also use Scala or Fan or similar
>> languages, and it would be just as portable (albeit a bit larger due
>> to the runtime dependency on those languages' runtime libraries).
>>
>> But yes, at the end of the day, I believe writing stuff in a portable
>> binary format like JVM bytecode (or CLR bytecode) is a better choice
>> than writing in a language that has to be recompiled for every target
>> system. You ought to know that already...would I be working on JRuby
>> if I believed any differently? :)
>>
>
> I agree with everything you're saying, more or less.
>
> However, none of that relates at all to what I think is the crux of the
> issue, which is that everyone writing a non-pure-Ruby gem today is forced to
> choose one of these options:
>
> 1) Support nearly everyone by maintaining two ports of your code: FFI for
> JRuby; C for MRI, Rubinius and MacRuby. Don't support GAE.
> 2) Support everyone by maintaining two ports of your code: JVM for JRuby and
> GAE; C for MRI, Rubinius and MacRuby.
> 3) Maintain only a single port, FFI, and force everyone on MRI to take a
> performance hit of some kind. Oh, and don't support Rubinius, MacRuby or
> GAE.
> 4) Don't support JRuby or GAE. Just write it in C.
> 5) Don't support MRI, Rubinius, or MacRuby. Just write it for the JVM.

FFI originated with rubinius, so I would wager that it will work once the FFI APIs get synched up again. Also, MacRuby has FFI support on its roadmap. That changes your picture a bit.

cr


From: Aaron Patterson on
On Tue, Jan 26, 2010 at 04:17:32AM +0900, Chuck Remes wrote:
>
> On Jan 25, 2010, at 1:12 PM, Mike Dalessio wrote:
>
> > On Mon, Jan 25, 2010 at 1:12 PM, Charles Oliver Nutter
> > <headius(a)headius.com>wrote:
> >
> >> On Mon, Jan 25, 2010 at 6:25 PM, Mike Dalessio <mike.dalessio(a)gmail.com>
> >> wrote:
> >>> Charlie, you're making a great case against using FFI.
> >>
> >> FFI is much better than writing any C code at all, due to the
> >> security, stability, and portability problems of writing your own C
> >> bindings. If you are permitted to load a given library and that
> >> library is available and you *must* use that library, FFI is the only
> >> logical choice. But it doesn't get around the fact that you need the
> >> library you're binding to be available and loadable on your target
> >> platform. FFI > C bindings, but [platform-independent binary] > FFI.
> >> And that usually means Java-based.
> >>
> >> I should also point out that you don't necessarily have to write JVM
> >> libraries in Java; you could also use Scala or Fan or similar
> >> languages, and it would be just as portable (albeit a bit larger due
> >> to the runtime dependency on those languages' runtime libraries).
> >>
> >> But yes, at the end of the day, I believe writing stuff in a portable
> >> binary format like JVM bytecode (or CLR bytecode) is a better choice
> >> than writing in a language that has to be recompiled for every target
> >> system. You ought to know that already...would I be working on JRuby
> >> if I believed any differently? :)
> >>
> >
> > I agree with everything you're saying, more or less.
> >
> > However, none of that relates at all to what I think is the crux of the
> > issue, which is that everyone writing a non-pure-Ruby gem today is forced to
> > choose one of these options:
> >
> > 1) Support nearly everyone by maintaining two ports of your code: FFI for
> > JRuby; C for MRI, Rubinius and MacRuby. Don't support GAE.
> > 2) Support everyone by maintaining two ports of your code: JVM for JRuby and
> > GAE; C for MRI, Rubinius and MacRuby.
> > 3) Maintain only a single port, FFI, and force everyone on MRI to take a
> > performance hit of some kind. Oh, and don't support Rubinius, MacRuby or
> > GAE.
> > 4) Don't support JRuby or GAE. Just write it in C.
> > 5) Don't support MRI, Rubinius, or MacRuby. Just write it for the JVM.
>
> FFI originated with rubinius, so I would wager that it will work once the FFI APIs get synched up again. Also, MacRuby has FFI support on its roadmap. That changes your picture a bit.

Rubinius implements enough of the MRI C api that it will run Nokogiri
today. MacRuby will follow suit, and I expect that to happen sooner
than it supports FFI (though this is conjecture). With minor tweaks to your C
code, you can have a native extension that runs on all three *today*.

--
Aaron Patterson
http://tenderlovemaking.com/