Using iText to generate PDFs in Rails; JRuby vs. Ruby Java Bridge

So I need to generate a PDF, using data coming out of a Rails-based app. I started looking at Prawn and Prawnto, but I need to prepend the PDF with some boilerplate material, ideally another PDF. In other words, I'd like to programmatically generate a PDF, then combine the generated PDF with the boilerplate PDF.

Pete suggested I look at iText. iText is a popular, free, PDF writing library written in Java. So, using this means connecting Ruby and Java in some way. Pete (also helpful) passed on some links and suggested I look at Rjb.

All I have to do is get Java and Rjb up and running.

Gathering ingredients

This is the easy part.

First, install rjb: gem install rjb

Next, download iText from lowagie.com. At a minimum, you want the "jar" file (it might be zipped). Unpack it to somewhere convenient, maybe your app's lib directory.

Done!

Try talking to iText

Try this in irb:

require 'rubygems' require 'rjb' load_path = File.join('path', 'to', 'your', 'railsapp', 'lib', 'itext-2.0.6.jar') options = [] Rjb::load load_path, options doc = Rjb::import("com.lowagie.text.Document")

If this actually works, you'll see something like #<Rjb::Com_lowagie_text_Document:0x2b0c8a8b6390>

I had trouble getting this to work on my first several tries. Here are a few errors you might encounter:

RuntimeError: can't create Java VM

I got this error when I didn't have a JAVA_HOME set for my environment. On my Mac, this was pretty easy, because Apple fixes the location of Java to /Library/Java/Home. Edit (or create, if you don't have one) your ~/.bash_profile and add:

export JAVA_HOME=/Library/Java/Home

On a Unix system (like, say, our production server), this is a little trickier. On our system, JAVA_HOME is /usr/java/jdk1.5.0_17.

You may also need to set the LD_LIBRARY_PATH. This should be a directory or two away from your JAVA_HOME, and can typically be set in a similar way (ie, in your ~/.bash_profile). The general pattern is "JAVA_HOME/jre/lib/ARCH", where ARCH refers to your system's architecture (typically i386, but am64 on some systems). (Some guides suggest that you also specify a 'client' directory, but our server didn't have one, and leaving it out didn't seem to hurt.)

NoClassDefFoundError: com/lowagie/text/Document

I got this error when I tried to use any relative path to the iText jar. I had to provide an exact, explicit path to the iText jar file, or NoClassDefFoundError would strike. Inside a script or rake task, I was able to use File.expand_path like so:

my_path = File.dirname(File.expand_path(__FILE__)) load_path = File.join(my_path, 'lib', 'itext-2.0.6.jar')

Can't start the AWT because Java was started on the first thread

I got this error sporadically, especially when trying to write tests. Based on a comment here, I was able to get it running like this:

require 'rubygems' require 'rjb' load_path = File.join('path', 'to', 'your', 'railsapp', 'lib', 'itext-2.0.6.jar') options = ['-Djava.awt.headless=true'] Rjb::load load_path, options doc = Rjb::import("com.lowagie.text.Document")

Step Three: Combining files

Once I had Rjb::import working, I was able to quickly get basic PDFWriter operations running. But I needed to combine files, so I transliterated a sample program I found on the iText Tutorial pages from Java to Ruby+Rjb+Java, ending up with something like what you see here.

Except that it wasn't working. At all. The errors I got suggested that either copier was nil, or that it didn't have a add method.

A random but not totally useless side-track – JRuby

I couldn't tell if Rjb was the source of my problems, but I had a hunch that if I could run the above script in JRuby, then I'd at least know that my program was fine, that the problem was somewhere else.

In theory, installing JRuby is easy: download the latest release from JRuby, unzip it somewhere useful, put that somewhere in your PATH. I had a problem with jirb (the JRuby irb) where env couldn't find jruby. Using the explicit path to JRuby (instead of ~/jruby/bin, say) made everything work.

The only real change to get this running in JRuby is to change from Rjb::import to include_class, like so:

require 'java' my_path = File.dirname(File.expand_path(__FILE__)) load_path = File.join(my_path, '..', 'iText-2.1.4.jar') require load_path include_class "java.io.FileOutputStream" include_class "com.lowagie.text.pdf.PdfWriter" include_class "com.lowagie.text.Document" include_class "com.lowagie.text.Paragraph"

But there was still a problem with copier.add. Luckily, though, JRuby's error messages are much more useful than the ones I got from RJb:

NameError: no add with arguments matching [class com.lowagie.text.pdf.PdfImportedPage] on object Java::ComLowagieTextPdf::PdfCopy:0xacbf5c @java_object=com.lowagie.text.pdf.PdfCopy@362a7b

So, there's no 'add' method! Okay, armed with this knowledge, I march off into the iText API docs. There, I find another method that seems to do what I want: addPage. With a little hackery, I end up with something like this:

document.open n_pages = reader.getNumberOfPages n_pages.times do |i| copier.addPage( copier.getImportedPage(reader, i+1) ) if copier end document.close

Which works: I can generate a PDF now. Ooooh, so close!

Now, I want to put some of our app's data into the generated PDFs. But JRuby hates our gems! We have a bunch of weird, wacky, crazy gems here, like RMagick and Ruby2Ruby. I could maybe create a seperate environment configuration with different gem requirements, but… ugh, that's so messy.

Back to Rjb!

Totally random aside

I haven't really done a lot of programming with Java, so I didn't know this before today, but in Java you can write two different methods with the same names as long as they have different parameters! So, add(integer) could be totally different than add(string), say.

From what I can gather, systems that bridge Java and Ruby (like Rjb) do their best to "guess" at the arguments' types, but sometimes they need help.

Finally: Combining PDF Files with iText

Actually, once I had the JRuby version of the script working, all I had to do was change all the include_class lines back to RJb::import.

Here's a reasonable fascimile of the final rake task: pdf:build and pdf:combine

Permalink • Posted in: ruby on rails, programmingComments (3)

Comments

Chris Dolan Jan 20, 2009

If you get sick of Java and want to try Perl instead, let me know. I'm the author of one of the leading Perl PDF libraries. :-)

Joshua Wehner Jan 20, 2009

I was sick of Java before I started. ;)

Seriously, though, is there a Perl-Ruby bridge, or something equivalent? We initially looked at command-line options (like pdftk) but decided we wanted something more or less inline.

I know that eventually Perl 6 will assimilate all of us, but I have no idea the present status of the borgification.

Chris Dolan Jan 20, 2009

I've never tried bridging to Ruby, but there appears to solutions on CPAN:
http://search.cpan.org/search?mode=all&query=ruby
If they work like other Perl bridges I've seen, they probably use a pipe between persistent processes, FastCGI-like.

I'd be willing to help you navigate these waters — let me know offline.