Showing posts with label docx4j. Show all posts
Showing posts with label docx4j. Show all posts

Thursday, January 26, 2012

Representing Java classes in diagrams

You probably wanted to display relationships from the Java classes in diagram if you needed to understand how the projects and files are related to each other. Eclipse can show you Call and Type hierarchy, but seeing it all in one graph can help you understand how project is structured. I have created the diagram for the docx4j using small application (using Apache BCEL 5.2) to create it (to import it I converted CSV file to XLSX and used yEd):


Application code that is creating CVS is this:

Once you have run this code, open file in Excel, save as XLSX and then open file in yEd. After that, choose Edge List and fill in your ranges. Import as Circular and then change Layout to Organic (I think that it is faster this way and it looks better). With graphs big as this one, some layouts will be extremely slow. What is useful in the end is that you can select neighboring nodes, and create new document out of that. That should give you a good overview of what your class is using or where it is being used.

Sunday, January 01, 2012

Merging Word documents with docx4j

Recently I needed to merge some Word .docx  documents and the tools that we chose for this was docx4j (www.docx4java.org). This is library for Java for working with Microsoft Open XML. There we two things that we needed to accomplish:

  1. Binding XML to various templates
  2. Merging documents as a result of the binding

I will write about merging documents as binding them is already explained well in docx4j site. Author of docx4j also offers commercial package for merging documents but if you want to try it for yourself, here are couple of things that I managed to do and got pretty decent results.

In version 2.7.1 docx4j you can work with java.util.File or java.io.InputStream. First one will do a god job if you have file present in your drive and second one if you keep content in the database (for example).  When merging Word documents you have to take care of relationships in the document itself. There are several elements that have relationships that can span through the document, but we were interested in just a few of them (images, footers and headers). It is worth to mention that if you miss one relationship, your document will be unreadable (in most cases). These are listed as resources and have references that you can use in your paragraphs.

So, to start we would:

  1. Load our initial file in WordprocessingMLPackage (this is the file where we want to attach the rest of the files, so in the end they look as one)
  2. Create unique section template
  3. Reset sections (this will serve the purpose of removing all references from the existing template, remember that section defines page layout)
  4. Remove body section (we can add this in the end)
  5. Loop through the attachment files (if you do not have sections separating pages, you might add page breaks)
  6. Copy relationships that you are interested in
  7. Copy elements
  8. If you do not want page breaks, then you can add empty section
  9. Add body section
  10. Reapply all headers and footers to empty sections

This all might sound complicated, but in the end, once you get to know the structure of the WordprocessingMLPackage, it becomes easier.

These are the code snippets that might be useful:

Note: All code displayed in upper window is property of Sapiens North America