Do it JSelf

With Michael Kay’s recent announcement that he is trying to create a JavaScript version of his Saxon processor, it is quite likely that we may be getting XSLT 2.0 support in the web browser relatively soon. Of course, everybody would prefer that browser vendors implement XSLT 2.0 natively, but that is, sadly, not the reality of today. We all remember how long it had taken for some browser vendors to implement XSLT 1.0, and that was in the time when there were still relatively many who believed that XML-based web is the way to go – a position that has become increasingly difficult to defend in the current HTML5 frenzy.

Only time will show what the future for XML in the browser holds. Right now, it seems that XML has become the unwanted child that nobody wants to play with, because it seems inept when it comes to creating flashy rotating logos.

Or maybe it is not that bad; we will have to wait and see. And in the meantime, the best we can do is to make sure that XML support in the browser evolves and catches up with today’s state of the technology to remain relevant. Browser vendors are clearly not (or so it seems) going to assist with this, so the solution is either to write browser extension plugins (which, in truth, never really worked in practice) or to use JavaScript to implement the missing functionality.

Luckily, and almost paradoxically, browser vendors have given us a helping hand where it comes to JavaScript: With the almost total war for the fastest web browser and the best level of HTML 5 support, we are witnessing tremendous improvements in the JavaScript engines in virtually all relevant web browsers of today. Not only are they able to execute JavaScript much faster than before, but they are also able to handle much larger JavaScript libraries, and even complete applications.

For the first time – and rather remarkably – it is now possible to consider JavaScript to be robust enough for such tasks as, for example, implementing a conformant XSLT 2.0 processor or…

…or an XProc implementation. For the last couple of months, I have been working (when other priorities allow) on creating a self-contained JavaScript version of Calumet, a Java XProc processor that I have been developing at EMC. In August 2010, I presented a live demo at the Balisage XML conference in Montréal and expressed my hope that future versions of Calumet will be distributed as a dual Java/JavaScript package. If you are interested in the technical details, you can find the slides from the presentation, as well as the conference paper (which, coincidentally, seems to have been one of the impulses that have convinced Michael Kay to attempt a similar thing with Saxon) here.

So, how has the JavaScript version of Calumet progressed in the last two or three months? I must say that we are still not there, but we are getting quite close. Take a look for yourself:

http://xmldemo.emc.com:8080/calumet

I still consider the JavaScript processor “alpha” quality, as there are still compatibility or performance issues with various web browsers (yes, I know that the p:validate-with-schematron example does not work in Safari/Chrome) and some features are not (or not fully) supported yet (for instance, the p:http-request implementation is really, really simple and limited). But perhaps the processor is already good enough to be released so that users can play with it, test it – and most importantly, figure out if they can do anything useful with it.

Christmas sounds like a good plan.

Posted in JavaScript, Programming, XML, XProc | 2 Comments

EMC XProc Engine announces itself

…and how else than in XProc:

<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="1.0">
  <p:output port="result" sequence="true"/>
  <p:try name="announcement">
    <p:group>
      <p:output port="result"/>
      <p:identity>
        <p:input port="source">
          <p:inline>
            <p xmlns="http://www.w3.org/1999/xhtml">
              I am happy to announce the availability of version
              1.0.8 of EMC XProc Engine, EMC's XProc processor
              implementation. The processor is free for developer
              use and can be downloaded - together with other XML
              tools - from the EMC XML Developer Community website.
            </p>
          </p:inline>
        </p:input>
      </p:identity>
    </p:group>
    <p:catch>
      <p:output port="result"/>
      <p:identity>
        <p:input port="source">
          <p:inline>
            <p xmlns="http://www.w3.org/1999/xhtml">
              This is the first public release of the processor. Note
              that while version 1.0.8 supports most of the required
              features of XProc, the implementation is not yet complete.
            </p>
          </p:inline>
        </p:input>
      </p:identity>
    </p:catch>
  </p:try>
  <p:identity name="feedback">
    <p:input port="source">
      <p:inline>
        <p xmlns="http://www.w3.org/1999/xhtml">
          We are looking forward to your feedback - please use the EMC
          XML Developer Community forum (or the xproc-dev mailing list)
          for any comments, suggestions, feature requests, or bug reports.
        </p>
      </p:inline>
    </p:input>
  </p:identity>
  <p:http-request name="download">
    <p:input port="source">
      <p:inline>
        <c:request xmlns:c="http://www.w3.org/ns/xproc-step" method="GET"
            href="https://community.emc.com/community/edn/xmltech"/>
      </p:inline>
    </p:input>
  </p:http-request>
  <p:wrap-sequence name="features" wrapper="features">
    <p:input port="source" xmlns="http://www.w3.org/1999/xhtml">
      <p:inline>
        <p>XProc Engine implements most of the required
          features of the XProc specification. For an overview of the
          supported features, and the progress of the implementation,
          see the XProc Test Suite website.
        </p>
      </p:inline>
      <p:inline>
        <p>Plug-ins can extend the processor and customize
          the default behavior or provide new functionality, such as
          extension XProc steps.
        </p>
      </p:inline>
      <p:inline>
        <p>The XProc Engine distribution comes with a number
          of plug-ins that can be used with the processor. Software
          developers can use the XProc Engine API to create custom
          plug-ins.
        </p>
      </p:inline>
      <p:inline>
        <p>The Java application programming interface makes
          it easy to embed the XProc Engine in other Java applications.
        </p>
      </p:inline>
      <p:inline>
        <p>XProc Engine provides an interface for running
          XProc pipelines from the command-line.</p>
      </p:inline>
      <p:inline>
        <p>XProc Engine can be integrated with the EMC
          Documentum xDB XML database (via a plug-in). The plug-in
          provides a number of xDB-specific XProc steps, and allows
          developers to combine the benefits of a state-of-the-art
          native XML database and XProc.
        </p>
      </p:inline>
    </p:input>
  </p:wrap-sequence>
  <p:compare>
    <p:input port="source">
      <p:document
          href="http://tests.xproc.org/results/calumet/report.xml"/>
    </p:input>
    <p:input port="alternate">
      <p:document
          href="http://tests.xproc.org/results/calabash/report.xml"/>
    </p:input>
    <p:documentation>
      <p xmlns="http://www.w3.org/1999/xhtml">
        How does it compare to the other XProc processor? :)</p>
    </p:documentation>
  </p:compare>
  <p:identity>
    <p:input port="source">
      <p:pipe step="announcement" port="result"/>
      <p:pipe step="feedback" port="result"/>
      <p:pipe step="download" port="result"/>
      <p:pipe step="features" port="result"/>
    </p:input>
  </p:identity>
</p:declare-step>

Posted in Work, XML | Tagged , , | 1 Comment

EMC XML Developer Community Launching

At Last! I am happy to announce that EMC has launched the new XML Developer Community, a developer forum for sharing information about XML and EMC’s XML technology. I was never good at making flashy announcements, so I will just say: go to http://developer.emc/com/xmltech and see for yourself, there is some seriously interesting stuff there. (And expect more to come soon – after all, we are just taking off!)

Besides the content (articles, videos, blogs), the real highlight for me is that EMC is going to release a whole suite of its XML tools through the site, free for unlimited developer use. And the first of the downloads is nothing less than EMC Documentum xDB, our native XML database (formerly known as X-Hive/DB).

Personally, I have been waiting for this to happen for a long time. I know that I am biased, but I really think that xDB is a killer application, with so much value and potential. In the 10 years of its existence (yes: birthday party!), xDB has grown into a really robust and mature product with excellent support for standards, and it is no wonder that it has become so successful it the commercial sphere. Sadly, it has also successfully stayed out of reach of the wider community. But this is changing now, and I am expecting lots of positive reactions to this; I, for myself, have always found xDB great to work with.

But the EMC XML Community site is not only about promoting EMC software. Its main focus is XML and XML technologies in general, and the goal is to provide a place of dialog between us – EMC’s XML developers – and the developer community, where useful information about XML (tutorials, best practices, support, etc.) can be shared and discussed.

Looking forward to your feedback, either on this blog, or better, on the EMC XML Community website!

Posted in XML, Work | Tagged , , , , | Leave a comment

Back from XML Prague (and bed, almost)

Finally recovered enough from XML Prague to write a short post. The conference was great (hey, the strudel was fantastic!), and I had a chance to chat with quite some interesting people there (…some of which probably infected me with a nasty bug that I have been struggling with until now).

The organizers attempted to make the event a bit special this year: they were broadcasting a live feed from the conference, to make it accessible also for people who could not participate in person. Quite a unique touch, and surely something that will attract a wider audience in the following years of the conference, should the organizers keep doing the same thing (and they should!).

Recordings of the presentations are now starting to show up on the conference website. Only slides with audio at the moment, but the organizers are planning to upload full video recordings of the speakers once these are processed and polished (…and let’s hope not – using Robin Berjon’s words – censored :-).

My presentation about XProc was on the second day of the conference (right after another XProc talk by Norm Walsh who showed some ingenious real-world pipelines – see his blog if you are interested in knowing more), and it was kind of motivating for me to see the positive reaction to EMC‘s plans to release its XML tools to the wider public.

The most frequent question after my presentation seemed to be: When? – and at that moment, the only answer I could give was: Soon, because, in all honesty, I didn’t know. Today, I think, I can reveal more: Very soon.

Posted in XML, Work | Tagged , , , , | Leave a comment

Presenting at XML Prague 2009

On March 22nd, I am presenting at XML Prague. Topic: Optimizing XML Content Delivery with XProc. There will be some interesting names there, like Michael Kay, Murata Makoto, Norman Walsh, or Jeni Tennison to name a few. Should be fun.

Posted in Work, XML | Tagged , | Leave a comment

Introduction to XProc

Anyone interested in learning/understanding XProc, don’t miss the excellent Introduction to XProc written by Dave Pawson. It is still work in progress, but it already contains a lot of useful information.
Another nice place to get started with XProc is James Sulak’s blog.

Posted in XML | Tagged | Leave a comment

First XProc test report for Calumet published

We have gone public – at last! Yesterday, I submitted the first test report for Calumet, which is the code name for our XProc implementation.
As of today, we pass 95% percent of the required tests, and about half of the optional tests, which makes for about 90% success rate overall. Compared to Calabash, which scores 97% at the moment, we still lag behind a bit, but we have made a good start, I think, more so that Calumet is getting more mature and compliant every day (literally). Stay tuned.

Posted in XML, Work | Tagged , , , | Leave a comment

XProc – The next rockstar?

While working on an XProc implementation, and especially after using XProc in real-life and seeing its true power, I am more and more confident that XProc will soon become one of the most popular (and useful) XML technologies out here. For me, the reasons are simple:

  • XProc integrates a whole plethora of XML technologies, ranging from XPath, XSL and XSLT, to XInclude processing, schema validation and XQuery support. The good thing about XProc is that you don’t have to learn the details of the different programming APIs and models. XProc shields you from that. To me, this is the most significant benefit of using XProc – and in fact, it is the very reason why XProc exists after all. Make manipulating XML content simple, transparent, and easy to understand.
  • XProc can make application development simpler, and faster. No more tedious XML programming (I guess we all know that how many times did I write this code before? feeling…), no more low-level dances around constructing/navigating/updating the DOM tree. Not any more: Here, this my XProc pipeline, run it and give me the results I want.
  • XProc can make applications more reliable and less buggy (once the XProc processor get good enough, of course :). This is related to the previous point. You see, manual XML programming is potentially dangerous, especially in the hands of unexperienced developers who are not aware of all the nifty details. I have seen too many examples of badly written code for performing an XSLT transformation, or for just parsing an XML document… And I am really glad that finally there is a tool that, to put it bluntly, can shield us from crappy programming. And once there are visual tools for building XProc pipelines – and I am sure there will be some soon – we will be even safer.
  • XProc is simple. Querying content using XQuery, transforming the results to XSL-FO, and generating a final PDF document has never been easier:
    <p:declare-step>
      <p:input port="source"/>
      <p:input port="parameters" kind="parameter"/>
    
      <p:xquery>
        <p:input port="query">
          <p:data href="stats.xq" content-type="application/xquery"/>
        </p:input>
      </p:xquery>
    
      <p:xslt>
        <p:input port="stylesheet">
          <p:document href="stats2fo.xsl"/>
        </p:input>
      </p:xslt>
    
      <p:xsl-formatter href="out/stats.pdf"/>
    </p:declare-step>
    
  • XProc is extensible. XProc comes with a library of standard steps, but one of the core features of the language is that it allows you to declare custom steps that provide more complex (or not supported by default) functionality. You can organize your custom steps in libraries, which you can then import in your XProc pipelines in a way similar to importing stylesheets in XSLT.
    <p:pipeline>
      <p:import href="sq-library.xpl"/>
      ...
      <sq:get-stock-quote ticker="GOOG"
          xmlns:sq="http://www.foo.com/sq/ns/"/>
      ...
    </p:pipeline>
    
  • …and finally, XProc is fun!
Posted in XML, Work | Tagged , , | 2 Comments

XProc goes to CR!

On November 26, W3C announced that XProc has moved the Candidate Recommendation status.

For those who don’t know what XProc, or XML Pipeline Language, is, I think it’s best to quote the specification:

[XProc is] a language for describing operations to be performed on XML documents.

An XML Pipeline specifies a sequence of operations to be performed on zero or more XML documents. Pipelines generally accept zero or more XML documents as input and produce zero or more XML documents as output. Pipelines are made up of simple steps which perform atomic operations on XML documents and constructs similar to conditionals, iteration, and exception handlers which control which steps are executed.

The specification is maintained by the XML Processing Model Working Group, which I happen to be a member of. The group is chaired by Norman Walsh, who – in parallel to his great editorial work (and I mean it, Norm; it has been a pleasure for me so far!) – is also responsible for the development of Calabash, the reference XProc implementation.

My involvement with XProc (and the WG) began in early 2008, when I started implementing an XProc processor for my employer. Since then, the specification has undergone a number of changes of varying magnitudes, so at times it was quite a challenge to keep our implementation in sync. But I endured (hey, I get paid for that!), and now I am happy to see that both Calabash and our – yet unnamed – processor are getting more mature every day.

At the moment, one of the most important tasks for the WG is to come up with a comprehensive test suite that would cover most of the language. And the sooner we are done with this the better, since the test suite not only helps to guarantee a reasonable level of interoperability among XProc implementations, but the whole process of writing test cases helps us to detect and fix defects or ambiguities in the specification until there is still time. There are quite a number of tests already, and I would say we are somewhere halfway through, so there is definitely still some work to be done… But we are making good progress, I think, and we will get there, soon.

In the meantime, feel encouraged to get in touch with us at the xproc-dev@w3.org mailing list. It is the perfect place to find out more about XProc and to ask questions. You can subscribe by sending an e-mail with the subject subscribe to xproc-dev-request@w3.org. See you there!

Posted in XML | Tagged , , | Leave a comment

S1000D Logic Engine Presentation

In October 2008, I attended the joint ATA e-Business Forum/S1000D User Forum in Budapest, where I was talking about the S1000D Process Data Module and about our Logic Engine implementation.

The presentations from the Forum are now available on-line, so if you are curious what is this weird S1000D stuff all about, you can view a PDF version of my presentation here.

Posted in XML, Work | Tagged , | Leave a comment