Karakas Online

10.3.2. HTML and RTF

The HTML and RTF document math processing is done partially in the stylesheets[1] (Section 4.2) and partially in the texmath2pngbmp.pl script.

10.3.2.1. Math processing in the HTML stylesheets

The following code in the HTML stylesheets

(root
 (make sequence
;   (literal
;    (debug (node-property 'gi
;                         (node-property 'document-element (current-node)))))
;(define (docelem node)
;  (node-propety 'document-element 
;    (node-property 'grove-root node)))
   (process-children)
   (process-math)
   (with-mode manifest
     (process-children))
   (if html-index
       (with-mode htmlindex
         (process-children))
       (empty-sosofo))))

(found in the docbook.dsl file of the original DocBook stylesheet package) initiates exactly the same processing as in the standard stylesheets, with one addition: the (process-math) instruction will start an additional processing step, after the standard one. The process-math routine is further specified in the code as follows:

;; Write equation info to equation-list.sgml
(define (process-math)
  (make entity
    system-id: "equation-list.sgml"
    (make element gi: "equation-set"
          attributes: (list
                       (list "latexopt" $latexopt$)
                       (list "density" $density$)
                       (list "usepackage" $usepackage$))
          (with-mode htmlmath (process-children)))))

This will create a new SGML file, equation-list.sgml, in the current directory, that will contain an element of type "equation-set". That's simply a container of equations and some LaTeX options that may be passed to it from the stylesheet: it contains the LaTeX options in “latexopt”, “density” and “usepackage”, as well as the TeX equation code, enclosed between <texequation>/</texequation> tags[2]. Here's how the equation-list.sgml file may look like[3]:

<equation-set
latexopt="12pt"
density="96x96"
usepackage=""
><texequation
fileref="images/math/11074.png"
>\[
\sum _{n=1}^{\infty }\frac{x^{n}}{n}=\ln \left(\frac{1}{1-x}\right)\]
  </texequation
><texequation
fileref="images/math/15280.png"
>\begin{equation}
f(x)=\left\{ \begin{array}{cc}
 \log _{8}x &#38; x&#62;0\\
 0 &#38; x=0\\
 \sum _{i=1}^{5}\alpha _{i}+\sqrt{-\frac{1}{x}} &#38; x&#60;0\end{array}
\right.\label{eq3}\end{equation}
  </texequation
></equation-set
>

How is the TeX code extracted from the <alt> elements and inserted in equation-list.sgml? This is the core work and is done by the following code in the HTML stylesheets:

;; How to write out an equation into the equation listing file
(define (write-eqn nd)
  (let ((texmath (select-elements (children (current-node))
                                  (normalize "alt")))
        (graphic (select-elements (children (current-node))
                                  (normalize "graphic"))))
    (make element gi: "texequation"
          attributes:
          (list
           (list "fileref" (attribute-string (normalize "fileref") graphic)))
          (literal (data texmath)))))
;; Special processing mode to extract equations
(mode htmlmath
  (default
    (let ((infeqns (select-elements (descendants (current-node))
                                    (normalize "informalequation")))
          (eqns (select-elements (descendants (current-node))
                                 (normalize "equation")))
          (inleqns (select-elements (descendants (current-node))
                                    (normalize "inlineequation"))))
      (with-mode htmlmath
        (process-node-list
         (node-list infeqns eqns inleqns)))))
  (element equation (write-eqn (current-node)))
  (element informalequation (write-eqn (current-node)))
  (element inlineequation (write-eqn (current-node))))

Basically, what the above code does is the following: It processes only text found in the <alt> element and only code found inside <informalequation>, <equation> or <inlineequation> tags. It puts this code (the TeX code of an equation) between <texequation> tags in equation-list.sgml. It also lists the fileref attribute of the <graphic> element.

This completes the Mathematics processing done by openjade. We will use the equation-list.sgml file to create PNG and BMP images of each equation.

10.3.2.2. Math processing with texmath2pngbmp.pl

The texmath2pngbmp.pl is called after openjade has processed the SGML file with the stylesheet for one HTML file. It takes one argument, the file to process. It expects a file with the structure of equation-list.sgml (see Section 10.3.2.1). It basically does the following:

At the end of the processing, PNG and BMP images are in images/math, while the HTML and RTF documents contain links to them for each equation. We are ready! We can enjoy Mathematics on the Web in TeX quality!smile

Note Note:
 

We only need to call texmath2pngbmp.pl once. The PNG and BMP equation images for the whole SGML file will be created in the right directory and need not be recreated for each HTML or RTF run, since our trick with the output.print.xxx entities will take care for each run to INCLUDE the graphic element with the right filenames in the fileref attribute (see Section 10.3.1.2, Section 7.1.4.1, Section 7.2.2).

Notes

[1]

Both HTML stylesheets contain the same code for Mathematics.

[2]

More precisely, the equation-list.sgml file is itself an SGML document, which should validate against the following DTD:

<!DOCTYPE equation-set [
<!ELEMENT equation-set - - (texequation+)>
<!ATTLIST equation-set
latexopt CDATA #IMPLIED
density CDATA #IMPLIED
usepackage CDATA #IMPLIED>
<!ELEMENT texequation - - (#PCDATA)>
<!ATTLIST texequation fileref CDATA #REQUIRED>
]>
[3]

You can see the equation-list.sgml file for this document here: equation-list.sgml.

[4]

It also unescapes some characters in the TeX code:

sub unescape {
    $eqn =~ s/&#38;/&/g;
    $eqn =~ s/&#62;/\>/g;
    $eqn =~ s/&#60;/\</g;
} 
[5]

This is new here. The original texmath2png.pl file did not compute BMP versions of the equation images.

Last updated Mon Sep 24 01:19:25 CEST 2007 Permalink: http://www.karakas-online.de/mySGML/explain-math-html-rtf.html All contents © 2002-2007 Chris Karakas