Karakas Online

Document processing with LyX and SGML

A quest for the Holy Grail of technical documentation

Chris Karakas

www.karakas—online.de
Revision History
Revision 1.4 20.09.2007 Revised by: CK
Added sections on start and end files, HTML headers and footers. Changed chapters resp. sections: Shortcomings and bugs, Main part, LaTeX errors. New files: example.start, example.end, part1, part2, part3, keycombos, keycombos2, acronyms, acronyms2, productnames, productnames2, applications, applications2. Changed files: lyxtox.
Revision 1.3 12.03.2006 Revised by: CK
Files changed: sedscr, lyxtox, lyxtox-print-pdf.dsl, lyxtox-print.dsl, awkscr_insert_index_items, ck-style.css, jadetex.cfg, addd. New sed scripts, sedscr_apa and sedscr_ima, take care of alt and title attributes in the HTML format, so that the resulting files remain W3C compliant. Previous versions might not be, because a phrase element inside a textobject was not used for the alt attribute, due probably to a bug in the DSSSL stylesheets. sgmltools is not needed anymore. Corrected a bug in awkscr_insert_index_items that would break index entry insertion after the first entry. Added code in sedscr that will take care of keycombos, acronyms, productnames and index entries of key combinations. Use of a new script, coolthumbs, for the creation of antialiased thumbnails in PDF. Further changes in: The final step, Unprintable characters, SP_ENCODING in "Set environment vars", Using Type 1 fonts, Choosing the right font encoding, Using True Type fonts, Optimal PDF—Figures, Acrobat
Reader 5 does not display thumbnails. Added a whole chapter on Localization (work in progress). Simplified addd script. Generated new Index (almost 2000 entries!).
Revision 1.2 25.06.2004 Revised by: CK
The tidy scripts have been deactivated in lyxtox. They mess up other areas like callouts or displayed code — but they are still in the package. Corrections to sedscr and awkscr_math scripts to handle inequalities correctly: now writing a < b > c in Math Mode will not result in an Openjade parser error about an "undefined element b". The jadetex.cfg file now contains examples of how to get customized headers and footers in PDF through the fancyhdr package (works only partly — ideas welcome) and also an example of using the underscore package to get underscores correctly in links — but this messes up the smiley names which also contain underscores.
Revision 1.1 13.06.2004 Revised by: CK
Discussion of newer LyX versions (newer than 1.2.0), as well as new errors and warnings. Inclusion of sedscr_tidy and sedscr_tidy2 sed scripts that tidy up the SGML code. The lyxtox script contains calls to those two scripts, otherwise no changes have been made to the scripts.
Revision 1.0 19.02.2004 Revised by: CK
Initial public release.

A method for single-source publishing using LyX and SGML is presented: LyX is used as a comfortable graphical SGML editor. Once the document is exported to SGML from LyX, it undergoes a series of transformations through sed and awk scripts that correct and enhance the SGML markup, compute the Index, insert the Bibliography and the Appendix and take care of the correct invocation of openjade, pdftex, pdfjadetex and all the other necessary programs for the generation of HTML (chunked or not), PDF (with images, bookmarks, thumbnails and hyperlinks), PS, RTF and TXT versions. All aspects of document processing are handled, including automatic Index generation, display of Mathematics in TeX quality both online and in print formats, as well as the use of bibliographic databases with RefDB. Special care is taken so that the document processing is as transparent to the user as possible - the aim being that the user writes in LyX, then presses a button, and the lyxtox script does the rest.


Table of Contents
1. Terms of distribution
1.1. Disclaimer
1.2. Formats
1.3. License
1.4. Availability of sources and support
1.5. Credits
1.6. Aknowledgements
1.7. Conventions
1.8. Abbreviations
2. Introduction
2.1. The general idea
2.2. Line of attack
3. Required software
3.1. LyX
3.2. DocBook
3.3. sgmltools
3.4. Openjade, pdfTeX and JadeTeX
3.5. TeX and LaTeX
3.6. Dvips, Ghostscript and ImageMagik
3.7. thumbpdf
3.8. Sed and awk
3.9. Lynx
3.10. HTML tidy
3.11. Refdb
4. Required preliminary steps
4.1. Reconfigure LyX
4.2. Adapt the DocBook DSSSL stylesheets
4.3. Adapt pdftex.cfg
4.4. Adapt jadetex.cfg
4.5. Check paths of catalog files
4.6. Adapt the preample
4.7. Admonitions
4.8. Callouts
4.9. Add density to images
4.10. Run sed and awk scripts
4.11. Set up your start and end scripts
4.12. Set up custom headers and footers
4.13. Set up your bibliographic database
4.14. Use a CSS for DocBook
4.15. Use coolthumbs
5. Writing in LyX, thinking in SGML
5.1. LyX environments
5.2. Authors, Credits, Roles
5.3. Keywords
5.4. Revision history
5.5. Paragraphs
5.6. Cross references
5.6.1. Mass insertion of cross-references in LyX
5.7. Images
5.7.1. Inline graphics
5.8. Admonitions
5.9. Callouts
5.10. Tables
5.11. Table of contents
5.12. List of figures, tables and equations
5.13. Epigraphs
5.14. SGML code in program listings
5.15. Filenames
5.15.1. Labels as filenames
5.15.2. Cool labels don't change!
5.16. Examples
5.17. Mathematics
5.18. Appendix
5.19. Bibliography
5.19.1. Bibliography without RefDB
5.19.2. Bibliography with RefDB
5.20. Index
5.20.1. Automatic Index generation
5.21. The final step: invoking lyxtox
6. Errors and warnings
6.1. LyX errors
6.2. Openjade errors
6.3. TeX errors
6.3.1. The structure of TeX errors
6.3.2. LaTeX errors
6.3.3. TeX capacity exceeded
6.3.4. Fatal format file error; I'm stymied
6.3.5. Corrupted NFSS tables
6.3.6. Missing $ inserted
6.3.7. Unprintable characters
6.4. Other errors
6.4.1. Keywords not present in HTML
6.4.2. thumbpdf fails
6.4.3. sed segmentation fault
6.4.4. Acrobat Reader 5 does not show thumbnails in Linux
6.4.5. URLs with underscore display '&lowbar;' instead of '_'
6.4.6. sed: file sedscr_img line 2: Unknown option to `s'
7. Explaining the magic: the details
7.1. Document processing
7.1.1. Check number of parameters
7.1.2. Set program locations
7.1.3. Set environment variables
7.1.4. Main part
7.1.5. DSSSL stylesheets
7.1.6. Inline graphics
7.1.7. Catalogs
7.1.8. CSS
7.1.9. Appendix
7.1.10. Bibliography
7.1.11. Index
7.2. Optimal PDF
7.2.1. From .lyx to .pdf
7.2.2. Figures
7.2.3. Using Type 1 Fonts
7.2.4. Choosing the right font encoding
7.2.5. Using True Type fonts
7.2.6. The hyperref package
7.2.7. Hyphenation
7.2.8. Bookmarks
7.2.9. PDF view options
7.2.10. Links to internet sites
7.2.11. Thumbnails
7.2.12. Configuring pdfjadetex
7.2.13. Further enhancements
7.3. Optimal PS
7.3.1. Embedding Computer Modern fonts
8. HTML validation
9. Accessibility
9.1. Priority 1 accessibility errors
9.2. Priority 2 accessibility errors
9.3. Priority 3 accessibility errors
10. Mathematics
10.1. DBTeXMath
10.2. Writing Mathematics in LyX
10.3. The magic behind the math
10.3.1. SGML math code correction
10.3.2. HTML and RTF
10.3.3. PDF and PS
10.4. Problems of the DBTeXMath method
11. Localization
11.1. Shell localization
11.2. sed localization
11.3. awk localization
11.4. Perl localization
11.5. Keyboard localization
11.5.1. xmodmap and xkeycaps
11.5.2. Modifiers and Mode_switch
11.5.3. Helpful Hints and Tips
11.6. LyX localization
11.6.1. Layout Language Options
11.6.2. Keyboard mapping configuration
11.6.3. Character Tables
11.6.4. International Spellcheck Support
11.7. Openjade localization
11.8. dvips localization
11.9. DSSSL stylesheet localization
11.10. TeX localization
11.11. lynx localization
11.12. Open localization problems
12. Shortcomings and bugs
13. Other methods
14. Bibliography
A. Appendix
A.1. The GNU Free Documentation Licence
A.1.1. PREAMBLE
A.1.2. APPLICABILITY AND DEFINITIONS
A.1.3. VERBATIM COPYING
A.1.4. COPYING IN QUANTITY
A.1.5. MODIFICATIONS
A.1.6. COMBINING DOCUMENTS
A.1.7. COLLECTIONS OF DOCUMENTS
A.1.8. AGGREGATION WITH INDEPENDENT WORKS
A.1.9. TRANSLATION
A.1.10. TERMINATION
A.1.11. FUTURE REVISIONS OF THIS LICENSE
A.1.12. ADDENDUM: How to use this License for your documents
Reference List
Index
List of Tables
4-1. ISO/DIN paper sizes
7-1. Paper sizes with hyperref
7-2. Link colours with hyperref
7-3. PDF view options
11-1. latin1 character set
List of Figures
4-1. General document info.
4-2. ISO-DIN paper sizes.
6-1. Insert URL with underscores in LyX.
7-1. CSS page area model.
7-2. Document Info: Fonts.
List of Equations
5-1. (eq1)
10-1. (eq2)
10-2. (eq3)
10-3. (eq4)
10-4. (eq5)
10-5. (eq6)
10-6. (eq7)
10-7. (eq8)
10-8. (eq9)
10-9. (eq10)
10-10. (eq11)
10-11. (eq12)
10-12. (eq13)
10-13. (eq14)
10-14. (eq15)
10-15. (eq16)

Chapter 1. Terms of distribution

1.1. Disclaimer

No liability for the contents of this documents can be accepted. Use the concepts, examples and other content at your own risk. As this is a new edition of this document, there may be errors and inaccuracies, that may of course be damaging to your system. Proceed with caution, and although this is highly unlikely, the author does not take any responsibility for that.

All copyrights are held by their respective owners, unless specifically noted otherwise. Use of a term in this document should not be regarded as affecting the validity of any trademark or service mark.

Naming of particular products or brands should not be seen as endorsements.


1.2. Formats

Important IMPORTANT: Script and Stylesheet Downloads!
 

If you want to download only the scripts and stylesheets of this project, without any documentation, get the following archive:

This document is available in the following formats:

Note RTF: Page numbers
 

In order to get correct page numbers in Microsoft Word, type the following after opening the document:

  1. CTRL-END

  2. CTRL-A

  3. F9

In Word Viewer 97, you must instead do:

  1. CTRL-END

  2. ALT

  3. V

  4. N

  5. ALT

  6. V

  7. P

See The OpenJade RTF backend for more details.

Important IMPORTANT: Downloads for offline reading!
 

If you want to download the HTML or RTF formats for offline reading, you will need to download the images as well - PNG for HTML and BMP for RTF, including the callouts! To save you the hassle, I have compiled the following zipped tar archives for offline reading (these contain the scripts and stylesheets too):

A tarball containing all the above formats, including images and scripts, is also available:


1.3. License

Copyright © 2004-2006 Chris Karakas. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license can be found in Section A.1, as well as at the GNU Free Documentation License (2) .


1.4. Availability of sources and support

See Section 1.2 for the modifiable sources of this document.

Important Pose your questions in my Linux Forum!
 

I've been working on this project since 2000. It is still work in progress. I don't have an "installation script" or similar - you will have to read this document carefully and try the solutions offered. If you have questions, patches, suggestions, problems or (better) solutions, come to my Linux Forum and post them there. I look forward to your feedback!


1.5. Credits

The DSSSL stylesheets (Section 4.2, Section 7.1.5) owe a lot to the DocBook Guide of the Debian Newbiedoc Project, Mandrake's manual-print.dsl file (see Customizing Document Production for a detailed description) and the DBTeXMath method. Also, the CSS file for DocBook, ck-style.css, got important elements from the Newbiedoc CSS file for DocBook and Mark Pilgrim's influential dive into Accessibility.

RefDB gave me a real solution to my bibliography problem (see Section 5.19, Section 7.1.10).

Part of Section 5.1 is taken from the LyX Tutorial, section 2.2 (“Environments”).

Part of Section 3.9 is taken from the lynx manpage.

The introduction text to Section 6.3 is taken from the TeX FAQ item on How to approach errors. The material in Section 6.3.1 is taken from the TeX FAQ item on the structure of TeX errors. Section 6.3.4 contains material from the TeX FAQ item on the fatal format file error.

The description of the common LaTeX error messages and warnings in Section 6.3.2 uses material from the chapter on “ LyX and LaTeX errors” of the Extended Features manual for LyX, available from LyX' Help menu.

Many other sources have been used for this work. See Chapter 14 for some of them. Although not all of them are present in Chapter 14, they are all quoted at the aproppriate place inside the document. Please follow the links given there.

Figure 7-1 is taken from W3C's working draft CSS3 Paged Media Module, version of Dec. 18th 2003 and is Copyright © 2003 W3C (MIT, ERCIM, Keio), All Rights Reserved. Used with permission according to W3C document licence.

The CSS file for DocBook that is used in this document , ck-style.css, uses QBullets in links. See Section 7.1.8 on how to do this. Thanks to Matterform Media for providing QBullets for free. If you plan to use them on your website, please observe the QBullets usage terms.

The examples for admonition in the Conventions Section (Section 1.7) were taken from the Section on admonitions of the DocBook Guide of the Debian Newbiedoc Project.

The method I present is an original work of mine, that arouse out of the desire to “write once, create many” (see how it all started in my Jade installation notes and how it ended in Section 5.21). I wanted to have a document processing chain with all the bells and whistles, including the ability to process Mathematics (Chapter 10), bibliography (see Section 5.19) and Index (see Section 5.20), controlled from one source and one click of a button. My solution uses many well-known methods and packages, but the “glue” is original.


1.6. Aknowledgements

Thanks to all authors, whose software is used in this work. Their example is a continuous source of inspiration to me:

Many thanks to Gareth Anderson and Jeremy Malcolm for their comments, suggestions and bug reports. Jeremy Malcolm pointed to missing .start and .end files in the online 1.3 version - many thanks! Thanks also to Gloomy, for the moral support during the hard times.


1.7. Conventions

admonitions

Admonitions are little pictures used to emphasize something of importance to the reader. The four types used are:

Note Note
 

Using a hammer to put together your computer is bad.

Tip Tip
 

Do not hit your thumb with the hammer, it hurts!

Important Important
 

Watch where you're swinging that hammer!

Caution Caution
 

Hitting your thumb with a hammer may lead to an unwanted trip to the hospital!

Warning Warning
 

Do not, under any circumstances, admit that you hit your own thumb with a hammer. The ridicule you will face is astounding!

access

keys Access keys enable navigation through the document, without relying on a mouse. The following keys have been given special meaning in this document:

P

Previous page.

N

Next page.

H

Home of the document (Table of Contents).

U

Up (takes you one level up the section hierarchy).

If you also happen to be reading the document from its original location, then the following access keys can also be used:

S

Start (takes you to the author's start page).

T

The current (“This”) page, without the Sitemenu on the left.

M

The current page in a frameset, where the left frame contains a Menu.

To use the access keys, you have to simultaneously press a modifier key, which may vary from browser to browser. For example in NN6+/Mozilla, the modifier key is ALT, so you have to use ALT-N to go to the next page, and ALT-P to come back. In other browsers such as IE6, the access keys just give focus to the associated link, so the sequence becomes ALT-N Enter . Try it, you'll like it! smile


1.8. Abbreviations

SGML

(Standard Generalized Markup Language) The superset of all the markup languages, SGML is a generalized schema for tagging structure data in an application-independent way.

XML

(eXtensible Markup Language) XML is an SGML for tagging structured data according to an accompanying DTD which defines the collection of tags and the rules for ordering and nesting them. This allows for labelling domain-specific data in a consistent way which allow s new applications to match the input file with the DTD to parse, validate and manipulate the data stream.

DSSSL

(Document Style Specifications and Semantics Language) A scheme-based language for rendering SGML documents, DSSSL code will specify the placements and font manipulations for each tag in the specified DTD. While many shops prefer the XSL method of doing the same formatting function, DSSSL is the most common method in the DocBook and OpenJade world.

XSL

(eXtensible Style Language) A text format specification language which is itself an XML; many sites prefer XSL to DSSSL because the same editing tools used for the document can also be used on the stylesheet.

DocBook

A document definition markup language which defines a set of SGML tags for the specific purpose of producing technical documentation. DocBook is used with a DSSSL or XSL to create document source files which will be portable across different display methods, for example, one document which renders in Postscript, Windows and Java Help, PDF, RTF, ASCII and HTML.


Chapter 2. Introduction

Do you want to create professionally formatted documents? Tired of always having to change the font settings, to insert or delete pagebreaks, to format your text for printing, monitor, or web view? Do you find yourself spending hours of your life into formatting issues that you wish you never had to be conftonted with? Did you, during your editing efforts, ever get the uncomforting feeling that you are inventing the wheel for th 39th time?

Well, in fact you are! This document will show you how to use the power of Open Source tools like LyX (3) and sgmltools (4) to create the documents you've dreamed of, while, as a nice side effect, concentrating on what deserves most of your attention: Content, not Formatting! I will describe a method by which all you have to do is to write your document in LyX, then run the . lyx file through a script that will produce the SGML source and, from it, the HTML, TXT, RTF, PDF and PS formatted documents, complete with table of contents, embedded pictures, fonts, thumbnails (for PDF) and other goodies.


2.1. The general idea

When writing LyX documents, formatting should be the last thing on your mind. Concentrate on writing a clear and concise document. The sgml parser will take care of the formatting.We've all heard of WYSIWYG. LyX is WYGIWYM. WYGIWYM stands for "What You Get Is What You Mean". This means if you mean for text to represent source code you assign it a source code environment (Lyxese for style), and it formats the way you meant it to. You needn't worry about formatting during the writing of your document. If you don't like the way it looks upon finishing and printing it out, you can change the way styles map to formatting, and those styles will consistently change throughout the document.


2.2. Line of attack

The method described here follows a line of attack defined by the following:

  • We will put everything in a shell script. We don't want to bother about anything else. We want to ponder comfortably upon the meaning of life while drinking some coffee or tea, watching our computer do the work for us - personally, a very rewarding experience :-)

  • We will use sed to correct LyX' SGML output. The more we are able to correct, the more SGML features we get out of our plain vanilla LyX.

  • We use the sgmltools package which hides a lot of details from the end user, giving nonetheless all the power of the involved tools.[2]

  • We will adapt the DocBook DSSSL stylesheets to our personal needs and taste.

  • The end product shall be a directory ready to upload to our web server with all files, links, images and formats necessary.


Chapter 3. Required software

The method described in this document requires a well-configured Linux system, armed with a heavy machinery of various software packages:

In the following, I will describe the required software in more detail. Of course, I cannot cover all details. See the documentation that comes with each tool and, for an alternative description, the Apendix (5) of DocBook: The definitive Guide.

Important Version-specific tweaks!
 

The method I will describe (more precisely, the sed script that corrects LyX' SGML output) is tailored to LyX 1.2.0-91, a rather dated version I use from a SuSE RPM. If you have a newer version of LyX, you will almost certainly have to tweak sedscr, since SGML output has been corrected in the newer 1.3.x versions. But the important thing is the method, not the version-specific tweaks, which you should be able to figure out yourself with a firm knowledge of sed, regular expressions and the help of this guide!wink

Currently, you should stick to the rather dated 1.2.0 version from one of the RPMs in Section 3.1, for reasons discussed in LyX 1.3.4 not suitable for Mathematics work in DocBook.

I have not investigated the portability of the method across operating systems. I have developed and tested it on a (rather dated) SuSE Linux 7.3 system and, more recently, on a SuSE Linux 9.0 system. Portability to other operating systems is dependent on the availability of the software needed and the scripting facilities offered. Any porting efforts are welcome.


3.1. LyX

In case you are wondering what LyX is, here is what http://www.lyx.org says on the subject:

LyX is an advanced open source document processor that encourages an approach to writing based on the structure of your documents, not their appearance. LyX lets you concentrate on writing, leaving details of visual layout to the software.

LyX runs on many Unix platforms, OS/2, and under Windows/Cygwin (this port requires an X server). It can also run natively on Mac OS X, thanks to the Qt/Mac library.

LyX produces high quality, professional output -- using LaTeX, an industrial strength typesetting engine, in the background; LyX is far more than a front-end to LaTeX, however. No knowledge of LaTeX is necessary to use LyX, although it will give a user more power.

LyX is stable and fully featured. It has been used for documents as large as a thesis, or as small as a business letter. Despite its simple GUI interface (available in many languages), it supports tables, figures, and hyperlinked cross-references, and has a best-of-breed math editor.

Get a suitable version of LyX available fo your distribution. I prefer to get the source RPM, like lyx-1.2.0-91.src.rpm and then compile it with

rpm --rebuild /usr/src/lyx-1.2.0-91.src.rpm

Update: Versions 1.3.2 and 1.3.4 of LyX do NOT work for our purposes! Version 1.3.2 brings the error Counter does not exist: sect1 and version 1.3.4 does not contain the “begin{equation}” commands in the alt part of the equation element (see LyX 1.3.4 not suitable for DocBook and Mathematics work), thus making all our efforts to use the DBTeXMath method (see Section 10.1) fail in vain. I am currently in contact with the LyX development team to iron these problems out. In the meantime, if you are having difficulties with your own LyX version, you can use the following RPMs to install version 1.2.0-91 which is guaranteed to work:

If you want to compile version 1.2.0-91 for your own Linux system, here are the source RPMs:

Note that the two source RPMs are practically identical, up to the lyx.spec file and an extra dif file for src/frontends/xforms/GUIRunTime.C. This is due to the renaming of some packages under SuSE 9.0, which makes changes in the “# usedforbuild” part of the spec file necessary, as well as to the check for the xforms package version that accepts only versions 0.88 and 0.89 - while in SuSE 9.0 we have already arrived at xforms 1.0-137! See How to compile an older version for a newer system in RPM for the details.

You might have to install doxygen (SuSE: Series d, install with YaST). The rebuild process of rpm creates a new RPM packet of LyX, as one can see from the last lines of the long output:

Processing files: lyx-1.2.0-91 
Finding Provides: (using /usr/lib/rpm/find-provides)... 
Finding Requires: (using /usr/lib/rpm/find-requires)... 
Requires: tetex te_latex /bin/bash /bin/sh /usr/bin/perl
/usr/bin/python ld-linux.so.2 libICE.so.6 libSM.so.6 libX11.so.6
libXpm.so.4 libc.so.6 libc.so.6(GLIBC_2.0) libc.so.6(GLIBC_2.1)
libc.so.6(GLIBC_2.1.3) libforms.so.0.89 libjpeg.so.62 libm.so.6
libm.so.6(GLIBC_2.0) libstdc++-libc6.2-2.so.3 
Wrote: /usr/src/packages/RPMS/i386/lyx-1.2.0-91.i386.rpm

Install the newly created RPM package as usual, either with YaST, or with

rpm -Uvh /usr/src/packages/RPMS/i386/lyx-1.2.0-91.i386.rpm
Note Please note:
 

As you can see from the above output of the rebuild command, LyX requires the following packages and libraries to be already installed on your system:

Requires: tetex te_latex /bin/bash /bin/sh /usr/bin/perl
/usr/bin/python ld-linux.so.2 libICE.so.6 libSM.so.6 libX11.so.6
libXpm.so.4 libc.so.6 libc.so.6(GLIBC_2.0) libc.so.6(GLIBC_2.1)
libc.so.6(GLIBC_2.1.3) libforms.so.0.89 libjpeg.so.62 libm.so.6
libm.so.6(GLIBC_2.0) libstdc++-libc6.2-2.so.3

On SuSE systems, if you install Lyx with the above rpm command, it is a good idea to either run SuSEconfig, or just the part of SuSEconfig that is relevant to LyX, /sbin/conf.d/SuSEconfig.lyx:

/sbin/conf.d/SuSEconfig.lyx  Running LyX configure script ...

On other systems, you may have to reconfigure LyX by hand: This is done from the menu Edit-->Reconfigure. See Section 4.1.

Now your LyX is up-to-date. You just have to write your document with it. Since LyX comes with a well written Tutorial (written itself in LyX), as well as User Guide (both easily acessed from the Help menue), I will not delve into the details of writing with LyX here.


3.2. DocBook

Here is what the LDP-Author-Guide says about DocBook:

To explain what DocBook is, we must first take a look at what SGML and XML are, and their relationship to DocBook.

The Standard Generalized Markup Language (SGML) is a language that is based on embedding codes within a document. In this way, it is similar to HTML, but there is where any similarities end. The power of SGML is that unlike WYSIWYG (What You See Is What You Get), you don't define things like colors, or font sizes, or even some kinds of formatting. Instead, you define elements (paragraph, section, numbered list) and let the SGML processor and the end program worry about placement, colors, fonts, and so on. HTML does the same thing, and is actually a subset of SGML. SGML has really three parts that make it up. First is the Structure, which is what is commonly called the DTD, or Document Type Definition. The DTD defines the relationship between each of the elements (or tags). The DocBook DTD, used to create this document, is an example of this. The DTD lists the rules that the content must follow. Second is the DSSSL or Document Style Semantics and Specification Language. The DSSSL tells the program doing the rendering how to convert the SGML into something that a human can read. It tells the renderer to convert a title tag into 14 point bold if it is going to RTF format, or to turn it into a <h1> tag if it is going to HTML. Finally there is the Content, which is what gets rendered by the SGML processor and is eventually seen by the user. This paragraph is content, but so is a graphic image, a table, a numbered list, and so on. Content is surrounded by tags to separate each element.

The following features must be installed to make DocBook usable (the RPM names and versions I mention refer to the ones I have installed on my SuSE Linux 7.3 system):

  • The DocBook DTD version 4.1 or version 3.1. LyX, starting from version 1.2.0, uses version 4.1 while older versions use version 3.1. (RPM: docbook_4-4.1-97 and docbook_3-3.1-98 respectively. I have also html-dtd-2001.11.7-0 installed, ). More recent versions, like docbook_3-3.1-468 and docbook_4-4.2-362 also seem to work O.K.

  • The ISO entities, that define some standard SGML entities (e.g. &gt;, &lt;, etc.) (RPM: iso_ent-2000.11.03-122. The newer iso_ent-2000.11.03-531 also seems O.K.)

  • The Norman Walsh's DocBook DSSSL modular stylesheets(RPM: docbook-dsssl-stylesheets-1.78-78).. The 1.72 and 1.78 versions are known to work. The 1.79 version has some problems (for example, it does not display the information in <othercredit> elements, see Trouble with new version of lyxtox scripts), so it is not recommended. You can get the 1.78 version from docbook-dsssl-stylesheets-1.78-78.

  • Jade/jadetex/openjade (version 3.0) (RPM: openjade-1.3-289, jade_dsl-1.2.1-369, jadetex-3.11-65). See also Section 3.4

  • SGMLtools-lite (RPM: sgmltools-lite-3.0.2-164). Currently not needed, see also Section 3.3.

  • Of course, you have to satisfy all the dependencies of the above packages. This can be quite a nightmare if you choose to do it “by hand”, as you can read in my Jade installation notes.


3.3. sgmltools

sgmltools-lite is a package whose purpose is to simplify the document creation process from SGML to some other format. It takes the complexity of the various commands involved and takes care of all the invocations and the options required. I used sgmltools-lite-3.0.2-164. However, the current version of the scripts does not use sgmltools anymore, so you don't need it. Some code of the print.dsl file of this package was used in the lyxtox stylesheets, but again, no changes are needed on your part.


3.4. Openjade, pdfTeX and JadeTeX

Openjade renders the SGML documents (that we will export from LyX) to the various other formats, like HTML, RTF, LaTeX etc. I use:

  • openjade-1.3-289: renders and validates the SGML code based on the DTD and DSSSL stylesheet.

  • pdfTeX-0.13d. I had to upgrade to for the same reason I had to upgrade openjade.

  • JadeTeX-3.11-65 (needed to process the TeX format created by openjade when called with the “-b tex” option).

Due to an error in a rather exotic situation (see Section 6.3.4), I recently had to upgrade pdfTeX to version 1.11b, or more precisely (Web2C 7.5.2) 3.141592-1.11b, and JadeTeX to version 3.13 - definitely recommended.

See the LDP Author's Guide (7) for more details on these programs.


3.5. TeX and LaTeX

LyX is a LaTeX front end. It was primarily designed with TeX and LaTeX in mind. It provides a more or less WYSIWYG environment for TeX/LaTeX. Further, since we will use the TeX format to create PDF documents through pdfjadetex, we will need as much TeX related machinery as we can get. This practically means that you should install all the TeX and LaTeX packages that come with your didtribution. For example:

  • te_etex-1.0.7-319

  • texinfo-4.0-268

  • tetex-1.0.7-319

  • te_latex-1.0.7-319

  • te_pdf-1.0.7-476

  • db2latex-0.5.1-15


3.6. Dvips, Ghostscript and ImageMagik

You'll need dvips for the creation of the PostScript®[3] format. You will need Ghostscript for the creation of thumbnails for PDF (see Section 3.7), as well as for various conversions of your images (see Section 4.9), where ImageMagik will also play a central role:

  • dvips[4]

  • ghostscript

  • xdvi[5]

  • ghostview (package gv-3.5.8-718)

The latter two programs are previewer for files in Dvi and PostScript® format. If you don't know what a dvi-file is, you've probably also never worked with LaTeX and should read the Tutorial document before proceeding further.


3.7. thumbpdf

The thumbpdf package by Heiko Oberdiek installs the Perl program thumbpdf on your system. With the help of thumbpdf and Ghostscript (which should also be installed), you can create thumbnails for the PDF document (to be seen when you click on the “thumbnails” register card in Acrobat® Reader[6]). Thumbnails are embedded images of the document's pages, drawn in small size and resolution. Their purpose is to facilitate navigation through the document (of course only if the PDF viewer supports them).

Thumbnails will be created automatically by the lyxtox script and will be embedded in the PDF document whithout any user intervention (see Section 7.1.4.7 for a detailed description). You just have to take care that the thumbpdf package is installed.

You can download thumbpdf from CTAN: thumbpdf. After download and extraction of the package, the files

  • readme.txt (documentation)

  • thumbpdf.tex (pdftex)

  • thumbpdf.sty (pdf(e)tex, pdf(e)latex, (e)tex, (e)latex)

should be moved to

  • texmf/doc/generic/thumbpdf/readme.txt

  • texmf/tex/generic/thumbpdf/thumbpdf.tex

  • texmf/tex/generic/thumbpdf/thumbpdf.sty

respectively. The Perl script itself, thumbpdf.pl, may be renamed to thumbpdf:

mv thumbpdf.pl thumbpdf

Ensure that the execute permission is set:

chmod +x thumbpdf

then move the file to a directory where the shell can find it (according to the PATH environment variable, e.g. /usr/local/bin/):

mv thumbpdf /usr/local/bin/thumbpdf

Requires:

  • Perl5 (version 5 of the perl interpreter).

  • Ghostscript:

    • Thumbnail generation: version 5.50 or better 6.0.

    • Thumbnail inclusion with ps2pdf: version 6.0.

  • pdfTeX.


3.8. Sed and awk

sed is a stream editor. A stream editor is used to perform basic text transformations on an input stream (a file or inpu t from a pipeline). We will use sed extensively through scripts like runsed, that is a wrapper around sed that takes a “sed script”, like sedscr, containing sed commands, as a first argument and the file to be transformed as a second.

awk searches files for lines (or other units of text) that contain certain patterns. When a line matches one of the patterns, awk performs specified actions on that line. awk keeps processing input lines in this way until it reaches the end of the input files. We will use awk to split the sed processed files into header, body and footer, in order to be able to manipulate these parts separately, before reassemblying them into a final document for further processing.

Most probably, your Linux distribution has already installed sed and awk for you. In case it hasn't, use the package management tool of the distribution to install them, or compile them from the source, if you feel like.


3.9. Lynx

Lynx is a fully-featured World Wide Web (WWW) client for users running cursor-addressable, character-cell display devices (e.g., vt100 terminals, vt100 emulators running on Windows 95/NT or Macintoshes, or any other "curses-oriented" display). It will display hypertext markup language (HTML) documents containing links to files residing on the local system, as well as files residing on remote systems running Gopher, HTTP, FTP, WAIS, and NNTP servers. Current versions of Lynx run on Unix, VMS, Windows 95/NT, 386DOS and OS/2 EMX.

We will use Lynx to transform the generated HTML version of our document to the plain text version, so you should have Lynx installed on your system.


3.10. HTML tidy

When editing HTML it's easy to make mistakes. Wouldn't it be nice if there was a simple way to fix these mistakes automatically and tidy up sloppy editing into nicely layed out markup? Well now there is! Dave Raggett's HTML TIDY is a free utility for doing just that. It also works great on the atrociously hard to read markup generated by specialized HTML editors and conversion tools (like the ones we will be using), and can help you identify where you need to pay further attention on making your pages more accessible to people with disabilities (see Chapter 9).

Tidy is able to fix up a wide range of problems (which is very good if you are trying to produce valid HTML, as described in Chapter 8) and to bring to your attention things that you need to work on yourself. Each item found is listed with the line number and column so that you can see where the problem lies in your markup. Tidy won't generate a cleaned up version when there are problems that it can't be sure of how to handle. These are logged as "errors" rather than "warnings".

See htmltidy for more details. We call htmltidy from within the lyxtox script (see Section 7.1.4.6 for the details). Download it and install it somewhere in your path.


3.11. Refdb

Note Alternative way for Bibliography
 

You are not confined to using RefDB whith my lyxtox script. If you don't feel like building your own bibliographic database, you can skip this section and just supply a bibliography.lyx file together with your LyX document. Set the process_RefDB variable in lyxtox to "0" and it will use your own bibliography.lyx to produce a bibliography.sgml file, instead of trying to create one automatically through RefDB. The bibliography.lyx file should then contain the SGML code for the references list, in the SGML environment of LyX. The GNU/Linux Command-Line Tools Summary HOWTO uses this approach, for example. See the bibliography.lyx and bibliography.sgml files in the Formats section of GNU/Linux Command-Line Tools Summary HOWTO.

Be warned however, that writing a bibliography file with all your references in SGML is not fun and does not solve the problem of formatting those references to the style of the journal (or medium) you are submitting your work (for this you would need an extra DSSSL stylesheet). In the long run, these two disadvantages will work against you to the point of being a real pain, especially if you submit the same work to more than one journals with conflicting formatting style guidelines regarding references.

Read Who should use refdb? - it will help you decide whether you need RefDB or not.

Important RefDB problems
 

Currently, version 0.9.4-pre5 of RefDB worked fine for some time, then, after adding a few citations and a few pages more to this document, refdbxp started to segfault. I tried 0.9.4, just released, but got installation problems and a "could not read from refdbd" message each time I tried anything. However, this does not mean that the following is untested or that RefDB does not work - during the short time I got it to work, it worked fine, as did all the scripts and stylesheets I present here. It also does not mean that refdbxp will segfault on your system - as always, YMMV.

RefDB is a reference database and bibliography tool for SGML, XML, and LaTeX/BibTeX documents. It allows users to share databases over a network. It is lightweight and portable to basically all platforms with a decent C compiler. And it's released under the GNU General Public License (2) .

RefDB is currently known to build and run out of the tarball on at least these platforms:

  • Linux

  • FreeBSD

  • NetBSD

  • Solaris (using gcc)

  • OSX/Darwin

  • Windows+Cygwin

RefDB appears to be the only available tool to create HTML, PostScript, PDF, DVI, MIF, or RTF output from DocBook or TEI sources with fully formatted citations and bibliographies according to publisher's specifications. If you want to include bibliographies in LyX , RefDB is the way to go. The standard way, using BibTeX, will NOT work in our SGML context.

To install RefDB, it is highly recommended to get the newest version currently available. Due to various bugs, versions older than 0.9.4-pre5 will not work. By the time you read this, version 0.9.4 should be out and you should use that one - here I will describe the installation procedure for the 0.9.4-pre5 version:

Download RefDB from the RefDB downloads page, along with the Perlmod package, currently RefDB-perlmod-0.3.tar.gz (you can skip the latter if you don't require the MARC and Pubmed import filters). First, install any Perl modules you need, such as RefDB-perlmod, MARC::Record and MARC::Charset:

For RefDB-perlmod:

tar -xzvf RefDB-perlmod-0.2.tar.gz
cd RefDB-perlmod-0.2/
perl Makefile.PL

You get an output like:

Checking if your kit is complete...
Looks good
Writing Makefile for RefDB::perlmod

After that, install perlmod with

make install

For the other two modules, MARC::Record and MARC::Charset, the installation can be done in the usual manner for CPAN modules:

perl -MCPAN -eshell
install MARC::Record
install MARC::Charset

I could not install MARC::Charset because it needs Perl 5.8.0 - however, this does not interfere with my method here, so you can safely skip it.

After the Perl modules, you must install the libdbi package: Go to the libdbi homepage and download the latest libdbi version libdbi-0.7.2.tar.gz. Then go to the libdbi-drivers homepage and download the latest libdbi-drivers package libdbi-drivers-0.7.1.tar.gz. Then, uninstall any previous libdbi or RefDB package with

make uninstall

from their source directory and you are ready for the installation of libdbi:

cd /usr/src/libdbi-0.7.2
./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var --infodir=/usr/share/info 
--mandir=/usr/share/man --with-gnu-ld
make
make install

Adapt the configure options to your situation, those above are the ones I use in my old SuSE system. The installation of libdbi-drivers is accomplished similarly:

cd /usr/src/libdbi-drivers-0.7.1
./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var --infodir=/usr/share/info 
--mandir=/usr/share/man --with-gnu-ld --with-mysql --with-pgsql
make
make install
make check

Note the “--with-mysql” and “--with-pgsql” options in the configure command above: for every libdbi database driver you plan to use, you have to insert the aproppriate “--with-” option. Type “./configure --help” to see all options. However, it makes no sense to say you want libdbi-drivers to be compiled with the pgsql driver, for example, if you don't have PostgreSQL installed and running.

In the "make check", you must give the database administrator's name and password (not that of a user, as in the 0.6.x version). Just accept the other values offered:

database hostname? [(blank for local socket if possible)]
database name? [libdbitest]

The output of “make check” starts with:

Plugin information:
-------------------
        Name:       mysql
        Filename:   /usr/lib/dbd/libmysql.so
        Desc:       MySQL database support (using libmysqlclient)
        Maintainer: Mark M. Tobenkin <mark@brentwoodradio.com>
        URL:        http://libdbi-drivers.sourceforge.net
        Version:    dbd_mysql v0.7.1
        Compiled:   Jan 30 2004
Successfully connected!

and continues with the outcome of various tests (create a database, select a database, list tables etc.). After all tests are completed successfully, you will (hopefully) see

SUCCESS! All done, disconnecting and shutting down libdbi. Have a nice day.
PASS: test_dbi
==================
All 1 tests passed
==================

Finally, you are ready for the installation of RefDB (instructions pertain to version 0.9.4-pre5:

cd /usr/src/refdb-0.9.4-pre5
./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var --infodir=/usr/share/info
--mandir=/usr/share/man --with-sgml-declaration=/usr/share/sgml/docbook_4/docbook.dcl
--with-xml-declaration=/usr/share/sgml/openjade/xml.dcl
--with-docbook-xsl=/usr/share/sgml/docbook/docbook-xsl-stylesheets
--with-refdb-url=http://midas/refdb --with-libdbi-lib=/usr/lib
make
make install

You must adapt the options of the configure command to your situation. Read the excellent RefDB documentation for more details.

This completes the RefDB installation - but you are not done yet! You must create the refdb database and grant yourself all privileges on it. In MySQL, on the mysql promt, you type:

CREATE DATABASE refdb;
grant all privileges on refdb1.* to chris@localhost identified by 'password';

On the shell prompt, type:

mysql -u root -p refdb < /usr/share/refdb/sql/refdb.dump

to populate the refdb database. Note that the refdb database is NOT the database you will use for your bibliography entries, but a central repository of information necessary for RefDB's internal functions.

Caution The refdb name has changed between versions!
 

For version 0.9.4-pre5, the database is called refdb, NOT refdb1! That "central repository" of RefDB used to be called "refdb", then it became "refdb1" and in 0.9.4-pre5 "refdb" again. I didn't know this and I kept on getting errors saying

failed to connect to database server

without telling me why! After I changed

 if (dbi_conn_connect(conn) < 0) { /* -1 and -2 indicate errors */
 LOG_PRINT(LOG_WARNING, "failed to connect to database server");
 dbi_conn_close(conn);
 return NULL;
 }

in src/refdba.c to

 const char *errmsg;
 if (dbi_conn_connect(conn) < 0) { /* -1 and -2 indicate errors */
 dbi_conn_error(conn, &errmsg);
 printf("\nUnable to connect! Error message: %s\n", errmsg);
 LOG_PRINT(LOG_WARNING, "failed to connect to database server");
 dbi_conn_close(conn);
 return NULL;
 }

I was able to see that the reason was that it was trying to connect to refdba, and NOT to refdba1:

Unable to connect! Error message: 1044: Access denied for user: 'chris@localhost' to database 'refdb'
failed to connect to database server
command processing done, finish dialog now
child finished client on fd 5
child exited with code 0
server waiting n_max_fd=4

The next step is to create the global configuration files. There are half a dozen of them, one for each RefDB tool. They are very well commented, so you will not encounter any problems in setting up the options there.

Further, you will have to import some citation styles. There are only two readily available: J.Biol.Chem.xml and Eur.J.Pharmacol.xml, both in the /usr/src/refdb-0.9.4pre5/styles directory. To import them, start the RefDB administration tool refdba and type:

addstyle /usr/src/refdb-0.9.4-pre5/styles/J.Biol.Chem.xml
addstyle /usr/src/refdb-0.9.4-pre5/styles/Eur.J.Pharmacol.xml

in the refdba prompt. Then, you can check which styles you have using the liststyle command in refdba:

refdba: liststyle .*

which will produce:

J.Biol.Chem.
Eur.J.Pharmacol.

You must also create at least one database that will hold your bibliographic entries. For example, to create the ck_refdb database, start refdba and type on the refdba prompt:

createdb ck_refdb
adduser -d ck_refdb chris -N newpassword

If you want to enter the bibliographic entries via a web interface and not from the command line, you have to configure Apache by inserting the following in its configuration file (adapt it to your situation accordingly):

Alias /refdb/ /usr/share/refdb/www/
<Directory "/refdb">
Options Indexes MultiViews
AllowOverride None
Order allow,deny
# Allow from all
Allow from 192.168.0.0/24
</Directory>

then copy some programs in the cgi-bin directory:

cd /usr/bin/
cp refdbc bib2ris nmed2ris /usr/local/httpd/cgi-bin/

and restart Apache.

To test RefDB, start the refdbd daemon with:

refdbd -s -e 0 -l 7

You should see something like:

refdbd -s -e 0 -l 7
dbi_driver_dir went to:
libdbi: Failed to load driver: /usr/lib/dbd/libpgsql.so
dbi is up using default driver dir
Available libdbi database drivers:
mysql
application server started
use /tmp/refdbd_fifo25577 as fifo
server waiting n_max_fd=4

If you don't have a database server correctly configured, although you asked for support of it in the libdbi-drivers configure command on compilation time, you will get an error like the above for PostgreSQL (libpgsql.so). In refdba, type

viewstat

If the server outputs look like the following:

You are served by: refdb 0.9.4-pre5
Client IP: 127.0.0.1
Connected via mysql driver (dbd_mysql v0.7.1)
to: 3.23.44-log
serverip: localhost
timeout: 60
dbs_port: 3306
logfile: /var/log/refdbd.log
logdest: 0
loglevel: 7
remoteadmin: off
pidfile: /var/run/refdbd.pid

then you know that everything is working fine. See the RefDB documentation for more details.


Chapter 4. Required preliminary steps

Just installing the required software (see Chapter 3) will not do the trick. There are quite a few details that will need your attention and will have to be configured correctly. In this section I will discuss all those preliminary steps. Don't worry, you will need to do them only once. Then, you may forget them forever.


4.1. Reconfigure LyX

Before you write anything in LyX, and after having installed all the required packages (see Chapter 3), you should reconfigure LyX to let it take into account all this software. Just do Edit-->Reconfigure. Check the output of this command (if you did not put LyX in the background, you should see it in the x terminal you started LyX in). You should see among others:

+checking for docbook class docbook-algo... yes 
+checking for docbook class docbook-book... yes 
+checking for docbook class docbook-chapter... yes 
+checking for docbook class docbook... yes 
+checking for docbook class docbook-section... yes 

If you don't, you miss the docbook DSSSL stylesheets (see Section 3.2) or something else went wrong. Don't continue before you fix this.


4.2. Adapt the DocBook DSSSL stylesheets

There are a lot of changes that must be done in the DocBook DSSSL stylesheets. The best way to incorporate them is to keep copies of the following files in your working directory:

See Section 7.1.5 for an explanation of the changes that have been incorporated in the above files. Make sure that their location is correctly set in lyxtox:

  HTML_CHUNKS_DSL="lyxtox-html.dsl"
  HTML_NOCHUNKS_DSL="lyxtox-onehtml.dsl"
  PRINT_PDF_DSL="lyxtox-print-pdf.dsl"
  PRINT_PS_DSL="lyxtox-print-ps.dsl"
  PRINT_RTF_DSL="lyxtox-print-rtf.dsl"
  PRINT_TXT_DSL="lyxtox-print-txt.dsl"

Also, you must insert "PDF" in the notation.class of the /usr/share/sgml/docbook_4/dbnotn.mod file:

<!ENTITY % notation.class "BMP| CGM-CHAR | CGM-BINARY | CGM-CLEAR | DITROFF | DVI | EPS | EQN | FAX | GIF | GIF87a | GIF89a | JPG | JPEG | IGES | PCX | PDF | PIC | PNG | PS | SGML | TBL | TEX | TIFF | WMF | WPG | linespecific %local.notation.class;"> 

If you still use LyX v.1.1.x, you should change /usr/share/sgml/docbook_3/dbnotn.mod to include “PDF” in the list of accepted file extensions:

<!ENTITY % notation.class                 "BMP| CGM-CHAR | CGM-BINARY | CGM-CLEAR | DITROFF | DVI                 | EPS | EQN | FAX | GIF | GIF87a | GIF89a                  | JPG | JPEG | IGES | PCX                 | PIC | PS | SGML | TBL | TEX | TIFF | WMF | WPG | PDF | PNG                 | linespecific                 %local.notation.class;"> 

If you omit this change, you will get a 'value of attribute "FORMAT" cannot be "PDF"' error when trying to produce a PDF (due to a 'format=”PDF”' attribute in the code for images, see Section 7.2.2). This error is also discussed in Chapter 6.

Open each one of the DSSSL stylesheets that you copied in your working directory and check if the paths used there are correct. The affected files are:

Check whether the paths in the ENTITY declarations are correct for your system. For example, in lyxtox-html.dsl you must check whether the following ENTITY declarations are correct:

<!ENTITY refdblib SYSTEM "/usr/share/refdb/dsssl/lib/refdblib.dsl">
<!ENTITY refdbvar SYSTEM "/usr/share/refdb/dsssl/lib/refdbvar.dsl">

Practically, this amounts to checking the right path for refdblib.dsl and refdbvar.dsl in your system. Repeat this check for the other two files.

Note Stylesheet location and RefDB
 

Currently, if you use RefDB, the above stylesheets have to be in the current directory (i.e. where also lyxtox is in). This is because in this case the refdb-html.dsl and refdb-print.dsl will be used. They are automatically generated from the RefDB stylesheet (e.g. J.Biol.Chem.dsl) which, in turn, is also automatically generated by RefDB (see Section 7.1.10.2 for all the details). refdb-html.dsl and refdb-print.dsl will point to the above lyxtox-*.dsl stylesheets for further processing, but they don't do it through a catalog (at the moment). Thus the lyxtox-*.dsl files have to be in the same directory, unless you change the generating awkscr_refdb_html and awkscr_refdb_print scripts.


4.3. Adapt pdftex.cfg

pdftex (part of the te_pdf package which you hopefully installed in Section 3.5) uses a file pdftex.cfg (located in /var/lib/texmf/pdftex/config/pdftex.cfg on my system), which contains amongst other things the names of the so-called “map files”. There is currently only one of these files there, pdftex.map (located in /var/lib/texmf/dvips/config/pdftex.map on my system). It contains the mapping between the short names (like hlsu8r) and the long PostScript® names (like LucidaSans-Bold) of the fonts. It must contain the mappings for the Computer Modern fonts (lines like “ cmr9 CMR9 <cmr9.pfb”), in order to be able to use them in PDF. Normally, this is already done by your distribution and you don't need to change anything.

On other systems, the mapping for the Computer Modern fonts may be located in a different file (perhaps under the name cm.map). Check your TeX/LaTeX documentation.


4.4. Adapt jadetex.cfg

jadetex.cfg is the configuration file for pdfjadetex (see Section 3.4). You should copy it in your working directory. You will definitely want to adapt jadetex.cfg to reflect the right author, keywords etc. for the PDF document (scroll to the end of jadetex.cfg to find the relevant code):

baseurl={http://www.karakas-online.de/mySGML/}, (1)
pdftitle={Document processing with LyX and SGML}, (2)
pdfsubject={Linux,document formatting}, (3)
pdfauthor={Copyright \textcopyright 2003, Chris Karakas}, (4)
pdfkeywords={Linux SGML LyX DSSSL DocBook} (5)
(1)
The baseurl will be added in front of any relative WWW link that you have in your PDF document.
(2)
The pdftitle will appear as the title of your PDF document in Acrobat® Reader under File-->Document Info-->General.
(3)
The pdfsubject will appear as the subject of your PDF document in Acrobat® Reader under File-->Document Info-->General.
(4)
The pdfauthor will appear as the author of your PDF document in Acrobat® Reader under File-->Document Info-->General. You may add copyright information as shown here.
(5)
The pdfkeywords will appear as a list of keywords for your PDF document in Acrobat® Reader under File-->Document Info-->General. Keywords are separated by blanks.

You can see the above information when you choose File-->Document Info-->General, see Figure 4-1.

Figure 4-1. General document info.

General document info.

General document info.

These are the minimum required changes for jadetex.cfg. You may change a lot of other settings for the PDF output. The jadetex.cfg file itself contains a lot of code which you may use as a point of departure for your explorations (don't forget to backup the original before you change it!). Some of the more advanced settings are discussed in Section 7.2. But if you are not interested in the gory details, you are already done with the above.


4.5. Check paths of catalog files

Check the lyxtox script and adapt the catalog files paths (see Section 7.1.7) to your situation. You may already have a master catalog installed on your system by your distribution. In my SuSE system, I have one in /etc/sgml/catalog. Open the file and pick the catalogs you need, adding them to the SGML_CATALOG_FILES variable as follows:

# openjade needs this! It is just the content of
# the /etc/sgml/catalog file. Please modify accordingly.
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/CATALOG.iso_ent"
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/CATALOG.docbook-dsssl-stylesheets"
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/CATALOG.mathml-2.0"
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/CATALOG.svg-1.1"
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/CATALOG.docbook_4"
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/openjade/catalog"

Not all catalogs from /etc/sgml/catalog should be added to SGML_CATALOG_FILES, although in theory, you should be able to just add the master catalog and let it do the rest (FIXME: I had problems with that, need to investigate why). Also, add the RefDB catalog only if you have RefDB installed:

SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/refdb/refdb.cat"

4.6. Adapt the preample

Put the following lines in the preample (menue Layout-->Preample) of your LyX document:

<!entity index SYSTEM "index.sgml">
<!entity bibliography SYSTEM "bibliography.sgml"> 
<!entity appendix SYSTEM "appendix.sgml">
<!ENTITY % output.print.png "IGNORE"> 
<!ENTITY % output.print.pdf "IGNORE"> 
<!ENTITY % output.print.eps "IGNORE">
<!ENTITY % output.print.bmp "IGNORE">

If you don't want to do this for each and every document you write in LyX, you can put the above lines at the very start of the template file docbook_article.lyx (located in /usr/share/lyx/templates in my system) as follows:

#LyX 1.2 created this file. For more info see http://www.lyx.org/
\lyxformat 220
\textclass docbook
\begin_preamble
<!entity index SYSTEM "index.sgml">
<!entity bibliography SYSTEM "bibliography.sgml"> 
<!entity appendix SYSTEM "appendix.sgml">
<!ENTITY % output.print.png "IGNORE">
<!ENTITY % output.print.pdf "IGNORE">
<!ENTITY % output.print.eps "IGNORE">
<!ENTITY % output.print.bmp "IGNORE">
\end_preamble
\language english
\inputencoding latin1
\fontscheme default
\graphics default
...
Important IMPORTANT:
 

If you use RefDB (Section 3.11), you have to change the bibliography entity to basename.bib.sgml, where basename is the name of your file, without the ending. Example: if your file is myTemplate.lyx and you use RefDB, then the bibliography entity should be declared as follows in the preample:

<!entity bibliography SYSTEM "myTemplate.bib.sgml">

See Section 7.1.9, Section 7.1.10, Section 7.1.11 and Section 7.2.2 for an explanation of this magic regarding the Appendix, Bibliography, Index and the figures respectively.


4.7. Admonitions

For graphical admonitions (see Section 1.7), you will need to copy the admonitions graphics. These are expected in /usr/share/sgml/docbkdsl/images by pdfjadetex, as the following error shows:

Error: pdfjadetex (file /usr/share/sgml/docbkdsl/images/important.pdf): cannot open image file

This is because the lyxtox-print.dsl file (see Section 4.2) contains the lines

    (define %admon-graphics-path%
    "/usr/share/sgml/docbkdsl/images/")

The images themselves are installed in /usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/images by the package docbook-dsssl-stylesheets-1.72-34 on my SuSE 7.3 system. Instead of changing the location in lyxtox-print.dsl, I have decided to copy them:

cp /usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/images/* /usr/share/sgml/docbkdsl/images/
Warning Warning:
 

DO NOT use a relative path, as in the commented line below! You will get errors of the form

LaTeX Warning: File `./images/important.pdf' not found on input line 5223.
although the files will be there. Most curiously, the images may nevertheless be included, but thumbpdf will fail!

The above is valid only for the print formats (like PDF, PS, RTF etc.). For the HTML output we use the lyxtox-html.dsl and lyxtox-onehtml.dsl, for many HTML files and one HTML file respectively. There, the directory of the admonition graphics is specified with the following code:

(define %admon-graphics-path%
;; REFENTRY admon-graphics-path
;; PURP Path to admonition graphics
;; DESC
;; Sets the path, probably relative to the directory where the HTML
;; files are created, to the admonition graphics.
;; /DESC
;; AUTHOR N/A
;; /REFENTRY
"./images/")

Create a directory with the same name as the . lyx file in your working directory, but without the . lyx ending:

mkdir myTemplate

Change to that directory and extract the admonitions archive:

tar -xzvf admonitions.tar.gz

This will create a subdirectory “images”, with all admonition graphics in it, in all the formats needed. Alternatively, you could copy the admonition images in the various formats into the images directory. The images contained are:

images/caution.bmp
images/caution.eps
images/caution.pdf
images/caution.png
images/important.bmp
images/important.eps
images/important.pdf
images/important.png
images/tip.bmp
images/tip.eps
images/tip.pdf
images/tip.png
images/note.bmp
images/note.eps
images/note.pdf
images/note.png
images/warning.bmp
images/warning.eps
images/warning.pdf
images/warning.png

If you have only the GIF versions of the admonitions, you can use the adddscr script as follows:

cd images
adddscr gif

Instead of “gif”, you can use whatever version you happen to have (like “png”, “bmp” etc). The adddscr script will convert the format you designated into PNG first, then use the addd script to add density (seeSection 4.9 and Section 7.2.2) and finally will create all the other formats with ImageMagik (see Section 3.6).See Section 7.2.2 for the background.


4.8. Callouts

You can control various parameters regarding callouts in the .dsl files (see Section 4.2 and Section 7.1.5). You should especially check the path to the callout images. In our case, it has to be “./images/callouts” , i.e.the images will be located in a subdirectory of the images directory, which in turn will be located inside the directory of the HTML files.

(define %callout-graphics%
  ;; If true, callouts are presented with graphics (e.g., reverse-video
  ;; circled numbers instead of "(1)", "(2)", etc.).
  ;; Default graphics are provided in the distribution.
  #t)
(define %callout-graphics-path%
  ;; Sets the path, probably relative to the directory where the HTML
  ;; files are created, to the callout graphics.
  "./images/callouts/")
(define %callout-graphics-extension%
  ;; REFENTRY callout-graphics-extension
  ;; PURP Extension for callout graphics
  ;; DESC
  ;; Sets the extension to use on callout graphics.
  ;; /DESC
  ;; AUTHOR N/A
  ;; /REFENTRY
  ".png")
(define %callout-graphics-number-limit%
  ;; If '%callout-graphics%' is true, graphics are used to represent
  ;; callout numbers. The value of '%callout-graphics-number-limit%' is
  ;; the largest number for which a graphic exists. If the callout number
  ;; exceeds this limit, the default presentation "(nnn)" will always
  ;; be used.
  10)

Change to the myTemplate directory (we created it in Section 4.7) and extract the callouts archive:

tar -xzvf callouts.tar.gz

This will create a subdirectory “images/callouts”, with all callouts graphics in it, in all the formats needed. Alternatively, you could copy the admonition images in the various formats into the images/callouts directory. The images contained are:

images/callouts/10.png
images/callouts/1.png
images/callouts/2.png
images/callouts/3.png
images/callouts/4.png
images/callouts/5.png
images/callouts/6.png
images/callouts/7.png
images/callouts/8.png
images/callouts/9.png
images/callouts/10.eps
images/callouts/10.bmp
images/callouts/10.pdf
images/callouts/1.eps
images/callouts/1.bmp
images/callouts/1.pdf
images/callouts/2.eps
images/callouts/2.bmp
images/callouts/2.pdf
images/callouts/3.eps
images/callouts/3.bmp
images/callouts/3.pdf
images/callouts/4.eps
images/callouts/4.bmp
images/callouts/4.pdf
images/callouts/5.eps
images/callouts/5.bmp
images/callouts/5.pdf
images/callouts/6.eps
images/callouts/6.bmp
images/callouts/6.pdf
images/callouts/7.eps
images/callouts/7.bmp
images/callouts/7.pdf
images/callouts/8.eps
images/callouts/8.bmp
images/callouts/8.pdf
images/callouts/9.eps
images/callouts/9.bmp
images/callouts/9.pdf

If you have only the GIF versions of the callouts, you can use the adddscr script:

cd images/callouts
adddscr gif

Instead of “gif”, you can use whatever version you happen to have (like “png”, “bmp” etc). The adddscr script will convert the format you designated into PNG first, then use the addd script to add density (see Section 4.9 and Section 7.2.2) and finally will create all the other formats with ImageMagik (see Section 3.6).


4.9. Add density to images

Since you will be creating various output formats from the same SGML source, you are going to need not only the png images, but also the pdf ones. Those need a special preparation in order to be embedded in the PDF document: we need to “add density” to them. This is one of the many reasons that the tools described in this document fail to incorporate images in PDF when ran “out-of-the-box”.

The gory details behind this are described in Section 7.2.2. Here, I will describe the most straightforward way to get working images for your PDF documents. For this purpose I have written a small utility, which I call addd (for add density). Download it, make it executable and put it somewhere like /usr/local/bin. It calls two programs which you should also have installed on your system: convert (package ImageMagick) and eps2png. You should also download the adddscr script, make it executable and put it in the images directory. The adddscr script expects a parameter, which should be one of the usual image file endings like “gif”, “png”, “jpg” etc. The idea is the following: you have all your images under the images directory and they are all of the same type, say GIF. Then you just call

adddscr gif

The adddscr script will convert all GIFs in the current directory to PNG format, then call the addd utility to add the right density to each image. Since addd produces .pdf and .eps from a given .png image and adddscr produces an additional .bmp, you end up with all the required image formats with the right properties for the subsequent inclusion in the various documents, be it PS, PDF or RTF. You may then delete your GIFs, you will not need them anymore (see Burn All GIFs.

If your images are all of type JPG, just call

adddscr jpg

in the images directory. If you just produced a sole image, you must call addd manually and then convert the png file to bmp. Example: Suppose you have just produced an image in JPG format, say myimage.jpg. Then do:

convert myimage.jpg myimage.png
addd myimage
convert myimage.png myimage.bmp
Caution Caution
 

You will need to repeat the above steps for each and every image you produce! If you omit it, or use your own .pdf and .eps versions, most probably they will FAIL to be embedded in your PDF, resp. PS document!


4.10. Run sed and awk scripts

Copy the following files in your working directory:

  • sedscr (a sed script that corrects LyX' exported SGML file),

  • sedscr_top (a sed script you can use to eliminate the “_top” target attribute of links whose link text contains a given regular expression string of your choice),

  • sedscr_val (a sed script that effects all changes that are necessary for the HTML document to validate as conforming to the HTML standards, see Chapter 8 for this subject),

  • sedscr_ris (a sed script that can create full RIS datasets out of a file containings URLs - included only for your convenience and not absolutely necessary for our method, see more details in Section 3.11, Section 5.19 and Section 7.1.10),

  • sedscr_abi (a sed script that will append the SGML entities (as defined in the Preample, see Section 4.6) for the Appendix, the Bibliography and the Index at the end of the corrected SGML file, see Section 7.1.9),

  • sedscr_app (a sed script that will insert a label and title in the Appendix, as well as change the end tag from </article> to </appendix>),

  • sedscr_cit (a sed script that will create a LyX file containing citation labels, to be used in citations),

  • sedscr_bib (a sed script that corrects the Appendix code, for the case we insert a bibliography after it, see Section 7.1.9),

  • awkscr_math (an awk script that prepares the Mathematics parts, like equations, for further processing, see Chapter 10, Section 10.1, Section 10.3),

  • awkscr_refdb_html and awkscr_refdb_print, used to create the necessary stylesheets if you are using RefDB (see Section 3.11, Section 5.19 and Section 7.1.10),

  • sedscr_tidy, a very rudimentary script that tries to reduce line length of the SGML file by inserting newlines after <para> and </para> tags. Also sedscr_tidy2, another sed script, to correct the first tidy script. You would run these two as follows:

    # Tidy up the SGML file.
    # ${RUNSED} ${SEDSCRTIDY} $1.sgml
    # ${RUNSED} ${SEDSCRTIDY2} $1.sgml
    

    However, they don't produce correct results, so the calls are commented in the lyxtox script.

  • sedscr_ima, a sed script that is used to produce another sed script, sedscr_img (sedscr_img is not included, as it is produced dynamically from the SGML file of the document and the sed script sedscr_ima).

  • sedscr_apa, a sed script that is used to erase <acronym>, <productname> and <application> tags from the alt and title texts in the dynamically created sed script sedscr_img.

  • lyxtox, the main script that creates all documents using the above scripts and and the rest of the required software (Chapter 3),

Copy runsed somewhere like /usr/local/bin. lyxtox and runsed should be executable.

runsed is a simple script that I modified from the original runsed script found in O'Reilly's Unix Power Tools, Chapter 34, Section 3 “Testing and Using a sed Script: checksed, runsed”. It simply takes two filenames as an argument and then runs sed on the second file using the first file as a sed script:

runsed sedscript file

A sed script is a script that tells sed what to do. sed, in turn, is a powerful line editor suitable for batch processing (see Section 3.8). You don't have to worry about runsed, sedscr and lyxtox. You may want to have a look at lyxtox, just to ensure that all paths are correctly set and that you get some idea of what it does. It is very well commented. The gory details are in Section 7.1.


4.11. Set up your start and end scripts

You are going to need two scripts, with the endings .start and .end respectively, and the same basename as your LyX document (without the LyX ending). Example: if you are processing myLyxfile.lyx, then you can create myLyxfile.start and myLyxfile.end. These files can contain code of your choice to be executed at the start and at the end of the lyxtox script. The .start file should contain at least the lines:

# Title of this document.
TITLE="LyX and SGML"
FORMATSFILE="formats.html"
COPYRIGHT="All contents <a href=\"license.html\">\&copy;<\/a> 2002-2006 <a href=\"http:\/\/www.karakas-online.de\">Chris Karakas<\/a>"
HOMEFILE="book1.html"
# Flags
# Set to "1" to process math.
process_math="1"
# Set to "1" if you have RefDB installed
# and want to create the bibliography.sgml file 
# through RefDB.
process_RefDB="1"
RefDB_db="ck_refdb"
REFDB_style="J.Biol.Chem."

The values for TITLE, FORMATSFILE, COPYRIGHT, HOMEFILE etc. are used in the HTML file generation. You may need them or not, depending on whether you use them in your part1, part2 and part3 files, which are responsible for your custom header and footer (see Section 7.1.4.6).

The values for process_math and process_RefDB are necessary. If you don't use mathematics (see Section 5.17), set process_math to 0. If you don't use RefDB (see Section 5.19.2), set process_RefDB to 0.

See example.start for an example of a .start file.

The .end file (myLyXfile.end in our example above) is there for you to add whatever additional processing steps you like. I use it to create all the tar archives found in Section 1.2, but you can use it for whatever you like. Here is a typical .end file that you would use to massage the HTML files a bit (with sedscr_val, in order to make them HTML standards compliant) and then create all those tar archives found in Section 1.2:

# Do some changes that are necessary for the HTML file to be validated
# as a conforming one, according to the standards
# of the W3C (see http://validator.w3c.org).
$RUNSED sedscr_val $1/*.html
rm $1/*.bak
rm $1/*.tar.gz
cp sedscr_top $1/
# Admonitions and Callouts.
cp -v admonitions.tar.gz $1/
cp -v callouts.tar.gz $1/
# TAR, all files.
$TAR --exclude=$1/index.html -czvf $1.tar.gz $1/
cp $1.tar.gz $1/
rm $1.tar.gz
# TAR, one big HTML file with images.
$TAR -czvf $1-onehtml.tar.gz $1/$1.html $1/images/*.png $1/images/*.gif $1/images/*/*.png $1/ck-style.css $tarfilelist
cp $1-onehtml.tar.gz $1/
rm $1-onehtml.tar.gz
# TAR, many HTML files with images.
$TAR --exclude=$1/$1.html --exclude=$1/index.html -czvf $1-html.tar.gz $1/*.html $1/images/*.png $1/images/*.gif $1/images/*/*.png $1/ck-style.css $tarfilelist
cp $1-html.tar.gz $1/
rm $1-html.tar.gz
# TAR, RTF file with images.
$TAR -czvf $1-rtf.tar.gz $1/$1.rtf $1/images/*.bmp $1/images/*/*.bmp $tarfilelist
cp $1-rtf.tar.gz $1/
rm $1-rtf.tar.gz
# TAR, SGML file with images.
$TAR -czvf $1-sgml.tar.gz $1/$1.sgml $1/images/* $1/images/*/* $1/appendix.sgml $tarfilelist
cp $1-sgml.tar.gz $1/
rm $1-sgml.tar.gz
# TAR, only the filelist.
$TAR -czvf $1-scripts.tar.gz $tarfilelist
cp $1-scripts.tar.gz $1/
rm $1-scripts.tar.gz

See example.end for an example of an .end file.


4.12. Set up custom headers and footers

Use part1, part2 and part3 to customize your headers and footers. You will not need to touch part1, unless you wish to change the DOCTYPE setting. You will, however, certainly want to adapt part2 (for the headers) and part3 (for the footers). These files are full-blown examples that demonstrate what you can do to enhance navigation and searching:

For the header (part2):

  • Add META tags, for example for the character encoding (Chapter 11), the CSS (Section 4.14) or favicon:

    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
    <link rel="stylesheet" type="text/css" href="ck-style.css">
    <link rel="icon" href="images/favicon.ico" type="image/x-icon">
    <link rel="shortcut icon" href="images/favicon.ico" type="image/x-icon">
    
  • Add a logo image:

    <a href="http://www.karakas-online.de" target="_top"><img src="images/karakas-online.png" alt="Karakas Online" border="0"></a>
    
  • Add a site search field that lets your visitors do a site-specific Google search. First some Javascript constants:

    <div align="center">
    <script type="text/javascript"><!--
    google_ad_client = "pub-2303179107222659";
    google_ad_width = 468;
    google_ad_height = 60;
    google_ad_format = "468x60_as";
    google_color_border = "B4D0DC";
    google_color_bg = "ECF8FF";
    google_color_link = "0000CC";
    google_color_url = "008000";
    google_color_text = "6F6F6F";
    //--></script>
    <script type="text/javascript"
      src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
    </script>
    </div>
    

    and then a few lines further down, the actual search field:

    <div class="BREADCRUMBS">
    <table summary="Breadcrumbs" width="100%" border="0" cellpadding="0" cellspacing="0">
    <tr>
    <td width="30%" align="left" valign="bottom">
      <a href="http://_DOMAIN_" accesskey="S">Start</a>
      <img src="images/small.arrow.outline.gif" alt="->" width="18" height="9" hspace="2" />
      <a href="_HOMEFILE_" accesskey="H" target="_top">_TITLE_</a>
      <img src="images/small.arrow.outline.gif" alt="->" width="18" height="9" hspace="2" />
      <a href="_FILENAME_" accesskey="T" target="_top">This page</a>
    </td>
    <td width="40%" align="center" valign="bottom">
      <!-- SiteSearch Google -->
      <FORM method=GET action='http://www.google.com/custom'>
      <input type=hidden name=domains value='karakas-online.de'><INPUT TYPE=text name=q size=31 maxlength=255 value=''>
      <INPUT type=submit name=sa VALUE='Search'>
      <input type=hidden name=sitesearch value='karakas-online.de'>
      <input type=hidden name=client value='pub-2303179107222659'>
      <input type=hidden name=forid value='1'>
      <input type=hidden name=channel value='5156821179'>
      <input type=hidden name=ie value='ISO-8859-1'>
      <input type=hidden name=oe value='ISO-8859-1'>
      <input type=hidden name=safe value='active'>
      <input type=hidden name=cof value='GALT:#008000;GL:1;DIV:#336699;VLC:663399;AH:center;BGC:FFFFFF;LBGC:FFFFFF;ALC:0000FF;LC:0000FF;T:000000;GFNT:0000FF;GIMP:0000FF;LH:44;LW:468;L:http://_DOMAIN_/_DIRNAME_/images/karakas-online.png;S:http://www.karakas-online.de;FORID:1;'>
      <input type=hidden name=hl value='en'>
      </FORM>
      <!-- SiteSearch Google -->
    </td>
    <td width="30%" align="right" valign="bottom">
      <a href="./" accesskey="M" target="_top">Display Sitemenu</a>
    </td>
    </tr>
    </table>
    </div>
    
  • Add translation links for the current page. But how does the header know the link to the current page? Here's where the constants DOMAIN, DIRNAME and FILENAME come into play:

    <div class="translatelink" align="right">
    <a href="http://babelfish.altavista.com/babelfish/tr?doit=done&amp;url=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_&amp;lp=en_zh">chinese</a>
     |
    <a href="http://translate.google.com/translate?u=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_&amp;langpair=en%7Cfr&amp;hl=en&amp;ie=ISO-8859-1&amp;prev=%2Flanguage_tools">french</a>
     |
    <a href="http://translate.google.com/translate?u=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_&amp;langpair=en%7Cde&amp;hl=en&amp;ie=ISO-8859-1&amp;prev=%2Flanguage_tools">german</a>
     |
    <a href="http://translate.google.com/translate?u=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_&amp;langpair=en%7Cit&amp;hl=en&amp;ie=ISO-8859-1&amp;prev=%2Flanguage_tools">italian</a>
     |
    <a href="http://babelfish.altavista.com/babelfish/tr?doit=done&amp;url=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_&amp;lp=en_ja">japanese</a>
     |
    <a href="http://babelfish.altavista.com/babelfish/tr?doit=done&amp;url=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_&amp;lp=en_ko">korean</a>
     |
    <a href="http://translate.google.com/translate?u=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_&amp;langpair=en%7Cpt&amp;hl=en&amp;ie=ISO-8859-1&amp;prev=%2Flanguage_tools">portuguese</a>
     |
    <a href="http://translate.google.com/translate?u=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_&amp;langpair=en%7Ces&amp;hl=en&amp;ie=ISO-8859-1&amp;prev=%2Flanguage_tools">spanish</a>
    </div>
    

    DOMAIN is set in the .start file (see Section 4.11), DIRNAME is replaced in lyxtox with the parameter you passed it:

    ${SED} -e "s/_DIRNAME_/$1/g" part2_1.tmp > part2_2.tmp
    

    Since the parameter you pass to lyxtox is supposed to be the basename of your .lyx file, which will also become the name of the directory where everything is going to be placed, you can see that DIRNAME will be replaced with the right directory name (actually, what will be replaced is _DIRNAME_, but the underscores are there only to make sure there is no other DIRNAME variable by accident there). Finally, FILENAME is determined in lyxtox as the basename of the HTML file:

    ${SED} -e "s/_FILENAME_/${BASENAME}/g" part2_2.tmp > part2_3.tmp
    
  • Add some “breadcrumbs”, i.e. navigation links separated by a “|”. Here the links point to the various formats of the document, with a nice example of how to mail the current page to a friend using a custom text and the right page link:

    <div class="BREADCRUMBS">
    <table summary="Breadcrumbs" width="100%" border="0" cellpadding="0" cellspacing="0">
    <tr>
    <td width="45%" align="left" valign="bottom">
    </td>
    <td width="10%" align="center" valign="bottom">
    </td>
    <td width="45%" align="right" valign="bottom">
    <a href="mailto:?Subject=Page%20recommendation&amp;Body=I%20thought%20you%20might%20find%20this%20URL%20interesting:%20http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_" onMouseover="window.status='Recommend this page to a friend'; return true;" onMouseout="window.status=''; return true" title="Recommend this page to a friend">Send to a friend&nbsp;</a>&nbsp;|&nbsp;<a href="_DIRNAME_.pdf" title="The *whole* document in PDF format">PDF&nbsp;</a>&nbsp;|&nbsp;<a href="_DIRNAME_.rtf" title="The *whole* document in RTF format">RTF&nbsp;</a>&nbsp;|&nbsp;<a href="_DIRNAME_.ps.gz" title="The *whole* document in gzipped Postscript (PS.GZ) format">PS&nbsp;</a>&nbsp;|&nbsp;<a href="_DIRNAME_.txt" title="The *whole* document in plain text (TXT) format">TXT&nbsp;</a>&nbsp;|&nbsp;<a href="_FORMATSFILE_" title="Other formats of this document">Other formats</a>
    </td>
    </tr>
    </table>
    </div>
    

For the footer:

  • Add a timestamp and a “permalink” (a permanent link, which people can use to link to the page):

    <div class="colophon">
    <table summary="Colophon" width="100%" border="0" cellpadding="0" cellspacing="0">
    <tr>
    <td width="30%" align="left" valign="bottom">
    Last updated _DATE_
    </td>
    <td width="40%" align="center" valign="bottom">
    Permalink: <a href="http://_DOMAIN_/_DIRNAME_/_FILENAME_" title="Permanent link to this page">http://_DOMAIN_/_DIRNAME_/_FILENAME_</a>
    </td>
    <td width="30%" align="right" valign="bottom">
    _COPYRIGHT_
    </td>
    </tr>
    </table>
    </div>
    
  • Add breadcrumb navigation links (“Start”, “Document title”, “This page”) and Google site search:

    <div class="BREADCRUMBS">
    <table summary="Breadcrumbs" width="100%" border="0" cellpadding="0" cellspacing="0">
    <tr>
    <td width="30%" align="left" valign="bottom">
      <a href="http://_DOMAIN_" accesskey="S">Start</a>
      <img src="images/small.arrow.outline.gif" alt="->" width="18" height="9" hspace="2" />
      <a href="_HOMEFILE_" accesskey="H" target="_top">_TITLE_</a>
      <img src="images/small.arrow.outline.gif" alt="->" width="18" height="9" hspace="2" />
      <a href="_FILENAME_" accesskey="T" target="_top">This page</a>
    </td>
    <td width="40%" align="center" valign="bottom">
      <!-- SiteSearch Google -->
      <FORM method=GET action='http://www.google.com/custom'>
      <input type=hidden name=domains value='karakas-online.de'><INPUT TYPE=text name=q size=31 maxlength=255 value=''>
      <INPUT type=submit name=sa VALUE='Search'>
      <input type=hidden name=sitesearch value='karakas-online.de'>
      <input type=hidden name=client value='pub-2303179107222659'>
      <input type=hidden name=forid value='1'>
      <input type=hidden name=channel value='5156821179'>
      <input type=hidden name=ie value='ISO-8859-1'>
      <input type=hidden name=oe value='ISO-8859-1'>
      <input type=hidden name=safe value='active'>
      <input type=hidden name=cof value='GALT:#008000;GL:1;DIV:#336699;VLC:663399;AH:center;BGC:FFFFFF;LBGC:FFFFFF;ALC:0000FF;LC:0000FF;T:000000;GFNT:0000FF;GIMP:0000FF;LH:44;LW:468;L:http://_DOMAIN_/_DIRNAME_/images/karakas-online.png;S:http://_DOMAIN_;FORID:1;'>
      <input type=hidden name=hl value='en'>
      </FORM>
      <!-- SiteSearch Google -->
    </td>
    <td width="30%" align="right" valign="bottom">
      <a href="./" accesskey="M" target="_top">Display Sitemenu</a>
    </td>
    </tr>
    </table>
    </div>
    
  • Add translation links to the current page. These are the same as the ones for the header, only that they are centered instead of being aligned to the right of the page:

    <div class="translatelink" align="center">
    <a href="http://babelfish.altavista.com/babelfish/tr?doit=done&amp;url=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_&amp;lp=en_zh">chinese</a>
     |
    <a href="http://translate.google.com/translate?u=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_&amp;langpair=en%7Cfr&amp;hl=en&amp;ie=ISO-8859-1&amp;prev=%2Flanguage_tools">french</a>
     |
    <a href="http://translate.google.com/translate?u=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_&amp;langpair=en%7Cde&amp;hl=en&amp;ie=ISO-8859-1&amp;prev=%2Flanguage_tools">german</a>
     |
    <a href="http://translate.google.com/translate?u=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_&amp;langpair=en%7Cit&amp;hl=en&amp;ie=ISO-8859-1&amp;prev=%2Flanguage_tools">italian</a>
     |
    <a href="http://babelfish.altavista.com/babelfish/tr?doit=done&amp;url=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_&amp;lp=en_ja">japanese</a>
     |
    <a href="http://babelfish.altavista.com/babelfish/tr?doit=done&amp;url=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_&amp;lp=en_ko">korean</a>
     |
    <a href="http://translate.google.com/translate?u=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_&amp;langpair=en%7Cpt&amp;hl=en&amp;ie=ISO-8859-1&amp;prev=%2Flanguage_tools">portuguese</a>
     |
    <a href="http://translate.google.com/translate?u=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_&amp;langpair=en%7Ces&amp;hl=en&amp;ie=ISO-8859-1&amp;prev=%2Flanguage_tools">spanish</a>
    </div>
    Once again, the DOMAIN, DIRNAME and FILENAME constants come into play.
    
  • Finally, add some icons. Some of them point to the validator services of the W3C (see Chapter 8). These are also nice examples of how to compute the right link that is to be sent to the validator so that it will be able to validate the current page: you have to use “http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_” for the CSS validator, but is is enough to use “http://validator.w3.org/check/referer” for the HTML validator. And don't miss the special link to “validate your browser”! :-)

    <div class="imagelink">
    <table width="100%">
    <tr align="center">
    <td width="30%" align="left" valign="middle">
    <a href="http://validator.w3.org/check/referer" target="_top">
    <img border="0" src="images/valid-html401.png" alt="Valid HTML 4.01! Click here to validate current page." title="Click here to validate the HTML code of this page"></a>
    </td>
    <td width="40%" align="center" valign="middle" colspan="3">
    <a href="http://www.anybrowser.org/campaign/" target="_top">
    <img border="0" src="images/w3c_ab.png" alt="Best viewed with ANY browser!" title="Click here to validate your browser"></a>
    </td>
    <td width="30%" align="right" valign="middle">
    <a href="http://jigsaw.w3.org/css-validator/validator?uri=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_&amp;warning=1&amp;profile=css3&amp;usermedium=all" target="_top">
    <img border="0" src="images/valid-css.png" alt="Valid CSS! Click here to validate current CSS." title="Click here to validate the CSS of this page"></a>
    </td>
    </tr>
    <tr align="center">
    <td align="left" valign="middle">
    </td>
    <td align="center" valign="middle">
    <a href="http://www.gnu.org/copyleft/fdl.html" target="_top">
    <img border="0" src="images/gnu-fdl.png" alt="This is a free document, published under the <acronym>GNU</acronym> Free Documentation Licence" title="GNU Free Documentation Licence"></a>
    </td>
    <td align="center" valign="middle">
    <a href="http://counter.li.org" target="_top">
    <img border="0" src="images/linux_user_314103.png" alt="Linux Counter icon. Chris Karakas is  <productname>Linux</productname> Counter registered user #314103" title="Linux Counter"></a>
    </td>
    <td align="center" valign="middle">
    <a href="http://www.fsf.org" target="_top">
    <img border="0" src="images/powered-by-free-software.png" alt="This is a free document, made with free software" title="Free Software"></a>
    </td>
    <td align="right" valign="middle">
    </td>
    </tr>
    </table>
    </div>
    
Tip How to set the constants in the part* files
 

Do not set the DOMAIN, DIRNAME and FILENAME constants in the part* files. Set only TITLE, DOMAIN, FORMATSFILE, COPYRIGHT and HOMEFILE in the .start file. DIRNAME and FILENAME are computed automatically in lyxtox.


4.13. Set up your bibliographic database

If you don't want to use RefDB, there is nothing you have to do at this stage, so you can skip this section. However, you will have more work in the long run, as you will need to enter the bibliographic references each time by hand in pure SGML in a separate file, see Section 5.19.

If you want to use RefDB, to take advantage of all the automated processing offered to you by a bibliographic database, RefDB and the lyxtox script, there is some prelimnary work to do: before you use your bibliographic treasures in citations and the reference list, you have to import them into the RefDB database you created during installation of the RefDB package (see Section 3.11). Further, you must give the name of your database as the value of the RefDB_db variable in lyxtox and also set process_RefDB to “1”, indicating you wish bibliographic processing through RefDB.

Adding references boils down to running the addref command with proper input files. The input files have to be valid RIS files. They may contain one or more RIS datasets.

An example of a RIS file is refdb.ris. A typical bibliographic entry in the RIS format looks like:

TY  - ELEC
ID  - Walsh2002
AU  - Walsh,Norman
AU  - Muellner,Leonard
TI  - DocBook: The definitive Guide - Apendix
KW  - guide
KW  - docbook
RP  - NOT IN FILE
PB  - O'Reilly & Associates, Inc.
UR  - http://docbook.org/tdg/en/html/appa.html
N1  - Accessed 29.06.2003
PY  - 2002/06/17/Version 2.0.8
SN  - 156592-580-7
ER  -

Each line starts with a two-letter tag followed by the string “ - “ (two spaces, a dash, and another space). Each RIS dataset starts with the TY (type) tag and ends with the ER (End of Reference) tag. In between, tag sequence is arbitrary. The meaning of the tags is:

TY:

Citation type. Can attain many values, some of the most usual being:

  • ABST (abstract reference)

  • BOOK (whole book reference)

  • COMP (computer program)

  • DATA (data file)

  • ELEC (electronic citation)

  • GEN (generic)

  • JOUR (journal/periodical reference)

ID:

Unique citation ID string. This can either be explicitly set by you, or automatically set by the RefDB system. It plays the same role as the citation key in the standard methods provided by LyX for citation purposes (see Section 7.1.10).

AU:

Author. Synonym: A1.

TI:

Title

KW:

Keyword

RP:

Reprint status. Can be one of

  • IN FILE

  • NOT IN FILE

  • ON REQUEST MM/DD/YY

PB:

Publisher

UR:

URL (for electronic citations)

Multiple AU and KW tags are possible. For a complete list of the RIS tags and their possible values see Writing RefDB data input - here is a more elaborate example of a bibliographic entry in RIS format, taken from this document:

TY  - CHAP
T1  - Physiological studies of the natriuretic peptide family
A1  - Lewicki,J.A.
A1  - Protter,A.A.
Y1  - 1995///
N1  - Atrial Natriuretic Peptide   Cardiac synthesis and secretion of /
ANP   Regulation of ANP Gene Expression   Regulation of ANP Release /
  ANP Receptors   Biologic Actions of ANP Brain Natriuretic Peptide (BNP) /
  BNP Structure   Biosynthesis of BNP   Biological Actions of BNP C-Type /
 Natriuretic Peptide (CNP)   Biologic Actions of CNP Modulators of /
 Natriuretic Peptide Clearance   Effects of Clearance Receptor Blockers /
  Effects of Neutral Endopeptidase Inhibitors Role of the Natriuretic / 
 Peitedes in Physiology and Disease   Hypertension   Congestive Heart  /
Failure   Supraventricular Tachyarrhythmias   Acute Renal Dysfunction
KW  - natriuretic
KW  - ANF
KW  - ANP
KW  - receptors
KW  - BNP
KW  - CNP
KW  - hypertension
KW  - congestive heart failure
KW  - review
KW  - cardiac
KW  - regulation
KW  - gene expression
KW  - expression
KW  - brain
KW  - structure
KW  - biosynthesis
KW  - receptor
KW  - inhibitor
KW  - physiology
KW  - renal
KW  - study
KW  - Peptides
KW  - atrial natriuretic peptide
KW  - MODULATOR
KW  - secretion
KW  - Gene Expression Regulation
RP  - IN FILE
SP  - 1029
EP  - 1053
VL  - 2
T2  - Hypertension: Pathophysiology, Diagnosis, and Management
A2  - Laragh,J.H.
A2  - Brenner,B.M.
IS  - 61
CY  - New York
PB  - Raven Press, Ltd.
ER  -

To build your own bibliographic database, you thus need all your references in the RIS format. If you found your reference in the web edition of some scientific journal, or one of the specialized bibliographic databases on the Internet, like PubMed, chances are that you will be able to copy the RIS version of the bibliographic entry with a mouse click on some link. Otherwise, you will either have to import it with the use of one of the input filters shipped with RefDB, use the Web interface (which you installed in Section 3.11), or write all those tags and their values by hand.

For automatic import, RefDB offers the following input filters:

  • dos2unix: A simple shell script to convert text files like RIS documents from DOS-style line endings to Unix-style line endings. Most refdb tools need their input files with Unix-style line endings. This is a valuable tool to import reference databases from Windows reference managers.

  • med2ris.pl: A tool to convert Pubmed data in both the tagged and the XML format to RIS.

  • bib2ris: A tool to convert BibTeX data to RIS.

  • db2ris: A tool to convert reference data in DocBook SGML/XML documents to RIS.

  • marc2ris.pl: A tool to convert references in MARC format to RIS.

To import all references from a RIS file called, say, refdb.ris, start refdbc and then type:

addref refdb.ris

Once you have populated your own bibliographic database with entries, you should export it to a file - it will serve you as a backup and reference. To export all entries in a file called refdb.ris (this will overwrite any existing refdb.ris file, so take care):

getref -t ris -o refdb.ris -s "ALL" :ID:>0

Open the refdb.ris file with a text editor and examine it. Pay special attention to the values of the ID field - we will use this field in a somewhat tricky way to refer to the bibliographic entries from LyX. How this is done, is explained in Section 5.19.


4.14. Use a CSS for DocBook

The following two definitions in the HTML stylesheets (see Section 4.2) specify the CSS that is to be used for HTML output:

(define %stylesheet%
  ;; Name of the stylesheet to use
  ;;#f)
  "ck-style.css")
  
(define %stylesheet-type%
  ;; The type of the stylesheet to use
  "text/css")

Of course, you may decide that you don't need a CSS, in which case you should define %stylesheet% to “#f”. But this is only part of the story! You may discover that your HTML documents still need the ck-style.css, even if you set %stylesheet% to “#f”! This is because of the extra processing that the header, body and footer are subject to, as described in Chapter 8. You will have to change part2 too to reflect the right CSS.

Talking about a “right” CSS for DocBook, you may find out that there not so many out there available - see CSS for DocBook for a rare example. The problem is that the HTML produced by the tools presented here uses its own classes which don't seem to be widely used outside DocBook. Instead of inventing the wheel for the third time, just grab my ck-style.css and use it as is, or adapt it to your purposes. As the scripts currently use it, it has to be installed in the working directory, but you can certainly change that easily.


4.15. Use coolthumbs

coolthumbs is a small, fine script that will create antialiased thumbnails for your PDF document (those icons that look like miniature copies of your pages, in the left column of Acrobat Reader, besides the bookmarks tab). Whithout it, the thumbnails will look “high contrast” or “edgy”. Using coolthumbs will produce hight quality thumbnails, with the help of Ghostscript and The GIMP (which, of course, you must also have installed, if you decide to use it). You can get coolthumbs from the Linux LaTeX-PDF HOW-TO (I would love to include it here, but unfortunately its copyright notice does not allow it explicitly).

Install coolthumbs in, say, /usr/local/bin. Then enter its location in lyxtox and set the use_coolthumbs parameter to 1:

# Shall we use the coolthumbs script to create the PDF thumbnails?
# You can get coolthumbs from
# http://www.ringlord.com/publications/latex-pdf-howto/
# Note that you will also need to have GIMP installed
# and that you will have to edit some lines in coolthumbs too.
use_coolthumbs="1"

These are the values I had to change in my copy of coolthumbs:

  • The location of The GIMP:

    # Program locations
    GIMP="/usr/bin/gimp"
    My GIMP scripts directory:
    # GIMPSCRIPTS: Your GIMP scripts directory. the file named by
    # SCALEALL will be CREATED there and then DELETED again:
    GIMPSCRIPTS=${HOME}/.gimp-1.2/scripts
    
  • Width and height of the thumbnails:

    THUMBNAIL_W=74
    THUMBNAIL_H=105
    
  • Some explanation on the choice of those values is due here: the ISO/DIN paper sizes (in mm) for the various DIN paper sizes are shown in Table 4-1 (taken from Paper size).

    Table 4-1. ISO/DIN paper sizes

    A

    B

    C

    0

    841x1189

    1000x1414

    917x1297

    1

    594x841

    707x1000

    648x917

    2

    420x594

    500x707

    458x648

    3

    297x420

    353x500

    324x458

    4

    210x297

    250x353

    229x324

    5

    148x210

    176x250

    162x229

    6

    105x148

    125x176

    114x162

    7

    74x105

    88x125

    81x114

    8

    52x74

    62x88

    57x81

    9

    37x52

    44x62

    40x57

    10

    26x37

    31x44

    28x40

    From Table 4-1 we see that if, as the author of coolthumb says, the values of 82/106 are "pretty much dead-on" for US Letter paper, then the values 74/105 must be just as "dead on" for DIN paper (we try to match one paper side as good as possible - and in this case we see that 106-105 is pretty much as good as can be). More precisely, it's dead-on for for DIN A7 paper, but DIN papers have the same aspect ratio throughout the whole paper range (see Figure 4-2), so the ratio 74/105 for DIN A7 is the same as the ratio 210/297 for DIN A4 (which is mostly used outside the USA) and also the same as every other DIN paper.

    Figure 4-2. ISO-DIN paper sizes.

    ISO-DIN paper sizes.

    ISO-DIN paper sizes.

  • DPI (dots-per-inch) value for the thumbnails:

    SNAPSHOT_DPI=133
    
  • This is the DPI value of my monitor, as taken from the output of the graphic card driver in /var/log/XFree86.0.log, where it says:

    DPI set to (133, 133)
    

    Of course, YMMV (=Your Monitor May Vary ;-)).

  • I use thumbpdf version 3.2, so I set

    THUMBPDF_V2=0
    

Chapter 5. Writing in LyX, thinking in SGML

You have now installed the required software (see Chapter 3) and taken the required preliminary steps (see Chapter 4) to ensure that everything is in place and configured correctly. In this chapter I will describe how write in LyX in order to achieve the desired results. This may at first look trivial, but is not:

LyX is a frontend for LaTeX. It was designed with TeX/LaTeX in mind, not SGML. The TeX language and the LaTeX macros describe a document not only from the structural point of view (using markup that expresses facts like “this is a paragraph”, “this is an itemized list”), but also from the descriptive one (“use 12pt here”, “indent 5cm there”). SGML, on the other side, separates structure from style. It is clear that when you export a TeX/LaTeX document not each and every TeX/LaTeX construct that is possible in LyX will find its equivalent in SGML. Clearly, you will have to use only those constructs that are common to both, or at least can be mapped to each other with some reasonable processing of LyX' SGML (done by lyxtox using runsed and sedscr). Remember this each time you try something in LyX, only to find out that it does not work in SGML.

Tip Use the .lyx version of this document as a template
 

Even if this chapter goes into a lot of details regarding writing in LyX in a manner that is compatible with SGML processing, a real example is still worth a thousand words. You should thus study real LyX documents that use the constructs discussed here and compare with the results in the other versions (HTML, PDF, PS, RTF, TXT and of course SGML). The best starting point is to use the .lyx version of this document (to be found in the links of Section 1.2).

Load the LyX version of this document in LyX and study the way various elements are used. Use it as your template, your starting point for your own document! If you look attentively, you will see that I am trying out quite a few non-trivial tricks here, which you can copy for your own use. wink


5.1. LyX environments

Different parts of a document have different purposes; we call these parts environments. Most of a document is made up of regular text. Section (chapter, subsection, etc.) titles let the reader know that a new topic or subtopic will be discussed. Certain types of documents have special environments. A journal article will have an abstract, and a title. A letter will have neither of these, but will probably have an environment that gives the writer's address.

Environments are a major part of the “What You See Is What You Mean” philosophy of LyX. A given environment may require a certain font style, font size, indenting, line spacing, and more. This problem is aggravated, because the exact formatting for a given environment may change: one journal may use boldface, 18 point, centered type for section titles while another uses italicized, 15 point, left justified type; different languages may have different standards for indenting; and bibliography formats can vary widely. LyX lets you avoid learning all the different formatting styles.

The Environment box is located on the left end of the toolbar (just under the File menu). It indicates which environment you're currently writing in. While you were writing your first document, it said “Standard,” which is the default environment for text. Now you will put a number of environments in your new document so that you can see how they work. You'll do so with the Environment menu, which you open by clicking on the “down arrow” icon just to the right of the Environment box.

Important Don't use the "Paragraph" environment!
 

Use "Standard" instead! Using "Paragraph" interferes with the changes that runsed and sedscr try to effect in the SGML code as exported by LyX. Writing paragraphs in LyX is treated in Section 5.5 - although there is actually nothing more to say on this subject for the moment. Just choose "Standard" and write.smile

An important thing to keep in mind is that whatever environment you set in LyX, it will NOT, per se, affect the formatting of your document! LyX environments tell something about the structure of the document, never about its formatting (at least not in the context we will be using them here, i.e. as equivalent to SGML tags). Thus, an environment of “Standard” will induce the <para> tag when exported to SGML from LyX. The following quote from the sgml-tools mailing list deals with a common misconception of <para> (see Use of <Para> within <ListItem> mangles list items):

> I think the definition of <Para> means to start on a new line and break on a new line.

At the risk of being pedantic, I think you're making a mistake of interpreting DocBook markup tags as having any bearing on format, which they do not. This is unlike many HTML tags, so if that's the particular SGML DTD that you have more experience with, it may be coloring your intepretation.

DocBook tags (and to be honest, most SGML DTDs) are used to identify the type and purpose of information, but not how that information might be portrayed in a formatted fashion. That's part of the whole power of it - by separating formatting/display from content, you are free to both ignore such issues when documenting, and yet be totally flexible during formatting on how you want to present marked up text.

In DocBook, the <Para> element is just defined as a "paragraph", a container sort of element for other inline, and some block, elements. It says nothing about starting nor breaking on a newline, although of course such could be selected by a style sheet as an implementation. In that respect the more tags you have in a document (and the more granularity of the information so tagged) the better - it gives the formatter and style sheet the most flexibility in handling how the formatted output should appear.

We talk about stylesheets in Section 4.2, Section 4.14, Section 7.1.5, Section 7.1.8 and Section 10.3.2.1.


5.2. Authors, Credits, Roles

If you have a more complicated situation than just an author for your document, like affiliations, translators, contributors etc., here's the right way to enter such information in LyX, so that it can be exported to SGML:

Create an environment (Section 5.1) of type “SGML” just after the title and before the abstract. There, enter the information as in the following example:

<authorgroup>
<author>   
<firstname>Chris</firstname>     <surname>Karakas</surname> 
<affiliation> 
<jobtitle>Webmaster</jobtitle>    
  <orgname>www.karakas-online.de</orgname>   
 </affiliation>  
</author>
<othercredit role="converter">
<contrib>Conversion from LyX to DocBook SGML, Index generation</contrib>
<firstname>Chris</firstname> <surname>Karakas</surname>
<affiliation>www.karakas-online.de</affiliation>
</othercredit>
<othercredit role="translator">
<contrib>Translation from italian</contrib>
<firstname>Chris</firstname> <surname>Karakas</surname>
<affiliation>www.karakas-online.de</affiliation>
</othercredit>
</authorgroup>

5.3. Keywords

The metainformation on keywords is not as relevant today (in the context of search engine optimization) as it was a few years ago. Nevertheless it may be a good practice to incorporate some keywords in the header as there are still some search engines around that use them (Google does not). Here's how you enter keywords in LyX:

After the title and before the abstract, create an environment (Section 5.1) of type “SGML”. There, enter the keywords as in the following example:

<keywordset>
<keyword>LyX</keyword>
<keyword>SGML</keyword>
<keyword>DocBook</keyword>
</keywordset>

5.4. Revision history

You enter a revision history as follows:

Create an environment (Section 5.1) of type “SGML” that is located after the abstract and before the first chapter/section. In this environment, enter the revision history as in the following example:

<REVHISTORY>
<REVISION>
<REVNUMBER>1.2
</REVNUMBER>
<DATE>29.05.2003
</DATE>
<AUTHORINITIALS>CK
</AUTHORINITIALS>
<REVREMARK>Some remarks about this revision.
</REVREMARK>
</REVISION>
<REVISION>
<REVNUMBER>1.1
</REVNUMBER>
<DATE>13.02.2003
</DATE>
<AUTHORINITIALS>CK
</AUTHORINITIALS>
<REVREMARK>Some remarks about this revision.
</REVREMARK>
</REVISION>

Revision numbers should be entered in reverse order (i.e., the latest revision should appear first on the list).


5.5. Paragraphs

Do NOT use the “Paragraph” environment (Section 5.1) when writing a paragraph. Use “Standard” instead. Using “Paragraph” interferes with the changes that runsed and sedscr try to effect in the SGML code as exported by LyX.


5.6. Cross references

Cross-references work in exactly the same way as usually in LyX: you first insert a label (choosing Insert-->Label from the menu), then insert a cross-reference at a point of your choice (choosing Insert-->Cross-reference from the menu). BUT: You can't cross-reference anything! Although it is certainly possible in LyX, it will not work when exported to SGML: you will get the error

xref to ANCHOR unsupported

See Chapter 6 for a discussion of this error. For the moment, just remember that the only cross-references that work for our purposes are cross-references to

  • chapters, sections, subsections, subsubsections

  • figures

  • tables

but that's quite enough for cross-referencing.

Important Always set labels!
 

Labels of chapters and sections will become the HTML filenames of the files that contain them. Choose them wisely and make it a habit to always create a label for your new chapter and section as soon as you write down their titles! See Section 5.15 for more tips on this important issue.


5.6.1. Mass insertion of cross-references in LyX

If you have to add hundreds of cross-references in just one section (e.g. more than 500, as in Credits for version 2.0 of the PHP-Nuke HOWTO), you will soon notice that, although a single cross-reference is inserted very easily in LyX (just choose Insert->Cross-reference from the menu, then choose the label of the reference you want), it becomes a real pain if you have to enter hundreds of them.

My solution to this was to write a script that reads a LyX file and outputs another LyX file that contains references to all labels of the first one. It was then easier to copy the references from the file thus created, paste them in Credits for version 2.0 of the PHP-Nuke HOWTO and then delete the unneeded ones, than try to insert all cross-references by hand using the LyX menu.

The following script, call it lyxrefs, will print a LyX file in standard output, containing cross-references to each and every label of the LyX file whose name was passed on the command line as argument:

#!/bin/bash
#
AWK="/usr/bin/awk"
function preample() {
cat <<-EOF
#LyX 1.2 created this file. For more info see http://www.lyx.org/
\lyxformat 220
\textclass article
\language english
\inputencoding auto
\fontscheme default
\graphics default
\paperfontsize default
\papersize Default
\paperpackage a4
\use_geometry 0
\use_amsmath 0
\use_natbib 0
\use_numerical_citations 0
\paperorientation portrait
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\defskip medskip
\quotes_language english
\quotes_times 2
\papercolumns 1
\papersides 1
\paperpagestyle default
EOF
}
function label() {
n=$1
echo ""
echo "\layout Standard"
echo ""
echo ""
echo "\begin_inset LatexCommand \label{cit:$n}"
echo ""
echo "\end_inset"
}
preample
# Output LyX commands for Chapter "All references"
cat <<-EOF
\layout Section
All references
EOF
# Output all references.
$AWK 'BEGIN {FS=" "} /\\begin_inset LatexCommand \\label{/ {gsub("label","ref");
printf("\n%s\n\n%s%s\n\n%s\n","\\layout Standard",
"\\begin_inset LatexCommand",$3,"\\end_inset")}' $1 > all-references.tmp
cat all-references.tmp
rm all-references.tmp
# Output LyX commands for Chapter "All references"
cat <<-EOF
\layout Section
All figure references
EOF
# Output only the figures.
$AWK 'BEGIN {FS=" "} /\\begin_inset LatexCommand \\label{fig-/ {gsub("label","ref");
printf("\n%s\n\n%s%s\n\n%s\n","\\layout Standard",
"\\begin_inset LatexCommand",$3,"\\end_inset")}' $1 > fig-references.tmp
cat fig-references.tmp
rm fig-references.tmp
# Output LyX commands for Chapter "All references"
cat <<-EOF
\layout Section
All table references
EOF
# Output only the tables.
$AWK 'BEGIN {FS=" "} /\\begin_inset LatexCommand \\label{tab-/ {gsub("label","ref");
printf("\n%s\n\n%s%s\n\n%s\n","\\layout Standard",
"\\begin_inset LatexCommand",$3,"\\end_inset")}' $1 > tab-references.tmp
cat tab-references.tmp
rm tab-references.tmp
echo "\the_end"

The script will even create three sections, with cross-references to all labels, all figures and all tables respectively. If you named it lyxrefs, you would call it as follows:

lyxrefs some-LyX-file.lyx > refs.lyx

Then refs.lyx will contain cross-references to all labels of some-LyX-file.lyx. You can open refs.lyx with LyX, copy all or part of the cross-references there and paste them in some-LyX-file.lyx. The cross-references in Credits for version 2.0 of the PHP-Nuke HOWTO were entered this way.


5.7. Images

You insert an image quite simply: from the menu, choose Insert-->Graphics. Enter the basename of your image (this is the name without the ending), followed by the ending “.eps”. Example: If your image is called myimage.png, enter “myimage.eps” in the file field. Don't worry that LyX cannot find your image. Don't worry that all your images are located in the images directory beneath your working directory, but you didn't enter any paths. The scripts will take care of this (see Section 7.1 for the gory details). You just enter “myimage.eps”, that's all.

If you want to see figure captions and titles (and the “alternative text” in HTML), or if you want to be able to reference a figure, or see your figure in the “List of Figures” that is created automatically, then you have to use floats. Follow the instructions exactly as given:

  • Be sure that your cursor is in the “Standard” environment.

  • From the menu, choose Insert-->Floats-->Figure.

  • In the float, insert an image as explained previously

  • While the cursor is still besides the (empty) figure box and inside the float, change the environment to “Caption” (Section 5.1).

  • Now you see the text “Figure#:” to the left of the image box, followed by the image box . Insert a label (Insert-->Label) directly after (to the right of) the image box. Through this label you will be able to cross-reference this figure in your document.

  • Type the figure caption text directly after the label. It will become the figure title and caption text (there is currently no way to get different texts for figure title and caption text from LyX, although SGML would support this. If you find an alternative, please send it to me).


5.7.1. Inline graphics

I will disappoint you: there is no way to include inline graphics with the method described in this document - see Section 7.1.6 for the reason.

Well, almost...wink

You have always the possibility to write SGML in LyX: just create an environment (Section 5.1) of type SGML and enter your text (in <para> elements), followed by

<inlinemediaobject>
   <![ %output.print.png; [
   <imageobject>
      <imagedata fileref="./images/1.png" format="PNG">
   </imageobject>
   ]]>
   <![ %output.print.pdf; [
   <imageobject>
      <imagedata fileref="1.pdf" format="PDF" scale="65">
   </imageobject>
   ]]>
   <![ %output.print.eps; [
   <imageobject>
      <imagedata fileref="1.eps" format="EPS">
   </imageobject>
    ]]>
   <![ %output.print.bmp; [
   <imageobject>
      <imagedata fileref="1.bmp" format="BMP">
   </imageobject>
    ]]>
   <textobject>
      <phrase>Inline graphic</phrase>
   </textobject>
</inlinemediaobject>

Substitute “1” with the basename of the inline graphic (i.e. the name without path information and without ending). Of course all necessary formats (PNG, EPS, PDF and BMP) with the right density settings must be available under the ./images directory, see Section 4.9.

Caution Caution:
 

You must insert the above SGML code without introducing newline characters, which in LyX will produce

 <para>  
elements when exported to SGML. Thus, for the above sentence and inline icon, you would have to enter
<para>Well, almost...<inlinemediaobject> <![ %output.print.png; [ <imageobject> <imagedata fileref="./images/icon_wink.png" format="PNG"> </imageobject> ]]> <![ %output.print.pdf; [ <imageobject> <imagedata fileref="icon_wink.pdf" format="PDF" scale="65"> </imageobject> ]]> <![ %output.print.eps; [ <imageobject> <imagedata fileref="icon_wink.eps" format="EPS"> </imageobject> ]]> <![ %output.print.bmp; [ <imageobject> <imagedata fileref="icon_wink.bmp" format="BMP"> </imageobject> ]]> <textobject> <phrase>Inline graphic</phrase> </textobject> </inlinemediaobject> </para>

5.8. Admonitions

There is no “Admonition” environment in LyX, just as there are no “Admonitions” in TeX/LaTeX. Admonitions are a DocBook SGML element. They are those text passages which alert the reader of some important fact (see Section 1.7 for examples). They carry titles like “Caution!”, “Important!”, “Note”, “Tip”, “Warning!”. There is no oher way to introduce admonitions in a LyX DocBook document, than by creating an SGML environment and inserting the necessary SGML commands there. Here is an example for “Caution”:

<caution>
<title>Caution</title> 
<para> 
You will need to repeat the above steps for each and every image you produce! 
If you omit it, or use your own .pdf and .eps versions, most probably they 
will FAIL to be embedded in your PDF, resp. PS document! 
</para> 
</caution>

which looks like this:

Caution Caution
 

You will need to repeat the above steps for each and every image you produce! If you omit it, or use your own .pdf and .eps versions, most probably they will FAIL to be embedded in your PDF, resp. PS document!

Since the environment is SGML, you can put any legal (from the DTD point of view) SGML tag inside <caution>/</caution>, not only <para> or <title>. Here is a more complicated example that uses an itemized list (through the <itemizedlist> tag) , this time for the “Tip” admonition:

<tip>
<title>Tip</title> 
<para> 
FYI, all changes presented here refer to variables that were 
originally defined in one of the following files:
<itemizedlist>
</listitem>
<listitem>
<para>
/usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/html/dbparam.dsl
</para>
</listitem>
<listitem>
<para>
/usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/print/dbparam.dsl
</para>
</listitem>
<listitem>
<para>
/usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/print/db31.dsl
</para>
</listitem>
</itemizedlist>
As said above, you should not change these files directly, because 
you will run into a lot of work when you upgrade them.
</para>
</tip>

It looks like this:

Tip Tip
 

FYI, all changes presented here refer to variables that were originally defined in one of the following files:

  • /usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/html/dbparam.dsl

  • /usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/print/dbparam.dsl

  • /usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/print/db31.dsl

As said above, you should not change these files directly, because you will run into a lot of work when you upgrade them.

Another example (of a "Note" admonition) that makes use of a code example (though the <screen> tag):

<note>
<title>Please note:</title> 
<para> 
A file containing graphical callouts (see Section 4.8 and Section 5.9) 
will NOT be validated! You will get an error saying
<screen>
document type does not allow element "IMG" here
</screen>
</para>
</note>

It will look like this:

Note Please note:
 

A file containing graphical callouts (see Section 4.8 and Section 5.9) will NOT be validated! You will get an error saying

document type does not allow element "IMG" here

5.9. Callouts

Callouts are those reverse-video circled numbers or "(1)", "(2)", etc. that you see appended to selected lines in some code examples, like the following:

baseurl={http://www.karakas-online.de/mySGML/}, (1)
pdftitle={Document processing with LyX and SGML}, (2)
pdfsubject={Linux,document formatting}, (3)
pdfauthor={Copyright \textcopyright 2004, Chris Karakas}, (4)
pdfkeywords={Linux SGML LyX DSSSL DocBook} (5)
(1)
The baseurl will be added in front of any relative WWW link that you have in your PDF document.
(2)
The pdftitle will appear as the title of your PDF document in Acrobat® Reader under File-->Document Info-->General.
(3)
The pdfsubject will appear as the subject of your PDF document in Acrobat® Reader under File-->Document Info-->General.
(4)
The pdfauthor will appear as the author of your PDF document in Acrobat® Reader under File-->Document Info-->General. You may add copyright information as shown here.
(5)
The pdfkeywords will appear as a list of keywords for your PDF document in Acrobat® Reader under File-->Document Info-->General. Keywords are separated by blanks.

Just as is the case with admonitions (see Section 5.8), there is no LyX environment (Section 5.1) for callouts (there is no such environment for TeX/LaTeX either, which is the reason why it is absent from LyX too). The only way to get callouts to work with LyX ist to create an SGML environment and put the SGML code there. For the example above, here is what you would have to write in the SGML environment:

<screen>
baseurl={http://www.karakas-online.de/mySGML/}, <co id="baseurl">
pdftitle={Document processing with LyX and SGML}, <co id="pdftitle">
pdfsubject={Linux,document formatting}, <co id="pdfsubject">
pdfauthor={Copyright \textcopyright 2003, Chris Karakas}, <co id="pdfauthor">
pdfkeywords={Linux SGML LyX DSSSL DocBook} <co id="pdfkeywords">
</screen>
<calloutlist>
    <callout arearefs="baseurl">
       <para>
      The baseurl will be added in front of any relative WWW link 
      that you have in your PDF document.
      </para>
    </callout>
<callout arearefs="pdftitle">
       <para>
      The pdftitle  will appear as the title of your PDF document 
      in Acrobat® <application>Reader</application> under File-->Document Info-->General.
      </para>
    </callout>
<callout arearefs="pdfsubject">
       <para>
      The   pdfsubject will appear as the subject  of your PDF document 
      in Acrobat® <application>Reader</application> under File-->Document Info-->General.
      </para>
    </callout>
<callout arearefs="pdfauthor">
       <para>
      The   pdfauthor  will appear as the author  of your PDF document 
      in Acrobat® <application>Reader</application> under File-->Document Info-->General. You may add copyright information as shown here.
      </para>
    </callout>
<callout arearefs="pdfkeywords">
       <para>
      The   pdfkeywords will appear as a list of keywords for  your PDF document 
      in Acrobat® <application>Reader</application> under File-->Document Info-->General. Keywords are separated by blanks.
     </para>
    </callout>
</calloutlist>

5.10. Tables

For table captions and titles to be output correctly, you have to insert tables in the following way:

  • Be sure that your cursor is in the “Standard” environment (see Section 5.1).

  • Insert a table float (from the menu: Insert-->Floats-->Table)

  • You get a table float containing one line. Change the environment on that line to “Caption”.

  • You see the text “Table#:” and the cursor is positioned immediately to the right of it. Insert a label for the table (from the menu: Insert-->Label).

  • After the label, on the same one line: type the table title and press <enter>. This will produce a <para> element we will eliminate in sedscr (see Section 7.1.4.1).

  • On the new line, insert your table as usual.

A warning about an "end tag for element "TABLE" which is not open", when processing the SGML file with Jade, is the less evil we can get (see Chapter 6) and is harmless (probably caused by a LyX bug in version 1.2.0).

A LyX related concern is that text within a cell will not wrap to fit the page, so if a line of text in a table is too long, the table will extend beyond the right margin of the page. Similarly, LyX's table inset will not split itself at the bottom of a page, and so might extend below the bottom margin. You have these options to resolve this problem (taken from the LyX User's Guide):

  1. Split it into two tables, to correctly handle pagebreaks and margins.

  2. Select the Longtable button in the Table Layout dialog. This automatically splits the table over more pages, if it is too high. After doing this, the list of Longtable buttons activate themselves and you may now define:

    1. FirstHead: The current row and all rows above that don't have any special options defined, are defined to be the header-lines of the first page of the multipage-table.

    2. Head: The current row and all rows above that don't have any special options defined, are defined to be the header-lines of all pages of the multipage-table except for the first page if FirstHead is defined.

    3. Foot: The current row and all rows above that don't have any special options defined, are defined to be the footer-lines of all pages of the multipage-table except for the last page if LastFoot is defined.

    4. LastFoot: The current row and all rows above that don't have any special options defined, are defined to be the footer-lines of the last page of the multipage-table.

    5. NewPage: This forces a pagebreak after the row where this flag is defined.

    If you define more flags in the same table row, you should be aware of the fact that only the first flag is used in the defined table rows. The others will the be defined as empty. In this context, first means first in this order: Foot, LastFoot, Head, FirstHead. See the TableExamples.lyx example file to see how this works.

  3. Use the Width entry in the Table Layout dialog to restrict the width of the table till it fits horizontally.

  4. A table can also be placed in a float, as described below, which will allow TeX to place it as well as it can within the page.

One last remark: Longtable and Rotate 90° use special LaTeX packages, so you should look into LaTeX configuration in the Help menu to see if your system supports these features.


5.11. Table of contents

LyX provides its own command for the creation of a Table of Contents (from the menu: Insert-->Lists&TOC-->Table of Contents) - but you should not use it! Whether a table of contents will be created or not is entirely controlled by the stylesheets (see Section 4.2 and Section 7.1.5). For example, the following code in a stylesheet specifies that a Table of Contents has to be created for document type “Article”:

(define %generate-article-toc%
  ;; Should a Table of Contents be produced for Articles?
  #t)

5.12. List of figures, tables and equations

A list of figures is created automatically for the document type “ DocBook book SGML”. You don't need to undertake anything here. The same is true for lists of tables and equations. This is true as long as you use the DSSSL stylesheets (that form part of DocBook, seeSection 3.2) with their default settings. In case you don't get a list, when you think you should, read in Section 7.1.5 about the generate-book-lot-list stylesheet parameter.


5.13. Epigraphs

An epigraph is a short inscription at the beginning of a document or component. LyX only provides the “Quotation” environment, which is mapped onto the blockquote SGML element, when exported. For an epigraph as such, there is no provision (as is the case with all SGML tags that do not have a clear correspondence to TeX/LaTeX environments - see Section 5.1 for the notion of an environment in LyX).

However, you can always define an SGML environment and write the necessary SGML tags there. For an epigraph, you would then write for Norm's favourite quote:wink

<epigraph>
<attribution>William Safire</attribution>
<para>
Knowing how things work is the basis for appreciation, and is
thus a source of civilized delight.
</para>
</epigraph>

See Chapter 7 for how the end result looks like.


5.14. SGML code in program listings

If you want to write executable SGML code in LyX, e.g. to implement an admonition (see Section 5.8) or a callout (see Section 5.9), there is no problem: just create an environment of type “SGML” and enter your code there. It will be part of the exported SGML file.

But if you want to give some example SGML code, like in a program listing, then it might be a better idea to move the code in a separate file and include that file per reference. This is because if the SGML code contains “<![CDATA[” and “]]>”, it is likely to interefere with the normal SGML processing, causing errors like the following:

/usr/bin/openjade:<OSFD>0:382:323:E: marked section end not in marked section declaration

To include a file, choose from the menu Insert-->Include file. In the dialog box, choose “Verbatim” and enter the file name. The file should best be located in the same directory as the SGML file exported from LyX (but need not necessarily to).

LyX will use the “inlinegraphic” SGML tag to include the file. This will collide with inline graphics, if you want to include them using this method too (see Section 5.7.1 and Section 7.1.6).


5.15. Filenames

It would be nice to have a “Filename” environment for DocBook SGML articles or books in LyX, it would make computer documentation much nicer. But I was unable to find one, or a workaround to it. If you find a way, please let me know. Of course, a poor man's solution could be to mark a filename with the “!” button, making it part of an <emphasis> tag, which would then appear in italics. But this contradicts the very idea of SGML - separation of content from formatting.


5.15.1. Labels as filenames

You should always put a label after the chapter/section/subsection/subsubsection title (from the LyX menu: Insert-->Label). The text you use for the label will become the name of the HTML file that contains that chapter/section etc. See Section 7.1.5 for the code that controls this setting in the stylesheets (it is the use-id-as-filename DSSSL parameter. that does this).

Tip SEO (Search Engine Optimization) tips
 

There is circumstancial evidence that the search engines (especially Google) weigh substantially the keywords that appear between <H1>, <H2>, <H3> tags (that's the chapter's and sections' titles) especially if they also appear as part of the HTML filename. You are thus well advised to follow some simple rules for your titles (er...filenameswink):

  • Always create a label. Failing to do so, will produce and automatic, nothing-saying filename. You loose a potential (although small) ranking boost from the search engines, if your section is about, say, blue widgets, but is called x543.html, instead of blue-widgets.html.

  • The label should contain the same text as the title. The title you enter in LyX will automatically be the HTML title in the header part of the resulting HTML file (this is a meta-information, not visible to you, but very well visible to the search engines), as well as the title inside the <H1>, <H2>, or <H3> tag that is produced in the HTML body automatically as well. If the label is similar to the title, then the HTML filename will be similar to the titles (meta-title and actual text title) of the text inside it. This wil be rewarded with more weight in the search engine ranking, resulting in a higher place in the SERPS (Search Engine Result Pages). And believe me, even if you hate search engines, it is much easier to find a chapter by reading "talking" filenames, like blue-widgets.html, than cryptic ones like x543.html!

  • Dont'use blanks. Use hyphen “-” instead. This makes use of the fact that for most search engines “-” is equivalent to a blank. This is not the case for underscore “_”.

  • Don't use illegal characters, i.e. characters that are not allowed in HTML filenames.


5.15.2. Cool labels don't change!

There is one thing to keep in mind when regrouping: chapter and section labels - DON'T change existing ones! You may change their position, but please not the label. Subsection and subsubsection labels can be changed without problem.

Why? Because chapters and sections will become separate HTML documents. There is a DSSSL stylesheet setting which controls how deep a level will still produce a separate HTML document (see the decription of chunk-section-depth in Section 7.1.5). The name of the documents will be the label of the chapter and section respectively. This is a behaviour we explicitly set in the DSSSL stylesheets (see Section 7.1.5) through the use-id-as-filename DSSSL parameter..

Obviously, you can move a section around, without affecting the HTML name of the resulting file, if you don't change its label. You can of course change its contents, put it somewhere else as a section of a different chapter etc., but you should leave the label untouched.

The problem is that, if the document is already on the Web and is receiving a lot of visits, most of them (experience suggests a number around two-thirds of the total number of visits) will be from search engines. If you change the section label, the HTML name changes. Consequently, the link from the search engines is no longer valid. The same is true for private bookmarks, or public bookmark lists.

But there is more to it: it's not only a matter of waiting 2-3 weeks for the new HTML document that contains the old (a bit reorganized) content to be indexed by the search engines. It's that the old name might have been at page 1 of the SERPS (Search Engine Result Pages) of some search engine for some keyword, because a lot of other people linked to it. Now, with a changed label and, consequently, a new HTML file name, those links do not reference the new document, and it gets a ranking close to "nowhere" (because search engines, notably Google, take links to a document to mean "votes" for that document and rank that document accordingly). The result: nobody finds it. frown

Note It's not a question of sacrificing quality!
 

I am not trying to tie up your hands in favour of a higher search engine ranking here! This discussion is not one of quality vs. ranking, but one of consistency. All I am advocating is: keep your labels (and consequently your filenames) consistent between various releases of your document! Once you have chosen a label for a chapter of section, stick with it.

Please also note that we are not talking about the title of a chapter or section, but its label. "Label" is LyXese for the "SGML id". You get a label from the "Insert -> Label" menu of LyX. Don't confuse title and label in this discussion!

For example, suppose a chapter with the title “Blue widgets” is at around place 6 out of 2,5 million (!) for "blue widgets" on Google. If you change the label of the chapter from "blue-widgets" to something else, then the original URI will disappear and you loose readers. Of course, changing the title also affects the SERPS (Search Engine Result Pages), but not as drastically as to eliminate the document altogether. However, Google likes a title that is correlated to the file name, so a title "Blue widgets" and a label "blue-widgets" are optimal from the SEO (Search Engine Optimization) point of view (see Section 5.15.1).

“But you are talking me into subjecting my writing to the whims of a search engine!”, you might counter. Nothing more far away than that! The point is not to restrict your writing. The point is: you write whatever you like, however you like and structure it as you please. There are rules for good writing that you might choose to observe, or not. There are also rules for good “copy” , from the point of view of keywords, search engines and ranking - which you are also free to observe or defy.

Then, at some point, you decide to put the document on the web. People will come, read it and, hopefully, find it good. Those people may like your document so much, as to go into the trouble to say something like "there's a cool document on blue widgets in this link here" - and link to it. Hundreds of people may do this perhaps - even thousands. Imagine the effort!

Now you come up with a new restructuring of you document - fine! You change the content - also fine! Then you change the label from:

<sect1 id="blue-widgets"><title>Blue widgets</title>

to:

<sect1 id="blue-widgets-2"><title>Blue widgets revisited</title>

In LyX, this is equivalent to changing the title from “Blue widgets” to “Blue widgets revisited” and the label from “blue-widgets” to “blue-widgets-2”. Perhaps you thought it would be a nice idea to change the title to reflect the reorganization. This will affect your ranking too, but then, almost everything that you write will affect it, so we will not discuss it here. (Actually, it will affect it a little more because it's on the title - but that's again not the point.).

But by changing that label from “blue-widgets” to “blue-widgets-2” you just managed to throw your document from place 6 to place 600 (or 6000, or...) in the SERPS. You just killed all the efforts of thousands of people that linked to your document!

Why?

Because labels become filenames in the document process from SGML to HTML (see Chapter 7 for a detailed explanation of this process). The document that would be blue-widgets.html now is blue-widgets-2.html. The original blue-widgets.html is nowhere to be found in your domain - hundreds, or even thousands of links on the Web now point to vacuum!

Google - and every other search engine - sees this and takes the old URL out of the index. Of course, it indexes the new one. But the new one does not have any links pointing to it - not yet. And perhaps people will not be willing to go into the trouble of changing all their documents, just because you wanted to keep your freedom of choosing (and changing!) the label (and the resulting HTML filename) at your whim. Thus, noone points to the new "reorganized" document. It is rated very low and appears at place...uhmm 1 million something, out of 2,5 million results for "blue widgets", where nobody will find it and nobody will read it. Remember, the original document ranked at place 6 out of 2,5 million!

You might think that, since the label-to-filename connection exists only for the “chunked” version (the version where openjade is instructed to split the document into separate HTML files, one per chapter or section, the so-called “chunks”, as explained in Section 7.1.4.6), the “unchunked” document will save you from this disaster. You are correct, the "single chunk documents" (single, big HTML file, TXT, PDF or PS versions) will not be affected .

If you only make the big HTML file, or the TXT, PDF and PS versions of your document available on the web, then you are not affected. But if you also made the chunked HTML version available at some point, the search engines will prefer to return results from this version, than from the others.

There are various reasons for this, one of them being that search engines don't read a document that is too long till the end and will thus index small chunks much better than huge textst. Another reason is that you need more links to a PDF document, to force a search engine to consider it important for indexing.

So forget about the huge, one-chunk docs as a search engine strategy. If you want to be found by the SEs, you must rely on the chunked versions - and perhaps a little on PDF, but only a little.

However, my point goes even further: we are not talking about a user who is searching for a unique, multiple keyword phrase that identifies the content of your reorganized document. We are talking about a user who just searches for, two keywords: “blue widgets”. If you change the label, you change the filename of the chunked version. If you do so, the search engine will NOT think "Ahh...the file blue-widgets.html is not there, let's present the huge document that contains all chapters, including the one on blue widgets - at the same ranking place"! There are three resons that this will not happen - and you should not rely on it:

  1. First, the search engine does not know that blue-widgets.html is just a chunk of some "whole" document, book1.html. There is nothing that a search engine does to find this out - not with today's technology. The two documents are different from the search engine point of view.

  2. Second, the big one, book1.html, contains much more text, therefore the importance of the "blue widgets" chapter is "diluted" from the surrounding, irrelevant text (irrelevant to what the user is searching with those keywords, namely "blue widgets"). This has to do with “ keyword density”, titles, structure and other “on-page” factors that the search engine calculates and takes into account for each page. Therefore, the document will rank at a place that is way back - invisible to all but the most determined searchers, practically dead.

  3. Third, if you are a HOWTO author, you may put your document on The Linux Documentation Project, which is a great place with good exposure to the Web, but that alone does not guarantee good ranking. What is also important, is that people link to it. But if you change an existing label, thus changing the filename of the chunked version (which is the most important one from the search engine point of view for the reasons stated above), then you kill all the links to the previous URL. You destroy what you were able to gather up to that point in terms of search engine visibility. You start anew.

Tip Use permanent redirects, if you do choose to change the label!
 

Let me put a preemtive disclaimer here: I know that you can put a "HTML permanent redirect" in your .htaccess file to indicate that the resource is now somewhere else, under a different name. But this makes URL management difficult for a webmaster. How on earth shall the webmaster know which labels some author, whose document he hosts on his website, changed in his last reorganization? Is he supposed to do nothing else a whole day, other than chasing diff outputs and editing .htaccess files? Just because the author wants to keep his freedom of changing labels at his whim? Remember, if it is a free document, it will find its way to other people's websites. People may link to files of those websites containing your document, even more often than they do to yours. When you release a new version with changed labels (and accordingly, filenames), do you send a list of changed labels to all those webmasters who host it, with the request to update their .htaccess files?

Certainly not.

Nevertheless, if you are the author and have decided to change the label of some chapter or section, don't let your web server send an "Error 404: not found" for the old URI. Let it send a “permanent redirect” instead. See Managing URIs and the links therein, for the preferred ways to handle this situation.

But again, a redirect has to be in your .htaccess file (or web conf file) until the last request has been seen in your web logs for the old URI (theoretically, at least, otherwise you loose the visitor). How long is this? One year? Ten years? How big does your .htaccess become if an author starts "reorganizing" his labels(!) every other week? How much of a performance penalty will you have to pay for your web server having to read and process huge .htaccess files on every page request?

Thus, every redirect will hurt you, either in terms of visibility in the SERPS, or in terms of complexity, or both. But if you change a label, don't forget the permanent redirect.

For the above reasons, most of the time, you will not feel the need to change labels while reorganizing, but think doubly about it if you must. Your best bet is to choose a label wisely (for the same reasons that you would Choose URIs wisely), perhaps with a name that is a bit more general than you might wish, but will still fit if you choose to change content, or even title, later on.

Cool URIs don't change. Cool labels don't change either. cool


5.16. Examples

There is no “Example” environment in LyX for document types of DocBook SGML book/article. You could certainly “simulate” it with a pure SGML environment, just as we did it for admonitions (Section 5.8) and callouts (Section 5.9), but I will not pursue this further here.


5.17. Mathematics

You write Mathematics in LyX the usual way (see, for example, the Tutorial and the User's Guide that come with LyX). There is absolutely nothing special you have to take care of. Here is an example:

Equation 5-1. (eq1)

Whatever you type in math, it will be typeset by TeX and be available in all formats - HTML, PDF, PS and RTF! Isn't it great?

You should nevertheless read Section 10.1 to ensure that you installed the necessary software for math processing and Chapter 10 to get an idea of what is going on in the background. There are more Mathematics examples for LyX in Section 10.2, where the subject is discussed in more details.


5.18. Appendix

LyX offers the possibility to mark a part of the document as being the Appendix: from the top menu, choose Layout -> Start Appendix here. Unfortunately, this will not work when the document is exported to SGML. We are thus left alone and have to implement a workaround. I have implemented the following solution:

  • The writer provides an extra LyX file, with the fixed name “appendix.lyx”, in the same directory as the main document.

  • The appendix. lyx file must be of type “ DocBook article (SGML)”, to be chosen from the LyX menu: Layout -> Document. A document type of “ DocBook book (SGML)” will NOT work, even if the main document is of book type! This is because if you set the document type to book, you will have to start with Chapters (instead of Sections) and then you will get parsing errors from openjade saying that CHAPTER is not allowed by the DTD at that place.

  • The appendix is marked as such, using LyX' menu: Layout -> Start Appendix here, with the cursor at the very start of the Appendix.

That's all! The lyxtox script will check if a file called appendix. lyx is there and will take the aproppriate steps to incorporate it into the document (see Section 7.1.9).

Tip How to insert cross-references to the Appendix
 

To insert a cross-reference to the Appendix (more precisely to a section, subsection, subsubsection, table, figure or equation in the Appendix), you use the same mechanism in LyX, as you would do for cross-references to labels that exist in a separate file: from the Insert menu, choose "Label", then, from the drop-down list for "Buffer", choose the appendix.lyx file (which you should have already opened in LyX). LyX will then present you with all available labels from the appendix.lyx file to choose for your cross-reference.


5.19. Bibliography

So you want to use LyX to process citations and include a reference list at the end of your document? And all this in SGML? You are at the right place! smile

To begin with, here is just a test citation: (8) . And another one: (9) . Take the time to examine them and their formatting in the version (HTML, PDF, PS, RDF or TXT) you are currently reading.


5.19.1. Bibliography without RefDB

As stated in Section 3.11, you are not confined to using RefDB whith my lyxtox script. If you don't feel like building your own bibliographic database, you can just supply a bibliography. lyx file together with your LyX document. Set the process_RefDB variable in lyxtox to "0" and it will use your own bibliography. lyx to produce a bibliography.sgml file, instead of trying to create one automatically through RefDB. The bibliography. lyx file should then contain the SGML code for the references list, in the SGML environment of LyX. The GNU/Linux Command-Line Tools Summary HOWTO uses this approach, for example.

This means that if you don't want to use RefDB, but still want to have a bibliography, your bibliography. lyx file should look like:

<bibliography id="references" title="References"> 
<bibliodiv>
<biblioentry xreflabel="KARAKAS1992">
<biblioset>
<author>
<surname>Karakas</surname>
<firstname>Chris</firstname>
</author>
<title>Neuronale Lernregeln und andere Methoden für lineare Assoziation und Trennung</title>
<publisher>
<publishername>BoD GmbH, Norderstedt</publishername>
</publisher>
<abstract>
<para>
An examination of neural network methods and methods from classic optimization theory for solving the problem of linear association and separation. After a formal mathematical definition of neural networks, the linear association problem and some so-called "learning rules" for its solution are considered: 
...more Abstract text here.
</para>
</abstract>
</biblioset>
</biblioentry>
...
more <biblioentry> elements here
...
</bibliodiv>
</bibliography>

The whole code should be in the LyX SGML environment (see Section 5.1 for an explanation of LyX environments). You can even insert URLs and cross-references the usual LyX way (from the LyX Insert menu) and they will be correctly exported to SGML. The layout of bibliography. lyx should be “ DocBook Chapter (SGML)” (to be set from the LyX Layout menu). You don't need anything in the preample or elsewhere. See the bibliography.lyx file in the Formats section of GNU/Linux Command-Line Tools Summary HOWTO for an example.

To cite a reference from a reference list created this way, you have to switch to the SGML environment for the whole paragraph containing the citation, then write as in the following example:

<para>
Consult <citation>KARAKAS1992</citation> for more details on
neural learning rules.
</para>

Note that we use the value of the xreflabel attribute to refer to the reference entry.

Note Formatting your Reference List.
 

Note that this way of creating a Reference List and citations does NOT cover formatting of either citations or the entries in the Reference List itself. Note also that each journal (or medium) has its own formatting expectations as far as the Bibliogrphy is concerned. You will have to write your own DSSSL driver file to accomodate for this need. RefDB (10) (Section 5.19.2) takes this burden out of your work for the slight overhead of setting up a database and installing the RefDB package (Section 3.11).


5.19.2. Bibliography with RefDB

You don't need to care about a bibliography.sgml file if you use RefDB - the lyxtox file will create it automatically with the help of RefDB, totally transparently to you. All you need is install and set up RefDB (see Section 3.11 and Section 4.13 respectively), and set the following variables in lyxtox:

  • Process_RefDB to “1”.

  • RefDB_db to the name of the RefDB database containing the bibliographic entries you plan to cite (you created this database in Section 3.11, the name in the example was “ck_refdb” and is the default one in lyxtox - you should of course change this).

  • REFDB_style to the style of the journal you want your Bibliography to conform to.

Note that this name must contain a dot at the end, because lyxtox will create, the file ${REFDB_style}dsl.

Example: for REFDB_style="J.Biol.Chem.", lyxtox will automatically create (through RefDB) the file J.Biol.Chem.dsl, i.e. the “dsl” ending is appended without any extra dot. The default value is “J.Biol.Chem.”, one of the two standard styles shipped with RefDB, and should get you starting. See How to manage bibliography styles for more details on bibliography styles in RefDB.

The only thing that remains is... cite of course!

Citing with RefDB in LyX is a bit tricky, since LyX does not allow citing the way RefDB is expecting it: with a role attribute with the value “REFDB” in the SGML citation code. Of course, we could use a pure SGML environment as in Section 5.19.1 and write:

<para>
Consult <citation role="XXXX"><xref linkend="KARAKAS1992"></citation> 
for more details on neural learning rules.
</para>

each time we want to cite something. But there is a better way: RefDB can create that SGML code automatically, if it finds something like:

<para>
Consult <citation role="XXXX">"KARAKAS1992"</citation> 
for more details on neural learning rules.
</para>

in the SGML file. But how can we convince LyX to output something like the above for each citation when it exports our file to SGML?

The solution I found to this is:

  1. Export all the bibliographic entries you plan to use in a file, say refdb.ris (see Section 4.13 on how to do this).

  2. Use the awkscr_cit AWK script to create a new LyX file from refdb.ris:

    awkscr_cit refdb.ris > citlabels.lyx
    

    The citlabels.lyx file is a LyX file containing only labels, one for each bibliographic entry in refdb.ris. The labels have the prefix "cit:" to distinguish them from true labels. To stay with the above example, the bibliographic entry with ID “KARAKAS1992” will produce a label “cit:KARAKAS1992” in the citlabels.lyx file created by awkscr_cit.

  3. Open the citlabels.lyx file in LyX, together with your document.

  4. To cite an entry, you (mis)use the “Insert Cross-Reference” feature of LyX: from the LyX menu, choose “Insert -> Cross Reference”. In the upcoming dialog box there is a drop-down list field called “Buffer”. The Buffer serves as the source of the labels you want to reference. Usually, “Buffer” is set to the current file. You change this to “citlabels.lyx” by clicking on the dropdown field and choosing the citlabels.lyx file. Now you are presented with all labels of citlabels.lyx. Choose the label containing the ID of the RefDB bibliographic entry you want to cite. In our example, you choose “cit:KARAKAS1992”, if you want to cite the entry with ID “KARAKAS1992”.

That's all! Don't worry about that “cit:” prefix, sedscr takes care about it. For every cross-reference to a label of the form “cit:IDsome”, sedscr will create a citation of the form:

<citation role="REFDB">"IDsome"</citation>

in the exported SGML file. Then, lyxtox will call the aproppriate RefDB tools to transform this citation to

<citation role="REFDB"><xref linkend="IDsome"></citation>

or some more complex form according to the bibliographic style in use. Again, you don't have to worry about this procedure. Just create the citlabels.lyx file as shown above and insert cross-references to its labels in place of your citations. You don't have to bother about SGML code and you can't get the ID's wrong since you get a list of all available labels in citlabels.lyx to choose from. On the other hand, you can't get anything but “simple” citations this way, but at the moment I can live with that. See Section 7.1.10 for a detailed explanation of what lyxtox and RefDB do for you in the background to produce correctly formatted citations and a correctly formatted Reference List.

More elaborate solutions (using a citlabels.lyx file containing two or more labels for each bibliographic ID, each one with a different prefix, serving different citation purposes, "simple" or "complex" ones) are possible, but my hope is that this work will convince the LyX development team to incorporate RefDB directly in LyX in a future version. In the meantime, the method presented works fine - here and now.cool


5.20. Index

The index will be generated automatically by the lyxtox script (which will call the collateindex.pl Perl script that came with the docbook-dsssl-stylesheets package you installed in Section 3.2) as a separate file. You must arrange to have this file incorporated into your document. The easiest way to do this is by file entity reference. In the preample of your document, add an internal subset that defines the index file entity:

<!entity index SYSTEM "index.sgml">

We have done this already in Section 4.6. See Section 7.1.9 for the sed scripts that insert the appendix SGML entity at the end of the SGML file and Section 7.1.11 for the explanation of the subsequent index generation.

From the user perspective, all it remains is to enter the words that shall be included in the index. This is done as usual in LyX: from the menu, choose either Insert-->”Index entry of preceding word” (which I personally find easier), or Insert-->”Index entry”, then enter the required word.

Important Don't underestimate the importance of an Index!
 

It is easy to neglect index generation. It is so boring, having to read the whole document again, word for word, and hour for hour use the menu to "insert index entry of preceding word"! But don't underestimate the importance of a good, complete index for your document, especially if it is of the order of a book! A knowledgeable reader will allways appreciate the possibility to use the Index to arrive at the information he is searching, rather than the usual time-consuming physical, or, even worse, virtual page browsing. Plan a whole working day for index generation of every 100 pages of text.


5.20.1. Automatic Index generation

LyX provides an easy way to insert an Index entry (see Section 5.20): from the menu, choose either Insert-->”Index entry of preceding word” (which I personally find easier), or Insert-->”Index entry”, then enter the required word. This method works fine - if you have a small document, with only a few keywords to insert. But what if your document has grown to hundreds of pages, with hundreds (or even thousands) of index entries to insert? See the Index of the PHP-Nuke HOWTO for an example of an Index that cannot be generated manually - unless you want to drive yourself crazy!

Clearly, for a comprehensive Index of large documents, an automatic procedure is necessary. However, the general problem of automatic Index generation is subject of extensive (and still not conclusive) research and I am not going to address it in its full generality here. For our purposes, even a semi-automatic procedure would be very helpful. To this end, I have created the following 4 scripts:

  • sedscr_list_index_items: lists all index entries contained in a LyX document.

  • sedscr_delete_index_items: deletes all index entries from a LyX document.

  • awkscr_create_index_items: creates a list of words used in a LyX document. The list can be subsequently edited manually, mostly deleting unwanted or uninteresting words, to yield a list of words that are used in the document and are interesting enough to be part of its Index.

  • awkscr_insert_index_items: uses an externally supplied document containing a list of index entries to insert an index entry in a LyX document for every word appearing in that list.

They can be used in the following semi-automatic Index generation procedure:

  1. Optional: create a list of all existing index entries in your document. This is useful not only because you are going to eliminate all index entries from the document in the next step, but also as a backup of the index entries that were currently in use - you might want to reuse them in some later step.

    To create a list of all existing index entries in your document, type:

    sedscr_list_index_items document.lyx > indexitems
    

    The generated indexitems file will contain a list of all index entries in document.lyx, one index entry per line, with a semicolon at the end of each line. The semicolon will be used later as a record delimiter in the awk scripts that follow, so don't let it irritate you.

    To get an alphabetically sorted list of index items, without duplicate entries and with all symbols at the beginning of the list, use the sort and uniq utilities as follows:

    cat indexitems | sort | uniq > indexitems.sorted
    mv indexitems.sorted indexitems
    
  2. Remove all previous index entries from the LyX document. You need this preliminary step because, if you forget to remove already existing index entries, a subsequent run of the awkscr_insert_index_items script may substitute even the existing index terms (those already inside the LyX \index commands) with LyX \index commands. This may or may not happen, depending on the regular expressions used in the current implementation of awkscr_insert_index_items, but it is better to err on the side of caution. What will happen, however, is that repeated invocations of awkscr_insert_index_items will add index entries besides already existing ones. You will thus end up with a document that contains double index entries for each index term in your indexitems file.

    Besides, there is another reason why you might want to remove all index entries from your LyX document: a LyX text cluttered with index entries may still be a breeze to read for a computer, but quite a headache to read for humans.

    To remove all index entries from a LyX document, type:

    sedscr_delete_index_items document.lyx > document-noindexitems.lyx
    

    The generated document-noindexitems. lyx will contain everything from document.lyx - except the index entries.

  3. Create a list of all index entries to be used in the LyX document. This is the most difficult part: as said above, this problem is not trivial. We will thus content ourselves with a list of all words used in the document. Once we have all words, we can still edit the list manually and delete all unwanted entries. This is what makes this procedure semi-automatic and not automatic. The idea is that it is still better having to delete 10000 lines from a 12000 line document, than having to insert 2000 index entries from the LyX Insert menu.

    To create a list of all words used in a LyX document, type:

    awkscr_create_index_items document.lyx > words
    

    There is even some code in awkscr_create_index_items that checks whether the current word is in some “trivia” list of trivial words and discards it. In such a case, you would call the script with two arguments, as follows:

    awkscr_create_index_items trivia document.lyx > words
    

    However, this part of the code is either too slow, or buggy, so it is commented for the moment (feel free to send corrections or suggestions).

    It is a good idea to sort your words alphabetically and delete double entries, so do:

    cat words | sort | uniq > words-unique
    mv words-unique words
    

    Once the list of all words of your document is created, all you have to do is open it with a text editor and delete all unwanted words or correct the ones that are in plural or have some punctuation at the end and so on. This is still hard if your document is large, but still a faster alternative than targeting the Insert menu with the mouse 8000 times (I guess each one of my 2000 index entries appears 4 times in my document, which gives me an estimate of 8000 menu selections with the mouse - unfortunately no keyboard bindings were found to work on my system).

    You should delete all lines containing characters that could be interpreted as metacharacters of regular expressions: *, +, ?, $, &, ^, \ - and probably many others. Don't try to escape them, it will not work: awkscr_create_index_items will replace the correct, string with the escaped string, adding an index entry for the escaped string too! This is not what you will want. What is rather needed here is a mechanism to search for the escaped string, but replaced it with the verbatim one (i.e. the string without the escaping backslashes). This is still work to be done (FIXME).

    Practically, this restriction means that you will have to add your index entries for symbols like *, +, ?, $, &, ^, \ manually, each time after you run awkscr_create_index_items.

  4. Once you have a file, say indexitems, with all words that should appear in the Index of a LyX file, type:

    awkscr_insert_index_items indexitems - < document-noindexitems.lyx > document-indexitems.lyx
    

    to create from document-noindexitems. lyx a document with index entries (document-indexitems.lyx) for all words in indexitems.

Warning Long execution time!
 

The current implementation of awkscr_insert_index_items takes really long to execute, if the indexitems file is large: For 3000 words in indexitems, producing about 9000 index entries in the final document (of which 3000 are duplicate), the script may well need 1-2 hours on a Pentium 3.4 GHz - go get a cup of coffe! smile

Some notes on awkscr_insert_index_items's mode of operation:

  • The “-” in the above invocation is important: it forces the awk script to continue reading from standard input, after it has read indexitems. This, together with the code

    FILENAME == "indexitems" {
            n++
            indexentry[$1] = $1
            next
    }
    

    in awkscr_insert_index_items, causes the words in indexitems to be imported into the indexentry[] associative array.

  • The file separator in awkscr_insert_index_items is set to the semicolon “;”, instead of the default, which is space. This makes it possible to enter index entries with more than one words. Accordingly, the awkscr_create_index_items script appends a semicolon at the end of each word it prints - you should leave these untouched!

  • awkscr_insert_index_items follows a simple algorithm to insert the index entries at the right places in the document: to insert an index entry, we have to know what LyX environment (Section 5.1) we are in. In essence, this means we have to parse the LyX document. Since the \ layout commands in the LyX file do NOT have what we would call “closing tags” in other markup languages, we cannot tell awk “if you are between the start and the end of the Paragraph environment, do the following”, or anything like that - there is no easy way to find the “end “ of an environment, given all the environment nestings that are possible. Luckily, another easy way exists: whenever a \ layout command is encountered, we are in the environment specified by that \ layout command, so we only need to set a variable, call it layout, accordingly:

    /\\layout SGML/ { layout = "SGML"; print; next }
    /\\layout Chapter/ { layout = "Chapter"; print; next }
    /\\layout Section/ { layout = "Section"; print; next }
    /\\layout Subsection/ { layout = "Subsection"; print; next }
    /\\layout Subsubsection/ { layout = "Subsubsection"; print; next }
    /\\layout Standard/ { layout = "Standard"; print; next }
    

    ...and so on

  • Clearly, we should not insert index entries everywhere, e.g. in the “Code” environment. That's why we check if we are in the "Standard", "Itemize", “Enumerate”, "Quotation", "Description" environment (warning: the way sedscr works currently, you should not insert index entries in the “Caption” environment) and, if we are (and only then), we substiture every word in the indexentry[] array with the LyX “insert index entry” command:

    {
            if ( (  layout == "Standard" ||
                    layout == "Itemize" ||
                    layout == "Enumerate" ||
                    layout == "Quotation" ||
                    layout == "Description" ) && ( inset == 0 ) ) {
                    for (item in indexentry) {
                            if (gsub(" " item " "," " item " \n\\begin_inset LatexCommand \\index{" indexentry[item] "}\n\n\\end_inset \n")) { continue }
                            else if (gsub("^" item " "," " item " \n\\begin_inset LatexCommand \\index{" indexentry[item] "}\n\n\\end_inset \n")) { continue }
                            else if (gsub(" " item "$"," " item "\n\\begin_inset LatexCommand \\index{" indexentry[item] "}\n\n\\end_inset \n")) { continue }
                            else if (gsub(" " item ":"," " item ":\n\\begin_inset LatexCommand \\index{" indexentry[item] "}\n\n\\end_inset \n")) { continue }
                            else if (gsub(" " item "\\."," " item ".\n\\begin_inset LatexCommand \\index{" indexentry[item] "}\n\n\\end_inset \n")) { continue }
                            else if (gsub(" " item ","," " item ",\n\\begin_inset LatexCommand \\index{" indexentry[item] "}\n\n\\end_inset \n")) { continue }
                            else if (gsub(" " item "\\?"," " item "?\n\\begin_inset LatexCommand \\index{" indexentry[item] "}\n\n\\end_inset \n")) { continue }
                            else if (gsub(" " item ";"," " item ";\n\\begin_inset LatexCommand \\index{" indexentry[item] "}\n\n\\end_inset \n")) { continue }
                            else if (gsub(" " item "\n"," " item "\n\n\\begin_inset LatexCommand \\index{" indexentry[item] "}\n\n\\end_inset \n")) { continue }
                            else if (gsub(" " "\"" item "\""," " "\"" item "\"\n\\begin_inset LatexCommand \\index{" indexentry[item] "}\n\n\\end_inset \n")) { continue }
                    }
                    { print; next }
            }
    }
    

Some tips regarding the (necessary) manual editing of the words file, the file output by awkscr_create_index_items above:

  • Leave the semicolons at the end of each line untouched! They are needed as record separators in the awk scripts.

  • You will see a lot of words (or their declinations) that are not useful. It is one thing to have a lot of words and another to have a set of really useful words and phrases. That's the price we pay for the simplicity of our method.

  • You may need to supply some extra terms you feel are missing from that file. Feel free to do this, awkscr_insert_index_items does not know how you created the indexitems file you give it.

  • Keep backups of your word lists from subsequent runs of the scripts. Combine word lists from other projects. No matter how long your word list, only the terms that really appear somewhere, will make it to the Index, so don't worry if your list is too long - given enough computing time, that is.

  • Take care to delete everything in your word list that looks like a regular expression with metacharacters - because it will be interpreted as such, with unpredictable results (unless you really know what you are doing). I once had “.*” on one line and I forgot to delete it. I then wondered how come that my document was full of index entries to “.*” while the text was almost gone! See regular expressions, for a brief introduction to regular expressions.

  • Take out any “:”, “;”, “?” from the end of the words, as well as enclosing double quotes. Those characters are already taken care of when it comes to inserting the entries, i.e. the indexentries file should contain only the “pure” words, without any punctuation signs.

  • Don't leave in “config” if your LyX file contains “config.php”. If you do, the latter will look ugly in the LyX editor, as it will contain an index entry for “config” just in the middle of it. This will not affect the rendered formats, however.

  • Don't leave in words that might form parts of a LyX command. I once left “Enumerate” in my word list. The resulting LyX file contained an index entry for “Enumerate” in front of every item in every enumeration list! Clearly, the awk script awkscr_insert_index_items “sees” the LyX commands in the file that are invisible to you. This bug has been fixed in the current version of the script, but there maybe others lurking around.

Finally, there are a few known limitations of the collateindex script that creates the index (see Automatic Indexing with the DocBook DSSSL Stylesheets:

  • Duplicate page numbers are not suppressed in the index. If the document contains three indexing hits on page 4, the generated index will contain 4, 4, 4.

  • Ranges are not automatically constructed. If the document contains indexing hits on pages 4, 5, 6, and 7, the generated index will contain 4, 5, 6, 7 instead of 4-7.


5.21. The final step: invoking lyxtox

If you have followed all the above, your working directory should now contain:

awkscr_create_index_items
awkscr_insert_index_items
awkscr_math
awkscr_refdb_html
awkscr_refdb_print
ck-style.css
images
jadetex.cfg
lyxtox
lyxtox-html.dsl
lyxtox-onehtml.dsl
lyxtox-print.dsl
lyxtox-print-howto.dsl
lyxtox-print-pdf.dsl
lyxtox-print-ps.dsl
lyxtox-print-rtf.dsl
lyxtox-print-txt.dsl
sedscr
sedscr_abi
sedscr_app
sedscr_bib
sedscr_delete_index_items
sedscr_list_index_items
sedscr_math
sedscr_ris
sedscr_top
sedscr_val

Now, to create all the other formats from your LyX source, you just call lyxtox with one argument: the name of your . lyx file without the . lyx ending:

lyxtox myTemplate

It's time for a cup of coffee now. Relax while your computer is busy creating a whole bunch of nice formatted documents. cool

For a detailed explanation of what is going behind the scenes, see Chapter 7.

Now, let's see a more realistic example: the creation of the GNU/Linux Command-Line Tools Summary HowTo. This HowTo is an attempt to provide a comprehensive summary of useful command-line tools available to a GNU/Linux based operating system, i.e. commands needed by the majority of users. The document is authored by Gareth Anderson and I assist him in the document conversion process. So, how do I go about such a project?

Here's how[7]:

My home directory is /home/chris. This is not my working directory for a lyxtox project. A working directory is, say, myLinuxCommands, beneath my home directory. That is, I create a working directory specific to a project. In our case, let's say this is /home/chris/myLinuxCommands.

Now, I change to the working directory and make sure I have all files mentioned above. These are at least

awkscr_create_index_items
awkscr_insert_index_items
awkscr_math
awkscr_refdb_html
awkscr_refdb_print
ck-style.css
images
jadetex.cfg
lyxtox
lyxtox-html.dsl
lyxtox-onehtml.dsl
lyxtox-print.dsl
lyxtox-print-howto.dsl
lyxtox-print-pdf.dsl
lyxtox-print-ps.dsl
lyxtox-print-rtf.dsl
lyxtox-print-txt.dsl
sedscr
sedscr_abi
sedscr_app
sedscr_bib
sedscr_delete_index_items
sedscr_list_index_items
sedscr_math
sedscr_ris
sedscr_top
sedscr_val
sedscr_ima
sedscr_apa

and may well have forgotten others (please tell me if I have). I copy these files from yet another directory in my home, where I have extracted a pristine copy of mySGML.tar.gz.

So now I have created the files

/home/chris/myLinuxCommands/awkscr_create_index_items
/home/chris/myLinuxCommands/awkscr_insert_index_items
/home/chris/myLinuxCommands/awkscr_math
/home/chris/myLinuxCommands/awkscr_refdb_html
/home/chris/myLinuxCommands/awkscr_refdb_print
/home/chris/myLinuxCommands/ck-style.css
/home/chris/myLinuxCommands/images
/home/chris/myLinuxCommands/jadetex.cfg
/home/chris/myLinuxCommands/lyxtox
/home/chris/myLinuxCommands/lyxtox-html.dsl
/home/chris/myLinuxCommands/lyxtox-onehtml.dsl
/home/chris/myLinuxCommands/lyxtox-print.dsl
/home/chris/myLinuxCommands/lyxtox-print-howto.dsl
/home/chris/myLinuxCommands/lyxtox-print-pdf.dsl
/home/chris/myLinuxCommands/lyxtox-print-ps.dsl
/home/chris/myLinuxCommands/lyxtox-print-rtf.dsl
/home/chris/myLinuxCommands/lyxtox-print-txt.dsl
/home/chris/myLinuxCommands/sedscr
/home/chris/myLinuxCommands/sedscr_abi
/home/chris/myLinuxCommands/sedscr_app
/home/chris/myLinuxCommands/sedscr_bib
/home/chris/myLinuxCommands/sedscr_delete_index_items
/home/chris/myLinuxCommands/sedscr_list_index_items
/home/chris/myLinuxCommands/sedscr_math
/home/chris/myLinuxCommands/sedscr_ris
/home/chris/myLinuxCommands/sedscr_top
/home/chris/myLinuxCommands/sedscr_val
/home/chris/myLinuxCommands/sedscr_ima
/home/chris/myLinuxCommands/sedscr_apa

as exact copies of the original ones (which are, as I said, in some other directory).

Now, in the /home/chris/myLinuxCommands directory, I create (or copy) the LyX file that I want to process. In our case, this is the gnu-linux-tools-summary.lyx file, together with bibliography.lyx and appendix.lyx. In other cases, the bibliography.lyx and appendix.lyx are not needed, so they are not there.

I change to the /home/chris/myLinuxCommands directory, which, as I said, is the working directory for the specific project in our example. Since my LyX file is gnu-linux-tools-summary.lyx, I call lyxtox with the basename of gnu-linux-tools-summary.lyx, i.e. with "gnu-linux-tools-summary", as its first and only parameter:

lyxtox gnu-linux-tools-summary

The name of the parameter is $1 in the lyxtox script.

(Of cource, I have taken care to adjust all parameters at the start of lyxtox to my situation, as described in Chapter 4).

If all goes well, I get another directory inside my working directory, named exactly as the parameter of the script, in our case gnu-linux-tools-summary. Everything I need is there: the one, big HTML file, the many HTML files ("chunks"), the PDF, PS.GZ, TXT files, the images directory, the math directory...simply everything. I can just tar the whole directory, upload it to my web site, extract it and I will have the whole project there!

Indeed, it is here: http://www.karakas-online.de/gnu-linux-tools-summary/. smile

Thus http://www.karakas-online.de/gnu-linux-tools-summary is an exact copy of my local /home/chris/myLinuxCommands/gnu-linux-tools-summary, as created by the lyxtox invocation

lyxtox gnu-linux-tools-summary
Note

Notice that you are supposed to have an "images" directory inside your working directory. Mine is /home/chris/myLinuxCommands/images. When you run lyxtox with "gnu-linux-tools-summary" as parameter, you get another one: /home/chris/myLinuxCommands/gnu-linux-tools-summary/images. This is perfectly OK: it is a copy of the original images directory. It is done this way, so that you only need to copy the gnu-linux-tools-summary to your website and be up and running without needing anything else.


Chapter 6. Errors and warnings

There will be some errors and warnings. However, many of them are not crucial (or even avoidable) and can be safely ignored. Some day, in a perfect world, where all involved programs and languages will work perfectly with each other, these errors will disappear. Until then, you must learn to live with them.

Here are some examples of possible errors:


6.1. LyX errors

text class not available:

If you get:

The document uses an unknown textclass "docbook-book". LyX wil not be able to format output correctly

do "Layout-->Document" on any LyX document you have and then, in the window that will appear, check the classes available under "Class". You should see all the above classes you listed, especially the "DocBook book (SGML)" class I use. You should see something like:

+checking for docbook class docbook-algo... yes 
+checking for docbook class docbook-book... yes 
+checking for docbook class docbook-chapter... yes 
+checking for docbook class docbook... yes 
+checking for docbook class docbook-section... yes 

If not, then somehow the reconfigure command above (see Section 4.1) did not find all your classes.


6.2. Openjade errors

DTDDECL catalog entries are not supported:

Jade does not support the DTDDECL catalog directive and it complains loudly if it encounters one. You may safely ignore this warning. See here for more details.

xref to ANCHOR unsupported:

This seems to be a Jade/Docbook problem and not a LyX one. LyX is capable of inserting cross-references to arbitrary positions in a text. For this purpose, it creates an anchor with the id tag aproppriately set in the SGML file:

<anchor id="homepage-fig">

The cross-reference itself is placed with the xref element:

<xref linkend="homepage-fig">

The error means that this mechanism of cross-referencing is unsupported. This is why I change

<section><anchor id="homepage-fig">

to

<section id="homepage-fig">

with sed (see Section 4.10). This way, at least cross-references to sections created by LyX will work in SGML.

value of attribute "LINKEND" must be a single token:

the label you used for a section contains spaces. Change spaces to, say, underscores or dashes.

value of attribute "ID" must be a single token:

you used a cross-reference to a label that contains spaces. Do not change the cross-reference. Change the label: change spaces to, say, underscores or dashes.

end tag for "SECT3" which is not finished:

You probably have an empty subsubsection, e.g. you just outlined your documents and some subsubsections have a title but no content (yet). I consider this warning to be a bug. Or is it a feature???

end tag for element "LISTITEM" which is not open:

this may be a bug of LyX' SGML generation. It happens when in an itemize environment you use a higher depth, possibly with nested item lists and subsequent paragraphs that you want to be on the same identation level as the outer item. Not critical.

"keep-with-next:" is not a valid keyword in a make expression for flow object class "paragraph":

Didn't track it down, but seems harmless.

value of attribute "FORMAT" cannot be "PDF":

Openjade tell us:

must be one of "BMP", "CGM-CHAR", "CGM-BINARY", "CGM-CLEAR", "DITROFF", "DVI", "EPS", "EQN", "FAX", "GIF", "GIF87A", "GIF89A", "JPG", "JPEG", "IGES", "PCX", "PIC", "PNG", "PS", "SGML", "TBL", "TEX", "TIFF", "WMF", "WPG", "LINESPECIFIC"

this means that the list of accepted formats does not contain “PDF”. Such a list appears in the following files:

  • /usr/share/sgml/db3xml/dbnotnx.mod

  • /usr/share/sgml/docbk30/docbook.dtd

  • /usr/share/sgml/docbook_3/dbnotn.mod

  • /usr/share/sgml/sdb/sdocbook.dtd

  • /usr/share/sgml/db41xml/dbnotnx.mod

  • /usr/share/sgml/docbook_4/dbnotn.mod

Given that LyX produces a SGML file containing

<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V4.1//EN"

as the first line, meaning that it uses DocBook V4.1 (1.1.x versions still used DocBook V3.0), we only need to insert "PDF" in the /usr/share/sgml/docbook_4/dbnotn.mod file:

<!ENTITY % notation.class "BMP| CGM-CHAR | CGM-BINARY | CGM-CLEAR | DITROFF | DVI | EPS | EQN | FAX | GIF | GIF87a | GIF89a | JPG | JPEG | IGES | PCX | PDF | PIC | PNG | PS | SGML | TBL | TEX | TIFF | WMF | WPG | linespecific %local.notation.class;"> 

If you still use LyX v.1.1.x, you should change /usr/share/sgml/docbook_3/dbnotn.mod to include “PDF” in the list of accepted file extensions:

<!ENTITY % notation.class                 "BMP| CGM-CHAR | CGM-BINARY | CGM-CLEAR | DITROFF | DVI                 | EPS | EQN | FAX | GIF | GIF87a | GIF89a                  | JPG | JPEG | IGES | PCX                 | PIC | PS | SGML | TBL | TEX | TIFF | WMF | WPG | PDF | PNG                 | linespecific                 %local.notation.class;"> 

Also, in the dbparam.dsl file for the print formats (located in /usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/print/dbparam.dsl on my system), zou would need to add “pdf” to the list of allowed graphic extensions:

(define %graphic-extensions%
  ;; REFENTRY graphic-extensions
  ;; PURP List of graphic filename extensions
  ;; DESC
  ;; The list of extensions which may appear on a 'fileref'
  ;; on a 'Graphic' which are indicative of graphic formats.
  ;;
  ;; Filenames that end in one of these extensions will not have
  ;; the '%graphic-default-extension%' added to them.
  ;; /DESC
  ;; AUTHOR N/A
  ;; /REFENTRY
  '("eps" "epsf" "gif" "tif" "tiff" "jpg" "jpeg" "png" "pdf" "tex"))

However, this is not necessary, since we use the technique of “ customization layers” (see Section 7.1.5) and define %graphic-extensions% in lyxtox-print.dsl as shown above. Due to the nature of the DSSSL language (which bears similarities to Scheme), our change takes precedence over the definition in the standard files and we don't need to change standard code.

Reference `4' on page 1 undefined on input line 180:

Some cross-refernce you use is probably misspelled. But this is somewhat difficult to achieve with LyX, since LyX provides you with a list of all the labels currently available for cross-referencing. The other occasion where you will see this error is in the first (or even second?) invocation of LaTeX for a particular format. You need at least 3 LaTeX passes for the table of contents and the cross-references to be worked out and if you are currently on the first pass, you will see this error for every cross-reference.

LaTeX_Font_Warning: Some font shapes were not available, defaults substituted:

If you installed the Computern Modern fonts, you probably don't need to worry about this error. If you check the fonts used in the PDF file (File-->Documet Info-->Fonts-->List all fonts), you will probably find out that only some seldomly used characters were not rendered with CM fonts. That's O.K. for me.

Overfull \hbox (30.17416pt too wide) in paragraph at lines 5425--5425:

You will get dozens of this. It is a typical LaTeX warning informing you that a line got some points too wide (mostly because there was some word that LaTeX could not hyphenate). Read the LaTeX documentation for this (but only if you want to produce a really perfect PDF document).

/usr/share/sgml/stylesheets/sgmltools/print.dsl:53:6:E: 3rd argument for primitive "string-append" of wrong type: "#f" not a string:

This is probably caused because %graphic-default-extension% is set to “false” (“#f”), while in lyxtox-print.dsl we try to concatenate the %admon-graphics-path%, the name of the admonition and %graphic-default-extension% to a full name of the admonition graphic. It is harmless, due to the way we use the graphics (see Section 7.2.2).

Warning: Version of thumbpdf.tex' does not match with perl script!:

thumbpdf complains even if the version is a newer one:

*** make `thumbpdf.pdf' / run pdfTeX ***
!!! Warning: Version of `thumbpdf.tex' does not match with perl script!
Current `thumbpdf.tex': 2002/05/26 v3.2
Please install version: 2001/01/12 v2.8

But if you have the newest version, as in the above case, there is no need to worry about this error.

end tag for "SECT1" which is not finished:

You may see this openjade error also for “SECT2”, “SECT3” and so forth. This may come up if you have written just the title of the section, subsection or subsubsection respectively, but you did not enter any text there. Just enter something, even a single word like “FIXME” (author's note to himself: this is a literal FIXME, not a meta-FIXME!), to remind you of the missing text, and the error should go away.

document type does not allow element "SECT2" here:

This is a similar error to the previous one. You most probably created a subsection which is not contained in a section, but is dircetly contained in a chapter. Similar errors will occur if you ommit environment levels that are “in between” the current one and its parent.

length of name token must not exceed NAMELEN (44):

You have an ID on the line where this error comes up - and this ID is too long, longer than NAMELEN, which is 44 in my case. Use a shorter ID. What in SGL comes as the ID of a chapter, section, figure or table, in LyX it is a label. Thus you should check your labels and make sure they are not longer than 44 characters.

Alternatively, change the value of NAMELEN in the file pointed to by the SGMDCL directive in your catalog files: The lyxtox script uses, among others, the following catalog: /usr/share/sgml/CATALOG.docbook-dsssl-stylesheets. You can see this in the line:

SGML_CATALOG_FILES="$SGML_CATALOG_FILES: /usr/share/sgml/CATALOG.docbook-dsssl-stylesheets" 

In this catalog, I have changed the SGMLDECL line from:

-- SGMLDECL "dtds/decls/docbook.dcl" --

to

SGMLDECL "/usr/share/sgml/docbook/docbook-dsssl-stylesheets/dtds/decls/docbook.dcl"

to reflect the correct path to the DocBook declaration file in my system. You must thus change the value of the NAMELEN variable in the file where the SGMLDECL points to, i.e.[8] docbook/docbook-dsssl-stylesheets/dtds/decls/docbook.dcl. There, change

        QUANTITY SGMLREF
                ATTCNT    256
                GRPCNT    253
                GRPGTCNT  253
                LITLEN   8092
                NAMELEN    44
                TAGLVL    100

to, for example:

        QUANTITY SGMLREF
                ATTCNT    256
                GRPCNT    253
                GRPGTCNT  253
                LITLEN   8092
                NAMELEN    64
                TAGLVL    100

See the error 'character "_" is_not allowed in the value of attribute "ID"' below for more changes you might need to make in the DocBook declaration file.

character ":" is_not allowed in the value of attribute "ID":

You have a LyX label that contains “:”. Delete the “:” , as the label of a chapter, section, figure, table etc. is going to be the ID of that element and “:” is not allowed in IDs. However, this only the short answer. For an in-depth explanation, see the next item.

character "_" is_not allowed in the value of attribute "ID":

You have a LyX label that contains “_”. Delete the “_” , as the label of a chapter, section, figure, table etc. is going to be the ID of that element and “_” is not allowed in IDs. However, as said in the previous item, this is only the tip of the iceberg. A tip of the hat to Tony Graham for the following detailed information in Allowed characters in element id's!

The characters allowed in an ID are those allowed in an SGML "NAME". The characters that are allowed to appear in “names” in SGML (the id attribute is defined in SGML as such a “NAME”) are set in what is known as the “ SGML declarationSGML declaration” of your document. By default, the first character must be a letter, and any other characters may be a letter or a digit.

You can add to this by specifying the additional characters in your SGML Declaration (and you can't take any characters away). The convention in widest use is that of the "Reference concrete syntax" included in the SGML standard itself that adds "." and "-" as "name" characters (but not as "name start" characters). This is what's used in the DocBook SGML Declaration, docbook.dcl.

The specific portion of the SGML Declaration of interest here is the "naming rules". Jade's default inferred SGML Declaration uses the same naming rules as SGML's "Reference Concrete Syntax". To allow underscores in entity names (and other SGML names), you need to supply an SGML Declaration that includes the underscore character. Using the DocBook SGML Declaration as an example, you need to add "_" to the LCNMSTRT and UCNMSTRT parameters:

        NAMING
                LCNMSTRT ""
                UCNMSTRT ""
                LCNMCHAR ".-"
                UCNMCHAR ".-"
                NAMECASE
                        GENERAL YES
                        ENTITY  NO

The NAMING portion specifies both uppercase and lowercase forms of the additional "name start" and "name" characters (since names are folded to uppercase when the "GENERAL" parameter has the value YES"). You need to add it in two places because you are declaring the uppercase and lowercase forms, which just happen to be the same.

You can reference your SGML Declaration by including it in the Jade command line before the filename for your SGML file (or before your DTD if also including the DTD filename in the command line). You can also reference an SGML Declaration to infer by using the "SGMLDECL" keyword in your catalog file. (See "charset.htm" from the nsgmls distribution for more information on the catalog format. FIXME: URL!)

Now, what does all that mean for our specific situation?

The lyxtox script uses, among others, the following catalog: /usr/share/sgml/CATALOG.docbook-dsssl-stylesheets. You can see this in the line:

SGML_CATALOG_FILES="$SGML_CATALOG_FILES: /usr/share/sgml/CATALOG.docbook-dsssl-stylesheets" 

In this catalog, change the SGMLDECL line from:

-- SGMLDECL "dtds/decls/docbook.dcl" --

to

SGMLDECL "/usr/share/sgml/docbook/docbook-dsssl-stylesheets/dtds/decls/docbook.dcl"

Note that we have taken away the comments and corrected the path to the declaration.

But that's NOT enough!

You must also change the file where the SGMLDECL points to, i.e.[9] docbook/docbook-dsssl-stylesheets/dtds/decls/docbook.dcl. There, change

        NAMING
                LCNMSTRT ""
                UCNMSTRT ""
                LCNMCHAR ".-"
                UCNMCHAR ".-"
                NAMECASE
                        GENERAL YES
                        ENTITY  NO

to:

        NAMING
                LCNMSTRT ""
                UCNMSTRT ""
                LCNMCHAR ".-_"
                UCNMCHAR ".-_"
                NAMECASE
                        GENERAL YES
                        ENTITY  NO

i.e. add the underscore to LCNMCHAR and UCNMCHAR.

This solves the problem at its root! See also character "_" not allowed in value of attribute ID.

general_entity_"d_op"_not_defined_and_no_default_entity:

When you enter an URL in LyX, you are asked to enter the URL and the Name of the link in a window that pops up. Whatever you enter in the URL field, will be automatically taken care by LyX: if it contains special characters, like ampersands, they are replaced with their SGML equivalent (see SGML entities). But whatever you enter in the Name field, it will passed “as is” to SGML. If you feel lazy and just copy the URL in the Name field, you must take care to replace special characters with their SGML entities yourself. Dynamic URLs are a classic example and are going to flood you with this error, if you do not do your homework. For example, if you feel you have to write something like

http://www.karakas-online.de/forum/viewtopic.php?t=14&f=3

instead of

Welcome to the <productname>Linux</productname> Forum of Chris

then you should write it as

http://www.karakas-online.de/forum/viewtopic.php?t=14&amp;f=3

i.e. replace “&” with its SGML entity “&amp;”.

element "BODY" undefined:

BODY? Which body? You don't remember to have entered anything like this in you text? Well, check for “<body>” somewhere in the text (not the code environments, as those have been taken care already by LyX). You have to replace the “<” and “>” with their SGML entities “&lt;” and “&gt” respectively. When you correct it, the other two errors

end tag for "BODY" omitted, but OMITTAG NO was specified
start tag was here

will disappear too, since they are just follow-ups of the first one.

character data is not allowed here:

This simply means that you did not replace some special character with its SGML entity. Have a look at the tables in SGML entities, try to guess which character is meant in the error (the SGML lines produced by LyX can become longer than Proust's), and replace it.

But the reason for this error may also lie in the fact that you forgot to change the LyX environment (see Section 5.1) from SGML to Standard and the text contains some special characters, like dashes. In this case, the text is interpreted in the SGML context and the special characters give rise to misunderstandings and errors in th parser. Just change the environment from SGML back to Standard (or whatever else it should be) and the error will disappear.

document type does not allow element "MEDIAOBJECT" here:

This error happens because sed (more precisely, the sed commands in sedscr) did not manage to change the SGML that LyX produced for a figure. The reason sedscr failed lies probably in some strange arrangement of the starting SGML tags for the figure. The best cure is to enter a “forced new line” by pressing Strg+Enter at the end of the line that precedes the offending figure. This will force the whole SGML code to start on a new line and will make identification by sedscr easier. The next time you run lyxtox, the correct SGML commands for the figure will be generated and the error will disappear.

an attribute value must be a literal unless it contains only name characters:

How on earth could this happen? Well, if you are like me and always try new possibilities with LyX and SGML, then you will encounter this error if you enter an URL while in the SGML environment. There, of course, you cannot just use LyX' menu Insert->URL - you have to write the SGML code for an URL &lt;ulink url=”http://www.somesite.com”&gt;. Now, if you forget to enclose the URL (the “ attribute value”, as the error says, of the “url” attribute in the ulink element) in double quotes, then you will get that error.

document type does not allow element <para> (after <entry> in a table):

You shouldn't get this error when you use the scripts desribed here - simply because the sedscr_tidy2 script takes care that it does not occur. However, if you do get it, it is nice to know why:

Due to the way the SGML parser works, the following piece of code describing an informal table will produce a “ document type does not allow element <para>” error:

<informaltable>
<tgroup cols="2" colsep="1" rowsep="1">
<colspec colname="col0" align="center">
<colspec colname="col1" align="center">
<tbody>
<row>
<entry align="center" valign="top">
<para>
EAN
</para>
</entry>
<entry align="center" valign="top">
<para>
9783898112338
</para>
</entry>
</row>

The reason is the so-called “Pernicious Mixed Content Problem”. From the Definitive Guide to DocBook on the <entry> element:

The content model of the Entry element exhibits a nasty peculiarity that we call pernicious mixed content.[18]

Every other element in DocBook contains either block elements or inline elements (including #PCDATA) unambiguously. In these cases, the meaning of line breaks and spaces are well understood; they are insignificant between block elements and significant (to the SGML parser, anyway) where inline markup can occur.

Table entries are different; they can contain either block or inline elements, but not both at the same time. In other words, one Entry in a table might contain a paragraph or a list while another contains simply #PCDATA or another inline markup, but no single Entry can contain both.

Because the content model of an Entry allows both kinds of markup, each time the SGML parser encounters an Entry, it has to decide what variety of markup it contains. SGML parsers are forbidden to use more than a single token of lookahead to reach this decision. In practical terms, what this means is that a line feed or space after an Entry start tag causes the parser to decide that the cell contains inline markup. Subsequent discovery of a paragraph or another block element causes a parsing error.

All of these are legal:

<entry>3.1415927</entry>
<entry>General <emphasis>#PCDATA</emphasis></entry>
<entry><para>
A paragraph of text
</para></entry>

However, each of these is an error:

<entry>
Error, cannot have a line break before a block element
<para>
A paragraph of text.
</para></entry>

<entry><para>
A paragraph of text.
</para>               Error, cannot have a line break between block elements

<para>
A paragraph of text.
</para></entry>

<entry><para>
A paragraph of text.
</para>               Error, cannot have a line break after a block element

</entry>

Thus, the informal table example above must be corrected to:

<informaltable>
<tgroup cols="2" colsep="1" rowsep="1">
<colspec colname="col0" align="center">
<colspec colname="col1" align="center">
<tbody>
<row>
<entry align="center" valign="top"><para>
EAN
</para></entry>
<entry align="center" valign="top"><para>
9783898112338
</para></entry>
</row>

This is done in lyxtox with a call to runsed using sedscr_tidy2 as the sed script. See also Openjade error: <para> not allowed after <entry>.


6.3. TeX errors

The material in this section (up to Section 6.3.1) is taken from the TeX FAQ item on How to approach errors.

Since TeX is a macroprocessor, its error messages are often difficult to understand; this is a (seemingly invariant) property of macroprocessors. Knuth makes light of the problem in the TeXbook, suggesting that you acquire the sleuthing skills of a latter-day Sherlock Holmes; while this approach has a certain romantic charm to it, it's not good for the 'production' user of (La)TeX. The following (derived, in part, from an article by Sebastian Rahtz in TUGboat 16(4)) offers some general guidance in dealing with TeX error reports, and other answers in this section deal with common (but perplexing) errors that you may encounter. There's a long list of "hints" in Sebastian's article, including the following:

  • Look at TeX errors; those messages may seem cryptic at first, but they often contain a straightforward clue to the problem. See Section 6.3.1 for further details.

  • Read the .log file; it contains hints to things you may not understand, often things that have not even presented as error messages.

  • Be aware of the amount of context that TeX gives you. The error messages gives you some bits of TeX code (or of the document itself), that show where the error "actually happened"; it's possible to control how much of this 'context' TeX actually gives you. LaTeX (nowadays) instructs TeX only to give you one line of context, but you may tell it otherwise by saying

    \setcounter{errorcontextlines}{999}
    

    in the preamble of your document. (If you're not a confident macro programmer, don't be ashamed of cutting that 999 down a bit; some errors will go on and on, and spotting the differences between those lines can be a significant challenge.)

  • As a last resort, tracing can be a useful tool; reading a full (La)TeX trace takes a strong constitution, but once you know how, the trace can lead you quickly to the source of a problem. You need to have read the TeXbook (see books about TeX) in some detail, fully to understand the trace.

    The command

    \tracingall 
    

    sets up maximum tracing. it also sets the output to come to the interactive terminal, which is somewhat of a mixed blessing (since the output tends to be so vast - all but the simplest traces are best examined in a text editor after the event).

    The LaTeX trace package (first distributed with the 2001 release of LaTeX) provides more manageable tracing. Its \ traceon command gives you what \tracingall offers, but suppresses tracing around some of the truly verbose parts of LaTeX itself. The package also provides a \traceoff command (there's no "off" command for \tracingall), and a package option (logonly) allows you to suppress output to the terminal.

The best advice to those faced with TeX errors is not to panic: most of the common errors are plain to the eye when you go back to the source line that TeX tells you of. If that approach doesn't work, the remaining answers in this section deal with some of the most common error messages you may encounter using LyX and a TeX system provided by a common Linux distribution as your starting point. For more on TeX errors, consult the “Joy of TeX errors” section of the TeX FAQ. You should not ordinarily need to appeal to the wider public for assistance (see TeX mailing lists), but if you do, be sure to report full backtraces (see errorcontextlines above) and so on.


6.3.1. The structure of TeX errors

The material in this section is taken from the structure of TeX errors.

TeX's error messages are reminiscent of the time when TeX itself was conceived (the 1970s): they're not terribly user-friendly, though they do contain all the information that TeX can offer, usually in a pretty concise way.

TeX's error reports all have the same structure:

  • An error message

  • Some 'context'

  • An error prompt

The error message will relate to the TeX condition that is causing a problem. Sadly, in the case of complex macro packages such as LaTeX, the underlying TeX problem may be superficially difficult to relate to the actual problem in the "higher-level" macros. Many LaTeX-detected problems manifest themselves as 'generic' errors, with error text provided by LaTeX itself (or by a LaTeX class or package).

The context of the error is a stylised representation of what TeX was doing at the point that it detected the error. As noted in approaching errors, a macro package can tell TeX how much context to display, and the user may need to undo what the package has done. Each line of context is split at the point of the error; if the error actually occurred in a macro called from the present line, the break is at the point of the call. (If the called object is defined with arguments, the "point of call" is after all the arguments have been scanned.) For example:

\blah and so on

produces the error report

! Undefined control sequence.
l.4 \blah
          and so on

while:

\newcommand{\blah}[1]{\bleah #1}
\blah{to you}, folks

produces the error report

! Undefined control sequence.
\blah #1->\bleah
                 #1
l.5 \blah{to you}
                 , folks

If the argument itself is in error, we will see things such as

\newcommand{\blah}[1]{#1 to you}
\blah{\bleah}, folks

producing

! Undefined control sequence.
<argument> \bleah
l.5 \blah{\bleah}
                 , folks

The prompt accepts single-character commands: the list of what's available may be had by typing ?. One immediately valuable command is h, which gives you an expansion of TeXs original prcis message, sometimes accompanied by a hint on what to do to work round the problem in the short term. If you simply type 'return' (or whatever else your system uses to signal the end of a line) at the prompt, TeX will attempt to carry on (often with rather little success).


6.3.2. LaTeX errors

LaTeX output can be very verbose. Just for your reference, here is a (rather long) example of output (taken from the .log file that is created each time anew from every program involved in the process that uses TeX in the background)[10]:

This is TeX, Version 3.14159 (Web2C 7.3.1) (format=jadetex 2001.10.25)  7 JAN 2004 07:26
**EN-Book.tex
(EN-Book.tex
JadeTeX 2001/07/19: 3.11
LaTeX Font Info:    Try loading font information for T1+ptm on input line 1.
(/usr/share/texmf/tex/latex/psnfss/t1ptm.fd
File: t1ptm.fd 2000/01/12 PSNFSS-v8.1 font definitions for T1/ptm. 
)
(jadetex.cfg (/usr/share/texmf/tex/generic/babel/babel.sty
Package: babel 2001/03/01 v3.7h The Babel package
(/usr/share/texmf/tex/generic/babel/english.ldf
Language: english 2001/02/07 v3.3k English support from the babel system
(/usr/share/texmf/tex/generic/babel/babel.def
File: babel.def 2001/03/01 v3.7h Babel common definitions
\babel@savecnt=\count114
\U@D=\dimen131
)
\l@canadian = a dialect from \language\l@english 
))
(/usr/share/texmf/tex/generic/thumbpdf/thumbpdf.sty
Package: thumbpdf 2001/04/02 v2.10 Inclusion of thumbnails (HO)
Package thumbpdf Warning: You need pdfTeX in PDF mode for driver `pdftex'. 
) (/usr/share/texmf/tex/latex/ae/ae.sty
Package: ae 1998/11/17 1.0 Almost European Computer Modern
(/usr/share/texmf/tex/latex/base/fontenc.sty
Package: fontenc 2000/08/30 v1.91 Standard <application>LaTeX</application> package
(/usr/share/texmf/tex/latex/base/t1enc.def
File: t1enc.def 2000/08/30 v1.91 Standard <application>LaTeX</application> file
LaTeX Font Info:    Redeclaring font encoding T1 on input line 38. 
)
LaTeX Font Info:    Try loading font information for T1+aer on input line 96.
(/usr/share/texmf/tex/latex/ae/t1aer.fd
File: t1aer.fd 1997/11/16 Font definitions for T1/aer.
)))
(/usr/share/texmf/tex/latex/ae/aecompl.sty 
Package: aecompl 1998/07/23 0.9 T1 Complements for AE fonts (D. Roegel) 
)
Package hyperref Info: Option `plainpages' set `false' on input line 128.
Package hyperref Info: Option `colorlinks' set `true' on input line 128.
Package hyperref Info: Option `bookmarksopen' set `true' on input line 128.
Package hyperref Info: Option `colorlinks' set `true' on input line 128.
Package hyperref Info: Option `breaklinks' set `true' on input line 128.
Package hyperref Warning: Option `pagebackref' has already been used,
(hyperref)                setting the option has no effect on input line 128.
)
Elements will be labelled
Jade begin document sequence at 19

As you can see from the last line, Jade just started its work. smile

It goes on with font checking as follows:

(EN-Book.aux)
\openout1 = `EN-Book.aux'.
LaTeX Font Info:    Checking defaults for OML/cmm/m/it on input line 19.
LaTeX Font Info:    ... okay on input line 19.
LaTeX Font Info:    Checking defaults for T1/cmr/m/n on input line 19.
LaTeX Font Info:    ... okay on input line 19.
LaTeX Font Info:    Checking defaults for OT1/cmr/m/n on input line 19.
LaTeX Font Info:    ... okay on input line 19.
LaTeX Font Info:    Checking defaults for OMS/cmsy/m/n on input line 19.
LaTeX Font Info:    ... okay on input line 19.
LaTeX Font Info:    Checking defaults for OMX/cmex/m/n on input line 19.
LaTeX Font Info:    ... okay on input line 19.
LaTeX Font Info:    Checking defaults for U/cmr/m/n on input line 19.
LaTeX Font Info:    ... okay on input line 19.
LaTeX Font Info:    Checking defaults for PD1/pdf/m/n on input line 19.
LaTeX Font Info:    ... okay on input line 19.
LaTeX Font Info:    Checking defaults for T2A/cmr/m/n on input line 19.
LaTeX Font Info:    Try loading font information for T2A+cmr on input line 19.
 (/usr/share/texmf/tex/latex/cyrillic/t2acmr.fd
File: t2acmr.fd 1999/01/07 v1.0 Computer Modern Cyrillic font definitions
)
LaTeX Font Info:    ... okay on input line 19.
LaTeX Font Info:    Checking defaults for TS1/cmr/m/n on input line 19.
LaTeX Font Info:    Try loading font information for TS1+cmr on input line 19.
(/usr/share/texmf/tex/latex/base/ts1cmr.fd
File: ts1cmr.fd 1999/05/25 v2.5h Standard <application>LaTeX</application> font definitions
)
LaTeX Font Info:    ... okay on input line 19.
LaTeX Font Info:    Checking defaults for LECO/omseco/m/n on input line 19.
LaTeX Font Info:    ... okay on input line 19.
LaTeX Font Info:    Checking defaults for LECX/omsecx/m/n on input line 19.
LaTeX Font Info:    ... okay on input line 19.
LaTeX Font Info:    Checking defaults for LECY/omsecy/m/n on input line 19.
LaTeX Font Info:    ... okay on input line 19.
LaTeX Font Info:    Checking defaults for LEGR/omsegr/m/n on input line 19.
LaTeX Font Info:    ... okay on input line 19.
LaTeX Font Info:    Checking defaults for LEHA/omseha/m/n on input line 19.
LaTeX Font Info:    ... okay on input line 19.
LaTeX Font Info:    Checking defaults for LEIP/omseip/m/n on input line 19.
LaTeX Font Info:    ... okay on input line 19.
LaTeX Font Info:    Checking defaults for LELA/omsela/m/n on input line 19.
LaTeX Font Info:    ... okay on input line 19.
LaTeX Font Info:    Checking defaults for LETI/omseti/m/n on input line 19.
LaTeX Font Info:    ... okay on input line 19.
Package hyperref Info: Link coloring ON on input line 19.
(/usr/share/texmf/tex/latex/hyperref/nameref.sty
Package: nameref 2000/05/08 v2.18 Cross-referencing by name of section
\c@section@level=\count115
)
LaTeX Info: Redefining \ref on input line 19.
LaTeX Info: Redefining \pageref on input line 19.
LaTeX Font Info:    Try loading font information for T1+aess on input line 99.
(/usr/share/texmf/tex/latex/ae/t1aess.fd
File: t1aess.fd 1997/11/16 Font definitions for T1/aess.
) [1.0.37]
File: logo1.eps Graphic file (type eps)
 <logo1.eps>
File: logo2.eps Graphic file (type eps)
 <logo2.eps>
Overfull \hbox (2655.0092pt too wide) in alignment at lines 610--899

Somewhere from this point on, you will start seeing LaTeX processing the pages of your document one after the other - and outputting errors and warnings like the “ Overfull \hbox” above.

There are some fairly common error messages and warnings. I'll cover those here using material from the chapter on “ LyX and LaTeX errors” of the Extended Features manual for LyX, available from LyX' Help menu. You should look at a good LaTeX book for a complete listing.

  • LaTeX Warning:“

    Anything beginning with these word is a warning message for the purpose of “debugging” the LaTeX code itself. You'll get messages like this if you added or changed cross-references or bibliography entries, in which case, LaTeX is trying to tell you that you need to make another run.

    You can by-and-large ignore these.

  • LaTeX Font Warning:”

    Another warning message, this time about fonts which LaTeX couldn't find. The rest of the message will often say something about a replacement font that LaTeX used.

    You can safely ignore these.

  • “ Overfull \hbox”

    LaTeX absolutely loves to spew these out. They are warning you about lines that were too long and run past the right margin. Almost always, this is unnoticeable in the final output. Or, only one or two characters extend past the margin. LaTeX seems to generate at least one of these messages for just about any document you write.

    You can ignore these stupid messages. Your eyes will tell you if there's a problem with something that's too wide; just look at the output.

  • “ Underfull \hbox”

    Not quite as common as its cousin. LaTeX seems to like to print lines that are a bit too wide as opposed to ones that are a bit too narrow. We have no idea why.

    You can ignore these, too.

  • “ Overfull \vbox” and “ Underfull \vbox”

    Warnings about troubles breaking the page. Once again, just look at the output. Your eyes will tell you where something has gone wrong.

  • LaTeX Error: File ‘Xxxx’ not found”

    The file “Xxxx” isn't installed on this system. This usually appears because some package your document needs isn't installed. If you didn't touch the preamble or didn't use the \usepackage{} command, then one of the packages LyX tried to load is missing. Use Help->La TeX Configuration, to get a list of packages that LyX knows about. This file is updated whenever you reconfigure LyX (using Edit->Reconfigure) and tells you which packages have been detected and what they do.

    If you did use the \usepackage{} command, and the package in question isn't installed, you'll need to install it yourself.

  • LaTeX Error: Unknown option”

    Error messages beginning with this are trying to tell you that you specified a bad or undefined option to a package. Check the package's documentation.

  • “Undefined control sequence”

    If you've inserted LaTeX code into your document, but made a typo, you'll get one of these. You may have forgotten to load a package. In any case, this error message usually means that you used an undefined command.

  • LaTeX Error: Undefined color `rtlred'

    Theoretically, you can define your own colours in jadetex.cfg. For example, the following block of code defines three new colours, rtlred, rtlblue and rtlgreen (the green being a bit darker than usual):

    \usepackage{color}
    \definecolor{rltred}{rgb}{0.75,0,0}
    \definecolor{rltgreen}{rgb}{0,0.5,0}
    \definecolor{rltblue}{rgb}{0,0,0.75}
    

    However, no matter where I position this code in jadetex.cfg (i.e. either before or after the call to hypersetup), I get the above error. For the time being, my remedy is to stick to the standard colours. Please inform me if you find a better solution.

    Andreas Ekenbäck (private communication) has found that by having double declarations of a colour:

    \newcmykcolor{abstractYellow}{0.02 0.02 0.35 0.0}
    \definecolor{abstractYellow}{cmyk}{0.02,0.02,0.35,0.0}
    

    in jadetex.cfg, it works. On his Fedora Core 4 workstation, he says, it is even enough to have the first declaration.

Further reading: see the chapter on “ LyX and LaTeX errors” of the Extended Features manual for LyX for tips on how to handle LaTeX errors and warnings.


6.3.3. TeX capacity exceeded

If you encounter TeX errors, especially

TeX capacity exceeded, sorry [main memory size=384000]

you may have to set in texmf.cnf (usually located in /etc/texmf, or where the TEXMFCNF environment variable (seeSection 7.1.3) shows):

main_memory = 3839999
main_memory.jadetex = 4999999
hash_extra.jadetex = 25000
pool_size.jadetex = 500000
save_size.jadetex = 15000
save_size = 8000

FYI, here's how much memory TeX used for this document (you may see such information in the .log file created by pdfjadetex which is deleted by default in lyxtox):

Here is how much of TeX's memory you used:
 4847 strings out of 15997
 36267 string characters out of 99016
 163640 words of memory out of 384000
 15827 multiletter control sequences out of 10000+15000
 68666 words of font info for 129 fonts, out of 400000 for 1000
 14 hyphenation exceptions out of 1000
 31i,12n,40p,1588b,3161s stack positions out of 300i,100n,500p,50000b,8000s

See my Jade Odyssey for more details and ideas on this and other related errors.


6.3.4. Fatal format file error; I'm stymied

Well...me too.frown

This what the TeX FAQ says about this error:

(La)TeX applications often fail with this error when you've been playing with the configuration, or have just installed a new version.

The format file contains the macros that define the system you want to use: anything from the simplest (Plain TeX) all the way to the most complicated, such as LaTeX or ConTeXt. From the command you issue, TeX knows which format you want.

The error message

Fatal format file error; I'm stymied

means that TeX itself can't understand the format you want. Obviously, this could happen if the format file had got corrupted, but it usually doesn't. The commonest cause of the message, is that a new binary has been installed in the system: no two TeX binaries on the same machine can understand each other's formats. So the new version of TeX you have just installed, won't understand the format generated by the one you installed last year.

Resolve the problem by regenerating the format; of course, this depends on which system you are using. On a teTeX-based system, run

fmtutil --all

or

fmtutil --byfmt=<format name>

to build only the format that you are interested in.

I got this error during the PDF processing (Section 7.1.4.7). Since pdftex is the program that was running at this stage, I tried both

fmtutil --all

or

fmtutil --byfmt=pdftex

but the problem remained.I had NOT upgraded to any new version for any of the programs involved. What I had done, was to add hundreds of cross-references (probably more than 500) in just one section (see it in Credits for version 2.0 of the PHP-Nuke HOWTO).

The solution was to upgrade pdftex from 0.13d to 1.11b (pdfTeX (Web2C 7.5.2) 3.141592-1.11b) and jadetex from 1.3 to 3.13. Note that upgrading pdftex was not enough, jadetex had to be upgraded too. Of course, you would still have to use fmtutil as above. After the upgrades, the error disappeared. As a nice by-product, document processing seems to be much faster now, although I didn't conduct any benchmarks.


6.3.5. Corrupted NFSS tables

(/usr/share/texmf/tex/latex/base/ts1cmr.fd)
! Corrupted NFSS tables.
wrong@fontshape ...message {Corrupted NFSS tables}
                                                  error@fontshape else let f...

I got this error after I used the RefDB (Section 3.11, Section 5.19, Section 7.1.10) DSSSL stylesheet modifications in the local DSSSL stylesheets (Section 4.2, Section 7.1.5) that control the output.


6.3.6. Missing $ inserted

You will probably get tons warnings like this one:

! Missing $ inserted.
<inserted text> 
                $
l.17245 ...b/a/bsd/2000/09/06/FreeBSD_Basics.html}
                                                  \endNode{}\endSeq{}\endNod...
I've inserted something that you may have forgotten.
(See the <inserted text> above.)

You can see them in the standard output, as well as in the .log file that is created automatically (if you don't see any .log file, maybe lyxtox currently deletes it - check the code). They mean that TeX tried to correct you , because it saw some character that is used in the mathematics mode and thought you might have forgotten to switch. So it inserts a $, which toggles math mode. If this is not what you intended (it will most probably not, if your .tex file is created through the procedure described here), then your only salvation is to escape the character that triggered this behaviour. This means that you may have to escape

  • Slashes (/) in URLs, at least when they separate two numbers (not tested)

  • Underscores (_)

  • Carets (^)

  • and generally everything that may have a math interpretation in TeX/LaTeX/LyX

FIXME: This needs testing! Till now, I didn't resort to escaping - and stil things work remarkably well...


6.3.7. Unprintable characters

Inevitably, sooner or later, you will hit a very nasty problem in document processing: some characters in your file will be unprintable! They may appear in the PDF as black boxes (for example , in the OT1 font encoding), or simply wrong. Here is what you can do to solve this problem:

If you are using the OT1 font encoding[11], i.e. if your jadetex.cfg file contains the line

\usepackage[OT1]{fontenc}

then you have to enter the following characters in math mode:

  • (backslash)

  • FIXME

If you are using the T1 font encoding, i.e. if your jadetex.cfg file contains the line

\usepackage[T1]{fontenc}

then you can use the ae and aecompl packages in order to be able to write almost every european character:

\usepackage{ae,aecompl}

You will not, however, manage to get the two symbols, , printed correctly. If your purpose was to get the french quote characters (the “guillemets”), then you could replace the ae and aecompl packages with the aeguill package, i.e. replace the above line in jadetex.cfg with:

\usepackage{aeguill}

If your purpose was to get two signs, one after another, just as if you were describing the effect of the “append” operator in

cat file1 >> file2

then you must enter the in math mode.

Note

Note that the above holds for the standard LyX environment (see Section 5.1) and probably all other environments where you use the same font as in the body text. You don't need to worry about characters in the code environment, probably because programming code is usually written in the ASCII encoding and LyX uses CDATA in <screen> or <programlisting> environments, where ASCII (or, maybe the value of the SP_ENCODING environment variable) is the expected input encoding for Openjade.

For more information on fonts and their encodings, see FIXME.


6.4. Other errors

6.4.1. Keywords not present in HTML

If you don't see the keywords in the HTML file, then this may be a bug in your stylesheets. There seems to be a bug in dbcommon.dsl (DocBook DSSSL stylesheets 1.73) that prevents this from happening for an article, where the <keywordset> in within <articleinfo>. The definition of "info-element" in dbcommon.dsl contains

((equal? (gi nd) (normalize "article")) 
  (select-elements (children nd) (normalize "artheader")))

but no reference to "articleinfo".

In dbcommon.dsl, which in my system resides in /usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/common/dbcommon.dsl and belongs to packet docbook-dsssl-stylesheets-1.72-34, replace the definition of info-element with:

;; ======================================================================
(define (info-element #!optional (nd (current-node)))
  ;; Returns the *INFO element for the nd or (empty-node-list) if no
  ;; such node exists...
  (cond
   ((equal? (gi nd) (normalize "set"))
    (select-elements (children nd) (normalize "setinfo")))
   ((equal? (gi nd) (normalize "book"))
    (select-elements (children nd) (normalize "bookinfo")))
   ((equal? (gi nd) (normalize "section"))
    (select-elements (children nd) (normalize "sectioninfo")))
   ((equal? (gi nd) (normalize "sect1"))
    (select-elements (children nd) (normalize "sect1info")))
   ((equal? (gi nd) (normalize "sect2"))
    (select-elements (children nd) (normalize "sect2info")))
   ((equal? (gi nd) (normalize "sect3"))
    (select-elements (children nd) (normalize "sect3info")))
   ((equal? (gi nd) (normalize "sect4"))
    (select-elements (children nd) (normalize "sect4info")))
   ((equal? (gi nd) (normalize "sect5"))
    (select-elements (children nd) (normalize "sect5info")))
   ((equal? (gi nd) (normalize "refsect1")) 
    (select-elements (children nd) (normalize "refsect1info")))
   ((equal? (gi nd) (normalize "refsect2")) 
    (select-elements (children nd) (normalize "refsect2info")))
   ((equal? (gi nd) (normalize "refsect3")) 
    (select-elements (children nd) (normalize "refsect3info")))
   ((equal? (gi nd) (normalize "refsynopsisdiv"))
    (select-elements (children nd) (normalize "refsynopsisdivinfo")))
   ((equal? (gi nd) (normalize "article"))
   ;; Changed by root.
   ;; node-list-filter-by-gi and articleinfo inserted. 
   ;; Otherwise no keywords are created in articles.
    (node-list-filter-by-gi (children nd) (list
                                          (normalize "artheader")
                                          (normalize "articleinfo"))))
   (else ;; BIBLIODIV, GLOSSDIV, INDEXDIV, PARTINTRO, SIMPLESECT
    (select-elements (children nd) (normalize "docinfo")))))
;; ======================================================================

6.4.2. thumbpdf fails

Perhaps the most accurate indicator of a serious error in your document or settings is the failing of thumbpdf:

THUMBPDF 2.8, 2001/01/12 - Copyright (c) 1999-2001 by Heiko Oberdiek.
*** make png files / run Ghostscript ***
   **** This file has a corrupted %%EOF marker, or garbage after the %%EOF.
GNU Ghostscript 6.51: Unrecoverable error, exit code 1
!!! Error: Closing Ghostscript (exit status: 1)!

Although document processing will continue (and you might even get a quite usable document), I strongly recommend you to investigate the source of this failing and to either eliminate it, or revert to a configuration that is known to work.

However, there is an exception to this rule of thumb: if you see the error

Ghostscript internal error

the very first time you run lyxtox, don't panic. Let it finish. It is probably missing the pk fonts for the resolution of your standard printer and must generate them on the fly. This will be done later, while creating the postscript (PS) version of the document. After this is done once, you will never see this error again. wink


6.4.3. sed segmentation fault

On one occasion, I got the following very disturbing error:

/usr/local/bin/runsed: line 54: 27611 Segmentation fault sed -f $SEDSCR $x >/tmp/$y$$

Output of the corrected SGML file broke at a line for no apparent reason. Openjade complained (of course) with some errors, but processed the file. I noticed that something went wrong from the output of thumbpdf (see Section 6.4.2): thumbpdf prints the page number for each page it processes - and this time it only printed half that many. I was not able to find the reason for the segmentation fault. I guess it has to do with the exceptionally long line that sed had to process: LyX produces very long SGML lines. It does not break the lines at SGML tags and clutters them on one line for reasons unknown to me. That particular line was thus more than 3400 characters[12] long when it was output without the error. As soon as I added some innocent text that made it a little longer, it would cause the segmentation fault above. I guess this breaks some input line length limitation in sed. However, this is denied to be the case in the sed manual on the (non-)limitations on line length:

For those who want to write portable SED scripts, be aware that some implementations have been known to limit line lengths (for the pattern and hold spaces) to be no more than 4000 bytes. The POSIX.2 standard specifies that conforming SED implementations shall support at least 8192 byte line lengths. GNU SED has no built-in limit on line length; as long as SED can malloc() more (virtual) memory, it will allow lines as long as you care to feed it (or construct within it).

The only remedy if this limit does exist (as it seems to be in my case), is to shorten your text, or introduce a paragraph, admonition (see Section 4.7, Section 5.8), code or other environment (Section 5.1) that may persuade LyX to start a new SGML line in the exported SGML text.


6.4.4. Acrobat Reader 5 does not show thumbnails in Linux

The thumbpdf script creates thumbnails for the PDF document just fine (see Section 7.2.11). Things work fine even with a thumbpdf.tex of version 3.2 and a thumbpdf of version 2.8, i.e. even if the versions of these two files differ (in which case thumbpdf will issue a warning, see Section 6.2).

Yet, one day, I recomputed the PDF version of a document, only to discover that the thumbnails were appearing intermittently in the document: some (few) appeared, the rest showed just a blank thumbnail. After an extensive Internet search (with very few results on the subject...) and a check with both Acrobat Reader 5 for Linux and Acrobat Reader 7 for Windows, it became clear that this is a problem of Acrobat Reader 5 for Linux and large thumbnails (small thumbnails are displayed fine).

I guess there is nothing you can do about this problem, other than switch to small thumbnails (right-click on the thumbnail area to see the context menu that will allow you to choose this), or upgrade to a version of Acrobat Reader that does not suffer from this bug.


6.4.5. URLs with underscore display '&lowbar;' instead of '_'

Any URL that has an underscore in it (like the link to awkscr_insert_index_items) shows '&lowbar' in place of '_'. To correct this, you must check the checkbox “ HTML type” when you insert the URL in LyX, see Figure 6-1.

Figure 6-1. Insert URL with underscores in LyX.

Insert URL with underscores in LyX.

Insert URL with underscores in LyX.

When you do this, you should NOT write underscores as '&lowbar;'. You should write an underscore as is, i.e. as '_', in both the URL and the Name fields Figure 6-1.

FIXME: This still does not solve the problem in PDFs: there, the underscore is interpreted mathematically, making the character following the underscore to appear as a subscript. Moreover, if the link does not have an ending, it gets the standard ending for PDFs, namely “.pdf” attached to it, which is definitely wrong. Try this link: awkscr_insert_index_items, in HTML and in PDF to see the difference!


6.4.6. sed: file sedscr_img line 2: Unknown option to `s'

You get the error:

editing mySGML.html
sed: file sedscr_img line 2: Unknown option to `s'
Output written to /tmp/mySGML.html_1402221460
Sed produced an empty file
- check your sedscript.
all done

When you check the offending line 2 of sedscr_img (remember that sedscr_img is dynamically generated each time you run lyxtox), you see:

s/<img src="\.\/images\/paper-sizes\.png">/<img src=".\/images\/paper-sizes.png" alt="ISO/DIN paper sizes." title="ISO/DIN paper sizes.">/g

Now it's clear what the error is: the caption to a figure contains a backslash (in "ISO/DIN"), which is used by sed as a regular expression delimiter.

The easiest solution to this seems to be: just avoid using backslash in figure captions! You cannot have everything in this world, it seems...smile

See Figure 4-2 for the corrected caption of this specific example.


Chapter 7. Explaining the magic: the details

 

Knowing how things work is the basis for appreciation, and is thus a source of civilized delight.

  William Safire

What makes the procedure described here appear to be “magic” is not only the “high-tecness” of the tools involved, but also the highly frustrating fact that each one of the tools involved expects its input in different directories and/or formats, making it really difficult for interfaces or pipes to match the output of one tool to the expected input of another. In this chapter I will explain the inner working of the involved scripts that, ultimately, do one thing: ensure that what is expected will be found in its expected place, in the expected format.

Tip Tip:
 

You don't need to read this chapter if you are not interested in the gory details. For getting things to work, the Chapter 3, Chapter 4, Chapter 5 and Section 5.21 should suffice. But if you want to understand how things work, then go on!


7.1. Document processing

What happens when you type “ lyxtox myTemplate”? A lot! Let's inspect it step by step:

The name of the parameter given to the lyxtox script, myTemplate, becomes $1. Whenever you see $1 in the code examples below, just replace it mentally with the name you supplied to lyxtox.


7.1.1. Check number of parameters

As in every good script, a rudimentary parameter number check is the very first thing to do:

# Check arguments and issue a help statement, if wrong
#
if [ $# -eq 0 -o $# -gt 1 ]; then
  help 
  exit 1
elif [ $1 = "-h" -o $1 = "--help" ]; then
  help 
  exit 0
fi

If the number of parameters is not exactly 1, the output of the help() function is printed, which looks like this:

Usage: lyxtox [-h] [FILENAME_WITHOUT_.lyx_ENDING]
Creates HTML, PDF, RTF, TXT and PS output 
from a single LYX source.
Needs: lyx, runsed, sed, sedscr, jadetex.cfg, perl, openjade,
pdfjadetex, DocBook, TeX, LaTeX, thumbpdf, gzip, tar 
and all packages required from these.
See http://www.karakas-online.de/mySGML/ for a detailed 
description.
EXAMPLE
=======
If your file is myfile.lyx, then do
lyxtox myfile
go get a cup of coffee and be happy :-)
-h, --help    Display this help text

7.1.2. Set program locations

If the parameter number check is passed, some program locations are set (adapt them to our situation!):

# Program locations
# Adapt to your situation.
LYX="/usr/X11R6/bin/lyx"
SED="/usr/bin/sed"
AWK="/usr/bin/awk"
RUNSED="/usr/local/bin/runsed"
SEDSCR="sedscr"
SEDSCRMATH="sedscr_math"
SEDSCRABI="sedscr_abi"
SEDSCRAPP="sedscr_app"
SEDSCRBIB="sedscr_bib"
AWKSCRMATH="awkscr_math"
PERL="/usr/bin/perl"
COLLATEINDEX="/usr/share/sgml/docbook/docbook-dsssl-stylesheets/bin/collateindex.pl"
UNESCAPEMATH="/usr/local/bin/unescape_math.pl"
TEXMATH2PNGBMP="/usr/local/bin/texmath2pngbmp.pl"
THUMB_PDF="/usr/local/bin/thumbpdf"
OPENJADE="/usr/bin/openjade"
PDFJADETEX="/usr/bin/pdfjadetex"
JADETEX="/usr/bin/jadetex"
DVIPS="/usr/bin/dvips"
GZIP="/usr/bin/gzip"
TAR="/bin/tar"
TIDY="/usr/bin/tidy"
HTMLSPLIT="/usr/local/bin/htmlsplit.awk"
REFDBXP="/usr/bin/refdbxp"
RUNBIB="/usr/bin/runbib"
DATADIR="../"
DOMAIN="www.karakas-online.de"

Further, the stylesheet locations are set (adapt them to your situation too):

  HTML_CHUNKS_DSL="lyxtox-html.dsl"
  HTML_NOCHUNKS_DSL="lyxtox-onehtml.dsl"
  PRINT_PDF_DSL="lyxtox-print-pdf.dsl"
  PRINT_PS_DSL="lyxtox-print-ps.dsl"
  PRINT_RTF_DSL="lyxtox-print-rtf.dsl"
  PRINT_TXT_DSL="lyxtox-print-txt.dsl"

You only need to change this part, if at all. The RefDB part:

  HTML_DSL="refdb-html.dsl"
  HTML_CHUNKS_DSL="$HTML_DSL#html"
  HTML_NOCHUNKS_DSL="$HTML_DSL#onehtml"
  PRINT_DSL="refdb-print.dsl"
  PRINT_PDF_DSL="$PRINT_DSL#print-pdf"
  PRINT_PS_DSL="$PRINT_DSL#print-ps"
  PRINT_RTF_DSL="$PRINT_DSL#print-rtf"
  PRINT_TXT_DSL="$PRINT_DSL#print-txt"

works automatically and it is does not play any role how you name (or where you place) the HTML_DSL and PRINT_DSL files in this case - they will be automatically created through awkscr_refdb_html and awkscr_refdb_print with whatever filename you pass to them for their output respectively, see the source code of lyxtox.


7.1.3. Set environment variables

It's time to set some environment variables. This is where quite a lot of the “magic” is hidden! Although they are mentioned somewhere in the respective manpages, a newcomer will have probably never heard of them. The result: images cannot be found, no matter how “right” you did it, as well as some other annoyances regarding font mappings. This is because different tools read different environment variables for the same information, ignoring all the others! There is probably no way to accomodate this - other than write all relevant variables down correctly - since TeX/LaTeX were created in a different time, for different needs than SGML parsers or PDF software.

Since we entered the names of images without any path information in LyX, openjade needs to be informed of their location with the environment variable SGML_SEARCH_PATH:

# Environment variables
# openjade needs this!
# Both absolute and relative paths work!
# SGML_SEARCH_PATH="$PWD/images"
SGML_SEARCH_PATH="./images"
export SGML_SEARCH_PATH

pdftex (and pdfjadetex) will look at a different environment variable for the location of the image files: TEXPSHEADERS. For some reason which I don't fully understand, \ includegraphics with pdftex uses the TEXPSHEADERS environment variable for the additional paths to search. Also: TEXPSHEADERS contains the search path where pdftex looks up for font mapping file (pdftex.map) and encoding files (*.enc):

# pdftex (and pdfjadetex) need this.
# In my system pdftex.map is located in /var/lib/texmf/dvips/config/,
# while the .enc files are under /usr/share/texmf/dvips/base/.
# TEXPSHEADERS=":${PWD}/images//:/var/lib/texmf/dvips/config/:/usr/share/texmf/dvips/base/"
TEXPSHEADERS=":${PWD}/images//"
export TEXPSHEADERS

LaTeX, on the other side, will look at still another variable for the path to images: TEXINPUTS (TEXINPUTS is also defined in the texmf.cnf file, usually located in the directory pointed to by the TEXMFCNF environment variable):

# <application>LaTeX</application> & Co. need this!
# A relative path does NOT work!
# TEXINPUTS="$PWD/images:$TEXINPUTS"
TEXINPUTS=":${PWD}/images//"
export TEXINPUTS

The TEXMFCNF environment variable points to the directory that contains the configuration files for TeX:

TEXMFCNF="/etc/texmf/"
export TEXMFCNF

In /etc/texmf/texmf.cnf we read:

% pdfjadetex: Search path for font metric (.tfm) files.
TEXFONTS = .;$TEXMF/fonts/tfm//

The next environment variable we will have to set, is the one that passes options to thumbpdf (see Section 3.7), for the case that you wish to do so:

# You can pass options to thumbpdf through this environment variable
THUMBPDF=""
export THUMBPDF

Last but not least, openjade needs to know the locations of the SGML catalog files (see Section 4.5):

SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/CATALOG.iso_ent"
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/CATALOG.docbook-dsssl-stylesheets"
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/CATALOG.mathml-2.0"
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/CATALOG.svg-1.1"
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/CATALOG.docbook_4"
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/openjade/catalog"
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/refdb/refdb.cat"
export SGML_CATALOG_FILES

The SP_ENCODING environment variable tells Openjade in which encoding the input file is written in:

SP_ENCODING="ISO-8859-1" 
export SP_ENCODING

Encoding names are case insensitive. The following named encodings are available (see Handling of character sets in OpenSP):

utf-8

Each character is represented by a variable number of bytes according to UCS Transformation Format 8 defined in Annex P to be added by the first proposed drafted amendment (PDAM 1) to ISO/IEC 10646-1:1993.

utf-16

Each character is represented by a variable number of bytes according to UCS Transformation Format 16 defined in Annex O to be added by the first proposed drafted amendment (PDAM 1) to ISO/IEC 10646-1:1993.

ucs-2

iso-10646-ucs-2 This is ISO/IEC 10646 with the UCS-2 transformation format. Each character is represented by 2 bytes. No special treatment is given to the byte order mark character.

ucs-4 iso-10646-ucs-4

utf-32 This is ISO/IEC 10646 with the UCS-4 transformation format. Each character is represented by 4 bytes.

unicode

Each character is represented according to the utf-16 encoding. The bytes representing the entire storage object may be preceded by a pair of bytes representing the byte order mark character (0xFEFF). The bytes representing each character are in the system byte order, unless the byte order mark character is present, in which case the order of its bytes determines the byte order. When the storage object is read, any byte order mark character is discarded.

euc-jp

This is equivalent to the Extended_UNIX_Code_Packed_Format_for_Japanese Internet charset. Each character is encoded by a variable length sequence of octets.

euc-kr

This is ASCII and KSC 5601 encoded with the EUC encoding as defined by KS C 5861-1992.

euc-cn cn-gb

gb2312 This is ASCII and GB 2312-80 encoded with the EUC encoding. It is equivalent to the CN-GB MIME charset defined in RFC 1922.

sjis

shift_jis This is equivalent to the Shift_JIS Internet charset. Each character is encoded by a variable length sequence of octets. This is Microsoft's standard encoding for Japanese.

big5 cn-big5

This is equivalent to the CN-Big5 MIME charset defined in RFC 1922.

is8859-n iso-8859-n

n can be any single digit other than 0. Each character in the repertoire of ISO 8859-n is represented by a single byte.

koi8-r

koi8 The koi8-r encoding as defined in RFC 1489.

xml

On input, this uses XML's rules to determine the encoding. On output, this uses UTF-8.

windows

Specify this encoding when a storage object is encoded using your system's default Windows character set. This uses the so-called ANSI code page.

wunicode

This uses the unicode encoding if the storage object starts with a byte order mark and otherwise the windows encoding. If you are working with Unicode, this is probably the best value for SP_ENCODING.

ms-dos

Specify this encoding when a storage object (file) uses the OEM code page. The OEM code-page for a particular machine is the code-page used by FAT file-systems on that machine and is the default code-page for MS-DOS consoles.

For OpenSP, a suite of SGML/XML processing tools related to Openjade, there are two other environment variables that are related to SP_ENCODING (see Handling of character sets in OpenSP):

SP_CHARSET_FIXED

If this variable is 1 or YES, then OpenSP will operate in fixed character set mode.

SP_SYSTEM_CHARSET

This identifies the system character set. When in fixed character set mode, this character set is used as the internal character set. When not in fixed character set mode this character set is used as the internal character set until the document character set has been read, at which point the document character set is used as the internal character set. The only currently recognized value for this is JIS. This refers to a character set which combines JIS X 0201, JIS X 0208 and JIS X 0212 by adding 0x8080 to the codes of characters in JIS X 0208 and 0x8000 to the codes of characters in JIS X 0212. The default system character set is Unicode 2.0.

But since lyxtox does not use OpenSP, we don't need to care about these two, as the manual page of Openjade asserts:

Note Openjade does not use SP_CHARSET_FIXED and SP_SYSTEM_CHARSET
 

OpenJade ignores the SP_CHARSET_FIXED and SP_SYSTEM_CHARSET environment variables and always uses Unicode as its internal character set, as if SP_CHARSET_FIXED was 1 and SP_SYSTEM_CHARSET was unset. Thus only the SP_ENCODING environment variable is relevant to OpenJades handling of character sets.


7.1.4. Main part

In the main part, the hard work (for your computer) begins:

The document is exported from LyX to DocBook SGML:

$LYX -e docbook $1.lyx

The SGML that is produced by LyX has several shortcomings. They have to be corrected. This is done by calling runsed:

$RUNSED $SEDSCR $1.sgml

which is the subject of the next subsection.

Note Alternative commands
 

There are quite a few alternative invocations of the various tools at appropriate places in the lyxtox script, in the form of comments. These are there in order to show you how you can achieve an equivalent result through other tools.


7.1.4.1. Runsed, sed and sedscr

Runsed takes as argument the sedscript to run and the file against which to run it. It calls sed with sedscr as the “sed command file”. In the sedscr file itself there is another bunch of “magic” going on:

Important Important note:
 

The changes in LyX' SGML code presented here pertain strictly to LyX version 1.2.0! The 1.1.x versions needed slightly (and subtly) different changes and the same may be true for future versions of LyX. Examples of sed commands for previous LyX versions are presented in the sedscr file in comments. Use (or construct) the right sed commands for the right changes for your LyX version! The success of this method depends crucially on this.

The code

s/<\(sect[^>]*\)>\(<title>[^<]*\)<anchor \([^>]*\)>/<\1 \3>\2/g
s/<\(chapter\)>\(<title>[^<]*\)<anchor \([^>]*\)>/<\1 \3>\2/g

tells sed to substitute[13]

< sect1 >< title > some title < anchor id="some label" >

with

<sect1 id="some label" ><title>some title

and

< chapter >< title > some title < anchor id="some label" >

with

<chapter id="somelabel"><title> some title

The code

/^.*<figure><title><graphic/{
s/<figure><title><graphic fileref="\([^"]*\)">[ ]*<anchor id="\([^"]*\)">\([^<]*\)<\/title>[ ]*<\/figure>/\
<figure id="\2">\
   <title>\
   \3\
   <\/title>\
   <mediaobject>\
      <\!\[ \%output\.print\.png; \[\
      <imageobject>\
         <imagedata fileref="\.\/images\/\1.png" format="PNG">\
      <\/imageobject>\
      \]\]>\
      <\!\[ \%output\.print\.pdf; \[\
      <imageobject>\
         <imagedata fileref="\1.pdf" format="PDF" scale="65">\
      <\/imageobject>\
      \]\]>\
      <\!\[ \%output\.print\.eps; \[\
      <imageobject>\
         <imagedata fileref="\1.eps" format="EPS">\
      <\/imageobject>\
       \]\]>\
      <\!\[ \%output\.print\.bmp; \[\
      <imageobject>\
         <imagedata fileref="\1.bmp" format="BMP">\
      <\/imageobject>\
       \]\]>\
      <textobject>\
         <phrase>\3<\/phrase>\
      <\/textobject>\
      <caption>\
         <para>\3<\/para>\
      <\/caption>\
   <\/mediaobject>\
<\/figure>\
/g
}

tells sed to substitute[14]

< figure >< title >< graphic fileref="imagename" > some blanks < anchor id="some id" >some title< /title >

with the more elaborate combination of figure and mediaobject elements:

<figure id="some id">
   <title>
   some title
   </title>
   <mediaobject>
      <![ %output.print.png; [
      <imageobject>
         <imagedata fileref="./images/imagename.png" format="PNG">
      </imageobject>
      ]]>
      <![ %output.print.pdf; [
      <imageobject>
         <imagedata fileref="imagename.pdf" format="PDF" scale="65">
      </imageobject>
      ]]>
      <![ %output.print.eps; [
      <imageobject>
         <imagedata fileref="imagename.eps" format="EPS">
      </imageobject>
       ]]>
      <![ %output.print.bmp; [
      <imageobject>
         <imagedata fileref="imagename.bmp" format="BMP">
      </imageobject>
       ]]>
      <textobject>
         <phrase>some title</phrase>
      </textobject>
      <caption>
         <para>some title</para>
      </caption>
   </mediaobject>
</figure>

There are some remarks due here:

  • The title of the original SGML appears in three places of the new SGML: the title, the phrase for the alternative text and the caption. LyX uses the figure caption for the title and there is no way we can derive three different texts for the three different uses in the new SGML. That is why in the output document the figure title, the alternative text and the figure caption are identical.

  • For the PNG format we must prefix the image file name, imagename.png, with the relative path (./images) to it, even though we set all environment variables correctly (see Section 7.1.3). This is not necesary for the other formats.

  • We scale the PDF images to 100%. I used to scale them down to 65%, but this is no longer necessary, after some experimentation with various scale factors that seem to compensate for this need in the addd utility. See also Section 4.9.

  • We make use of external SGML entities like %output.print.png; This is a topic of its own which is explained in detail in Section 7.2.2.

A mediaobject similar to the above (but without figure id and caption) is inserted whenever a “simple” image, i.e. one without the float element with caption, is encountered in LyX' SGML. It substitutes a line like[15]

< graphic fileref="imagename" >

with a mediaobject like

<mediaobject>
      <![ %output.print.png; [
      <imageobject>
         <imagedata fileref="./images/imagename.png" format="PNG">
      </imageobject>
      ]]>
      <![ %output.print.pdf; [
      <imageobject>
         <imagedata fileref="imagename.pdf" format="PDF" scale="65">
      </imageobject>
      ]]>
      <![ %output.print.eps; [
      <imageobject>
         <imagedata fileref="imagename.eps" format="EPS">
      </imageobject>
       ]]>
      <![ %output.print.bmp; [
      <imageobject>
         <imagedata fileref="imagename.bmp" format="BMP">
      </imageobject>
       ]]>
</mediaobject>

Notice that the text is now simply "Figure", since there was no caption. You may change it to something else. There is also no id available for this mediaobject, therefore you cannot cross-reference it. That's why I suggested floats in Section 5.7.

The following sed code

/^.*[^<]*<programlisting/s/<programlisting\([^>]*\)>/<screen\1>/g
/^.*[^<]*<\/programlisting>/s/<\/programlisting\([^>]*\)>/<\/screen\1>/g

will substitute <programmlisting> with <screen>, while this one:

# Delete the <para> before the <tgroup> tag.
s/<tgroup/<tgroup/g
# Delete the </para> after the </tgroup> tag.
s/<\/tgroup><\/para>/<\/tgroup>/g

will delete <para> before <tgroup> and </para> before </tgroup>.

For table captions and titles to be output correctly, you have to eliminate the <para> from any sequence </title><para><tgroup> AND you have to write a table float (see Section 5.10, in the inside of which you will have to set the title and the caption environment on one line, then press <enter>, set the environment to "Standard" (this will produce the <para> element we eliminate here) and continue with the table normally. A warning about an "end tag for element "TABLE" which is not open" is the less evil we can get and is harmless (a LyX bug in 1.2.0, not openjade's):

/<\/title><tgroup/s/<\/title><tgroup/<\/title><tgroup/

Further, for the cross-references to tables to work, we have to substitute[16]

< table >< title >< anchor id="some id" >

with

<table id="some id"><title>

This is done with the following sed code:

s/<table>[ ]*<title>[ ]*<anchor \([^>]*\)>/<table \1><title>/g

Some minor issues still remain:

Substitute 'ldquo' with 'quot' and 'rdquo' with 'quot':

s/\&ldquo;/\&quot;/g
s/\&rdquo;/\&quot;/g

But we are not done with the quots yet: In <othercredit> we have to substitute[17]

& quot ;

with " :

/<othercredit/s/\&quot;/"/g

And: substitute[18]

& amp ; copy ; 

with &copy. This will produce a Copyright symbol, instead of "&copy":

s/\&copy;/\&copy/g

Also, substitute &xxxx; with the character it representes - somehow these entities do not work:

s/[/[/g
s/]/]/g
s/{/{/g
s/}/}/g
s/$/$/g
s/%/%/g
s/#/#/g
s/|/|/g
s/£/£/g
s/_/_/g
s/\/\\/g
s/~/~/g

Finally, for the index to be created, we have to insert the index creation command. Comment this if you don't want an index: to substitute[19]

< /book >

with[20]

&index;
< /book >

the following sed code is needed:

/<\/book>/s/<\/book>/\&index;\
<\/book>/

A similar sed code is there for the article document type. Currently, this part has been commented and transfered to the sedscr_abi script which inserts the entities for the Appendix, the Bibliography and the Index at once at the end of the document, before the closing </book> or </article> tags.


7.1.4.2. Tidying up the SGML code

Finally, two calls to runsed with sedscr_tidy and sedscr_tidy2 as the script files will “ tidy up” the SGML file:

$RUNSED $SEDSCRTIDY $1.sgml
$RUNSED $SEDSCRTIDY2 $1.sgml

sedscr_tidy consists simply of the following lines:

# Author: Chris Karakas
# http://www.karakas-online.de
#
# Part of the LyX-to-X project.
# See http://www.karakas-online.de/mySGML/ for a detailed 
# description.
#
# Copyright (c) 2004, Chris Karakas 
# http://www.karakas-online.de
# chris at mydomain dot de (see above for my domain) 
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2, or (at your option)
# any later version.
# 
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; see the file COPYING.  If not, write to
# the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.


{
s/\([^\n]\)</\1\
</
s/>\([^\n]\)/>\
\1/
P
D
}

It does practically nothing else than insert a newline before an opening bracket (a "<") or a closing one (a ">"). After this transformation has taken place for the whole document, runsed is called with sedscr_tidy2 as the sed script:

# Author: Chris Karakas
# http://www.karakas-online.de
#
# Part of the LyX-to-X project.
# See http://www.karakas-online.de/mySGML/ for a detailed 
# description.
#
# Copyright (c) 2004, Chris Karakas 
# http://www.karakas-online.de
# chris at mydomain dot de (see above for my domain) 
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2, or (at your option)
# any later version.
# 
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; see the file COPYING.  If not, write to
# the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.



/<entry/{
N
s/<entry \([^>]*\)>[ \n\t]*<para>/<entry \1><para>/
p
d
}

/<\/para/{
N
s/<\/para>[ \n\t]*<\/entry>/<\/para><\/entry>/
p
d
}

/DATA/{
N
s/\n]]>/]]>/
P
D
}

/<citation/{
N
N
s/\n//g
p
d
}

sedscr_tidy2 will do some corrections, because the changes of sedscr_tidy went a bit too far:

  • It will delete any newline before the closing "]]>" of CDATA elements

  • and it will bring <entry> and <para> elements on the same line, i.e. it will delete any newlines between them. It will do the same for the closing </para> and </entry> pairs. Both pairs occur inside tables and due to the Pernicious Mixed Content Problem they should not contain any line feed or space in-between, or the parser will think that the table cell contains inline markup and issue the warning that the “ document type does not allow element <para>” at that place. See the discussion of the “ document type does not allow element <para>” error in Section 6.2, as well as in Openjade error: <para> not allowed after <entry>.

Warning Tidy scripts mess up code snippets
 

The tidy scripts will mess up any part of your file that contains the < and > brackets. Especially code that is included verbatim (i.e. without the use of some external entity) and contains such brackets will look awkward. Callouts will also be affected. I have deactivated the call to the scripts in the lyxtox file until I find a better solution (BTW, nsgmls will break with errors so it does not lend itself to SGML code tidying either). If potentially affected code is included with the help of an external entity though, then the tidy scripts might work fine for you.


7.1.4.3. Key combinations

The sedscr file also contains code that will add markup for key combinations:

# Key combinations
#
# CTRL-X-Y
s/\([^-]\)CTRL-\([^-. &<]*\)-\([^-. &,)<]*\)/\1\
<keycombo>\
    <keycap>CTRL<\/keycap>\
    <keycap>\2<\/keycap>\
    <keycap>\3<\/keycap>\
<\/keycombo>\
<indexterm>\
    <primary>CTRL_\2_\3<\/primary>\
<\/indexterm>\
/g

The above code, for example, will substitute every occurence of the string “ CTRL-X-Y ” with:

<keycombo>
    <keycap>CTRL</keycap>
    <keycap>X<\/keycap>
    <keycap>Y<\/keycap>
</keycombo>
<indexterm>
    <primary>CTRL_X_Y</primary>
</indexterm>

thus adding the right DocBook markup for the key combination and also an index entry for it too. There is also code for “ CTRL-X ” or only “CTRL”. Instead of “CTRL”, you can also have “ESC” or “ALT” - there is code for them too. The user can thus just write “ CTRL-ALT-DEL ” or “ESC” or “ ALT-F4 ” and the scripts will take care of markup and indexing.


7.1.4.4. Acronyms, product names, applications

There is also automatic markup insertion for acronyms, product names and applications:

For acronyms:

# Acronyms
# PHP, GNU, EOF, Python, POSIX, GUI, LDP...
s/\([ .\t\r\n]\)PHP\([ .\t\r\n]\)/\1<acronym>PHP<\/acronym>\2/g
s/\([ .\t\r\n]\)GNU\([ .\t\r\n]\)/\1<acronym>GNU<\/acronym>\2/g
s/\([ .\t\r\n]\)EOF\([ .\t\r\n]\)/\1<acronym>EOF<\/acronym>\2/g
s/\([ .\t\r\n]\)Python\([ .\t\r\n]\)/\1<acronym>Python<\/acronym>\2/g
s/\([ .\t\r\n]\)POSIX\([ .\t\r\n]\)/\1<acronym>POSIX<\/acronym>\2/g
s/\([ .\t\r\n]\)GUI\([ .\t\r\n]\)/\1<acronym>GUI<\/acronym>\2/g
s/\([ .\t\r\n]\)LDP\([ .\t\r\n]\)/\1<acronym>LDP<\/acronym>\2/g
s/\([ .\t\r\n]\)IDE\([ .\t\r\n]\)/\1<acronym>IDE<\/acronym>\2/g
s/\([ .\t\r\n]\)RPM\([ .\t\r\n]\)/\1<acronym>RPM<\/acronym>\2/g
s/\([ .\t\r\n]\)PGP\([ .\t\r\n]\)/\1<acronym>PGP<\/acronym>\2/g
s/\([ .\t\r\n]\)GPG\([ .\t\r\n]\)/\1<acronym>GPG<\/acronym>\2/g
s/\([ .\t\r\n]\)ID\([ .\t\r\n]\)/\1<acronym>ID<\/acronym>\2/g
s/\([ .\t\r\n]\)BW\([ .\t\r\n]\)/\1<acronym>BW<\/acronym>\2/g
s/\([ .\t\r\n]\)ASCII\([ .\t\r\n]\)/\1<acronym>ASCII<\/acronym>\2/g
s/\([ .\t\r\n]\)CPU\([ .\t\r\n]\)/\1<acronym>CPU<\/acronym>\2/g

For product names:

# Product names
# UNIX, Linux, Acrobat, Windows...
s/\([ .\t\r\n]\)UNIX\([ .\t\r\n]\)/\1<productname>UNIX<\/productname>\2/g
s/\([ .\t\r\n]\)Linux\([ .\t\r\n]\)/\1<productname>Linux<\/productname>\2/g
s/\([ .\t\r\n]\)Acrobat\([ .\t\r\n]\)/\1<productname>Acrobat<\/productname>\2/g
s/\([ .\t\r\n]\)Windows\([ .\t\r\n]\)/\1<productname>Windows<\/productname>\2/g
s/\([ .\t\r\n]\)Mandrake\([ .\t\r\n]\)/\1<productname>Mandrake<\/productname>\2/g
s/\([ .\t\r\n]\)SuSE\([ .\t\r\n]\)/\1<productname>SuSE<\/productname>\2/g

For applications:

# Applications
# TeX, LaTeX, Acrobat Reader, PHP-Nuke
s/\([ .\t\r\n]\)TeX\([ .\t\r\n]\)/\1<application>TeX<\/application>\2/g
s/\([ .\t\r\n]\)LaTeX\([ .\t\r\n]\)/\1<application>LaTeX<\/application>\2/g
s/\([ .\t\r\n]\)Reader\([ .\t\r\n]\)/\1<application>Reader<\/application>\2/g
s/\([ .\t\r\n]\)PHP-Nuke\([ .\t\r\n]\)/\1<application>PHP-Nuke<\/application>\2/g
s/\([ .\t\r\n]\)Perl\([ .\t\r\n]\)/\1<application>Perl<\/application>\2/g
s/\([ .\t\r\n]\)Java\([ .\t\r\n]\)/\1<application>Java<\/application>\2/g

The principle is the same for all three: substitute any occurence of the acronym, product name or application with the appropriate DocBook markup. For example, an string “GNU” is replaced by:

<acronym>GNU</acronym>

(GNU is an acronym), while a “Linux” is replaced by:

<productname>Linux</productname>

(Linux is a product name in this case) and “TeX” (an application) is replaced by

<application>TeX</application>

It is up to you which acronyms, product names or applications you want to have automatically marked up this way (they will appear in small caps if you didn't change anything in the standard stylesheets), so feel free to add or remove your favourites from the code in sedscr.

This concludes the transformation of LyX' SGML (specific parts were not covered here - for the Mathematics part, see Section 10.3.1, for the citations part, see Section 7.1.10.2). runsed copies the original SGML file to a backup file with the .bak ending, then writes the output of sed to a temporary file with a random name, compares the two files and, if something has changed, replaces the original file with the temporary one, otherwise outputs a warning that the file did not change.


7.1.4.5. Alt attributes for images

Before we continue, we have to take care of a problem that seems to be caused by a bug in the DSSSL stylesheets used: although we add a <phrase> element to the <textobject>'s (see the code in sedscr), it seems that it is not used for alt attributes in the resulting images during HTML creation. This makes the resulting HTML documents fail the HTML validation test of the W3C (see Chapter 8).

I have decided to resolve this problem with another sed script, this time a dynamic one! First, the SGML file of the document is passed to sed using sedscr_ima:

${SED} -n -f ${SEDSCRIMA} $1.sgml > ${SEDSCRIMG}

This produces a sed script (${SEDSCRIMG) called sedscr_img. We create a second sed script, ${SEDSCRGRA}, which we will use in order to substitute "graphXXXX" with the real name of the graphic file in ${SEDSCRIMG:

${SED} -n -e '/<\!ENTITY/s/.*graph\([^ ]*\) "\([^>]*\)".*>/s\/graph\1\/\2\/g/p' $1.sgml > ${SEDSCRGRA}

We now use the sed script ${SEDSCRGRA} (sedscr_gra) to substitute "graphXXXX" with the real name of the graphic file in ${SEDSCRIMG}[21]:

${RUNSED} ${SEDSCRGRA} ${SEDSCRIMG}

This last step transforms sedscr_img, but it stil contains <acronym>, <productname> and <application> tags. To erase them from the alt and title texts in the sed script ${SEDSCRIMG}, we use the sed script sedscr_apa:

${RUNSED} ${SEDSCRAPA} ${SEDSCRIMG}

Finally, we add the necessary sed commands for the alt and title texts of smilies.

echo 's/"\.\/images\/icon_smile\.png">/".\/images\/icon_smile.png" alt="smile" title="smile">/g' 
>> ${SEDSCRIMG}
echo 's/"\.\/images\/icon_wink\.png">/".\/images\/icon_wink.png" alt="wink" title="wink">/g' 
>> ${SEDSCRIMG}
echo 's/"\.\/images\/icon_cool\.png">/".\/images\/icon_cool.png" alt="cool" title="cool">/g' 
>> ${SEDSCRIMG}
echo 's/"\.\/images\/icon_eek\.png">/".\/images\/icon_eek.png" alt="shock" title="shock">/g' 
>> ${SEDSCRIMG}
echo 's/"\.\/images\/icon_frown\.png">/".\/images\/icon_frown.png" alt="frown" title="frown">/g' 
>> ${SEDSCRIMG}

Now we have computed a sed script, ${SEDSCRIMG}, that adds alt and title tags to the images in every HTML file that is applied on.

Here's how it looks like for this document:

s/<img src="\.\/images\/general-info\.png">/<img src=".\/images\/general-info.png" 
alt="General document info." title="General document info.">/g
s/<img src="\.\/images\/paper-sizes\.png">/<img src=".\/images\/paper-sizes.png" 
alt="ISO-DIN paper sizes." title="ISO-DIN paper sizes.">/g
s/<img src="\.\/images\/insert-url\.png">/<img src=".\/images\/insert-url.png" 
alt="Insert URL with underscores in LyX." title="Insert URL with underscores in LyX.">/g
s/<img src="\.\/images\/page-area-model\.png">/<img src=".\/images\/page-area-model.png" 
alt="CSS page area model." title="CSS page area model.">/g
s/<img src="\.\/images\/fonts\.png">/<img src=".\/images\/fonts.png" 
alt="Document Info: Fonts." title="Document Info: Fonts.">/g
s/"\.\/images\/icon_smile\.png">/".\/images\/icon_smile.png" alt="smile" title="smile">/g
s/"\.\/images\/icon_wink\.png">/".\/images\/icon_wink.png" alt="wink" title="wink">/g
s/"\.\/images\/icon_cool\.png">/".\/images\/icon_cool.png" alt="cool" title="cool">/g
s/"\.\/images\/icon_eek\.png">/".\/images\/icon_eek.png" alt="shock" title="shock">/g
s/"\.\/images\/icon_frown\.png">/".\/images\/icon_frown.png" alt="frown" title="frown">/g

We will use it in a moment...


7.1.4.6. Document creation: HTML

After some cleaning

# Clean previous HTML files.
rm $1/*.html
# Clean previous image files.
rm -rf $1/images
# Clean rsync backup copies.
rm -rf $1/*~

the document creation begins:

For the one HTML file, the steps are:

  • Index initialization (-N option):

    $PERL $COLLATEINDEX -N -o index.sgml
    
  • Create one HTML file. We use openjade for that. Older versions used sgmltools as follows:

    $SGMLTOOLS -b onehtml -s $HTML_NOCHUNKS_DSL -j "-i output.print.png 
    -V nochunks -V html-index" $1.sgml
    

    Notice that we pass "-i output.print.png -V nochunks -V html-index" to openjade through the -j option. Current versions use openjade directly:

    ${OPENJADE} -t sgml -d $HTML_NOCHUNKS_DSL -i output.print.png -V nochunks -V html-index $1.sgml > $1.html
    

    The -i option to openjade tells it to include the output.print.png entity (see the structure of the mediaobjects put in place by runsed above), while the preample (see Section 4.6) tells it to ignore all such entities. Since the command line option overrides all others, the output.print.png entity is included for the HTML output, while the othe ones are ignored (see also Section 7.2.2).

  • Index creation:

    $PERL $COLLATEINDEX -g -o index.sgml HTML.index
    
  • Generation of one HTML file (the index will be included). Again, older versions used sgmltools:

    $SGMLTOOLS -b onehtml -s $HTML_NOCHUNKS_DSL -j "-i output.print.png" $1.sgml
    

    but newer ones use openjade:

    ${OPENJADE} -t sgml -d $HTML_NOCHUNKS_DSL -i output.print.png -V nochunks $1.sgml > $1.html
    
  • Tidy the HTML code:

    $TIDY -ascii -c -wrap 200 -f /dev/null -m $1.html
    
  • Correct header and footer. First, split the HTML document in title and body parts. The title will be put in title.tmp, the body in body.tmp:

    $HTMLSPLIT < $1.html
    
  • Second, put the right header and footer in the file (see Chapter 8):

    HTMLFILE=$1.html
    BASENAME=`basename $HTMLFILE`
    cat ${DATADIR}/part1 > ${HTMLFILE}
    cat title.tmp >> ${HTMLFILE}
    echo '</title>' >> ${HTMLFILE}
    cat meta.tmp >> ${HTMLFILE}
    
  • Substitute the placeholders DOMAIN, DIRNAME, FILENAME etc. in the header (part2) and footer file (part3) with the current values:

    # Header
    ${SED} -e "s/_DOMAIN_/${DOMAIN}/g" ${DATADIR}/part2 > part2_1.tmp
    ${SED} -e "s/_DIRNAME_/$1/g" part2_1.tmp > part2_2.tmp
    ${SED} -e "s/_FILENAME_/${BASENAME}/g" part2_2.tmp > part2_3.tmp
    ${SED} -e "s/_TITLE_/${TITLE}/g" part2_3.tmp > part2_4.tmp
    ${SED} -e "s/_FORMATSFILE_/${FORMATSFILE}/g" part2_4.tmp > part2_5.tmp
    ${SED} -e "s/_COPYRIGHT_/${COPYRIGHT}/g" part2_5.tmp > part2_6.tmp
    ${SED} -e "s/_HOMEFILE_/${HOMEFILE}/g" part2_6.tmp > part2_7.tmp
    ${SED} -e "s/_DATE_/${TODAY}/g" part2_7.tmp > part2.tmp
    cat part2.tmp >> ${HTMLFILE}
    # Body
    cat body.tmp >> ${HTMLFILE}
    # Footer
    ${SED} -e "s/_DOMAIN_/${DOMAIN}/g" ${DATADIR}/part3 > part3_1.tmp
    ${SED} -e "s/_DIRNAME_/$1/g" part3_1.tmp > part3_2.tmp
    ${SED} -e "s/_FILENAME_/${HTMLFILE}/g" part3_2.tmp > part3_3.tmp
    ${SED} -e "s/_TITLE_/${TITLE}/g" part3_3.tmp > part3_4.tmp
    ${SED} -e "s/_FORMATSFILE_/${FORMATSFILE}/g" part3_4.tmp > part3_5.tmp
    ${SED} -e "s/_COPYRIGHT_/${COPYRIGHT}/g" part3_5.tmp > part3_6.tmp
    ${SED} -e "s/_HOMEFILE_/${HOMEFILE}/g" part3_6.tmp > part3_7.tmp
    ${SED} -e "s/_DATE_/${TODAY}/g" part3_7.tmp > part3.tmp
    cat part3.tmp >> ${HTMLFILE}
    

    If you have set the values of TITLE, FORMATFILE, HOMEFILE etc. in your .start file (see Section 4.11) and you use them somewhere in your part* files, they will be replaced with those values too. The same is true for DATE: you can use it in your headers and footers to produce the timestamp automatically.

  • Add alt and title attributes to images. We use the sed script sedscr_img, that we computed in Section 7.1.4.5:

    # Add alt and title tags to the images.
    ${RUNSED} ${SEDSCRIMG} ${HTMLFILE}
    
  • Finally, do some housekeeping, removing all intermediate files:

    # Housekeeping
    rm -f body.tmp title.tmp meta.tmp part2*.tmp part3*.tmp
    

For the HTML output with many files (chunks), the procedure is analogous to the above, so I will not repeat it here.


7.1.4.7. Document creation: PDF

For the print formats, the index is recreated (we need page numbers instead of HTML links). Notice the -p option to collateindex and the use of the saved copy of HTML.index in the current directory - this is because the raw index data are generated with the HTML stylesheet, even for the print formats (see Section 7.1.11):

rm index.sgml
$PERL $COLLATEINDEX -p -g -o index.sgml HTML.index

and the images directory is copied under $1 (the myTemplate directory, in our example):

cp -av images $1/

For the PDF output, the steps are:

  • Generate the PDF document in a first pass:

    $OPENJADE -t tex -d $PRINT_PDF_DSL -o $1.tex -i "output.print.pdf" $1.sgml
    

    Notice that now only the output.print.pdf entities in the mediaobjects are included (see the discussion of this for the HTML output above, as well as in Section 7.2.2).

  • The generated PDF in the 1st pass does not have thumbnails yet. Generate thumbnails now (do not confuse the script THUMB_PDF (thumbpdf) with the environment variable THUMBPDF, which passes additional options to the THUMB_PDF script ):

    $THUMB_PDF $1
    
  • Generate PDF again (2nd pass), to incorporate the thumbnails. The following two commands are equivalent to

    $SGMLTOOLS -b pdf -s sgmltools-pdf -j "-i output.print.pdf" $1.sgml
    

    (up to the use of the stylesheet, that is), but we have to run them separately, because otherwise we cannot process the file produced by thumbpdf - sgmltools will always want a filename of the form @jobname.tpt, where jobname is its PID (process id) (difficult to guess...). On the other side, thumbpdf will produce $1.ptp. So we have to "simulate" sgmltools with the following two commands:

    • This will produce a tex file from the SGML source:

      $OPENJADE -t tex -d $PRINT_PDF_DSL -o $1.tex -i "output.print.pdf" $1.sgml
      
    • This will produce a PDF file from the tex file. This PDF file will have thumbnails!

      $PDFJADETEX $1.tex
      
    • We must call pdfjadetex a second time, because the first time there was no .aux file and the bookmarks were not created:

      $PDFJADETEX $1.tex
      
    • A third pass of pdflatex is needed, in order to get the page numbers in the Table of Contents computed. See the PRINT_PDF_DSL used above for the parameters that control printing and placement of ToC:

      $PDFJADETEX $1.tex
      

Our PDF document, with all its bells and whistles, is now ready!


7.1.4.8. Document creation: RTF and TXT

The RTF and TXT outputs are easy. For the RTF, we just have to do:

$OPENJADE -t rtf -d $PRINT_RTF_DSL  -i "output.print.bmp" $1.sgml

and for the TXT:

$LYNX -dump -nolist $1.html > $1.txt

i.e. we use the Lynx text browser with the -dump option to create a text version from the one, big HTML file we created previously.


7.1.4.9. Document creation: PS

For the PS output, we have a little more work: we have to set the printer to "cmz", so that dvips (which will be called either through sgmltools, as in older versions of the script, or directly, as in newer ones) will search for the file config.cm (located in /var/lib/texmf/dvips/config/config.cm on my system), which contains the mappings for the "Computer-Modern" fonts. We use "cmz" instead of "cm" in order to embed the font in the PS file, thus making it portable (there is also a file config.cmz):

PRINTER="cmz"
export PRINTER
$OPENJADE -t tex -d $PRINT_PS_DSL -o $1.tex -i "output.print.eps" $1.sgml
# Compress PS
$GZIP $1.ps

As with PDFJADETEX in Section 7.1.4.7, again 3 passes are necessary:

$JADETEX $1.tex
$JADETEX $1.tex
$JADETEX $1.tex

An equivalent command would be

$SGMLTOOLS -b ps -s sgmltools-ps -j "-i output.print.eps" $1.sgml

but you would have to use the right stylesheet (in this invocation, the sgmltools-ps stylesheet is used, which is mapped, through the /etc/sgml/aliases file to "-//SGMLtools//DOCUMENT Docbook Style Sheet for Print//EN"#print.ps, which in turn is mapped, through the sgmltools catalog file /usr/share/sgml/stylesheets/sgmltools/sgmltools.cat, to print.dsl#print.ps, i.e. the print.ps id of the print.dsl file in the same directory where sgmltools.cat is also located - phew...).


7.1.4.10. Housekeeping and special processing

Our documents are now all ready. What follows is just general housekeeping: move all documents in the HTML directory and remove the .tpt file. Remove all .pdf, .eps, .gif and .jpg images from ./images under $1 (myTemplate in our example, ./images in the current directory is not affected!). From what has been created, leave only $1.sgml and index.sgml in the current (working) directory:

mv $1.txt $1.rtf $1.pdf $1.ps.gz $1/
mv $1.html $1/
rm $1.tpt $1.log $1.aux $1.out $1.tex
rm $1/images/*.pdf $1/images/*.eps $1/images/*.gif $1/images/*.jpg
rm $1/images/*/*.pdf $1/images/*/*.eps $1/images/*/*.gif $1/images/*/*.jpg
cp $1.sgml $1/

You may continue with the processing, calling tar to create various archives, or sed to further tweak the HTML code of the files. I leave these steps as an example for the interested reader. If you don't need this special processing, you should comment it!


7.1.5. DSSSL stylesheets

In Section 4.2 we copied these .dsl files to the current working directory:

Every change you wish to do should normally go in one of these three files - often even to more than one, e.g. when your change affects the HTML output for both one and many files (“chunks”). You should leave the original files that came with the Norman Walsh stylesheets unchanged and put your changes in the above files instead. Technically, this amounts to writing your own “driver” file. This is the easiest way to incorporate your preferences and remain flexible during upgrades.

Note TXT and HOWTO print targets
 

Currently, the lyxtox-print-txt.dsl and lyxtox-print-howto.dsl are not used, but are included here for your convenience and for the sake of completeness. You can thus see the common pattern used in all print stylesheets and use it for your own purposes.

For the TXT format I currently use the “onehtml” target: after the one, big HTML file has been rendered, Lynx (the text browser) is called with the -dump and -nolist options to create a text version directly from HTML, without any stylesheet for text. Note that, for a very complex document like the PHP-Nuke HOWTO (more than 500 pages, 1000 links to external URLs, 1500 cross-references, 2500 Index entries, 400 files in the “chunked” HTML version), the call to sgmltools with the “-b txt” option breaks with an error (caused probably by too many warnings about cross-references), although it works with smaller and less complex documents.

The “howto” print target is not used at all, but is included following the example of the print.dsl file that comes with the original sgmltools package.

The basic driver file looks like this (see Customizing the Stylesheets):

<!DOCTYPE style-sheet PUBLIC "-//James Clark//DTD DSSSL Style Sheet//EN" [
<!ENTITY dbstyle SYSTEM "docbook.dsl" CDATA DSSSL>
]>
<style-sheet>
<style-specification use="docbook">
<style-specification-body>
;; your stuff goes here...
</style-specification-body>
</style-specification>
<external-specification id="docbook" document="dbstyle">
</style-sheet>

Make sure that you specify, in the system identifier, the full path to the docbook.dsl file that you want to customize; for example, /usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/print/docbook.dsl (this is already done in the stylesheets provided, at least as long as your other settings, like Catalogs (Section 4.5) are correct).

You can add your own definitions, or redefinitions, of stylesheet rules and parameters where

;; your stuff goes here...

occurs in the example above.

A lot of work, research and experimentation has been invested in the above three stylesheet files, so let's explain some of the most important changes they introduce:

  • To generate a Table of Contents for articles, %generate-article-toc% has been set to true:

    (define %generate-article-toc%
      ;; Should a Table of Contents be produced for Articles?
      #t)
    
  • To get a title page for articles, %generate-article-titlepage% has been set to true:

    (define %generate-article-titlepage%
      ;; Should an article title page be produced?
      #t)
    
  • The CSS has been set to “ck-style.css”:

    (define %stylesheet%
      ;; Name of the stylesheet to use
      ;; #f)
      "ck-style.css")
    

    Note that this setting will be overwritten by the technique described in Chapter 8. You will have to enter the correct CSS in part2 too. The CSS file used here, ck-style.css, is a CSS for DocBook, especially crafted for the classes that appear in the automatically generated HTML code of the DocBook stylesheets. It incorporates some other fine features as well, like an elaborate font-controlling mechanism that passes the accessibility tests (see Chapter 9) and is still browser-independent, or a user-specified icon as a list bullet - up to a rotating globe that appears besides links that do not belong to a user-specifiable “local” domain (i.e. links that point “outside” the document collection currently viewed). The CSS file is discussed in detail in Section 7.1.8.

  • The width of the table that contains verbatim environments (like program listings) has been set to max. 95%. This is not the same as the value used in the CSS (see Section 7.1.8) for the PRE.SCREEN and PRE.PROGRAMLISTING environments, which should be 100%.

    (define ($table-width$)
      ;; REFENTRY table-width
      ;; PURP Calculate table width
      ;; DESC
      ;; This function is called to calculate the width of tables that should
      ;; theoretically be "100%" wide. Unfortunately, in HTML, a 100% width 
      ;; table in a list hangs off the right side of the browser window.  (Who's
      ;; mistake was that!).  So this function provides a way to massage
      ;; the width appropriately.
      ;;
      ;; This version is fairly dumb.
      ;; /DESC
      ;; AUTHOR N/A
      ;; /REFENTRY
      (if (has-ancestor-member? (current-node) '("LISTITEM"))
          "90%"
          "95%"))
    
  • “PDF” has been added to the list of mediaobject notations. Otherwise, no images will be displayed in PDF (see also Section 7.2.2):

    (define preferred-mediaobject-notations
      (list "PDF" "EPS" "PS" "JPG" "JPEG" "PNG" "linespecific"))
    (define preferred-mediaobject-extensions
      (list "pdf" "eps" "ps" "jpg" "jpeg" "png"))
    
  • The default graphic extension has been changed to “png”:

    (define %graphic-default-extension%
      ;; REFENTRY graphic-default-extension
      ;; PURP Default extension for graphic FILEREFs
      ;; DESC
      ;; The '%graphic-default-extension%' will be
      ;; added to the end of all 'fileref' filenames on
      ;; 'Graphic's if they do not end in one of the
      ;; '%graphic-extensions%'.  Set this to '#f'
      ;; to turn off this feature.
      ;; /DESC
      ;; AUTHOR N/A
      ;; /REFENTRY
      ;; #f)
      "png")
    
  • “pdf” and “tex” have been added to the list of graphic extensions - otherwise you will get errors (see Chapter 6):

    (define %graphic-extensions%
      ;; REFENTRY graphic-extensions
      ;; PURP List of graphic filename extensions
      ;; DESC
      ;; The list of extensions which may appear on a 'fileref'
      ;; on a 'Graphic' which are indicative of graphic formats.
      ;;
      ;; Filenames that end in one of these extensions will not have
      ;; the '%graphic-default-extension%' added to them.
      ;; /DESC
      ;; AUTHOR N/A
      ;; /REFENTRY
      '("gif" "jpg" "jpeg" "png" "tif" "tiff" "eps" "epsf" "pdf" "tex"))
    
  • Use graphics for admonitions (see Section 4.7) and callouts (see Section 4.8):

    (define %admon-graphics%
      ;; Use graphics in admonitions?
      #t)
    (define %callout-graphics%
      ;; If true, callouts are presented with graphics (e.g., reverse-video
      ;; circled numbers instead of "(1)", "(2)", etc.).
      ;; Default graphics are provided in the distribution.
      #t)
    
  • The admonition graphics path has been set to “./images”:

    (define %admon-graphics-path%
      ;; Path to admonition graphics
      ;; Sets the path, probably relative to the directory
      ;; where the HTML files are created, to the admonition
      ;; graphics.
      ;;
      ;; This needs to be "./images/" for tar distributed articles
      ;; This needs to be "../images/" for tar distributed Newbiedoc book
      ;; This needs to be "../images/" for individual articles on our website
      "./images/")
    
  • The callouts graphics path has been set to "./images/callouts/":

    (define %callout-graphics-path%
      ;; Sets the path, probably relative to the directory where the HTML
      ;; files are created, to the callout graphics.
      "./images/callouts/")
    
  • The callout graphics extension has been set to “png”:

    (define %callout-graphics-extension%
      ;; REFENTRY callout-graphics-extension
      ;; PURP Extension for callout graphics
      ;; DESC
      ;; Sets the extension to use on callout graphics.
      ;; /DESC
      ;; AUTHOR N/A
      ;; /REFENTRY
      ".png")
    
  • The code that produces the full name of admonitions has been changed:

    (define ($admon-graphic$ #!optional (nd (current-node)))
      ;; Admonition graphic file
      ;; Given an admonition node, returns the name of the
      ;; graphic that should be used for that admonition.
      (cond ((equal? (gi nd) (normalize "tip"))
             (string-append %admon-graphics-path% "tip."
                            %graphic-default-extension%))
            ((equal? (gi nd) (normalize "note"))
             (string-append %admon-graphics-path% "note."
                            %graphic-default-extension%))
            ((equal? (gi nd) (normalize "important"))
             (string-append %admon-graphics-path% "important."
                            %graphic-default-extension%))
            ((equal? (gi nd) (normalize "caution"))
             (string-append %admon-graphics-path% "caution."
                            %graphic-default-extension%))
            ((equal? (gi nd) (normalize "warning"))
             (string-append %admon-graphics-path% "warning."
                            %graphic-default-extension%))
            (else (error (string-append (gi nd) " is not an admonition.")))))
    
  • In lyxtox-print.dsl (which uses elements taken from Mandrake's manual-print.dsl), in order to be able to have URLs in PDF:

    ;;  Inserted in order to be able to get URLs in PDF documents.
    ;;  Adapted from manual-print.dsl of <productname>Mandrake</productname>.
    ;; Include the flow object class "formatting-instruction" : ONLY for Jade
    (declare-flow-object-class formatting-instruction
           "UNREGISTERED::James Clark//Flow Object Class::formatting-instruction")
    ;; *** URLs ***
    ;; Original : dblink.dsl
    (element ulink
     (sosofo-append
      ;; If you allow process-children here, you will get the text printed once more!
      ;; (process-children)             ;; Write the text with its format (anchor in HTML)
      (make formatting-instruction      ;; Write : " \href{" + theUrl + "}{" + theText + "}"
        data: (string-append " \\href{" (attribute-string (normalize "url")) "}{" (data-of (current-node)) "}")
      )
     )
    )
    ;; These three elements are from "dbindex.dsl".
    ;; Must be placed here because of the redifinition of "ulink".
    ;; Otherwise the Index entries will point to HTML files,
    ;; instead of page numbers.
    (element (primaryie ulink)
      (indexentry-link (current-node)))
    (element (secondaryie ulink)
      (indexentry-link (current-node)))
    (element (tertiaryie ulink)
      (indexentry-link (current-node)))
    
  • To be able to use Computer Modern fonts:

    ;;  Inserted in order to be able to use Computer Modern fonts  in PS and PDF documents.
    ;;  The font names _must_ be written exactly as follows ("Computer-Modern", 
    ;;  not "Computer Modern") for pdfjadetex to recognize them and use the T1
    ;;  fonts istead of the PK (Type 3) ones (which will look ugly on screen)!
        ;;
        ;;  Gnuishly correct fonts...
        ;;
        (define %body-font-family% "Computer-Modern")
        (define %mono-font-family% "Computer-Modern-Typewriter")
        (define %title-font-family% "Computer-Modern-Sans")
        (define %admon-font-family% "Computer-Modern-Sans")
        (define %guilabel-font-family% "Computer-Modern-Sans")
    
  • To use chapter/section/subsection labels as filenames for the respective HTML files:

    (define %use-id-as-filename%
      ;; Use <acronym>ID</acronym> attributes as name for component HTML files?
      #t)
    
  • In lyxtox-html.dsl, to be able control what section levels get put into separate HTML files (chunks) in the chunked version, the chunk-section-depth has been included (taken from the html/dbchunk.dsl file, which on my system is /usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/html/dbchunk.dsl with full name):

    (define (chunk-section-depth)
      2)
    

    A value of 1 for chunk-section-depth means that the chunks are the individual SECT1 elements, i.e. the Sections. A value of 2 means that the chunks are the individual SECT2 elements, i.e. the SubSections and so on. SECT1, SECT2 and SECT3 are the SGML tags used to denote a Section, SubSection and SubSubSection respectively. There is no meaning in using chunk-section-depth with values higher than 3, at least not with LyX, since SubSubSubsections is the deepest level you can use in LyX (at least with the default settings).

  • However, as painful experimentation has shown, setting a higher value for chunk-section-depth will have no effect, if “sect2”, “sect3” etc. are not contained in the chunk-element-list list. The list

    (define (chunk-element-list)
      (list (normalize "preface")
            (normalize "chapter")
            (normalize "appendix")
            (normalize "article")
            (normalize "glossary")
            (normalize "bibliography")
            (normalize "index")
            (normalize "colophon")
            (normalize "setindex")
            (normalize "reference")
            (normalize "refentry")
            (normalize "part")
            (normalize "sect1")
            (normalize "sect2")
            (normalize "section")
            (normalize "book") ;; just in case nothing else matches...
            (normalize "set")  ;; sets are definitely chunks...
            ))
    

    contains the elements that constitute chunks. If you take an element out of the list, the chunk-section-depth above will not have any effect. Thus, to have chunks on sect2, sect3 etc., you must have them included in this list. chunk-element-list is originally defined in the html/dbchunk.dsl file (just like chunk-section-depth), which on my system is /usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/html/dbchunk.dsl with full name. See also Naming of generated html code from sgml documents. Note that if you include “sect3” (SubSubSubsections) in the chunk-element-list list, you will get chunks on SubSubSubsections, no matter if your chunk-section-depth is only 2 and not 3!

  • To force the table of contents to be on a separate HTML page and not contain any section:

    (define (chunk-skip-first-element-list)
      ;; forces the Table of Contents on separate page
      '())
    

    The reason that Norm added support for merging the first sect1 into the file with its parent is because the alternative is really ugly in the case that you have a chapter (title) immediately followed by a section (title) (see Use of <Para> within <ListItem> mangles list items).

    Tip Tip
     

    That's why you should try to put at least some introductory text after a chapter title and before the first section of the chapter!

    FIXME: LyX inserts an extra </listitem> here, if I don't have this line...

  • To get a list of figures at the start of the document, the generate-book-lot-list must contain “figure”, as in the following example:

    (define ($generate-book-lot-list$)
      ;; REFENTRY generate-book-lot-list
      ;; PURP Which Lists of Titles should be produced for Books?
      ;; DESC
      ;; This parameter should be a list (possibly empty) of the elements
      ;; for which Lists of Titles should be produced for each 'Book'.
      ;;
      ;; It is meaningless to put elements that do not have titles in this
      ;; list.  If elements with optional titles are placed in this list, only
      ;; the instances of those elements that do have titles will appear in
      ;; the LOT.
      ;;
      ;; /DESC
      ;; AUTHOR N/A
      ;; /REFENTRY
    (list (normalize "table")
    (normalize "figure")
    (normalize "example")
    (normalize "equation")))
    

    The same must be the case if you want a list of tables (“table”), examples (“example”) or equations (“equation”). Normally, you will not have to touch this (i.e. your customization layer need not contain this code), as the above setting is the default one (for books).

  • To be able to display “other credit”, “release info” and “publisher” information on a book's title page, these elements must be added to the book-titlepage-recto-elements list:

    (define (book-titlepage-recto-elements)
      ;; elements on a book's titlepage
      ;; note: added revhistory to the default list
      ;; note: added othercredit to the default list
      ;; note: added releaseinfo to the default list
      ;; note: added publisher to the default list
      (list (normalize "title")
            (normalize "subtitle")
            (normalize "graphic")
            (normalize "mediaobject")
            (normalize "corpauthor")
            (normalize "authorgroup")
            (normalize "author")
            (normalize "othercredit")
            (normalize "releaseinfo")
            (normalize "publisher")
            (normalize "editor")
            (normalize "copyright")
            (normalize "pubdate")
            (normalize "revhistory")
            (normalize "abstract")
            (normalize "legalnotice")))
    
  • and the same must be done for the article-titlepage-recto-elements list, if we want to display them in articles' title pages too:

    (define (article-titlepage-recto-elements)
      ;; elements on an article's titlepage
      ;; note: added othercredit to the default list
      (list (normalize "title")
            (normalize "subtitle")
            (normalize "authorgroup")
            (normalize "author")
            (normalize "othercredit")
            (normalize "releaseinfo")
            (normalize "copyright")
            (normalize "pubdate")
            (normalize "revhistory")
            (normalize "abstract")
            (normalize "legalnotice")))
    
  • and the following code must be added:

    (define (process-contrib #!optional (sosofo (process-children)))
      ;; print out with othercredit information; for translators, etc.
      (make sequence
        (make element gi: "SPAN"
              attributes: (list (list "CLASS" (gi)))
              (process-children))))
    (define (process-othercredit #!optional (sosofo (process-children)))
      ;; print out othercredit information; for translators, etc.
      (let ((author-name  (author-string))
            (author-contrib (select-elements (children (current-node))
                                              (normalize "contrib"))))
        (make element gi: "P"
             attributes: (list (list "CLASS" (gi)))
             (make element gi: "B"
                  (literal author-name)
                  (literal " - "))
             (process-node-list author-contrib))))
    (mode article-titlepage-recto-mode
      (element contrib (process-contrib))
      (element othercredit (process-othercredit))
    )
    (mode book-titlepage-recto-mode
      (element contrib (process-contrib))
      (element othercredit (process-othercredit))
    )
    (define (article-title nd)
      (let* ((artchild  (children nd))
             (artheader (select-elements artchild (normalize "artheader")))
             (artinfo   (select-elements artchild (normalize "articleinfo")))
             (ahdr (if (node-list-empty? artheader)
                       artinfo
                       artheader))
             (ahtitles  (select-elements (children ahdr)
                                         (normalize "title")))
             (artitles  (select-elements artchild (normalize "title")))
             (titles    (if (node-list-empty? artitles)
                            ahtitles
                            artitles)))
        (if (node-list-empty? titles)
            ""
            (node-list-first titles))))
    
  • ...and many other details. You are encouraged to read the source of the stylesheets (lyxtox-html.dsl, lyxtox-onehtml.dsl, lyxtox-print.dsl, lyxtox-print-pdf.dsl, lyxtox-print-ps.dsl, lyxtox-print-rtf.dsl, lyxtox-print-txt.dsl, lyxtox-print-howto.dsl)!

Tip Important DocBook files
 

FYI, all changes presented here refer to variables that were originally defined in one of the following files:

  • /usr/share/sgml/stylesheets/sgmltools/print.dsl

  • /usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/html/dbparam.dsl

  • /usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/print/dbparam.dsl

  • /usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/print/db31.dsl

  • /usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/print/dbchunk.dsl

As said above, you should not change these files directly, because you will run into a lot of work when you upgrade them.

For a list of all parameters, with references to the files that use them, see the

  • /usr/share/sgml/docbook/docbook-dsssl-stylesheets-1.72/html/XREF

file. If you are looking for a parameter and you can't find the file it's in, the XREF file is your friend.


7.1.6. Inline graphics

There is no way to include inline graphics with the method described in this document - here's why:

In theory, you could put something like

/^.*[^<]*<inlinegraphic/{
s/<inlinegraphic fileref="\([^"]*\)"[^>]*>/\
   <inlinemediaobject>\
      <\!\[ \%output\.print\.png; \[\
      <imageobject>\
         <imagedata fileref="\.\/images\/\1.png" format="PNG">\
      <\/imageobject>\
      \]\]>\
      <\!\[ \%output\.print\.pdf; \[\
      <imageobject>\
         <imagedata fileref="\1.pdf" format="PDF" scale="65">\
      <\/imageobject>\
      \]\]>\
      <\!\[ \%output\.print\.eps; \[\
      <imageobject>\
         <imagedata fileref="\1.eps" format="EPS">\
      <\/imageobject>\
       \]\]>\
      <\!\[ \%output\.print\.bmp; \[\
      <imageobject>\
         <imagedata fileref="\1.bmp" format="BMP">\
      <\/imageobject>\
       \]\]>\
      <textobject>\
         <phrase>Inline graphic<\/phrase>\
      <\/textobject>\
   <\/inlinemediaobject>/g
}

in sedscr and thus substitute the <inlinegraphic> element (which is produced by LyX whenever you choose Insert-->Include File, then “Verbatim” ) with an <inlinemediaobject> along the lines of Section 7.1.4.1. But LyX uses <inlinegraphic> also for verbatim inclusions of files. We make use of this feature in Section 5.14, where we see how to include potentially problematic SGML code in program listings. Unfortunately, there is no way to tell which purpose the <inlinegraphic> element serves in the SGML file that is exported from LyX, a file inclusion of program code, or an inline image. The code above would make the substitution in both cases, which would definitely mess things up.

Tip Tip
 

Nevertheless, you can try it, if you are sure that your code does not include any other files, only images. There is a caveat though: you should create files with the same basename (i.e. without the ending) in the working directory, in addition to the ones you create in ./images. Example: if you want to include an icon named wink.png inline, then you have to create wink.png, wink.eps, wink.pdf and wink.bmp in ./images (as shown in Section 4.9) and a (possibly empty, it doesn't really matter) file named "wink" in ./. This is because when you will try to Insert-->Include File, LyX will not allow you to enter a fantasy name in the filename field and, on the other side, you must enter the filename without the ending and without the directory path, in order for the above code to work in sedscr.


7.1.7. Catalogs

A catalogue is a text file containing the translation rules of the public identifier to system's files.

The identifier systems used by SGML and by some tools are based on catalogues that perform the translation of these identifiers to files that hold the necessary definitions. For tools to be able to find the necessary catalogue(s), the environment variable SGML_CATALOG_FILES should be set, as explained in Section 7.1.3.

In my system, the sgmltools-lite package installed the /etc/sgml/catalog file. Its content can be used to set SGML_CATALOG_FILES as follows:

SGML_CATALOG_FILES="/usr/share/sgml/CATALOG.iso_ent" SGML_CATALOG_FILES="$SGML_CATALOG_FILES:
/usr/share/sgml/CATALOG.docbook-dsssl-stylesheets" SGML_CATALOG_FILES="$SGML_CATALOG_FILES:
/usr/share/sgml/CATALOG.docbook_3" SGML_CATALOG_FILES="$SGML_CATALOG_FILES:
/usr/share/sgml/CATALOG.docbook_4" SGML_CATALOG_FILES="$SGML_CATALOG_FILES:
/usr/share/sgml/openjade/catalog" SGML_CATALOG_FILES="$SGML_CATALOG_FILES:
/usr/share/sgml/stylesheets/sgmltools/sgmltools.cat" SGML_CATALOG_FILES="$SGML_CATALOG_FILES:
/usr/share/sgml/dtd/sgmltools/catalog" 
export SGML_CATALOG_FILES 

However, as recent versions of lyxtox do not use sgmltools, I use the relevant (and only those!) lines of my “master” catalog file in /etc/sgml/catalog to define SGML_CATALOG_FILES as:

SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/CATALOG.iso_ent"
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/CATALOG.docbook-dsssl-stylesheets"
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/CATALOG.mathml-2.0"
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/CATALOG.svg-1.1"
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/CATALOG.docbook_4"
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/sgml/openjade/catalog"
SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/refdb/refdb.cat"
export SGML_CATALOG_FILES

Generally, you'll need to set the SGML_CATALOG_FILES environment variable to all the catalogs that you have under the directory you installed the DocBook stylesheets (probably something like /usr/share/sgml or /usr/local/sgml, see Section 3.2 and Section 3.3 for the packages that install stylesheets).

If you want to learn more on catalogues and the way they are constructed, see Creating and modifying catalogues.


7.1.8. CSS

See Section 4.14 and Chapter 8 for the details of using a CSS for the HTML output. The proposed ck-style.css is a CSS tailored to the HTML produced by DocBook. It also takes into account some modern approaches to accessibility (see Chapter 9, for more on accessibility of DocBook generated HTML pages).

For example, you can control the navigation header style with

DIV.NAVHEADER {
        color: #000000;
        background-color: #EFEFF8;
        padding: 5px;
        margin-bottom: 10px;
        width: 100%;
        border: thin solid #a0a0d0;
}

and the navigation footer style with

DIV.NAVFOOTER {
        color: #000000;
        background-color: #EFEFF8;
        padding: 5px;
        margin-top: 10px;
        width: 100%;
        border: thin solid #a0a0d0;
}

The concepts of margin, border and padding follow a page model that is described in W3C's working draft CSS3 Paged Media Module, version of Dec. 18th 2003. Figure 7-1, taken from this document (Copyright © 2003 W3C (MIT, ERCIM, Keio), All Rights Reserved) illustrates the various geometric notions of this page model. Note that the XSL area model is deliberately very similar to the CSS one.

Figure 7-1. CSS page area model.

CSS page area model.

CSS page area model.

If you output admonitions as tables, rather than graphics (see Section 4.7), then you can control their style with a code like the following (shown here for the "Important" admonition):

TABLE.IMPORTANT
{
        font-style:italic;
        border: solid 2px #ff0000;
        width: 70%;
        margin-left: 15%;
}

Screen output (code) is controlled by

PRE.SCREEN
{
        font-family:monospace;
        white-space: pre;
        width: 100%;
        background-color: #ffffcc;
        border:solid;
        color: #000000;
        border-color: #009999;
        border-left: solid #009999 2px;
        border-right: solid #009999 2px;
        border-top: solid #009999 2px;
        border-bottom: solid #009999 2px;
        padding-left: 15pt;
}

while examples and the Table of Contents by

DIV.EXAMPLE,DIV.TOC {
        border: thin dotted #70AAE5;
        padding-left: 10px;
        padding-right: 10px;
        color: #000000;
        background-color: #EFF8F8;
}
DIV.TOC {
        margin-left: 20px;
        margin-right: 20px;
        width: 95%;
}

An “external link” icon to absolute links (i.e. links starting with “http:”) is added through

/* Add an external-link icon to absolute links */
a[href^="http:"] {
        background: url(images/remote.gif) right center no-repeat;
        padding-right: 12px;
}
a[href^="http:"]:hover {
        background: url(images/remote_a.gif) right center no-repeat;
}

However, this alone would put the icon on every link with an absolute URL, including links pointing to the local domain. This is corrected by

/* ...but not to absolute links in this domain... */
a[href^="http://www.karakas-online.de"] {
        background: transparent;
        padding-right: 0px;
}
a[href^="http://www.karakas-online.de"]:hover {
        background: transparent;
}

“External link” icons tell you what a link will do before you click on it. There are icons specifically designed for this purpose, like QBullets. QBullets are a collection of elegant, animated icons that attach to hypertext links to indicate their function. You can download Qbullets for free from matterform media.

To use this idea for footnotes, the name attribute is used as a selector:

/* Add a note icon to footnote links */
a[href^="#FTN"] {
        background: url(images/qbullet-note.gif) right center no-repeat;
        padding-right: 12px;
}
a[href^="#FTN"]:hover {
        background: url(images/qbullet-note_a.gif) right center no-repeat;
}

This will select all links whose href attribute starts with "#FTN" and append a note icon to them, which will even show an animated page curl upon passing with the mouse over it (hover). The links whose href attribute starts with “#FTN” look like

<a name="AEN1175" href="#FTN.AEN1175">[1]</a>

and point to a footnote. The footnote itself also contains a link - which points back to the referring text. That link will not be affected by the above selection, since its href attribute does not start with “#FTN” :

<a name="FTN.AEN1175" href="explain-runsed-sed-sedscr.html#AEN1175">[1]</a>

To display a back icon besides those links, a selector on the name attribute is used:

/* ...and a back icon to the backlinks in the footnotes themselves */
a[name^="FTN"] {
        background: url(images/scrollup.gif) right center no-repeat;
        padding-right: 12px;
}
a[name^="FTN"]:hover {
        background: url(images/scrollup_a.gif) right center no-repeat;
}
Note String matching on attributes is a CSS3 feature!
 

To be able to use string matching on attributes, as we have done in the QBullets examples above, the user must view our document with a browser that supports this CSS3 feature. If you are wondering whether your browser belongs to this cutting edge category (Mozilla 1.5 does, tip, tip wink), you can do this W3C browser test on CSS selectors and test for "Substring matching attribute selector".

To get an icon in place of the usual bullet in itemized lists, the list-style property is used for the UL tag:

UL {
        margin-bottom: 10px;
        list-style: url(images/tux-bullet.png) square;
    }

Last but not least, a cross-browser relative font setting can be achieved with

P {
        font-size: 12px;
}
/*/*/A{}
BODY P {
        font-size: x-small;
        voice-family: "\"}\"";
        voice-family: inherit;
        font-size: small;
}
HTML>BODY P {
        font-size: small;
}
/* */

which is indeed too complicated to explain here in all its depth. See Dive into Accessibility, day 26 for this.


7.1.9. Appendix

As already noted in Section 5.18, we can't include the Appendix in the main document, even if we mark it as an appendix with LyX' aproppriate check box in the Layout -> “Start Appendix here” menu item. We must include an extra LyX document of DocBook article type that is marked as an Appendix itself and contains our Appendix.

If such a file with the name appendix. lyx exists, lyxtox will take a series of actions to incorporate it into the final SGML document automatically:

  • The appendix. lyx file is exported to SGML:

    # Export the Appendix to DocBook SGML.
    if test -e appendix.lyx; then
      $LYX -e docbook appendix.lyx
    fi
    
  • A series of SGML code corrections take place through successive runs of 3 sed scripts:

    if test -e appendix.lyx; then
      $RUNSED $SEDSCRABI  $1.sgml
      $RUNSED $SEDSCR appendix.sgml
      $RUNSED $SEDSCRAPP appendix.sgml
    fi
    
    1. With the sedscr_abi sed script, the ending tags </book> or </article> are substituted with SGML code that inserts the SGML entities of the Appendix, Bibliography and Index as defined in the Preample (see Section 4.6). For example, the following sed commands from sedscr_abi:

      /<\/book>/s/<\/book>/\
      \&appendix;\
      \&bibliography;\
      \&index;\
      <\/book>/
      

      will substitute

      &appendix;
      &bibliography;
      &index;
      </book>
      with
      &appendix;
      &bibliography;
      &index;
      &appendix;
      &bibliography;
      &index;
      </book>
      

      Note that the inserted lines before the closing </book> tag contain SGML entities that were defined as SYSTEM files with the names appendix.sgml, bibliography.sgml and index.sgml respectively in the Preample (see Section 4.6). Similar sed commands in sedscr_abi will do the same for the closing </article> tag.

    2. The sedscr sed script will correct the SGML code of appendix.sgml, as it will do for the main document. See Section 7.1.4.1 for an explanation of its inner workings.

    3. Finally, the sedscr_app sed script will replace the first line of appendix.sgml and its subsequent 3 lines with the correct SGML incantation for an Appendix:

      <appendix label="A"><title>Appendix</title>
      

      It also replaces the closing </article> tag (remember that we created the appendix. lyx file as a document of type "DocBook article (SGML)", see Section 5.18) with the right </appendix> one.

  • If Mathematics processing is ON (with the variable $process_math set to 1 at the start of lyxtox), the appendix.sgml file is undergone the same transformations as the main document using the awkscr_math awk script (explained in Section 10.3):

      $AWK -f $AWKSCRMATH appendix.sgml > appendix.awk.sgml
      mv appendix.awk.sgml appendix.sgml
    
  • Openjade (Section 3.4), called by the lyxtox script, will process both the main document and appendix.sgml and will insert the contents of the latter in place of the appendix SGML entity that was inserted by sedscr_abi above.


7.1.10. Bibliography

To create a bibliography section, LyX offers two possibilities: one using the citation keys of your free choice and one using keys of bibliographic entries in a BibTeX database. Neither of them produces SGML code when the document is exported to SGML, so we have to search for other methods. Before I explain the RefDB method, which suits our purposes perfectly, I will first decribe both standard methods to give you an idea of the classic bibliography creation process in a TeX/LaTeX system with LyX. You can read more about them in the LyX User's Guide and Extended Features Guide respective, both accessible from LyX' Help menu.


7.1.10.1. Standard Bibliography methods in LyX

Both standard methods have in common that they make use of the Bibliography environment (Section 5.1) to list references. When you first open a Bibliography environment, LyX adds a large vertical space, followed by the heading “Bibliography” or “References,” depending on the document class. The heading is in a large boldface font. Each paragraph of the Bibliography environment is a bibliography entry.

At the beginning of the first line of each paragraph, you will see a grayish box showing a number. If you click on it, you will get a dialog in which you can set a key and a label. The key is the symbolic name by which you will refer to this bibliography entry. For example, suppose your first entry in the bibliography was a book about LaTeX. We could choose the key “latexguide” for that entry. You can also give a label, which will be displayed in the gray inset box.

The key field isn't useless. You can refer to your bibliography entries using the Insert->Citation Reference command. Just choose the key inside in the panel Bibliography keys, then transfer it to the Inset Keys panel with the left arrow. Multiple references can be placed by selecting more than one key. This is the first standard way of creating a bibliography in LyX.

The second standard way of creating a bibliography with LyX uses BibTeX. BibTeX is, it is a system for creating a large database of your most used journal references. For all future articles you write, you only need to include this standard database and reference the appropriate key to each reference.

To use BibTeX for inserting references in your document, proceed as follows: at the very end of your document, select Insert->Lists & TOC->BibTeX Reference. In the resulting dialog, fill out the dialog boxes as follows:

Database:

enter the name of your .bib file *without* the .bib extension. For searching multiple .bib files, just enter them in the desired order, separated by commas.

Style:

enter the name of your BibTeX style file *without* the .bst extension. The default style is plain (which should be included in your LaTeX distribution, so you don't have to worry about creating it).

For each citation, assuming that the source is in the .bib file, just call Insert->Citation Reference at the correct location in the text, and enter the appropriate reference key.


7.1.10.2. The RefDB method

The problem with both methods is that they don't “speak” SGML. Of course, you can use BibTeX to create a DVI document containing citations and bibliography and from there take the route to PS and PDF, or even to HTML through a tool like LaTeX2HTML, but this is not the route we want to take. We want to take the route through SGML - and that's where both methods fail to follow us.

The general principle of the RefDB method is straightforward: each citation that you want to be treated as a RefDB citation needs to have a role attribute with the value "REFDB". Each citation defines at least one xref element. The value of the linkend attribute encodes the ID of the required reference in the database (if you need references in several databases, this attribute can additionally specify the database). RefDB uses this information to generate a DocBook bibliography element. This contains an entry for each requested reference. These entries are labelled with ID attributes that match the xref linkend attributes in the text. Each RefDB-generated reference entry defines a xreflabel attribute which holds the text that is to be displayed at the position of the corresponding xref elements.

This is all it takes for single and unique citations, i.e. with one xref element per citation element and only one occurrence throughout the text. Both multiple occurrences of the same citation in the text and multiple citations (more than one xref elements per citation element) make things a bit more difficult.

Some output formats require a different formatting for the first citation of a publication in the text and all subsequent citations of the same publication. The first citation is identical with the above mentioned default case. All following citations of the same publication need an additional xref endterm attribute which points to an additional bibliomset element which in turn contains the text to be displayed for subsequent citations. The endterm attribute has the same value as the linkend attribute except that the letter "S" (as in subsequent) is appended to the attribute. See Processing expectations for the refdb DocBook bibliography output.

Compare the way you would use BibTeX with LaTeX (and LyX) to the way you use RefDB (10) to produce a Bibliography (see RefDB and BibTeX comparison):

With the BibTeX method, you proceed as follows:

  • You enter your LaTeX references into a flat-file database in the BibTeX format.

  • You use style files in the powerful but somewhat cryptic BibTeX format for your LaTeX documents.

  • In a LaTeX document, you specify with the \bibliography command which external bibliography file to use. You specify with \cite commands which references you want to cite (and appear in your bibliography). LyX does this automatically for you.

  • You run latex on your LaTeX document. This will create an .aux file which contains (among other stuff) a list of all citations.

  • Then you run bibtex on your LaTeX document. This will use the bibliography style you specified in the document and will create a cooked bibliography in a .bbl file.

  • Finally you run latex once or twice again to finalize your document.

With the RefDB method, you proceed as follows:

  • You enter your references for a SGML or XML document into a SQL database. Input can be either RIS, DocBook, or BibTeX. As RefDB is Unix-style, you can write other import filters in any language that can send output to either stdout or to a file. Using a SQL database means better scalability for large collections and added benefit if you share your data with colleagues (think workgroups, departments, access control...)

  • You specify the bibliography and citation styles for SGML and XML documents in XML files which are essentially templates for the sequence and appearance of bibliography and citation elements. These are also loaded into a database. This means they are pre-parsed, which speeds up the formatting of bibliographies.

  • In an SGML or XML document, you specify an external entity with the name of the SMGL or XML file that will contain your bibliography. In DocBook documents you specify with <citation><xref..></citation> constructs which references you want to cite. Parenthetical and textual citations are supported.

  • RefDB uses a DSSSL script to extract a list of all citations from SGML or XML documents into an XML document (which you can edit to add other, not cited references)

  • Then you run a RefDB tool on your SGML or XML document. This will use the bibliography style you specified on the command line and will create a cooked bibliography in a SGML or XML file. It will also create a small stylesheet driver file specific for your bibliography style.

  • Finally you run Jade or an XSLT processor on your document to transform it to the final output. This step uses the RefDB-created driver files to format the RefDB bibliographies (leaving alone potential other bibliographies) and the RefDB citations (leaving alone potential other citations). The stylesheet driver files essentially take care of character properties like font weight, posture etc. for various parts of the citation or bibliography. The citations are neatly hyperlinked with the references in the bibliography in all output formats that support this.

Note Your sources remain intact!
 

Please note that neither BibTeX nor RefDB do any "search-and-replace"-style mangling of your sources. The cooked bibliography is kept in an external file in both cases. This way it is easy to reformat your document for a different bibliography style without touching the document source. And the whole thing works for DocBook, TEI, and any other reasonable DTD (with a little stylesheet tweaking, that is).

Let's see how lyxtox handles all this complexity in the background for you:

When sedscr is run through runsed from lyxtox:

$RUNSED $SEDSCR $1.sgml

it makes a lot of changes to the SGML file $1.sgml that was previously exported by LyX (see Section 7.1.4.1 for an in-depth description of them). The following code in sedscr takes care of citations that were entered as described in Section 5.19.2:

s/<xref linkend="cit:\([^"]*\)">/\
<citation role="REFDB">\1<\/citation>\
/g

This sed code will substitute every cross-reference to a label of the form “cit:IDsome” with a citation of the form:

<citation role="REFDB">"IDsome"</citation>

in the exported SGML file. Then, lyxtox will call refdbxp:

$REFDBXP -t db31 < $1.sgml > $1.full.sgml
mv $1.full.sgml $1.sgml

to transform all citations of the above form to one of the form:

<citation role="REFDB"><xref linkend="IDsome"></citation>

But how are the citations and references going to be formatted? We need special DSSSL instructions for those parts of our document! Fortunately, RefDB provides us with the aproppriate stylesheet. It is generated dynamically each time, using as input the style name we entered in lyxtox as the value of the REFDB_style variable (see Section 4.13), with a call to runbib from lyxtox:

$RUNBIB -d $RefDB_db -S $REFDB_style -t db31 $1.sgml

This will produce the stylesheet ${REFDB_style}dsl (notice the absence of a dot before the dsl ending). For example, if REFDB_style is “J.Biol.Chem.”, then the above call to runbib will create J.Biol.Chem.dsl.

J.Biol.Chem.dsl looks like this:

<!DOCTYPE style-sheet PUBLIC "-//James Clark//DTD DSSSL Style Sheet//EN" [
<!ENTITY docbook-refdb-html.dsl PUBLIC "-//Markus Hoenicka//DOCUMENT 
RefDB DocBook html stylesheet//EN" CDATA DSSSL>
<!ENTITY docbook-refdb-print.dsl PUBLIC "-//Markus Hoenicka//DOCUMENT 
RefDB DocBook print stylesheet//EN" CDATA DSSSL>
]>
<style-sheet>
<style-specification id="html" use="docbook-refdb-html">
<style-specification-body>
(define ABSTJOURNALNAMESTYLE "ITALIC")
(define ABSTVOLUMESTYLE "BOLD")
(define CHAPBOOKTITLESTYLE "ITALIC")
(define BOOKBOOKTITLESTYLE "ITALIC")
(define INPRJOURNALNAMESTYLE "ITALIC")
(define JOURJOURNALNAMESTYLE "ITALIC")
(define JOURVOLUMESTYLE "BOLD")
(define NEWSJOURNALNAMESTYLE "ITALIC")
(define NEWSVOLUMESTYLE "BOLD")
</style-specification-body>
</style-specification>
<style-specification id="print" use="docbook-refdb-print">
<style-specification-body>
(define ABSTJOURNALNAMESTYLE "ITALIC")
(define ABSTVOLUMESTYLE "BOLD")
(define CHAPBOOKTITLESTYLE "ITALIC")
(define BOOKBOOKTITLESTYLE "ITALIC")
(define INPRJOURNALNAMESTYLE "ITALIC")
(define JOURJOURNALNAMESTYLE "ITALIC")
(define JOURVOLUMESTYLE "BOLD")
(define NEWSJOURNALNAMESTYLE "ITALIC")
(define NEWSVOLUMESTYLE "BOLD")
</style-specification-body>
</style-specification>
<external-specification id="docbook-refdb-html" document="docbook-refdb-html.dsl">
<external-specification id="docbook-refdb-print" document="docbook-refdb-print.dsl">
</style-sheet>

This means the following:

For online output (id=”html”), take those define's into account and then proceed to use the part of the stylesheets with ID “docbook-refdb-html” (use=”docbook-refdb-html”). Openjade will look at the external specification with id “docbook-refdb-html” at the end:

<external-specification id="docbook-refdb-html" document="docbook-refdb-html.dsl">

and will see that is is the document whose name is the "docbook-refdb-html.dsl" SGML entity (document="docbook-refdb-html.dsl"). It will then consult the entity declarations at the start of the stylesheet:

<!ENTITY docbook-refdb-html.dsl PUBLIC "-//Markus Hoenicka//DOCUMENT 
RefDB DocBook html stylesheet//EN" CDATA DSSSL>

and find out that the docbook-refdb-html.dsl entity is the one with the public identifier "-//Markus Hoenicka//DOCUMENT RefDB DocBook html stylesheet//EN", it will search the catalog files (see Section 4.5) and, since the lyxtox script stores the RefDB catalog file in its SGML_CATALOG_FILES variable:

SGML_CATALOG_FILES="$SGML_CATALOG_FILES:/usr/share/refdb/refdb.cat"

openjade will finally be able to locate the document containing the rest of the DSSSL instructions necessary for processing the Bibliography. It turns out that, on my system, this is the /usr/share/refdb/dsssl/html/docbook-refdb.dsl, but this depends on how you compiled RefDB - on the value of the --prefix option precisely. The same procedure is followed for the print output (the id=”print” part).

But wait a minute! That /usr/share/refdb/dsssl/html/docbook-refdb.dsl, does it contain any special code for Mathematics, like the one in Section 10.3.2.1? No. Does it contain any of the other special DSSSL processing as described in Section 7.1.5? Again, no. It is a “driver” file containing only the RefDB-specific DSSSL instructions - for the rest, it jumps directly to the standard DocBook stylesheets of Norman Walsh, through the use=”docbook” attribute, just as our J.Biol.Chem.dsl jumps to /usr/share/refdb/dsssl/html/docbook-refdb.dsl through the use="docbook-refdb-html" attribute.

Of course, you can tweak the /usr/share/refdb/dsssl/html/docbook-refdb.dsl to include both the Mathematics (DBTeXMath) and all the other specific code, but this is not acceptable. Clearly, we need an automatic solution.

Fortnunately, there is one: use the ${REFDB_style}dsl stylesheet that RefDB created automatically for us, to create two new DSSSL driver files, one for HTML and one for print output. these stylesheets will take care to jump first to our other DSSSL files - the latter ones will then jump to the standard DocBook stylesheets.

To implement this solution, two new scripts are necessary, awkscr_refdb_html and awkscr_refdb_print. The lyxtox script calls them as follows:

  $AWKSCR_REFDB_HTML  ${REFDB_style}dsl > $HTML_DSL
  $AWKSCR_REFDB_PRINT ${REFDB_style}dsl > $PRINT_DSL

Both take as input the newly created ${REFDB_style}dsl stylesheet (J.Biol.Chem.dsl in our example) and produce a new HTML and print DSSSL stylesheet respectively. Here is the awkscr_refdb_html script :

#! /bin/sh
AWK="/usr/bin/awk"
cat <<-EOF 
<!DOCTYPE style-sheet PUBLIC "-//James Clark//DTD DSSSL Style Sheet//EN" [
<!ENTITY docbook.dsl PUBLIC "-//Norman Walsh//DOCUMENT DocBook HTML 
Stylesheet//EN" CDATA DSSSL>
<!ENTITY lyxtox-html.dsl SYSTEM "lyxtox-html.dsl" CDATA DSSSL>
<!ENTITY lyxtox-onehtml.dsl SYSTEM "lyxtox-onehtml.dsl" CDATA DSSSL>
]>
<style-sheet>
EOF
for id in "html" "onehtml"; do
  echo "<style-specification id=\"$id\" use=\"lyxtox-"$id"\">"
  echo "<style-specification-body>"
  $AWK '$1 ~ /<style-specification/ && $2 ~ "id=\"html\"",$1 ~ /<\/style-specification/ 
  {if ($1 !~ /^</) {print}}' "id=$id" $1
  echo "</style-specification-body>"
  echo "</style-specification>"
done
cat <<-EOF 
<external-specification id="lyxtox-html" document="lyxtox-html.dsl">
<external-specification id="lyxtox-onehtml" document="lyxtox-onehtml.dsl">
</style-sheet>
EOF

This script

  1. Prints the necessary entity declarations that have to come at the top of the new DSSSL stylesheet.

  2. For every HTML id (there are currently two of them, “html” and “onehtml”), print as style specification containing all lines that do NOT start with “<” in the style specification for the “html” id in the given file, i.e. the newly created ${REFDB_style}dsl stylesheet. In our example, it prints every line not starting with “<” in the following part of J.Biol.Chem.dsl that comprises only the style specification for the id=”html” part:

    <style-specification id="html" use="docbook-refdb-html">
    <style-specification-body>
    (define ABSTJOURNALNAMESTYLE "ITALIC")
    (define ABSTVOLUMESTYLE "BOLD")
    (define CHAPBOOKTITLESTYLE "ITALIC")
    (define BOOKBOOKTITLESTYLE "ITALIC")
    (define INPRJOURNALNAMESTYLE "ITALIC")
    (define JOURJOURNALNAMESTYLE "ITALIC")
    (define JOURVOLUMESTYLE "BOLD")
    (define NEWSJOURNALNAMESTYLE "ITALIC")
    (define NEWSVOLUMESTYLE "BOLD")
    </style-specification-body>
    </style-specification>
    

    Obviously, the lines that do NOT start with a “<” are those define's - and those are exactly the lines we are interested in, everything else will be replaced by our own code.

  3. Prints aproppriate external specifications that point to our own stylesheets that contain the rest of our customization.

The output looks like this (see also refdb-html.dsl):

<!DOCTYPE style-sheet PUBLIC "-//James Clark//DTD DSSSL Style Sheet//EN" [
<!ENTITY docbook.dsl PUBLIC "-//Norman Walsh//DOCUMENT DocBook HTML 
Stylesheet//EN" CDATA DSSSL>
<!ENTITY lyxtox-html.dsl SYSTEM "lyxtox-html.dsl" CDATA DSSSL>
<!ENTITY lyxtox-onehtml.dsl SYSTEM "lyxtox-onehtml.dsl" CDATA DSSSL>
]>
<style-sheet>
<style-specification id="html" use="lyxtox-html">
<style-specification-body>
(define ABSTJOURNALNAMESTYLE "ITALIC")
(define ABSTVOLUMESTYLE "BOLD")
(define CHAPBOOKTITLESTYLE "ITALIC")
(define BOOKBOOKTITLESTYLE "ITALIC")
(define INPRJOURNALNAMESTYLE "ITALIC")
(define JOURJOURNALNAMESTYLE "ITALIC")
(define JOURVOLUMESTYLE "BOLD")
(define NEWSJOURNALNAMESTYLE "ITALIC")
(define NEWSVOLUMESTYLE "BOLD")
</style-specification-body>
</style-specification>
<style-specification id="onehtml" use="lyxtox-onehtml">
<style-specification-body>
(define ABSTJOURNALNAMESTYLE "ITALIC")
(define ABSTVOLUMESTYLE "BOLD")
(define CHAPBOOKTITLESTYLE "ITALIC")
(define BOOKBOOKTITLESTYLE "ITALIC")
(define INPRJOURNALNAMESTYLE "ITALIC")
(define JOURJOURNALNAMESTYLE "ITALIC")
(define JOURVOLUMESTYLE "BOLD")
(define NEWSJOURNALNAMESTYLE "ITALIC")
(define NEWSVOLUMESTYLE "BOLD")
</style-specification-body>
</style-specification>
<external-specification id="lyxtox-html" document="lyxtox-html.dsl">
<external-specification id="lyxtox-onehtml" document="lyxtox-onehtml.dsl">
</style-sheet>

We then process our SGML file with this stylesheets for HTML output!

Let's examine for a moment what the new stylesheet does:

  1. lyxtox calls openjade and tells it to use the “html” id of this stylesheet. This is because, in case RefDB processing is on, the stylesheet for HTML chunked output is defined by:

      HTML_DSL="refdb-html.dsl"
      HTML_CHUNKS_DSL="$HTML_DSL#html"
    

    and HTML_DSL is the name of the file created by awkscr_refdb_html above:

    $AWKSCR_REFDB_HTML  ${REFDB_style}dsl > $HTML_DSL
    

    That “#html” is a special Jade/Openjade construct that tells it to start processing at id=”html” of the given stylesheet.

The new stylesheet (which will bear the name refdb-html.dsl, if HTML_DSL is left to its default value) jumps to id “lyxtox-html” which, by the same reasoning as we saw for the original J.Biol.Chem.dsl file previously, will be found in the lyxtox-html.dsl file that contains all our non-RefDB-specific customization. lyxtox-html.dsl, on the other hand, is constructed in such a way that it will call the standard DocBook stylesheets with a use=”docbook” attribute - and the processing chain is closed!

Processing for print output is done similarly: awkscr_refdb_print creates a new print stylesheet from ${REFDB_style}dsl, refdb-print.dsl, then this new stylesheet is called from lyxtox with an id of “print-pdf”, “print-ps”, “print-rtf” or “print-txt” respectively for each one of the print formats:

  PRINT_DSL="refdb-print.dsl"
  PRINT_PDF_DSL="$PRINT_DSL#print-pdf"
  PRINT_PS_DSL="$PRINT_DSL#print-ps"
  PRINT_RTF_DSL="$PRINT_DSL#print-rtf"
  PRINT_TXT_DSL="$PRINT_DSL#print-txt"

Just as its HTML counterpart above, refdb-print.dsl is constructed in such a way that it will call our lyxtox-print-pdf.dsl, lyxtox-print-pdf.dsl etc. stylesheets with use attributes like use="lyxtox-print-pdf", use="lyxtox-print-ps" etc. respectively. Those stylesheets, in turn, will jump to the standard DocBook stylesheets with a use=”docbook” attribute.

How does a typical RefDB bibliography look like? lyxtox copies the $1.bib.sgml file (wher $1 is the argument passed to it) to the file with the fixed name bibliography.sgml:

mv $1.bib.sgml bibliography.sgml

The &bibliography; entity that sedscr_abi appends to our SGML file is mapped automatically to bibliography.sgml through the line

<!entity bibliography SYSTEM "bibliography.sgml">

in the preample of our document (see Section 4.6). Thus, you only have to look at bibliography.sgml - a typical entry there looks like:

<bibliomixed id="IDKARAKAS1992" role="BOOK">
<bibliomset role="intext" id="IDKARAKAS1992X">
    (<abbrev>11</abbrev>)
</bibliomset>
<bibliomset role="intextsq" id="IDKARAKAS1992S">
    (<abbrev>11</abbrev>)
</bibliomset>
<bibliomset role="authoronly" id="IDKARAKAS1992A">
    <bibliomset relation="author">
        <surname>Karakas</surname>
    </bibliomset>
</bibliomset>
<bibliomset role="authoronlysq" id="IDKARAKAS1992Q">
    <bibliomset relation="author">
        <surname>Karakas</surname>
    </bibliomset>
</bibliomset>
<bibliomset role="yearonly" id="IDKARAKAS1992Y">
    (<abbrev>11</abbrev>)
</bibliomset>
<bibliomset role="bibliography" id="IDKARAKAS1992B">
    <abbrev>11. </abbrev>
    <bibliomset relation="book">
        <bibliomset relation="author">
            <surname>Karakas</surname> 
            <firstname>C.</firstname>
        </bibliomset> 
    </bibliomset>
    <bibliomset relation="book">
        (<pubdate role="primary">1999</pubdate>) 
    </bibliomset>
    <bibliomset relation="book">
        <publishername>BoD GmbH, Norderstedt</publishername>  
    </bibliomset>
</bibliomset>
</bibliomixed>

As you can see, it is based on bibliomixed and bibliomset elements. The content of a bibliomixed element includes all necessary punctuation for formatting - we say that bibliomixed entries are “cooked”. The creator of RefDB, Markus Hoenicka, says the following about the role of bibliomixed elements in RefDB (see Formatting DocBook bibliographies):

RefDB does use bibliomixed on purpose. I think the choice between raw and cooked is a compromise between philosophy and ease of implementation, speed of execution etc. I see the main purpose of auto-generating bibliographies not in creating beautiful and philosophically correct *source* documents, but to help users create correct *formatted* output. The intermediate bibliography element is a means to achieve this. The DocBook DTD explicitly defines the bibliomixed element to create bibliographic output that would be too tedious or complicated to create on the stylesheet level alone.


7.1.11. Index

Before you can process your document, you must make sure that index.sgml exists. This is a chicken and egg problem, but it can be solved with the collateindex.pl command:

perl collateindex.pl -N -o index.sgml 

or, as in lyxtox:

$PERL $COLLATEINDEX -N -o index.sgml

The -N option creates a new index; -o indentifies the name of the output file. This name must be the same as the name you specified in the preample (see Section 4.6). The collateindex.pl script is part of the docbook-dsssl-stylesheets package (see Section 3.2). There are a multitude of options to collateindex.pl; see the reference page for more information.

Creating an index is a multi-step, two-pass process (see Automatic Indexing with the DocBook DSSSL Stylesheets):

  • In order to create an index, you must first generate the raw index data. This is done with the HTML stylesheet (even if you want print output). That's why in lyxtox we use the same copy of HTML.index which was created with the “no-chunks” HTML stylesheet

    $PERL $COLLATEINDEX -p -g -o index.sgml HTML.index
    
  • After you created the HTML.index file, you can generate your final document as usual using whichever stylesheet is appropriate. The generated document will contain the index:

    • Using sgmltools, as in older versions of lyxtox:

      • For one big HTML file:

        $SGMLTOOLS -b onehtml -s $HTML_NOCHUNKS_DSL -j "-i output.print.png" $1.sgml
        

        (notice the nochunks option we pass to openjade through sgmltools)

      • For many HTML files (one per chapter/section):

        $SGMLTOOLS -b html -s $HTML_DSL -j "-i output.print.png" $1.sgml
        
      • and for PDF:

        $SGMLTOOLS -b pdf -s sgmltools-pdf -j "-i output.print.pdf" $1.sgml
        
    • And using openjade and pdfjadetex as in newer versions:

      • For one big HTML file:

        ${OPENJADE} -t sgml -d $HTML_NOCHUNKS_DSL -i output.print.png -V nochunks $1.sgml > $1.html
        
      • For many HTML files (one per chapter/section):

        $OPENJADE -t sgml -d $HTML_CHUNKS_DSL -i output.print.png $1.sgml
        
      • and for PDF:

        ${PDFJADETEX} $1.tex
        
Tip Tip
 

Whether an index has to be created or not, can be controlled by setting html-index to "#t" in the stylesheets (see Section 4.2 and Section 7.1.5) as follows (original code is in dbparam.dsl, but it is better not to touch it):

(define html-index
  ;; REFENTRY html-index
  ;; PURP HTML indexing?
  ;; DESC
  ;; Turns on HTML indexing.  If true, then index data will be written
  ;; to the file defined by 'html-index-filename'.  This data can be
  ;; collated and turned into a DocBook index with bin/collateindex.pl.
  ;; /DESC
  ;; AUTHOR N/A
  ;; /REFENTRY
  #t)
Default is false ("#f"). I preferred to let the default untouched and insert the external entity "index" at the end of the document using runsed and sedscr, see Section 7.1.4.1. If you decide to set html-index you will have to comment this in sedscr.
Tip Tip
 

You can change the name of the file to which index data will be written by setting html-index-filename in the stylesheets (see Section 4.2 and Section 7.1.5) as follows (original code is in dbparam.dsl, but it is better not to touch it):

(define html-index-filename
  ;; REFENTRY html-index-filename
  ;; PURP Name of HTML index file
  ;; DESC
  ;; The name of the file to which index data will be written if
  ;; 'html-index' is not '#f'.
  ;; /DESC
  ;; AUTHOR N/A
  ;; /REFENTRY
  "HTML.index")
Default is HTML.index. I preferred to let the default untouched. If you decide to set html-index-filename, you will have to adapt lyxtox to reflect the name change.

7.2. Optimal PDF

HTML is quite limited when it comes to advanced formatting capabilities (although this has somewhat changed with the advent of CSS, see Section 4.14 and Section 7.1.8). On the other side, the layout of a PDF document remains unchanged, regardless of the output medium, be it monitor, or printer. It retains all its typographic finesse and is not (at least easily) modifiable. These properties, together with the availability of free PDF readers, like Acrobat® Reader or xpdf, have rendered PDF a very popular format.

But PDF is not a simple print format. It incorporates features that bear similarities to HTML : you can insert hypertext links to point either to some other place in the same document (a cross-reference), to other PDF documents, or even to WWW pages. You can have the Table of Contents as a link tree to the left (“bookmarks”), extended Document Information (author, keywords, creator, embedded fonts etc.), or thumbnails (we will talk in Section 7.2.11 for the details of thumbnail generation), which are small pictograms of the document's pages to aid visual navigation.

In this section I will discuss the details of incorporating all these advanced features in the PDF document generated by DocBook through LyX.


7.2.1. From .lyx to .pdf

The classic way to transform a . lyx document to PDF format is to follow a three-pass procedure: first, the .lyx document is exported to DVI, then to PS through dvi2ps, then the PS version is tranformed to PDF through software like the commercial Acrobat® Distiller®[22], or the freely available Ghostscript. (via ps2pdf).

This classic three-pass procedure is not only complicated, it also loses some information through the intermediate DVI format. The results are often not acceptable: the most frequent problem is bad presentation of the character glyphs that make up the document (see Quality of PDF from PostScript):

  • Wrong type of fonts used, which is the commonest cause of fuzzy text.

  • ghostscript too old, which can also result in fuzzy text.

  • Switching to font encoding T1, which is yet another possible cause of fuzzy text.

  • Another problem - missing characters - arises from an aged version of Acrobat® Distiller®.

  • Finally, there's the common confusion that arises from using the dvips configuration file -Ppdf, the weird characters.

It would be much better to produce the PDF version directly from the TeX (i.e. LyX) output. The pdftex package (see Section 3.5) was created with this objective in mind. pdftex (and pdfjadetex) creates the PDF document in one pass from the TeX format. A disadvantage of pdftex and pdfjadetex used to be the complexity of the preliminary steps (see Chapter 4, especially Section 4.9) needed to get a LaTeX document converted to PDF, escecially when it contained images. Not anymore: the method described here automates the format transormations for the images and hides the complexity of the commands involved in three files (sedscr, jadetex.cfg, lyxtox). See Section 7.2.2.


7.2.2. Figures

This is a serious problem most people fail in first place. LaTeX (and LyX) expects the images to be in encapsulated PostScript® (.eps) format (see Section 5.7). On the other hand, pdfjadetex (see Section 3.4) is not capable of dealing with eps (only with PDF, JPEG, PNG or MetaPost), we will have to convert the images to encapsulated PDF (.epdf) format, while still carrying the .pdf ending! This is best done by the addd script, which in turn uses convert (from the Image Magick package) and eps2pdf. The script works as follows (FIXME: The script has been simplified. I didn't test it extensively though. The following describes the old script):

Some variables are set first:

CONVERT="/usr/bin/convert" (1)
DENSITY=133 (2)
(1)
The location of the convert utility. It is part of the ImageMagick package, so you must have ImageMagick installed.
(2)
The "dots-per-inch" of the device where the image was made. If you plan to use the addd script to add density to screenshots, then this is the DPI value of the monitor where the screenshots were made.
  • The script echoes some information about the file it processes and the calculated constants:

    echo ""
    echo "Processing file $1.png"
    echo ""
    echo "DENSITY=$DENSITY"
    
  • We use ImageMagik to convert the PNG image to encapsulated PDF (EPDF, a format different from encapsulated PostScript®, EPS), adding density and antialiasing (so that texts are more readable). This is necessary in order for the bitmapped image to behave correctly in the PS document and is a point often and easily missed[23]:

    ${CONVERT} -antialias -density ${DENSITY} $1.png $1.epdf
    
  • We then use ImageMagick again to convert the PNG image to encapsulated PostScript®[24] format, adding antialiasing again:

    ${CONVERT} -antialias -density ${DENSITY} $1.png $1.eps
    
  • But we are still not done! pdfjadetex will accept encapsulated PDF images (.epdf), but only if they end in .pdf! We thus have to rename the EPDF image to a PDF one:

    # Rename the file so that it ends in pdf - pdftex wants this!
    mv $1.epdf $1.pdf
    

This procedure will create pdf and png files with the right resolution (density) information. Unfortunately the eps file that is also created as a by-product, will display somewhat smaller. FIXME: This may be the result of ghostview not using the right DPI when it displays the image, so it may be a problem of my system configuration.

You can automate the “add density” procedure using the adddscr script, which is included in the packages that you will find in Section 1.2:

#! /bin/sh
for x in `ls *.$1`; do
y=`basename $x .$1`
convert $y.$1 $y.png
addd $y
convert $y.png $y.bmp
done

The adddscr script accepts one parameter: the format from which you want to start the conversion. The idea is the following: suppose you have a set of GIFs, but no PNGs or any other format. Then you can change to the directory of the images, type

adddscr gif

and get your GIFs converted to PNGs, then get EPDF (renamed as PDF) and EPS version with added density. If you only have, say, BMP versions, just type

adddscr bmp

and the script will convert your BMPs to PNGs first, then to all other formats, adding density on the way.

For example, you can use the adddscr script the very first time you install the docbook-dsssl-stylesheets RPM. The RPM package offers GIF, EPS and PDF versions of admonitions (see Section 4.7) and callouts (see Section 4.8). The EPS and PDF versions surely come with the density of the system (read: monitor) they were created on, so it may not be wise to add density to them once again (if you do it, it may make the admonition and callout icons display smaller or larger than they should in PDF and PS documents)[25]. However, while you might want to leave the EPS and PDF versions untouched, you definitely need PNG and BMP versions of the images. Here's where the adddscr script comes in handy:

Comment the line that adds density in adddscr:

#! /bin/sh
for x in `ls *.$1`; do
y=`basename $x .$1`
convert $y.$1 $y.png
# addd $y
convert $y.png $y.bmp
done

change to the directory of the admonitions and callouts:

cd /usr/share/sgml/docbook/docbook-dsssl-stylesheets/images

then type

adddscr gif

Do the same for the callouts:

cd /usr/share/sgml/docbook/docbook-dsssl-stylesheets/images/callouts
adddscr gif

Now you have PNG and BMP versions of all your admonition and callout icons![26]

If you wondered why your images dont't get included in the PDF, although you meticulously prepared everything “right”, now you know why! But there's more to it - read on!smile

Note But what is this "density" anyway?
 

From problem downsampling:

You should think of a digital image as a rectangular pixel grid. Suppose you have an image of 1500 by 2000 pixels. The image could be printed (on paper) or viewed (on screen). For that purpose, you have to tell the printer software or the "display" utility program how large an image pixel should be printed as.

Suppose you want a single image pixel for a single printer dot, at 300 dots per inch (dpi). That means that the image will come out of the printer with a size of about 5 inch by 6.6 inch (or 127 mm by 169 mm).

But the image file (normally) only knows about its pixel size (1500x2000), not its size at which it should be printed (since you could as well print it at double or half of the size; that's up to the printer software).

However, some image formats (e.g. TIFF, EPS, EPDF) allow to store the "density", i.e., the real physical size of a pixel, as extra (header) information. This is what we have to do with our EPS and PDF images too.

Having converted the images to all possible formats and having used the right parameters for each format, still does not mean we are done! If we are not very careful about the way we will use them, we will end up in a real mess, even though all seems to be right according to the packages, the SGML specification, the Stylesheets etc.

The problem is that when you generate "print" output, the stylesheets don't have any means to know which print format you mean, EPS, PDF or RTF -- there's no way to provide them with that information yet. Given the choice of PNG, BMP or EPS they'll choose EPS every time. As we've said, pdfjadetex doesn't handle .eps files.

The solution is to use parameter entities (if that's an unfamiliar term, read the FreeBSD Documentation Project Primer for New Contributors: Entities, there's a section in there which explains them, and gives examples of using them to conditionally include certain parts of your document).

In a nutshell, start your document like this:

<!DOCTYPE book  PUBLIC "-//OASIS//DTD DocBook V4.1//EN" [
<!ENTITY % output.print.png "IGNORE">
<!ENTITY % output.print.pdf "IGNORE">
<!ENTITY % output.print.eps "IGNORE">
<!ENTITY % output.print.bmp "IGNORE">
]>
<!-- rest of the document as normal -->

Adjust that as necessary, depending on which version of the DTD you're using. The key bits are the ENTITY lines.

Then, when you want to include an image, do this (see the explanation of the sedscr file in Section 7.1.4.1):

<mediaobject>
      <![ %output.print.png; [
      <imageobject>
         <imagedata fileref="./images/imagename.png" format="PNG">
      </imageobject>
      ]]>
      <![ %output.print.pdf; [
      <imageobject>
         <imagedata fileref="imagename.pdf" format="PDF" scale="65">
      </imageobject>
      ]]>
      <![ %output.print.eps; [
      <imageobject>
         <imagedata fileref="imagename.eps" format="EPS">
      </imageobject>
       ]]>
      <![ %output.print.bmp; [
      <imageobject>
         <imagedata fileref="imagename.bmp" format="BMP">
      </imageobject>
       ]]>
</mediaobject>

Now, when you process your document with "openjade ...", the "%output.print.xxx;" is replaced by the word "IGNORE". This tells Jade to ignore that section of the document. The overall effect is that no image will be included, neither PNG, nor PDF, nor EPS, nor BMP.

In order to get one (and only one) image included, you have to tell Jade that one or other of the %output.print.xxx; entities must contain "INCLUDE" rather than "IGNORE". You can do this on the command line with the "-i" flag. So if you're producing a HTML file, you would do (see lyxtox and Section 7.1.4.6):

$SGMLTOOLS -b html -s $HTML_DSL -j "-i output.print.png" $1.sgml

If you're producing a PDF file, you would do (see lyxtox and Section 7.1.4.7):

$SGMLTOOLS -b pdf -s sgmltools-pdf -j "-i output.print.pdf" $1.sgml

and so on. With the -j option to sgmltools you can pass options to Jade - we thus pass the aproppriate "-i output.print.xxx" for each format.

Using openjade and pdfjadetex, these commands are equivalent to:

${OPENJADE} -t sgml -d $HTML_NOCHUNKS_DSL -i output.print.png -V nochunks $1.sgml > $1.html

and:

${OPENJADE} -t tex -d ${PRINT_PDF_DSL} -o $1.tex -i "output.print.pdf" $1.sgml
${PDFJADETEX} $1.tex

respectively.

Yes, it's a kludge. But once incorporated in a script (like sedscr, see Section 7.1.4.1), that doesn't have to bother us any more. I owe it to Nik Clayton, who posted it to the docbook-apps mailing list on June 8th, 2000 . It works great! Thanks Nik.


7.2.3. Using Type 1 Fonts

If you want a PDF document that not only excels when printed, but also when displayed on the screen, it is advisable to embed Type 1 fonts, even though the document may increase in size. The reason is is that by default TeX/LaTeX uses bitmapped fonts instead of Type1 or TrueType ones. The resolution of these bitmapped fonts matches that of the printer on the system you create the document. This is rarely the same resolution of the monitor or printer the reader will use. This change in resolution results in terrible quality when displaying these fonts on a screen. or printing it on a printer, whose resolution does not match the one of the bitmapped font.

The solution to the problem is to force TeX/LaTeX to use Type1 fonts which are scalable and thus resolution-independent. There are two ways to achieve this:

  • Use of the standard PostScript® fonts

  • Use of Type1 versions of Computer Modern (CM) fonts.

I chose the latter method. I like the look of the TeX documents that use Computer Modern fonts. cool

I am not alone with this predilection: most people use Computer Modern to start with, and even those relative sophisticates who use something as exotic as Sabon often find themselves using odd characters from CM without really intending to do so (see The wrong type of fonts in PDF).

Fortunately, rather good versions of the CM fonts are available from the AMS (who have them courtesy of Blue Sky Research and Y&Y, see Blue Sky Research and Computer Modern fonts for some historical background) and most modern systems have the fonts installed ready to use (if yours doesn't, go get them from the Comprehensive TeX Archive Network archives: Blue Sky CM Type 1 fonts, or any other CTAN mirror).

There are six DSSSL variables for defining parameters for changing fonts.

%title-font-family%

the font used for all titles. Example: titles, glossary entries.

%admon-font-family%

the font used for admonissions. Example: note.

%guilabel-font-family%

the font used for GUI text. Example: guimenuitem.

%mono-font-family%

the font used for elements needing typewriter or monospace text. Example: file names, commands and URLs.

%refentry-font-family%

the font used for references.

%body-font-family%

the font used for body text.

By default, these variables may take the following values:

  • Helvetica

  • Palatino

  • Bookman

  • Courier

  • Wingdings

  • Avant-Garde

  • New-Century-Schoolbook

  • Times-Roman

  • Zapf-Dingbats

  • Computer-Modern-Typewriter

  • Computer-Moder-Sans

  • Computer-Modern

  • Computer-Modern-Caps-And-Small-Caps

Other font names may be used, but they correspond to one of the fonts in the list above. For example:

  • Arial = iso-sanserif = Helvetica

  • Courier-New = Courier

  • Times-New-Roman = iso-serif = Times-Roman

  • WingDings = Wingdings

To use other fonts, they must be T1 fonts, the coding used by TeX. These are listed in the file called t1***.fd, where *** is the font code. The first letter represents the font provider; the next two letters represent the font name. In addition, we must ensure that all required files are present in our installation, i.e. the .tfm, .afm, .vf, .pfm and .pfb files for each .fd file.

We must ensure that Jadetex can associate the font name, e.g. Utopia, with the code name of the font (for Utopia, "put"). This is done by adding lines such as the following to the jadetex.cfg file:

\def\Family@font_name(***)

where *** again represents the three-letter code and "font_name" is the... font name. For example, to use the Utopia font, the following line would be added to the jadetex.cfg file:

\def\Family@Utopia{put}.

Following Customizing Document Production, we can include these lines in the jadetex.cfg file (currently commented, since they make sense only if you have those fonts installed - uncomment accordingly):

\makeatletter
\def\Family@Utopia{put}
\def\Family@ZapfChancery{pzc}
\def\Family@Fibonacci{cmfib}
\def\Family@Funny{cmfr}
\def\Family@Dunhill{cmdh}
\def\Family@Concrete{ccr}
\def\Family@Charter{bch}
\def\Family@Fontpxr{pxr}
\def\Family@Fontaer{aer}
\def\Family@Fontaess{aess}
\def\Family@Fontaett{aett}
\def\Family@Fontlcmss{lcmss}
\def\Family@Fontlcmtt{lcmtt}
\def\Family@Fontcmvtt{cmvtt}
\def\Family@Fontcmbr{cmbr}
\def\Family@Fontcmtl{cmtl}
\def\Family@Fontpxss{pxss}
\def\Family@Fonttxss{txss}
\def\Family@Fonttxr{txr}
\makeatother

The font declarations are preceded by \makeatletter and followed by \makeatother to properly escape the "@" symbol.

Now, if we wish titles to be formatted using Fonttxr and body text with Concrete, all that is necessary is to add the following lines to lyxtox-print-pdf.dsl:

(define %title-font-family% "Fonttxr") (define
%body-font-family% "Concrete"

And in order to use Computer Modern fonts in the PDF document , we write the following in the print stylesheet, lyxtox-print-pdf.dsl (see Section 7.1.5):

    ;;  Gnuishly correct fonts...
    ;;
    (define %body-font-family% "Computer-Modern")
    (define %mono-font-family% "Computer-Modern-Typewriter")
    (define %title-font-family% "Computer-Modern")
    (define %admon-font-family% "Computer-Modern-Sans")
    (define %guilabel-font-family% "Computer-Modern-Sans")

Now, if the T1 font encoding is used, i.e. if the jadetex.cfg file contains the line

\usepackage[T1]{fontenc}

the EC fonts will be used instead of the CM ones. Since there are no Type1 EC fonts available, pdftex will use bitmapped ones, yielding poor quality PDF files. Fortunately, there is a workaround: the ae package provides virtual EC fonts based on the CM ones, so that the Type1 CM fonts can be used in the output file. However, not all EC characters are available in this way. The aecompl package defines the missing characters as bitmapped fonts. To use them you should have the following in jadetex.cfg:

\usepackage{ae} 
\usepackage{aecompl} 

or just

\usepackage{ae,aecompl} 

In this way the output file will use CM fonts for all except some rarely used characters. pdfjadetex will end saying something like:

<8r.enc><cmex10.pfb><cmti9.pfb><cmmi6.pfb><cmr7.pfb><cmmi7.pfb> </var/cache/f
onts/pk/ljfour/jknappen/tc/tcrm0700.600pk><cmr6.pfb><cmsy10.pfb><cmitt10.pfb><c
mssi9.pfb><cmmi9.pfb><cmr9.pfb><cmmi10.pfb><cmtt8.pfb><cmtt9.pfb><cmss9.pfb> </
var/cache/fonts/pk/ljfour/jknappen/tc/tcrm0800.600pk><cmti10.pfb><cmbx10.pfb> <
/var/cache/fonts/pk/ljfour/jknappen/tc/tcrm1000.600pk><cmr10.pfb><cmssbx10.pfb>

meaning that it has embedded all fonts shown in angle brackets into the PDF file. See High quality PDF output from LaTeX and TeX for more details.

Tip Tip:
 

You can check the fonts used in the PDF file by choosing File-->Documet Info-->Fonts-->List all fonts in Acrobat® Reader. You will see that the fonts are embedded (not totally, but as a subset, which is the legally correct way) in the PDF document, see Figure 7-2.

Figure 7-2. Document Info: Fonts.

Document Info: Fonts.

Document Info: Fonts.

However, today other solutions to the font quality problem exist as well: instead of using virtual Type 1 fonts (which is what you do when you use the ae, aecompl and aeguill packages), you may choose to use “true” Type 1[27] fonts by installing one of the new CM-super, CM-LGC or Latin Modern fonts. From Finding 8-bit Type 1 fonts:

CM-super

is an auto-traced set which encompasses all of the T1 and TS1 encodings as well as the T2* series (the family of encodings that cover languages based on Cyrillic alphabets). These fonts are pretty easy to install (the installation instructions are clear), but they are huge: don't try to install them if you're short of disc space.

CM-LGC

is a similar "super-font" set, but of much more modest size; it covers T1, TS1 and T2{A} encodings (as does CM-super, and also covers the LGR encoding (for typesetting Greek, based on Claudio Beccari's Metafont sources). CM-LGC manages to be small by going to the opposite extreme from CM-super, which includes fonts at all the sizes supported by the original EC (a huge range); CM-LGC has one font per font shape, getting other sizes by scaling. There is an inevitable loss of quality inherent in this approach, but for the disc-space-challenged machine, CM-LGC is an obvious choice.

Latin Modern

is produced using a program MetaType1 which, as its name implies, brings the power of the Metafont paradigm to the production of Type 1 fonts. The Latin Modern set comes with T1, TS1 LY1 encoded variants (as well as a variant using the Polish QX encoding); for the glyph set it covers, its outlines seem rather cleaner than those of CM-super. Latin Modern is more modest in its disc space demands than is CM-super, while not being nearly as stark in its range of design sizes as is CM-LGC - Latin Modern's fonts are offered in the same set of sizes as the original CM fonts. It's hard to argue with the choice: Knuth's range of sizes has stood the test of time, and is one of the bases on which the excellence of the TeX system rests.


7.2.4. Choosing the right font encoding

If you think of a font as being arranged in a table, then the font encoding is nothing else than the way the font's symbols (the “glyphs”) are arranged in the table. If you think of the table as being fixed, then different font encodings arrange the same or different glyphs in different ways in the table's cells. If you mentally number all table cells sequentially, then for each table cell you have a number and a glyph. The number is the font's internal representation (the “encoding”) of the glyph.

Fonts are always encoded in some encoding - that's the nature of a font, being just a table of glyphs. Thus, in order to use a font that contains the glyphs (letters, symbols,...) you need, you must tell pdfjadetex and jadetex which encoding to use. For example, to use the T1 font encoding, the jadetex.cfg file must contain the line

\usepackage[T1]{fontenc}

To use the OT1 encoding, you must have:

\usepackage[OT1]{fontenc}

There are some factors that affect the choice of font encoding:

  • Your language. More precisely, the glyphs you want to present in your document (if you find the word “glyphs” confusing, just read “letters” or “symbols”). If your language is english, then you can use both the T1 and the OT1 font encoding. If your language is french, then you have to use the T1 encoding. The same is true for many european languages. FIXME: encodings for other languages.

  • Mathematics. If you don't display any Mathematics, you can choose from a wider choice of fonts and font encodings. But if you use Mathematics, your document will look better if you choose a font family that contains mathematical symbols as well. A font family that contains excellent mathematical fonts is Computer Modern. Computer Codern came originally only in the OT1 encoding. This is fine, as long as you use only english. For european languages, you have to use the T1 encoding. Now, if you have Mathematics and write in some european language (e.g. french, where you need accented letters and the like), then your choice is becoming narrower: you need a good Mathematics font, say Computer Modern, but you also need T1 encoding. This leads to the use of virtual EC fonts with the ae and aecompl packages, as discussed in Section 7.2.3.

  • The symbols you want to use. Some symbols are available in one encoding, but not in another. For example, is missing from the OT1 encoded Computer Modern fonts, but you can still get it in the PDF and PS if you enter math mode in Lyx and type the two there. However,if your purpose is to get the french quotes (“guillemets”), then you might just as well choose the T1 encoding and the aeguill package:

    \usepackage[T1]{fontenc}
    \usepackage{aeguill}
    
  • Quality. You might not want to use the aeguill package, because the fonts it defines are not as perfect as the original Computer Modern fonts, leading to (maybe imperceptible, but nonetheless existent) inaccuracies and inconveniences in the resulting PDF. So you might decide to use the OT1 font encoding and the original Computer Modern fonts, entering your and always in math mode in LyX. Note that the HTML versions of your document will then contain small images in place of those symbols, since lyxtox will treat them as mathematical “ inline equations” (see Chapter 10).

Note that today other solutions to the font quality problem exist as well: instead of using virtual Type 1 fonts (which is what you do when you use the ae, aecompl and aeguill packages), you may choose to use “true” Type 1 fonts by installing one of the new CM-super, CM-LGC or Latin Modern fonts, see Section 7.2.3.


7.2.5. Using True Type fonts

FIXME: To be done.

For the moment, see Using TrueType fonts with TeX via Postscript Type1 format.

The idea is:

  • Transform the TT font to Type1 (see Section 7.2.3) using ttf2pt1. Take care of naming conventions for the new font.

  • Integrate the newly created Type 1 font in your TeX installation. If you observed naming conventions, then this step might be done automatically.

  • In one of the last output lines of ttf2pt1, the font name was printed, for example:

    FontName VAGRoundedBT_Regular
    

    Use that name in the lyxtox-print-pdf.dsl file (see Section 7.1.5):

        (define %body-font-family% "VAGRoundedBT_Regular")
        (define %mono-font-family% "Computer-Modern-Typewriter")
        (define %title-font-family% "VAGRoundedBT_Regular")
        (define %admon-font-family% "Computer-Modern-Sans")
        (define %guilabel-font-family% "Computer-Modern-Sans")
    

    Of course, since this is a T1 font, the T1 font encoding has to be used, i.e. the jadetex.cfg file must contain the line

    \usepackage[T1]{fontenc}
    

7.2.6. The hyperref package

The hyperref package by Sebastian Rahtz und Heiko Oberdiek expands the cross-referencing capabilities of LaTeX introducing \special commands that can be interpreted by a driver (like pdfjadetex) to produce hypertext links to places in the same document (cross-references), other PDF documents, or even WWW pages.

We pass options to the hyperref package in the jadetex.cfg configuration file. Either the classic \usepackage, or the new \ hypersetup command can be used for this purpose. I use the latter. If you use the \ usepackage method, you should always specify the driver like in

\usepackage[pdftex]{hyperref}

In addition to the base URL, author, title, subject and keywords (which you have already set up correctly in Section 4.4), there are a lot of other options that can be set in jadetex.cfg (see Erstellung von pdf-Dokumenten mit LaTeX (in german) ):

  • open settings:

    • pdfpagemode: Determines how the document will open in Acrobat®. If no mode is explicitly chosen, but the bookmarks option is set, UseOutlines is used.

      • None

      • UseThumbs: show thumbnails.

      • UseOutlines: show bookmarks.

      • FullScreen

    • pdfstartpage: Determines on which page the PDF document is opened.

    • pdfstartview: FitB or FitH: Set the startup page view.

  • paper size settings: The keywords for paper size may directly appear in the hypersetup command, since they are boolean variables. (draft=true is equivalent to draft.) . An overview of the possible settings is presented in Table 7-1.

    Table 7-1. Paper sizes with hyperref

    Paper size option

    Meaning

    draft

    all hypertext options are turned off

    debug

    extra diagnostic messages are printed in the log file

    a4paper

    210mm x 297mm

    a5paper

    148mm x 210mm

    b5paper

    176mm x 250mm

    letterpaper

    8.5in x 11in

    legalpaper

    8.5in x 14in

    executivepaper

    7.25in x 10.5in

The breaklinks option enables breaking of long hypertext links across lines, the linktopage option has the effect that only the page number (and not the chapter/section text) links to the relevant chapter or section. All possible link colour options are shown in Table 7-2. The frenchlinks option differentiates links from the rest of the text not through colours, but through small caps instead.

Table 7-2. Link colours with hyperref

Option

Standard colour

Meaning

linkcolor

red

internal links

anchorcolor

black

anchors

citecolor

green

citations

filecolor

magenta

links to files

menucolor

red

links to Acrobat® menus

pagecolor

red

links to page numbers

urlcolor

cyan

links to WWW pages

See Erstellung von pdf-Dokumenten mit LaTeX for more PDF options.


7.2.7. Hyphenation

If pdfjadetex encounters a word it does not know how to hyphenate, the word is skipped! In the following example from Customizing Document Production, the french words savoir or évolution were problematical.

The error log file contained the following lines:

Overfull \hbox (3.64668pt too wide) in paragraph at lines
249?256 \T1/ptm/m/n/11 plusiers resources liées á linux-mandrake. Si vous souhaitez en savoir

This message tells us that pdfjadetex does not know how to hyphenate savoir, consequently it is skipped. In order for pdfjadetex to correctly format lines, we must ensure that hyphenation is explicitly activated using the command \def\Hyphenate{1 } in jadetex.cfg and, in addition, that the text is justified (\def\Quadding{justify}). But, in addition, we must explicitly set the language in order to activate the correct hyphenation module (with \def\Language{UK}).

Internet addresses can become quite long, even longer than a single line. jadetex doesn't hyphenate them properly. Actually, no hyphenation rule prevents it from inserting a hyphen after the double slash. The url package was developed by Donald Arseneau to solve these problems. To use it, you must tell jadetex to load it. You do this by inserting the following in the jadetex.cfg file (see Section 4.4 and Section 7.2.12 for more on jadetex.cfg):

\usepackage{url}

We must also ensure that openjade properly transforms the <ulink> and <filename> elements into jadetex commands. See Section 7.2.10 for more on this.


7.2.8. Bookmarks

Bookmarks are a navigation aid for the Acrobat® Reader. They are a tree-like structure that reflects the chapter/section structure of the document. The nodes of the tree are the chapter/section texts, which link directly to the respective chapter/section. The behaviour of bookmarks can be customised with the following commands (since they are boolean variables, they can appear directly in the hypersetup command, e.g. to switch it explicitly off, use e.g. bookmarksopen=false):

  • bookmarks: bookmarks are created even if the \ tableofcontents command is not present.

  • bookmarksopen: Expand all bookmarks.

  • bookmarksnumbered: Include section numbers in bookmarks.

  • bookmarksopenlevel: The maximal tree depth of bookmarks to open


7.2.9. PDF view options

The most important options regarding PDF view options are shown in Table 7-3. They are set in jadetex.cfg with the hypersetup command.

Table 7-3. PDF view options

PDF view option

Meaning

Values

pdfcenterwindow

Center the window of the document

false, true

pdffitwindow

Fit document window to the first document page

false, true

pdfhighlight

How to highlight a link button when pressed

/I (Inversion), /N (No effect), /O (Outline), /P (Pressed)

pdfmenubar

Show menu bar

false, true

pdfnewwindow

Open the document in a new window

false, true

pdfpagelabels

Show logical labels, instead of page numbers (use only in \usepackage)

false, true

pdfpagelayout

Controls the page layout upon opening of the document

SinglePage (default), OneColumn, TwoColumnLeft, TwoColumnRight

pdfpagemode

Specifies how to open the document

None (default), UseThumbs, UseOutlines, FullScreen

pdfstartpage

Specifies the opening page of the document

1 (default), other page number

pdfstartview

Specifies opening size

FitH, FitB,

pdftoolbar

Show viewer's toolbar

false, true

plainpages

Page anchors as arabic numbers

false, true

See Erstellung von pdf-Dokumenten mit LaTeX for more PDF options.


7.2.10. Links to internet sites

The url package enables the use of URL links in the PDF document and takes care of their hyphenation. To use it, you must tell jadetex to load it. You do this by inserting the following in the jadetex.cfg file (see also Section 4.4):

\usepackage{url}

We must also ensure that openjade properly transforms the <ulink> and <filename> elements into jadetex commands. For the <ulink> element, we want openjade to copy the text found between the two tags and within parentheses, i.e. the address given in the id attribute.

In order to do this, we use the customization as described by Pascal Lo Ré in Customizing Document Production: we create a sosofo of type formatting-instruction. We added a new flow-object to those already defined in the DSSSL stylesheets:

;;  Inserted in order to be able to get URLs in PDF documents.
;;  Adapted from manual-print.dsl of <productname>Mandrake</productname>.
;; Include the flow object class "formatting-instruction" : ONLY for Jade
(declare-flow-object-class formatting-instruction
       "UNREGISTERED::James Clark//Flow Object Class::formatting-instruction")

This addition allows us to insert arbitrary, non-formatted text into the output file. Then we can insert suitable TeX instructions into the intermediate output. This new flow-object is called formatting-instruction. The usage syntax of formatting-instruction is as follows:

(make formatting-instruction data:
"text-to-be-output")

The value of the data variable is inserted into the output file. Note that the “\” character must be escaped with an addition '\'. For example, in order to insert the TeX function,\penalty, you would use the following:

(make formatting-instruction data:
"\\penalty")

Now, in order to get hypertext links in PDF, the ulink element is redefined in lyxtox-print.dsl with the help of formatting-instruction (see Using Jade for SGML transformations for the class "formatting-instruction" and Section 4.2 and Section 7.1.5 for the lyxtox-print.dsl file):

;; *** URLs ***
;; Original : dblink.dsl
(element ulink
 (sosofo-append
  ;; If you allow process-children here, you will get the text printed once more!
  ;; (process-children)             ;; Write the text with its format (anchor in HTML)
  (make formatting-instruction      ;; Write : " \href{" + theUrl + "}{" + theText + "}"
    data: (string-append " \\href{" (attribute-string (normalize "url")) "}{" (data-of (current-node)) "}")
  )
 )
)

The text to be edited is accessed by data-of (current-node). Because this text requires special formatting (italics?), we must call the (process-children) function. The address is returned by attribute-string (normalize “url”). To construct the string, we concatenate using the command string-append, and assign the result to data in the sosofo. This nicely produces hypertext links in PDF.

However, the <ulink> element is not only used for URLs, but also for the index generation (see Section 7.1.11). Since we redefined it above, it is going to be used for the index too (remember, the changes you do in the stylesheets, as discussed in Section 7.1.5, will take precedence over the defaults). We thus have to copy the code from dbindex.dsl for <ulink> elements that are children of <primaryie>, <secondaryie> or <tertiaryie>, which is the case for the index:

;; These three elements are from "dbindex.dsl".
;; Must be placed here because of the redifinition of "ulink".
;; Otherwise the Index entries will point to HTML files,
;; instead of page numbers.
(element (primaryie ulink)
  (indexentry-link (current-node)))
(element (secondaryie ulink)
  (indexentry-link (current-node)))
(element (tertiaryie ulink)
  (indexentry-link (current-node)))

7.2.11. Thumbnails

The thumbpdf package by Heiko Oberdiek installs the Perl program thumbpdf on your system. With the help of thumbpdf and Ghostscript (which should also be installed), you can create thumbnails for the PDF document (to be seen when you click on the “thumbnails” register card in Acrobat® Reader). Thumbnails are embedded images of the document's pages, drawn in small size and resolution. Their purpose is to facilitate navigation through the document (of course only if the PDF viewer supports them).

You need at least Ghostscript 5.50 in order to be able to use thumbpdf. You need to declare its use in the jadetex.cfg file (see Section 4.4 and Section 7.2.12) as follows:

\usepackage[pdftex]{thumbpdf}

The PDF document with thumbnails is created in a three-pass process:

  1. The PDF file without thumbnails is created first.

  2. thumbpdf is called and given as argument the basename of the PDF file. It creates the thumbnails in a file with the same basename and the ending .tpt.

  3. The PDF file is created again. Through the \ usepackage instruction above, pdfjadetex searches for a .tpt file in the current directory and uses it to embed the thumbnails in the PDF file.

If the parameter use_coolthumbs is set to 1 in lyxtox (that's currently the default), the thumbnails will be generated using the coolthumbs script (see Section 4.15), which in turn will call The GIMP to create smooth, anti-aliased thumbnails. See the Linux LaTeX-PDF HOW-TO for the details of the inner workings of coolthumbs.

See also Section 7.1.4.7 for the PDF document creation process.


7.2.12. Configuring pdfjadetex

The file jadetex.cfg is used whenever you want to override jadetex's or pdfjadtex's default behavior. This file just goes in the current working directory (i.e. where jadetex is being run from). It seems that pretty much any TeX code can go in there, but here are some common things.

  • The hyperref package expands the cross-referencing capabilities of LaTeX introducing \special commands that can be interpreted by a driver (like pdfjadetex) to produce hypertext links to places in the same document (cross-references), other PDF documents, or even WWW pages. You declare its use in jadetex.cfg with

    \hypersetup{
    

    inserting various options after the curly bracket. See Section 7.2.6.

  • Two-Sided Pages: sometimes, you will want jadetex to start chapters on the recto side, and try to keep the total count of pages even. This is no longer the default behaviour, so if you want to revert to it, put the following in jadetex.cfg:

    \def\PageTwoSide{1} \def\TwoSideStartOnRight{1}
    
  • PDF Outlines (bookmarks): PDF outlines are a nested, tree-like list of the hierarchy of chapters, sections, etc, each trre node being a link to the relevant chapter or section. These are displayed on the left side of Acrobat® Reader. To enable the listing of bookmarks, put this in jadetex.cfg:

    pdfpagemode=UseOutlines
    
  • If you happen to have a "Citation Reference" or a "Cross Reference" inside of a "Section" (for example "5 The Proof [somebook]"), then you will need to give the linktocpage option to pdftex, otherwise you will get an error when generating the Table of Contents saying

    "pdfTeX error (ext4): link annotations can't be nested"
    

    Put the following in jadetex.cfg:

    \usepackage[pdftex,linktocpage]{hyperref}
    
  • Computer Modern (CM) fonts: The ae package provides virtual EC fonts based on the CM ones, so that the Type1 CM fonts can be used in the output file. However, not all EC characters are available in this way. The aecompl package defines the missing characters as bitmapped fonts. To use them you should have the following in jadetex.cfg:

    \usepackage[T1]{fontenc} 
    \usepackage{ae} 
    \usepackage{aecompl} 
    

    In this way the output will use CM fonts for all except some rarely used characters


7.2.13. Further enhancements

The PDF format offers some possibilities that further enhance its use:

  • Optimization: The document can be optimized for browsing , so that it needs only be partially downloaded to be viewed.

  • Passwords: Using programs like pdlin [28] you can set passwords and permissions regarding printing, changing, selecting and adding text.

These enhancements are not used in my scripts, but they certainly could be incorporated if needed.

For more information on the creation on powerful PDF documents see Erstellung von pdf-Dokumenten mit LaTeX.

The manual-print.dsl file, used for the print versions of the Mandrake Linux distribution, contains examples of customizations that are beyond the scope of this document, but may very well be useful in a heavy production environment. These include:

  • Definition of language

  • Foreign text or abbreviations

  • Long titles

  • Glossary entries

  • Inline graphics

  • Character fonts

and other goodies that make reading of this document strongly recommended for the interested reader.


7.3. Optimal PS

We discuss some PS-specific topics here - currently only how to embed Computer Modern fonts in a PS document.


7.3.1. Embedding Computer Modern fonts

The reason behind embedding the Type1 Computer Modern fonts in PostScript® (PS) documents is the same as for PDF ones (see Section 7.2.3): the resolution of the bitmapped fonts that are used by default by TeX/LaTeX matches that of the printer on the system you create the document. This is rarely the same resolution of the monitor or printer the reader will use. This change in resolution results in terrible quality when displaying these fonts on a screen. or printing it on a printer, whose resolution does not match the one of the bitmapped font.

Tip Tip:
 

In recent TeX/LaTeX distributions, freely available Type1 versions of the CM fonts are provided. These appear under the collective name of bluesky. The bluesky fonts can be obtained from CTAN, if not already installed on your system.

You can embed the CM fonts in the PS document using an aproppriate printer with the -P option:

PRINTER="cmz"
export PRINTER

I use this comfortable method in the lyxtox script. This will make dvips (which is directly called in the current lyxtox script, or indirectly called by sgmltools, in older versions of lyxtox, see Section 7.1.4.9) to read the configuration file config.cmz (under /var/lib/texmf/dvips/config/config.cmz on my system). On my system, this file contains only one line:

p +psfonts.cmz

and looking at the psfonts. cmz file (located in /usr/share/texmf/dvips/bluesky/psfonts.cmz, notice the name “bluesky” in the directory path), I can see that all the necessary mappings from short names to full ones are there. Here are some lines of psfonts.cmz:

cmb10           CMB10           <cmb10.pfb
cmbsy10         CMBSY10         <cmbsy10.pfb
cmbx10          CMBX10          <cmbx10.pfb
cmbx12          CMBX12          <cmbx12.pfb
cmbx5           CMBX5           <cmbx5.pfb
cmbx6           CMBX6           <cmbx6.pfb
cmbx7           CMBX7           <cmbx7.pfb
cmbx8           CMBX8           <cmbx8.pfb
cmbx9           CMBX9           <cmbx9.pfb
cmbxsl10        CMBXSL10        <cmbxsl10.pfb

If no map file like the above can be found, you probably miss something from your distribution, unless you installed TeX/LaTex by hand, in which case you know what you are doing and might try the following (see LaTeX to PDF Guide):

Search for a file pdftex.map (see Section 4.3) in your TeX distribution (located in /var/lib/texmf/dvips/config/pdftex.map on my system). Run a Perl script as follows:

perl -ne '($font, @rest) = split(/ /);
                  $uc = $font;
                  $uc =~ tr/a-z/A-Z/;
                  print "$font $uc @rest";
                 ' PATH-TO-PDFTEX-MAP/pdftex.map
                 > cm.map

The created cm.map file instructs dvips to embed the Type1 outline fonts for the Computer Modern fonts into the resulting PostScript® file.

Add to your presonal .dvipsrc the following line

p +PATH-TO-CM-MAP/cm.map 

This tells dvips to use the font map you just created in addition to the system-wide configuration files.

The LyX User's Guide (available through the menu: Help-->User's Guide) contains a whole chapter devoted to the configuration of dvips and Ghostscript for LyX (Section 2.6, as of LyX 1.2.0), which is definitely recommended reading.


Chapter 8. HTML validation

The W3C is maintaining a HTML validator service at The W3C Validator. If you try it using various URLs from pages on the Internet that were created using the methods described in this document (i.e. DocBook SGML, openjade etc.), you may be surprised that so many of them will return the following error from W3C when validatedfrown:

I was not able to extract a character encoding labeling from 
any of the valid sources for such information. Without encoding 
information it is impossible to validate the document.

This is because their authors did not take the trouble to examine the created HTML documents and enhance them to conform to the HTML standards. There is, in fact, some amount of work involved, if you want your HTML documents to obey the standards set by the W3C - but this work too is automated in the scripts presented here! Let's have a look how this is done:

The above error from the W3C HTML Validator comes from the fact that the documents, as produced by openjade and the configuration settings discussed so far, do not include something like

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

But even if they did, the all-important DOCTYPE statement

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

would also be missing, making validation against a HTML DTD impossible. This may be a deliberate “feature” of the tools involved in the document creation chain. But it also may have its root to an option that went unnoticed by me throughout the time! If you happen to know of such an option (perhaps in the HTML stylesheet?), please don't hesitate to contact me.

The way I decided to close this gap is an idea I borrowed from Hugo van der Kooij while reading his document on how to setup your own docbook processing: after the HTML document has been created, proceed by extracting 2 parts out of it, the title part and the body part, both stored in separate temporary files called title.tmp and body.tmp respectively. This splitting part is done by an awk script called htmlsplit.awk. You should also have created three text files

  1. your replacement HTML code up until the <title> tag - this is what we will call part1,

  2. the title part, containing the title (or even some navigational menu structure specific to your website, but we will not pursue it here further, see Hugo's original document for this) - this is what we will call part2,

  3. the footer part - which we will call part3.

Part1 looks like

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>

so that is where the DOCTYPE statement goes! Part2 contains:

</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body bgcolor="#FFFFFF">

so there is where the encoding information goes! I have also set the background colour to fit my site's design. Part3 contains

<table width="100%">
<tr align="center">
<td valign="middle">
<a href="http://validator.w3.org/check?ss=1&amp;sp=1&amp;uri=http%3A%2F%2F_DOMAIN_%2F_DIRNAME_%2F_FILENAME_">
<img border="0" src="images/valid-html401.png" alt="Valid HTML 4.01!" height="31" width="88"></a>
</td>
<td valign="middle">
<a href="http://counter.li.org">
<img border="0" src="images/linux_user_314103.png" alt="Linux User #314103"></a>
</td>
<td valign="middle">
<a href="http://www.anybrowser.org/campaign/">
<img border="0" src="images/w3c_ab.png" alt="Best viewed with ANY browser!"></a>
</td>
</tr>
</table>
</body>
</html>

and, as you can easily see, is a customized footer. Now we proceed to intermix all the above files in the following order into one HTML document:

  1. part1

  2. title

  3. part2

  4. body

  5. part3

At the end we get a HTML document that is customized to the design of our site and can easily be checked for compliance to the HTML standards!

Tip HTML Parameters and Chunking
 

You can achieve a similar result by setting the html-header-tags parameter accordingly in the HTML DSSSL stylesheet (Section 4.2, Section 7.1.5). The html-header-tags parameter should contain a list of the the HTML HEAD tags that should be generated. The format is a list of lists, each interior list consists of a tag name and a set of attribute/value pairs: '(("META" ("NAME" "name") ("CONTENT" "content"))).

(define %html-header-tags%
 ;; What additional HEAD tags should be generated?
 '())

Of course, you would have to change the html-header-tags parameter in the .dsl file each time before processing a new document. You would thus need some kind of placeholders that could be identified and changed with sed to the actually needed values. However, this amounts to the same effort that we are currently investing with our method, which also substitutes various placeholders in parts 1-3. Obviously, such a flexibility must come at some cost. Feel free to experiment with the other HTML Parameters for Chunking too!

The footer file, part3, deserves some extra attention, since it illustrates the kind of customization and control over your HTML output that you can achieve with this method: it contains the HTML code that prints three icons in a row - a W3C HTML validation icon, a Linux Counter icon and an icon from the "any browser" campaign. There are three placeholders in the link for the HTML validation icon: _DOMAIN_, _DIRNAME_ and _FILENAME_. These are substituted on-the-fly (using sed one-line commands) with the domain, directory and filename respectively of the file whose footer we are currently processing:

$SED -e "s/_DOMAIN_/$DOMAIN/g" ${DATADIR}/part3 > part3_1.tmp
$SED -e "s/_DIRNAME_/$1/g" part3_1.tmp > part3_2.tmp
$SED -e "s/_FILENAME_/${BASENAME}/g" part3_2.tmp > part3.tmp

The result is an icon that, when clicked, will automatically pass the URI of the current file to The W3C Validator for HTML validation!

Note Please note:
 

A file containing graphical callouts (see Section 4.8 and Section 5.9) will NOT be validated! You will get an error saying

document type does not allow element "IMG" here
In fact, any HTML standard after 2.0 explicitly forbids <img> from occuring anywhere inside the parse tree of a <pre> element - and that's where exactly the graphical callouts most often happen to occur! For example, the HTML 3.2 reference specification states: "PRE has the same content model as paragraphs, excluding images and elements that produce changes in font size, e.g. IMG, BIG, SMALL, SUB, SUP and FONT". If you want your document to be fully compliant, you will have to suppress the use of graphics for the callouts by setting %callout-graphics% to false in you stylesheet:
(define %callout-graphics%
;; Use graphics in callouts?
#f)

Also note that admonitions (see Section 4.7 and Section 5.8) will be positioned using valign="middle", instead of the right one valign="middle", thus leading again to non validation. But this is easily corrected by the sed script sedscr_val that is run near the end of the lyxtox script.


Chapter 9. Accessibility

A broad definition of accessibility covers people operating under situational limitations as well as functional limitations:

Functional limitations pertain to disabilities, such as blindness or limited use of the hands. Functional limitations can be visual, auditory, physical, or cognitive (which includes language and learning disabilities).

Situational limitations relate to the prevailing circumstatnces, environment, or device. These limitations can affect anybody, not just people with disabilities. Examples include mobile devices and device limitations, such as having no mouse, or constraining circumstances, such as interacting with a web site through a computer integrated into a car's dashboard, wher the use of the hands and eyes is limited.

Shawn Lawton Henry, Constructing Accessible Web Sites.

Something is accessible if it is able to be used by persons with disabilities. In the context of computing, this generally means that the software or device should be compatible with access aids, and should be able to transform itself into a needed format (see the Glossary of Bobby). The efficiency with which information can be accessed by people with various abilities and disabilities ultimately determines the degree of accessibility - it is clear that this is a highly subjective matter. Nevertheless, various criteria have been developed to help determine how accessible a web page is. I will follow the priorisation of accessibility errors as suggested by Bobby in http://bobby.watchfire.com:80/bobby/html/en/readreport.jsp:

Priority 1 Accessibility problems

seriously affect a page's usability by people with disabilities. A Bobby Approved rating can only be granted to a site in which none of the pages contain accessibility errors. Bobby Approved status is equivalent to Conformance Level A for the Web Content Guidelines.

Priority 2 Accessibility problems

are those which you should try to fix. Although not as vital as Priority 1 access errors, the items in this section are considered important for access. If you can pass all items in this section in addition to the Priority 1 section, including relevant User Checks, your page meets Conformance Level AA for the Web Content Guidelines. This is the preferred minimum conformance level for an accessible site.

Priority 3 Accessibility problems

are third-tier access problems which you should also consider. If you can pass all items in this section in addition to the Priority 1 and 2 sections, including relevant User Checks, your page meets Conformance Level AAA for the Web Content Guidelines.

If you think this is “fine print” that does not have to bother you, you are wrong: no lesser than the Sydney Organizing Committee for the Olympic Games (SOCOG) thought the same and was fined to pay A$20000 in a case that was brought to an australian court - see the Reader's guide to Sydney Olympics accessibility complaint for the whole story and an explanation of the court's decision, as well as Olympic Failure: A Case for Making the Web Accessible for the web designer's point of view.

You also cannot argue that this has happened “too far away” from you, perhaps on another continent. The world grows together and all western nations have passed legislation that is more or less similar on this point, based on the same legal principles of unequal treatment (“discrimination”; “unfavourable” treatment) and unjustifiable hardship (“undue” hardship or “burden”). It is a matter of time until similar cases appear to the courts. I hope you understand by now the following

Important Important fact:
 

Accessibility is NOT optional!

So what can you do to improve the HTML pages created by the tools I presented, from the accessibility point of view? You can pass any of your generated HTML pages to Bobby for an accessibilty test[29]. You will be presented with a list of errors, sorted according to priority as above. I will discuss them for a typical page that was generated with the tools I presented in the previous chapters.

Tip Mean tip:
 

Not all of the accessibility errors presented in the following sections can be reproduced when testing a page that was generated with the method presented here. To get the full idea, pass the the Linux Accessibility HOWTO pages for a test at Bobby.shock

For more information on accessibility, see the Bobby Accessibility FAQ and the the Linux Accessibility HOWTO.


9.1. Priority 1 accessibility errors

No priority 1 accessibility errors seem to be there for a page that was generated with lyxtox. This is good news, as it means that the pages generated this way do not pose serious obstacles to people with disabilities. Provided that we didn't miss anything, we can say that the pages have “ Bobby Approved status”, or that they conform to Level A of the Web Content Guidelines.


9.2. Priority 2 accessibility errors

The explanation texts are from Bobby. Clicking on any of the problems that Bobby reports will produce a more detailed description of how to fix the problem. In addition to items that Bobby can examine automatically, a number of items that require manual examination are presented in the User Checks section of the Bobby report.

Nest headings properly.

This comes from incorrect nesting of the H elements in the authors environment. I could not correct this, as it seems to me a stylesheet/Jade problem. Of course, you could correct it manually, or with a script, but I think it is better to correct it before it happens.

Important Why it is bad:
 

Some users skim through a document by navigating its headings (the H1, H2, H3, H4, H5, and H6 elements). Some access aids extract the headings to create an outline of the page, allowing users to get an overview and jump quickly to a desired part. Incorrect nesting of headings will result in an incorrect outline structure which may disorient users. Screen readers rely on these tags to interpret the structure of your pages. See also "Using real headers" by Mark Pilgrim.

Use a public text identifier in a DOCTYPE statement.

We have corrected this already in Chapter 8.

Important Why it is bad:
 

Include a document type declaration at the beginning of a document that refers to a published DTD (e.g., the HTML 4.01 transitional DTD). The document type declaration should be appropriate to the markup language you are using.


9.3. Priority 3 accessibility errors

Provide a summary for tables.

This refers to the revision history table. Again, it seems to be a stylesheet/Jade problem.

Identify the language of the text.

This helps the computer or assistive device present information in a way that is appropriate to the language and also helps automatic translation software that translates text from one language into another. This should already have been fixed with the techniques of Chapter 8.


Chapter 10. Mathematics

Since LyX is a frontend for LaTeX, it is no secret that it has exceptional capabilities when it comes to typesetting mathematics. However, when it comes to transforming the .lyx document to different format, the possibilities are rather limited[30]:

The above is not a satisfactory situation. Even if we can get mathematics in some formats, we would have to abandon the SGML framework we have been using so far. It would be nice if we could continue to write in a DocBook SGML document in LyX, using all the excellent math typestting capabilities of TeX/LaTeX and then export to whichever of the above formats through our usual scripts and tools like openjade, pdfjadetex etc. In fact, as the following example already shows, we can:

Equation 10-1. (eq2)

Let's see what is going on, from the SGML point of view, when you type an equation like Equation 10-1[31] in LyX: when exported to SGML, this equation yelds the following MathML code:

 <equation>
  <alt>\begin{equation}
{\displaystyle }\int _{a}^{b}x^{2}dx=\frac{1}{3}(b-a)^{3}\label{eq2}\end{equation}

  </alt>
  <math>
   <mtable>
    <mtr>
     <mrow>[par][displaystyle [par]]<mo> &int; </mo>
      <mrow>
       <msup><mi> x </mi><mn> 2 </mn>
       </msup>
      </mrow><mo> &InvisibleTimes; </mo>
      <mrow><mo> &DifferentialD; </mo><mi> x </mi>
      </mrow><mn> = </mn>
      <mfrac><mn> 1 </mn><mn> 3 </mn>
      </mfrac>
      <msup><fenced open="(" close=")">
       <mrow><mi> b </mi><mn> - </mn><mi> a </mi>
       </mrow></fenced><mn> 3 </mn>
      </msup>
     </mrow>
    </mtr>
   </mtable>
  </math>
 </equation>

There are some remarks due here:

As you see, LyX is already capable of producing the (excessively verbose) MathML version of our equation. However, the technologies for delivering Math on the Web through MathML are in constant flow (see a MathML status report). Furthermore, my openjade will spit dozens of errors, one for each MathML tag. Even if I were able to eliminate these through the use of some extra module, it seems that the quality of the results, be it online or printed, is going to be rather dissapointing, as the following quote by Allin Cottrell[32] clearly indicates:

A dsssl engine such as openjade can turn the MathML into TeX for you but the results are likely to be disappointing, particularly if you are used to typesetting mathematics using TeX itself. TeX's native mathematical typesetting is near-perfect, only occasionally requiring manual tweaking to achieve optimal results; it is also rather comprehensive, with the aid of the AMS (American Mathematical Society) extensions if need be. But if you take the route of MathML to TeX via dsssl and jade, the specifics of the math typesetting must be handled by the dsssl stylesheet. David Carlisle put some work into this a few years back (for which we can be grateful), but he didn't finish the job and nobody else has done so since. Thus if you send MathML through jade to TeX you are likely to find (a) that those elements that are recognized by the stylesheets are typeset less adeptly than by TeX itself (with clumsy-looking spacing), while (b) various important elements may not be recognized at all. For example in my field of statistics the overbar (denoting the arithmetic mean) is a common modifier, but it is simply ignored. Other formulations common in statistics are also ignored, or are not dealt with properly, so this route is not really usable for me.

Allin Cottrell has introduced the DBTeXMath method, which I have slightly modified and incorporated in the method I presented so far. In the following sections I will first describe the necessary steps and software, then the explanation of the details.


10.1. DBTeXMath

You probably already have almost all the necessary software installed on your system for the DBTeXMath method to work on it, but let's have a look at it anyway:

  • Perl: The method is heavily based on some post-processing done by Perl scripts, so Perl is a must. Any modern version should do, so just install the Perl package and the Perl modules of your distribution.

  • The three .dsl files (stylesheets) as discussed in Section 4.2. They already contain all necessary changes to process Mathematics, so you will just need to copy them to the aproppriate locations - but you have done that already in Section 4.2.

  • TeX, LaTeX (see Section 3.5), dvips and convert (see Section 3.6) are also used heavily, but you already have installed those, didn't you?

So far, you might not have to install anything additional, but the following is definitely new:


10.2. Writing Mathematics in LyX

You write Mathematics in LyX just as you would write it anyway, i.e. as if it were going to be processed by TeX and not Jade. The whole combined power of TeX/LaTeX/LyX lies at your fingertips! So let 's try some math here! This is a numbered displayed equation:

Equation 10-2. (eq3)

An example of an unnumbered equation:

Equation 10-3. (eq4)

And here is an example of an inline formula: , or again .

This is a partially numbered displayed equation:

Equation 10-4. (eq5)

while in this one all equations are numbered:

Equation 10-5. (eq6)

Now, let's do some more advanced examples:

Equation 10-6. (eq7)

How about these: a matrix

Equation 10-7. (eq8)

a partially filled matrix

Equation 10-8. (eq9)

a continued fraction

Equation 10-9. (eq10)

a limit

Equation 10-10. (eq11)

and some inequalities

Equation 10-11. (eq12)

Equation 10-12. (eq13)

Equation 10-13. (eq14)

Here is an inline inequality, with a proof of non-negativity of relative entropy ( ): show that , for Then observe that (here comes an inequality array):

Equation 10-14. (eq15)

If you are looking at the PDF or PS version of this document, everything will look perfect, because it was typeset directly by TeX. But if you are reading the HTML or RTF version, then you might have noticed that the equation numbering is not continuous. Rather it starts all over from (1) in each multiline equation. This is a bug of the method we use: each equation, be it inline or displayed, one line, or multi-line, will be processed by TeX as a separate document (only for HTML and RTF), starting the numbering from 1 over and over again. Till some TeX Guru (anyone reading?) out there tells me how to work around this, the simplest solution for equation numbering is to follow the rules below:

  1. Don't put more than one labels in a multi-line equation (an equation array in LaTeX jargon). If you absolutely need to, split the equation in two, containing only one label each.

  2. Start each equation label with “eq”, followed by a number, like “eq1”, “eq2”, “eq78” etc. My scripts number each displayed equation (i.e. not the inline ones) consequtively through the whole document, automatically assigning an id and a title to them in the SGML code. These ids and titles are always of the form “eqxxx” where xxx is the number of the equation and refer to the whole equation, not some line of it. On the other hand, LyX knows only of equation labels assigned to some line of some equation. If you followed the previous rule and if you name the one and only label of your equation with the “eqxxx” label displayed in its title, then you can refer to it from LyX like any other cross-reference and everything will work perfectly!

  3. The best way to implement the previous rule is to just process your document once without any labels in equations. You will see that all versions will have concise and continuous numbering in the equation titles. You will see titles like “Equation 11-1. (eq2)”. The “11-1” comes from Jade and means “the first equation of chapter 11”. The “(eq2)” is the title, assigned automatically by my scripts. Now, if you want to label this equation in LyX, label only one line of it (first rule) and label it “eq2” (second rule[34]).


10.3. The magic behind the math

It's not the math behind the magic - so you don't have to be afraid of being drown in formulas! Just read on to delve into the details of math document processing that goes behind the scenes while you are drinking that (hopefully not cold) cup of coffee. wink

The idea behind the DBTeXMath method used here relies on the fact that, as we saw in the code example in Chapter 10 above, all the TeX code for typesetting the equation is contained between the <alt> tags inside the <equation> tags. The <alt> tag is somehow “misused” for this purpose, but we will not bother about this for the moment (see the discussion in Section 10.4). DBTeXMath tweaks the stylesheets in order to let the TeX code

  • be processed and converted in bitmapped images of PNG and BMP format, for HTML, resp. RTF, or

  • be processed and typeset directly for PDF and PS.

The code between the <math> tags is completely ignored - in fact, it is deleted. The steps in more detail:


10.3.1. SGML math code correction

SGML math code as exported by LyX is, once again, not perfect and the first step consists in correcting it, just as we did in Section 7.1.4.1. We will change the SGML math code to fit our needs using again runsed and sedscr. But this time, we are going to need an awk script too, awkscr_math. We'll see in Section 10.3.1.2 why.


10.3.1.1. Problems

Regarding Mathematics code, there are four distinct problems in LyX' SGML:

  1. LyX produces only an <alt>/</alt> tag pair (with the TeX code in it) and a <math>/</math> tag pair with the MathML representation of the equation. This is not enough. The <graphic fileref=”equationimagefile”> is missing. Openjade will complain that the end tag for <equation> was reached, but the element was “not complete”. And of course it is right, since the <alt> tag was meant as a textual representation of the image (see Section 10.4), while the <graphic> tag is expected to hold the visual representation of it in the form of some image file. Clearly, we will have to create the <graphic fileref=...> tags from the scratch. That's the first problem.

  2. The second problem is that LyX exports <equation> even for inline equations! The right tag would be <inlineequation> for inline equations (see inlineequation), <equation> for an equation with title (see equation) and <informalequation> for one without (see informalequation). This is very unfortunate, because it makes all equations “displayed”, i.e. drawn on a separate line. That's the second problem.

  3. The third problem is that the MathML code it produces cannot be dealt with by openjade (at least not with the standard modules I have on my system) and thus produces parse errors. Furthermore, we are not going to need it, since the TeX code inside the <alt> tags will suffice completely for us. We will thus have to delete everything between the <math>/</math> tags. That's the third problem.

  4. The fourth problem regards inequalities in Math Mode. Just writing

    Equation 10-15. (eq16)

    will produce and error

    E: element "B" undefined
    

    because the parser will see the brackets around and think that it is an SGML element. That's the fourth problem.


10.3.1.2. Solution

We will start with the second problem from Section 10.3.1.1 above, since it is the only one that needs both sed and awk to be solved. It is also a nice example of a problem that cannot be solved with only one of them[35]. In order to solve it, an observation was crucial: whenever a displayed equation occurs, LyX ends the preceding line with </para><para>. The idea is therefore to the <equation> tag that follows a line ending in </para><para> to something different than <equation>, so that we can safely say that the rest of remaining <equation> tags denotes actually inline equations and change them accordingly.

The following code in sedscr checks if the line ends in </para><para> and if so, gets the next line in the pattern space (N command). It then changes the <equation> to <informalequation>[36], prints the line and deletes the pattern space completely:

/<\/para><para>$/{
N
s/[ \t\n\r]*<equation>/\
<informalequation>\
/
p
d
}

The rest is accomplished in the awk script awkscr_math. We know by now that whatever <equation> tags may have remained, they denote the start of inline equations. We thus change <equation> to <inlineequation> and </equation> in </inlineequation>, but only between <equation>/</equation> tags. This is done by the following code in awkscr_math:

/<equation>/,/<\/equation>/{
    gsub("<equation>","<inlineequation>")
    gsub("<\/equation>","<\/inlineequation>")
}

Finally, we can transform whatever <informalequation> tags remain back to <equation>:

/<informalequation>/,/<\/equation>/{
    gsub("<informalequation>","<equation id=\"eq" ++num_eq "\"> <title>(eq" num_eq ")<\/title>")
}

By the way, we also did something that is impossible to do in sed: we added a dynamic id and title, both composed of the string “eq” and a dynamically increased counter. The id is the same as the “label” in LyX, the title is what will be displayed in the equation title. By letting the title be equal to the id, we are able to see what id an equation has in the SGML code (because it will be displayed in the title) and set the LyX label for that equation to be the same (see Section 10.2), thus making cross-refernces to equations possible[37].

The first problem, create the <graphic fileref=...> tags, requires a decision: which filename to take? A first idea, to use the label from the TeX codebetween the <alt> tags, as long as there is one, is not viable: what if the TeX code describes more than one equations (an eqnarray), each one labeled with its own label? I decided to use random filenames in all situations, rather than running into such problems. Due to the random part, such a substitution calls for awk, rather than sed.

The random numbers, which will be the filenames to use, are currently drawn between 10000 and 20000. If you want to change these limits, you can do it in the BEGIN part of awkscr_math:

BEGIN {
num_min = 10000
num_max = 20000
num_ran = 0
num_dif = num_max - num_min
num_eq = 0
srand()
}

But sometimes, this randomness is not needed:

Tip How to get identical filenames for the equations from run to run
 

If you want the same sequence of random numbers each time you run the script (thus producing the same filenames from run to run), you should comment the seed function srand().

The following code creates the <graphic fileref=...> tag with a random filename in the fileref attribute after the closing </alt> tag:

 gsub("<\/alt>","<\/alt>\n
<\!\[ \%output\.print\.png; \[\n
<graphic fileref=\"images\/math\/" num_ran "\.png\">\n
\]\]>\n
<\!\[ \%output\.print\.pdf; \[\n
<graphic fileref=\"images\/math\/" num_ran "\.png\">\n
\]\]>\n
<\!\[ \%output\.print\.eps; \[\n
<graphic fileref=\"images\/math\/" num_ran "\.png\">\n
\]\]>\n
<\!\[ \%output\.print\.bmp; \[\n
<graphic fileref=\"images\/math\/" num_ran "\.bmp\">\n
\]\]>\n
")

More precisely, it substitutes </alt> with something like

 </alt>
<![ %output.print.png; [
<graphic fileref="images/math/10404.png">
]]>
<![ %output.print.pdf; [
<graphic fileref="images/math/10404.png">
]]>
<![ %output.print.eps; [
<graphic fileref="images/math/10404.png">
]]>
<![ %output.print.bmp; [
<graphic fileref="images/math/10404.bmp">
]]>

Some remarks:

  • The number used for the filename (10404) is randomly generated (num_ran, in the previous code example).

  • The directory for the images of the equations is images/math. If you need to change it, you will have to do so everywhere in awkscr_math.

  • We make again use of the output.print.xxx entities to denote code that has to be IGNOREd or INCLUDEd , depending on the format we are rendering, see Section 7.2.2 and Section 7.1.4.1.

  • We specify a PNG file even for PDF and PS processing. That's not important. These formats will not take into account the <graphic> element when processed (the stylesheets will take care of this). Nevertheless, we must put a <graphic> tag with some filename there, otherwise openjade will complain.

  • The important information is that the file 10404.png shall be used to display the equation when the output.print.png entity is included (i.e. only in HTML) and that 10404.bmp shall be used when the output.print.bmp entity is included (i.e. only in RTF).

The above substitution takes place in the following 4 situations (which cover all math situations in LyX' SGML) in awkscr_math:

  • Between “<alt>\[” and “</alt>”.

  • Between “<alt>$” and “</alt>”.

  • Between “<alt>\begin{equation}” and “</alt>”.

  • Between “<alt>\begin{eqnarray}” and “</alt>”.

This solves our first problem.

While we are at it, we substitute the “<” and “>” symbols (that appear in inequalities between the alt tags) with their SGML entities[38]:

    if ( $1 != "<alt>\\\[" && $1 != "<\/alt>" ) {
        gsub("<"," \\&lt; ")
        gsub(">"," \\&gt; ")
    }

When texmath2pngbmp.pl is executed, it will see those entities and will substitute them with their numeric equivalents:

sub unescape {
    $eqn =~ s/&#38;/&/g;
    $eqn =~ s/&#62;/\>/g;
    $eqn =~ s/&#60;/\</g;
}

This solves the fourth problem.

The third problem is easier solved: the following code in awkscr_math will substitute everything between <math> and </math> with the empty string (thus creating an empty line):

/<math>/,/<\/math>/{
    gsub(".*","")
}

That empty lines do not get printed, is easily seen from the last code block in awkscr_math:

!/^$/{
    print
}

which will print every line that is not empty.


10.3.2. HTML and RTF

The HTML and RTF document math processing is done partially in the stylesheets[39] (Section 4.2) and partially in the texmath2pngbmp.pl script.


10.3.2.1. Math processing in the HTML stylesheets

The following code in the HTML stylesheets

(root
 (make sequence
;   (literal
;    (debug (node-property 'gi
;                         (node-property 'document-element (current-node)))))
;(define (docelem node)
;  (node-propety 'document-element 
;    (node-property 'grove-root node)))
   (process-children)
   (process-math)
   (with-mode manifest
     (process-children))
   (if html-index
       (with-mode htmlindex
         (process-children))
       (empty-sosofo))))

(found in the docbook.dsl file of the original DocBook stylesheet package) initiates exactly the same processing as in the standard stylesheets, with one addition: the (process-math) instruction will start an additional processing step, after the standard one. The process-math routine is further specified in the code as follows:

;; Write equation info to equation-list.sgml
(define (process-math)
  (make entity
    system-id: "equation-list.sgml"
    (make element gi: "equation-set"
          attributes: (list
                       (list "latexopt" $latexopt$)
                       (list "density" $density$)
                       (list "usepackage" $usepackage$))
          (with-mode htmlmath (process-children)))))

This will create a new SGML file, equation-list.sgml, in the current directory, that will contain an element of type "equation-set". That's simply a container of equations and some LaTeX options that may be passed to it from the stylesheet: it contains the LaTeX options in “latexopt”, “density” and “usepackage”, as well as the TeX equation code, enclosed between <texequation>/</texequation> tags[40]. Here's how the equation-list.sgml file may look like[41]:

<equation-set
latexopt="12pt"
density="96x96"
usepackage=""
><texequation
fileref="images/math/11074.png"
>\[
\sum _{n=1}^{\infty }\frac{x^{n}}{n}=\ln \left(\frac{1}{1-x}\right)\]
  </texequation
><texequation
fileref="images/math/15280.png"
>\begin{equation}
f(x)=\left\{ \begin{array}{cc}
 \log _{8}x &#38; x&#62;0\\
 0 &#38; x=0\\
 \sum _{i=1}^{5}\alpha _{i}+\sqrt{-\frac{1}{x}} &#38; x&#60;0\end{array}
\right.\label{eq3}\end{equation}
  </texequation
></equation-set
>

How is the TeX code extracted from the <alt> elements and inserted in equation-list.sgml? This is the core work and is done by the following code in the HTML stylesheets:

;; How to write out an equation into the equation listing file
(define (write-eqn nd)
  (let ((texmath (select-elements (children (current-node))
                                  (normalize "alt")))
        (graphic (select-elements (children (current-node))
                                  (normalize "graphic"))))
    (make element gi: "texequation"
          attributes:
          (list
           (list "fileref" (attribute-string (normalize "fileref") graphic)))
          (literal (data texmath)))))
;; Special processing mode to extract equations
(mode htmlmath
  (default
    (let ((infeqns (select-elements (descendants (current-node))
                                    (normalize "informalequation")))
          (eqns (select-elements (descendants (current-node))
                                 (normalize "equation")))
          (inleqns (select-elements (descendants (current-node))
                                    (normalize "inlineequation"))))
      (with-mode htmlmath
        (process-node-list
         (node-list infeqns eqns inleqns)))))
  (element equation (write-eqn (current-node)))
  (element informalequation (write-eqn (current-node)))
  (element inlineequation (write-eqn (current-node))))

Basically, what the above code does is the following: It processes only text found in the <alt> element and only code found inside <informalequation>, <equation> or <inlineequation> tags. It puts this code (the TeX code of an equation) between <texequation> tags in equation-list.sgml. It also lists the fileref attribute of the <graphic> element.

This completes the Mathematics processing done by openjade. We will use the equation-list.sgml file to create PNG and BMP images of each equation.


10.3.2.2. Math processing with texmath2pngbmp.pl

The texmath2pngbmp.pl is called after openjade has processed the SGML file with the stylesheet for one HTML file. It takes one argument, the file to process. It expects a file with the structure of equation-list.sgml (see Section 10.3.2.1). It basically does the following:

  • It parses the equation-list.sgml file. It extracts “latexopt”, “density” and “usepackage” .

  • For every text found between <texequation> tags, it creates a new TeX file. In the preample, it sets the options found in “latexopt”, “density” and “usepackage”. In the main part, it inserts the text, which is nothing else that the TeX code describing an equation [42].

  • It then calls

    • LaTeX to process the TeX file,

    • dvips to create the the PostScript® file from the .dvi file created by LaTeX,

    • convert to convert the PS file to PNG format with the right density (the full name of the PNG file was extracted from equation-list.sgml)

    • convert (once again) to convert the PS file to BMP format (the full name was calculated from the path and basename of the PNG file[43].

    system ("latex $textmp");
    system ("dvips -o $textmp.eps $textmp -E");
    system ("convert -density $den $textmp.eps $figfile");
    system ("convert $textmp.eps $filepath"."$filebasename.bmp");
    

At the end of the processing, PNG and BMP images are in images/math, while the HTML and RTF documents contain links to them for each equation. We are ready! We can enjoy Mathematics on the Web in TeX quality!smile

Note Note:
 

We only need to call texmath2pngbmp.pl once. The PNG and BMP equation images for the whole SGML file will be created in the right directory and need not be recreated for each HTML or RTF run, since our trick with the output.print.xxx entities will take care for each run to INCLUDE the graphic element with the right filenames in the fileref attribute (see Section 10.3.1.2, Section 7.1.4.1, Section 7.2.2).


10.3.3. PDF and PS

The PDF and PS document math processing is done partially in the print stylesheet[44] (Section 4.2) and partially in the unescape_math.pl script. It differs substantially from the HTML and RTF math processing (Section 10.3.2): instead of creating images for each equation, the stylesheet instructs openjade to directly output the TeX code in the .tex file it produces. We then unescape some characters in that .tex file and pass it to pdfjadetex or jadetex for the creation of the PDF, resp. DVI file[45].


10.3.3.1. Math processing in the print stylesheet

The code for math processing in the print stylesheet does basically one simple thing: it outputs the text found between the <alt> tags (the TeX code of the equation) in <equation>, <informalequation> and <inlineequation> elements of the input file. It encloses that text between begintexliteral and endtexliteral tags:

(element (equation math) (empty-sosofo))
(element (equation graphic) (empty-sosofo))
(element (equation alt)
 (make display-group
   (literal "BEGINTEXLITERAL")
   (literal (data (current-node)))
   (literal "ENDTEXLITERAL")))
(element (informalequation math) (empty-sosofo))
(element (informalequation graphic) (empty-sosofo))
(element (informalequation alt)
 (make display-group
   (literal "BEGINTEXLITERAL")
   (literal (data (current-node)))
   (literal "ENDTEXLITERAL")))
(element (inlineequation math) (empty-sosofo))
(element (inlineequation graphic) (empty-sosofo))
(element (inlineequation alt)
 (make sequence
   (literal "BEGINTEXLITERAL")
   (literal (data (current-node)))
   (literal "ENDTEXLITERAL")))

It also specifies that we plan to use the TeX backend:

(define tex-backend
  ;; REFENTRY tex-backend
  ;; PURP Are we using the <application>TeX</application> backend?
  ;; DESC
  ;; This parameter exists so that '-V tex-backend' can be used on the
  ;; command line to explicitly select the <application>TeX</application> backend.
  ;; /DESC
  ;; AUTHOR N/A
  ;; /REFENTRY
  #t)

When openjade processes the an SGML file containing equations, it will follow the above DSSSL code and produce something like:

{}BEGINTEXLITERAL\char92{}begin\{equation\}
\{\char92{}displaystyle \}\char92{}int \char95{}\{a\}\char94{}
\{b\}x\char94{}\{2\}dx=\char92{}frac\{1\}\{3\}(b-a)\char94
{}\{3\}\char92{}label\{eq2\}\char92{}end\{equation\}
  ENDTEXLITERAL\endSeq{}\endNode{}\Node%

This is the literal TeX code for Equation 10-1, enclosed in begintexliteral-endtexliteral tags. As you can see, openjade "escapes" some characters (for instance, turning a backslash into \char92{} and a caret into \char94{}), so that unless we take special measures we will end up with a typeset version of the source for the equation rather than the typeset equation. It is here that the unescape_math.pl script comes into play.


10.3.3.2. Unescaping TeX equation code

The unescape_math.pl script is a simple Perl script that reads the input file (the .tex file created by openjade through the print stylesheet as in Section 10.3.3.1) line by line and “unescapes” some characters found between begintexliteral and endtexliteral. More precisely, the order of operations for each line is the following:

  1. If the line contains begintexliteral, we are entering TeX math code. Delete begintexliteral.

  2. If we are still in TeX math code, unescape the code. This is basically a substitution of a few characters:

    sub unescape {
        $line =~ s/\\char92{}/\\/g;
        $line =~ s/\\char94{}/^/g;
        $line =~ s/\\char95{}/_/g;
        $line =~ s/\\{/{/g;
        $line =~ s/\\}/}/g;
        $line =~ s/\\&/&/g;
        $line =~ s/\\\$/\$/g;
    }
    
  3. If the line contains endtexliteral, we are leaving TeX math code. Delete endtexliteral.

The code of the above simple algorithm is

while ($line = <MAN>) {
    $begin = 0;
    if ($line =~ /{}BEGINTEXLITERAL/) {
        $line =~ s/BEGINTEXLITERAL//;
        $inmath = 1;
        $begin = 1;
    }
    if ($inmath || $begin) {
        unescape();
    }
    if ($line =~ /ENDTEXLITERAL/) {
        if ($inmath) {
            $line =~ s/ENDTEXLITERAL//;
            $inmath = 0;
        }
    }    
    print TMP "$line";
}

After applying unescape_math.pl to the .tex file, the above “escaped” TeX code for Equation 10-1 becomes:

{}\begin{equation}
{\displaystyle }\int _{a}^{b}x^{2}dx=\frac{1}{3}(b-a)^{3}\label{eq2}\end{equation}
  \endSeq{}\endNode{}\Node%

which will be perfectly typeset by TeX in the subsequent call of (pdf)jadetex.

Note Note:
 

In the original unescape_math.pl script, step 2 of the algorithm was performed as step 3, after the endtexliteral substitution. This is a bug that comes on the surface when we have one line containing both equation code and endtexliteral. In such a case, the original sequence of steps will only delete endtexliteral, leaving the equation code still "escaped".

After this “unescaping”, we get a .tex file that we can either process to a PDF one through pdfjadetex[46]

$PDFJADETEX $1.tex

or to a PS one through jadetex[47] and dvips:

$JADETEX $1.tex
$DVIPS -o $1.ps $1.dvi

10.4. Problems of the DBTeXMath method

The DBTexMath method provides us with an easy way to get perfectly typeset Mathematics in all 4 formats: HTML, PDF, PS and RTF. Moreover, it does this in the context of SGML processing - we are not forced to leave our usual procedures and/or markup, it integrates nicely in our scripts.

Nevertheless, such an approach to mathematical document processing is not free from problems:

  • The most important objection comes from a theoretical viewpoint: the <alt> tag is what its name suggests - a tag for an alternative description of the equation. This is not exactly what the TeX code is, especially from the accessibility point of view (see Chapter 9). In a message to the docbook-apps mailing list, Michael Smith points out the following:

    The DocBook TC intended the contents of <alt> to be a human-readable text description, using ISO entities for any math symbols that couldn't be represented with normal characters.

    But maybe if you use <alt role="tex">, you could tweak the stylesheets so that to the HTML output they add some generated text like:

    <img alt="TeX version of equation: [TeX stuff]">
    

    That way, to people reading or hearing the alt text in a browser, it'll at least be unambiguous to them that what's they're reading/ hearing is TeX math -- which, depending on the complexity of the equation, they may or may not find "human-readable"[48].

  • Labels on displayed equations for the HTML and RTF formats will always start at 1, no matter how many equations you have. I briefly discussed this in Section 10.2, along with some measures to alleviate the problem. Until some TeX Guru (anyone reading?) out there tells me how to work around this, you might want to crop every image of a displayed equation to 90% in the x direction with the -crop option to convert:

    convert -crop 90x100% images/math/17202.png
    

    This will eliminate the equation labels from the image. If you follow the rules in Section 10.2, you will still be able to reference equations concisely and correctly through all chapters and in all formats.

  • Your text should not contain begintexliteral or endtexliteral, otherwise the unescape_math.pl script will try to unescape some characters between them (and will also delete them). Well, I can live with that one...


Chapter 11. Localization

So here you are, after 200 pages of information bombardement, your eyes wide open in expectation of the last firework: “will it work with my language?”, you ask. This chapter deals with the answer to this question.

FIXME: Add: What is internationalization, what localization?

Here is “the problem at hand”, as described vividly in Messing about with Unicode, XML, XSL, DSSSL, Tex, Omega, Fop and the rest of the mess:

Let's say you have a source of data that is going to be published. The data comprises text from many, many languages. English. Dutch. Chinese, sure. Malayalam, perhaps. Tibetan, of course. And it contains some pretty weird symbols. Like a shwa -- a topsy-turvy e: . In order to subsume all this data in one character encoding, everything is encoded in Unicode: the current standard for multi-lingual, unified text encoding.

You would like to print your text, publish your text on the web, and perhaps also to prepare it for further editing in a word-processing package. And you don't want to lose your Chinese characters, your IPA signs, or your mathematical symbols.

Warning Work in progress!
 

This chapter is work in progress, so the information in it is incomplete - and may even be inaccurate! Many problems may have disappeared, as distributions made the transition to UTF-8. Others may have surfaced. If you have any hints regarding this complex subject, please contact me, or post in my Linux Forum.

One thing that makes this endeavour so difficult, is that it is comprised of many, many steps, each one with its own inputs, output and tools. To understand the process, we have to dissect it in those steps:

Questions over questions...Now let's put the puzzle pieces together!

Note LyX User Guide
 

The following sections are taken from the LyX User Guide:


11.1. Shell localization

FIXME: What is locale?

FIXME: what is charset? What is encoding?

FIXME: include localization through locale and .bashrc.


11.2. sed localization

From the sed manual page, we see that the following environment variables affect the execution of sed:

LANG

Provide a default value for the internationalisation variables that are unset or null. If LANG is unset or null, the corresponding value from the implementation-dependent default locale will be used. If any of the internationalisation variables contains an invalid setting, the utility will behave as if none of the variables had been defined.

LC_ALL

If set to a non-empty string value, override the values of all the other internationalisation variables.

LC_COLLATE

Determine the locale for the behaviour of ranges, equivalence classes and multi-character collating elements within regular expressions.

LC_CTYPE

Determine the locale for the interpretation of sequences of bytes of text data as characters (for example, single- versus multi-byte characters in arguments and input files), and the behaviour of character classes within regular expressions.

LC_MESSAGES

Determine the locale that should be used to affect the format and contents of diagnostic messages written to standard error.

NLSPATH

Determine the location of message catalogues for the processing of LC_MESSAGES .


11.3. awk localization

From the awk manual page, we see that the following environment variables affect the execution of awk:

LANG

Provide a default value for the internationalisation variables that are unset or null. If LANG is unset or null, the corresponding value from the implementation-dependent default locale will be used. If any of the internationalisation variables contains an invalid setting, the utility will behave as if none of the variables had been defined.

LC_ALL

If set to a non-empty string value, override the values of all the other internationalisation variables.

LC_COLLATE

Determine the locale for the behaviour of ranges, equivalence classes and multi-character collating elements within regular expressions and in comparisons of string values.

LC_CTYPE

Determine the locale for the interpretation of sequences of bytes of text data as characters (for example, single- versus multi-byte characters in arguments and input files), the behaviour of character classes within regular expressions, the identification of characters as letters, and the mapping of upper- and lower-case characters for the toupper and tolower functions.

LC_MESSAGES

Determine the locale that should be used to affect the format and contents of diagnostic messages written to standard error.

LC_NUMERIC

Determine the radix character used when interpreting numeric input, performing conversions between numeric and string values and formatting numeric output. Regardless of locale, the period character (the decimal-point character of the POSIX locale) is the decimal-point character recognised in processing awk programs (including assignments in command-line arguments).

NLSPATH

Determine the location of message catalogues for the processing of LC_MESSAGES .

PATH

Determine the search path when looking for commands executed by system(expr), or input and output pipes. See the XBD specification, Environment Variables.

In addition, all environment variables will be visible via the awk variable ENVIRON.


11.4. Perl localization

FIXME: Get the information from the Perl manual on locale handling


11.5. Keyboard localization

To use LyX properly, you must set X up correctly. This is especially vital if you're using the international support features of LyX and want to use non-English keyboard mappings. Unfortunately, almost nobody bothers to do this, especially those who've installed Linux on a PC. Administrators of large systems can be guilty of this, too, so don't assume that you're safe if you're using a large system. Any user can instruct X how to use his or her keyboard. You needn't rely on your sysadmin for this - in fact, you shouldn't! The following two programs are all you need to set up your keyboard the way you want it.


11.5.1. xmodmap and xkeycaps

First of all read the man pages for these two programs. They are your best friends when you are trying to set up X key mapping correctly. If you don't have them, install them.


11.5.1.1. xmodmap

This document contains no information on how to use xmodmap. There is a sample .Xmodmap file in Customization. To load the new X keyboard mappings, place the command xmodmap .Xmodmap somewhere in your startup scripts [for example, .cshrc, .profile, .login, or .xinitrc are all possibilities].


11.5.1.2. xkeycaps

This program is a dream come true! It brings up a graphical version of your keyboard, allows you to make modifications, and then spits those modifications out to the standard output in a form readable by xmodmap. It is very useful when you're trying to design a new .Xmodmap file, though it will require you to do a bit of cut-and-pasting.


11.5.2. Modifiers and Mode_switch

LyX supports three modifiers: Shift [S-], Control [C-], and Meta [M-]. Moreover, if one of the keys of your keyboard is configured as a Compose key, then you can use it to enter some characters not available on your keyboard. This compose key can be used either as a modifier (like Shift or Control) or as a prefix key. Here are some examples of what you can do with a Compose key:

  • Compose+e+' é

  • Compose+O+R ®

  • Compose+1+2 ½

  • Compose+<+< «

This input method is particularly handy when you use accented characters only from time to time. It works by default for latin1 characters, but other input methods will be used if you setup your locale correctly.


11.5.3. Helpful Hints and Tips

First, open up two xterminals. Use one to edit a new .Xmodmap file and run xkeycaps from the other. Using xkeycaps, remap your keyboard the way you want it. There's a button in xkeycaps to output the new keymap. Once you hit it, xkeycaps will spit a bunch of stuff on the xterm you executed it from. Just copy and paste all of that into your .Xmodmap file, and you're done.[49]

Also, there are some things you can do to help you get oriented. Try executing the command xmodmap -v -pm. This will show you all of the currently active modifiers. Also try xmodmap -v -pke | more to see which keycode numbers are mapped to which symbolic names. It will also give you some idea of the syntax of the .Xmodmap file.

There's one thing you'll need to check. Make sure that your Delete and BackSpace keys are not defined as the same key symbol by X! Note that giving these two keys unique symbol names will not necessarily alter the behavior of your programs. Some programs bind Delete and BackSpace to the same operation. Emacs is one. Other programs, however, use Delete and BackSpace for different operations. LyX is one of these programs, and if you have Delete and BackSpace labeled with the same key symbol name, you'll have trouble using LyX.


11.6. LyX localization

This section describes how to use LyX with any language you want. LyX comes with a default configuration which supports the English language on a U.S.-style keyboard, with a standard U.S. paper size and the spell checker set to U.S. English. You can change any or all of these settings as desired, and you can make the changes apply to the current session only, or use them as your new default configuration.

If you have a keyboard suited to the language you are using (for example, a german keyboard for writing in German), and you have correctly configured your X environment, all you need to do for LyX is tell it your language, the character encoding, and desired paper size. Refer to Section 11.6.1 for more information.

If, however, you have a U.S.-style keyboard and want to write in a different language than English, you can use an alternate keymap. For example, if you have a U.S.-style keyboard but want to write in Italian, you should configure LyX to use an Italian keymap. Refer to Section 11.5 for details.

Finally, you may just want to change a few key mappings or create an entirely different keymap (for Vulcan, for instance). You may, for example, normally write in Italian on a U.S. keyboard but want to include an occasional quotation in German. In such a case, you can write your own keyboard mapping or modify an existing one to support the characters you want.

The details of how to customize LyX to your own language are way beyond the scope of this manual. You can not only alter the keyboard layout, you can also change the names of the menus buttons, etc., to reflect your language. If you want to learn more about writing keymap files and tailoring LyX to your native tongue, please see the Customization manual for details.


11.6.1. Layout Language Options

The Document Layout dialog lets you set the language and character encoding for your language. Access this dialog by selecting Layout->Document.

Choose your language by clicking on the arrow in the Language combobox of the Document Layout dialog. The default is U.S. English. Scroll to find the language you want and then click on your choice. The language name appears in the window.[50]

The Encoding box lets you choose the character encoding map you want to use. The default is the ASCII encoding which is typically sufficient for U.S. English. A superset of the ASCII encoding is the Latin1 encoding, which includes the characters required by the various Western European languages. The third choice, Latin2, is for support of Eastern European languages. Click on the dialog and then click on the encoding you want to use, and it appears in the window. (Refer to Section 11.6.2 for the character encodings.)

To change the paper size, select Layout->Paper. Then use the Papersize combobox to select a paper size. The default is the U.S. Letter paper size.

To use any language, papersize, or encoding change you made, click on the OK button. Your new configuration will now be used as long as you are in the current LyX session.


11.6.2. Keyboard mapping configuration

The preferences dialog allows you to choose up to two keyboard mappings. This allows you to choose the keymap of your choice for your U.S.-style keyboard. You can choose primary and secondary keyboard languages and then select which one you want to use.

Click on the down arrow for the Primary combobox and choose the language keymap you want by clicking on it. The name them appears in the Primary window. Do the same in the Secondary window for a secondary language if you want one. You can then select either your primary or secondary keymap in the Mapping section of the menu, or select No key mapping if you do not want to use an alternate keymap.

The Character set window allows you to use different character sets if your language uses more than one. Greek, for example, uses two, and a Greek user can enter iso-8859-7 in this window and the appropriate character map (a .cdef file), if available, is loaded.

Note that one of the choices for both primary and secondary keymaps is Other. You can use this to select a custom keymap which you've created yourself. For example, current distributions of LyX provide an american-2 keymap file in the $LYX_DIR/kbd directory. This keymapping supports some accented characters for other languages in addition to the U.S. keymap. To use the american-2 keymap, select the Options->Keyboard menu, select other in the primary selection box, enter the name of the keymap (american-2) and click on OK. You should now be able to enter accent characters using the new keymap.


11.6.3. Character Tables

Here is a table with all the characters in the Latin1 character set. You should be able to print all these characters directly from the keyboard without using too many modifier keys (if your keyboard is set up correctly, that is). Note that you must set your font encoding (in the Encoding combobox of the Layout->Document dialog) to latin1 to use this keyset, and to latin2 to use the Latin2 keyset.

Table 11-1. latin1 character set

00

10

20

30

40

50

60

70

80

90

A0

B0

C0

D0

E0

F0

00

0

@

P

'

p

°

À

Ð

à

ð

01

!

1

A

Q

a

q

¡

±

Á

Ñ

á

ñ

02

2

B

R

b

r

¢

²

Â

Ò

â

ò

03

#

3

C

S

c

s

£

³

Ã

Ó

ã

ó

04

$

4

D

T

d

t

¤

´

Ä

Ô

ä

ô

05

%

5

E

U

e

u

¥

µ

Å

Õ

å

õ

06

&

6

F

V

f

v

¦

Æ

Ö

æ

ö

07

`

7

G

W

g

w

§

·

Ç

×

ç

÷

08

(

8

H

X

h

x

¨

¸

È

Ø

è

ø

09

)

9

I

Y

i

y

©

¹

É

Ù

é

ù

0A

*

:

J

Z

j

z

ª

º

Ê

Ú

ê

ú

0B

+

;

K

[

k

{

«

»

Ë

Û

ë

û

0C

,

<

L

\

l

|

¬

¼

Ì

Ü

ì

ü

0D

-

=

M

]

m

}

­

½

Í

Ý

í

ý

0E

.

>

N

^

n

~

®

¾

Î

Þ

î

þ

0F

/

?

O

_

o

¯

¿

Ï

ß

ï

ÿ

There are a few things you need to know about Table 11-1. This manual is set up --- by hand, mind you --- to print all of these characters. That ain't the default. Nowhere near, in fact. Here are some of the details you'll need to bear in mind when using characters from the Latin1 character set:

  • The characters at entries A2, A4, A5, A6 and AD -- the cent, the yen, the generic-currency-symbol, the broken vertical bar, and the short dash -- are just plain missing in the default encodings. We don't know where they are or why this is the case.

  • Even if you've selected latin1 in the Document Layout dialog, users who have only the OT1-fonts for LaTeX [or who have the T1-fonts but aren't using them] will still be missing a few characters: D0, F0, DE, FE, AB, and BB -- the uppercase and lowercase eth and thorn, and the french quotes -- won't show up[51].

  • Users of OT1-fonts can, however, get the french quotes [characters AB and BB] if they include the either the package umlaute.sty or german.sty in their documents.[52]

  • If you use OT1 font encoding, i.e. if the jadetex.cfg file contains the line

    \usepackage[OT1]{fontenc}
    

    (in which case it wouldn't make sense to use neither of the ae, aecompl, aeguill, umlaute or german packages, which are for the T1 encoding, i.e. you would have commented the lines:

    \usepackage{ae} 
    \usepackage{aecompl} 
    \usepackage{ae,aecompl} 
    \usepackage{german} 
    \usepackage{umlaute}
    

    or those line would simply not be there), then in this case you must input the following characters in math mode (see also the table “How to Typeset Special Characters” in What Is TeX?):

    • : two “lower than” signs, one after another[53]. Enter math mode and type the two signs.

    • : two “greater than” signs, one after another[54]. Enter math mode and type the two signs.

    • < : “lower than” sign. Enter math mode and type the sign.

    • > : “greater than” sign. Enter math mode and type the sign.

    • \: backslash. Enter math mode and type “\backslash”.

    • _: underscore. Enter math mode and type “\textunderscore”. From How to use the underscore character:

    The underscore character is ordinarily used in TeX to indicate a subscript in maths mode; if you type in the course of ordinary text, TeX will complain. If you're writing a document which will contain a large number of underscore characters, the prospect of typing \ (or, worse, \textunderscore) for every one of them will daunt most ordinary people.

    Moderately skilled macro programmers can readily generate a quick hack to permit typing to mean 'text underscore'. However, the code is somewhat tricky, and more importantly there are significant points where it's easy to get it wrong. There is therefore a package underscore which provides a general solution to this requirement.

    There is a problem, though: OT1 text fonts don't contain an underscore character, unless they're in the typewriter version of the encoding (used by fixed-width fonts such as cmtt). So either you must ensure that your underscore characters only occur in text set in a typewriter font, or you must use a fuller encoding, such as T1, which has an underscore character in every font.

    • |: vertical bar. Enter math mode and type the sign.

The following is a full list of all of the accented characters LyX can display directly. It includes not only the accented characters from the previous table, but also the characters from ISO8859--2 through 4.

  • From ISO8859--1:

    ¨ Ä Ë Ï Ö Ü ä ë ï ö ü ÿdiaeresis

    ^ Â Ê Î Ô Û â ê î ô ûcircumflex

    ` À È Ì Ò Ù à è ì ò ùgrave

    ´ Á É Í Ó Ú Ý á é í ó ú ýacute

    ~ Ã Ñ Õ ã ñ õtilde

    ¸Ççcedilla

    ¯macron[55]

  • From ISO8859--2 through 4:

    \^{H}\^{J}\^{h}\^{\j}\^{C}\^{G}\^{S}\^{c}\^{g}\^{s}circumflex

    \'{S}\'{Z}\'{s}\'{z}\'{R}\'{L}\'{C}\'{N}\'{r}\'{l}\'{c}\'{n}acute

    \~{I}\~{\i}\~{U}\~{u}tilde

    \c{S}\c{s}\c{T}\c{t}\c{R}\c{L}\c{G}\c{r}\c{l}\c{g}\c{N}\c{K}\c{n}\c{k}cedilla[56]

    \={E}\={e}\={A}\={I}\={O}\={U}\={a}\={\i}\={o}\={u}macron

    \H{O}\H{U}\H{o}\H{u}hungarian umlaut

All the characters above are actively supported by TeX fonts. In addition TeX allows diacritical marks on almost all characters . Also make sure you're using the T1 font-encoding and have the package umlaute.sty with the definition file iso.def installed.


11.6.4. International Spellcheck Support

LyX uses the ispell spelling checker. You should configure ispell to work with your system if it does not already. To get the appropriate language dictionary, refer to the Where file included with the ispell package. Note that some dictionaries do not support the Latin1 encoding. If you have selected the Latin1 encoding (in the Document Layout dialog) with one of these dictionaries, the spellchecker will not work for some people. Refer to the “Spellchecking” section of the LyX User's Guide for more details about international spellchecking.


11.7. Openjade localization

Openjade only supports a single pre-defined character repertoire. A character name of the form U-XXXX where XXXX are four upper-case hexadecimal dig- its, is recognized as referring to the Unicode character with that code. For many characters, it is also possible to use the ISO/IEC 10646 name in lower-case with words separated by hyphens.

Some common SDATA entity names from the ISO entity sets are recognized and mapped to characters. In addition an SDATA entity name of the form U-XXXX, where XXXX are four upper-case hexadecimal digits, is mapped to the Unicode character with that code.

OpenJade now supports the standard-chars, map-sdata-entity, add-name- chars, add-separator-chars and char-repertoire declaration element forms, allowing a style-sheet to define additional character names, sdata entity mappings, name characters (i.e. characters allowed in identifiers) and separator characters. Currently the only recognized character repertoire is the built-in repertoire. It has the public identifier "UNREGISTERED::OpenJade//Character Repertoire::OpenJade".


11.8. dvips localization

The only localization of dvips available seems to be the redefinition of paper size, depending on the user's (or system's) locale. For example, to change the paper size to letter format (used in North America), one would do

texconfig dvips paper lettersize

In SuSE Linux, this is done automatically by the SuSEconfig.tetex script (located in /sbin/conf.d/) by taking into account the LC_PAPER locale:

function get_paper () {
    ( 
        . /etc/sysconfig/language &> /dev/null
        read h w r < <(LANG=$RC_LANG locale -k LC_PAPER)
        case "$h" in
            height=297) echo a4     ;;
            *)          echo letter ;;
        esac
    )
}

11.9. DSSSL stylesheet localization

As distributed, the stylesheets use English for all generated text, but several other localization files are also provided.

Switching localizations is achieved by writing a customization layer that references the proper localization file. The customization should look like this:

<!DOCTYPE style-sheet PUBLIC "-//James Clark//DTD DSSSL Style Sheet//EN" [
<!ENTITY dbstyle PUBLIC "-//Norman Walsh//DOCUMENT DocBook Print Stylesheet//EN" CDATA DSSSL>
<!-- The path to the l10n file must be correct, of course, and should -->
<!-- point to the print/html directory as appropriate for the preceding -->
<!-- dbstyle -->
<!ENTITY l10n    SYSTEM "docbook/print/dbl1dege.dsl" CDATA DSSSL>
]>
<style-sheet>
<style-specification use="l10n docbook">
<style-specification-body>
;; Additional customization here, if desired
</style-specification-body>
</style-specification>
<external-specification id="docbook" document="dbstyle">
<external-specification id="l10n" document="l10n">
</style-sheet>

Where dbl1dege.dsl is the name of the localization file you wish to use (German in this example). See Customizing the Stylesheets.


11.10. TeX localization

Install the package latex-ucs (in SuSE Linux it is the RPM package latex-ucs-20030605-18.noarch.rpm in the noarch directory of the distribution CDs), or get it from CTAN from the directory macros/latex/contrib/unicode, for example from CTAN latex-ucs package.


11.11. lynx localization

From the lynx manual page:

-display_charset=MIMEname

set the charset for the terminal output.

NATIVE LANGUAGE SUPPORT

If configured and installed with Native Language Support, Lynx will display status and other messages in

your local language. See the file ABOUT_NLS in the source distribution, or at your local GNU site, for

more information about internationalization.

The following environment variables may be used to alter default settings:

LANG

This variable, if set, will override the default message language. It is an ISO 639 two-letter code identifying the language. Language codes are NOT the same as the country codes given in ISO 3166.

LANGUAGE

This variable, if set, will override the default message language. This is a GNU extension that has higher priority for setting the message catalog than LANG or LC_ALL.

LC_ALL and LC_MESSAGES

These variables, if set, specify the notion of native language formatting style. They are POSIXly correct.

LINGUAS

This variable, if set prior to configuration, limits the installed languages to specific values. It is a space-separated list of two-letter codes. Currently, it is hard-coded to a wish list.

NLSPATH

This variable, if set, is used as the path prefix for message catalogs.

From the Unicode HOWTO:

Lynx-2.8 has an options screen (key 'O') which permits to set the display character set. When running in an xterm or Linux console in UTF-8 mode, set this to "UNICODE UTF-8". Note that for this setting to take effect in the current browser session, you have to confirm on the "Accept Changes" field, and for this setting to take effect in future browser sessions, you have to enable the "Save options to disk" field and then confirm it on the "Accept Changes" field.

Now, again, all a document needs is the following line between the <head> and </head> tags:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

When you are viewing text files in UTF-8 encoding, you also need to pass the command-line option "-assume_local_charset=UTF-8" (affects only file:/... URLs) or "-assume_charset=UTF-8" (affects all URLs). In lynx-2.8.2 you can alternatively, in the options screen (key 'O'), change the assumed document character set to "utf-8".


Chapter 12. Shortcomings and bugs


Chapter 13. Other methods

Various other methods exist that are related to the one presented here. This method is unique in that it uses LyX and a lot of script “ glue code” to hide the complexity of writing in SGML from the user. However, if for some reason you prefer to not use LyX, the shell scripts, the stylesheets or some other part of this method, you might want to have a look at the following alternatives:


Chapter 14. Bibliography

Note Note
 

This is not the Bibliography produced by RefDB (Section 3.11), but just a chapter with this misleading title, containing some links to other sources. The RefBDB Bibliography bears the name “Reference List” or “References” and is located even further down, near the end of the document.


Appendix A. Appendix

A.1. The GNU Free Documentation Licence

This is an exact copy of the GNU Free Documentation License Version 1.2, November 2002:

Copyright (C) 2000,2001,2002 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.


A.1.1. PREAMBLE

The purpose of this License is to make a manual, textbook, or other functional and useful document "free" in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others.

This License is a kind of "copyleft", which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software.

We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference.


A.1.2. APPLICABILITY AND DEFINITIONS

This License applies to any manual or other work, in any medium, that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. Such a notice grants a world-wide, royalty-free license, unlimited in duration, to use that work under the conditions stated herein. The "Document", below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as "you". You accept the license if you copy, modify or distribute the work in a way requiring permission under copyright law.

A "Modified Version" of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language.

A "Secondary Section" is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document's overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (Thus, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them.

The "Invariant Sections" are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License. If a section does not fit the above definition of Secondary then it is not allowed to be designated as Invariant. The Document may contain zero Invariant Sections. If the Document does not identify any Invariant Sections then there are none.

The "Cover Texts" are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at most 25 words.

A "Transparent" copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, that is suitable for revising the document straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup, or absence of markup, has been arranged to thwart or discourage subsequent modification by readers is not Transparent. An image format is not Transparent if used for any substantial amount of text. A copy that is not "Transparent" is called "Opaque".

Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML, PostScript or PDF designed for human modification. Examples of transparent image formats include PNG, XCF and JPG. Opaque formats include proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML, PostScript or PDF produced by some word processors for output purposes only.

The "Title Page" means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, "Title Page" means the text near the most prominent appearance of the work's title, preceding the beginning of the body of the text.

A section "Entitled XYZ" means a named subunit of the Document whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language. (Here XYZ stands for a specific section name mentioned below, such as "Acknowledgements", "Dedications", "Endorsements", or "History".) To "Preserve the Title" of such a section when you modify the Document means that it remains a section "Entitled XYZ" according to this definition.

The Document may include Warranty Disclaimers next to the notice which states that this License applies to the Document. These Warranty Disclaimers are considered to be included by reference in this License, but only as regards disclaiming warranties: any other implication that these Warranty Disclaimers may have is void and has no effect on the meaning of this License.


A.1.3. VERBATIM COPYING

You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in Section A.1.4.

You may also lend copies, under the same conditions stated above, and you may publicly display copies.


A.1.4. COPYING IN QUANTITY

If you publish printed copies (or copies in media that commonly have printed covers) of the Document, numbering more than 100, and the Document's license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects.

If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages.

If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document, free of added material. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public.

It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document.


A.1.5. MODIFICATIONS

You may copy and distribute a Modified Version of the Document under the conditions of Section A.1.3 and Section A.1.4 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version:

  • A. Use in the Title Page (and on the covers, if any) a title distinct from that of the Document, and from those of previous versions (which should, if there were any, be listed in the History section of the Document). You may use the same title as a previous version if the original publisher of that version gives permission.

  • B. List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement.

  • C. State on the Title page the name of the publisher of the Modified Version, as the publisher.

  • D. Preserve all the copyright notices of the Document.

  • E. Add an appropriate copyright notice for your modifications adjacent to the other copyright notices.

  • F. Include, immediately after the copyright notices, a license notice giving the public permission to use the Modified Version under the terms of this License, in the form shown in the Addendum below.

  • G. Preserve in that license notice the full lists of Invariant Sections and required Cover Texts given in the Document's license notice.

  • H. Include an unaltered copy of this License.

  • I. Preserve the section Entitled "History", Preserve its Title, and add to it an item stating at least the title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there is no section Entitled "History" in the Document, create one stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence.

  • J. Preserve the network location, if any, given in the Document for public access to a Transparent copy of the Document, and likewise the network locations given in the Document for previous versions it was based on. These may be placed in the "History" section. You may omit a network location for a work that was published at least four years before the Document itself, or if the original publisher of the version it refers to gives permission.

  • K. For any section Entitled "Acknowledgements" or "Dedications", Preserve the Title of the section, and preserve in the section all the substance and tone of each of the contributor acknowledgements and/or dedications given therein.

  • L. Preserve all the Invariant Sections of the Document, unaltered in their text and in their titles. Section numbers or the equivalent are not considered part of the section titles.

  • M. Delete any section Entitled "Endorsements". Such a section may not be included in the Modified Version.

  • N. Do not retitle any existing section to be Entitled "Endorsements" or to conflict in title with any Invariant Section.

  • O. Preserve any Warranty Disclaimers.

If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version's license notice. These titles must be distinct from any other section titles.

You may add a section Entitled "Endorsements", provided it contains nothing but endorsements of your Modified Version by various parties--for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard.

You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one.

The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version.


A.1.6. COMBINING DOCUMENTS

You may combine the Document with other documents released under this License, under the terms defined in Section A.1.5 above for modified versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice, and that you preserve all their Warranty Disclaimers.

The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work.

In the combination, you must combine any sections Entitled "History" in the various original documents, forming one section Entitled "History"; likewise combine any sections Entitled "Acknowledgements", and any sections Entitled "Dedications". You must delete all sections Entitled "Endorsements".


A.1.7. COLLECTIONS OF DOCUMENTS

You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects.

You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document.


A.1.8. AGGREGATION WITH INDEPENDENT WORKS

A compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, is called an "aggregate" if the copyright resulting from the compilation is not used to limit the legal rights of the compilation's users beyond what the individual works permit. When the Document is included in an aggregate, this License does not apply to the other works in the aggregate which are not themselves derivative works of the Document.

If the Cover Text requirement of Section A.1.4 is applicable to these copies of the Document, then if the Document is less than one half of the entire aggregate, the Document's Cover Texts may be placed on covers that bracket the Document within the aggregate, or the electronic equivalent of covers if the Document is in electronic form. Otherwise they must appear on printed covers that bracket the whole aggregate.


A.1.9. TRANSLATION

Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of Section A.1.5. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License, and all the license notices in the Document, and any Warranty Disclaimers, provided that you also include the original English version of this License and the original versions of those notices and disclaimers. In case of a disagreement between the translation and the original version of this License or a notice or disclaimer, the original version will prevail.

If a section in the Document is Entitled "Acknowledgements", "Dedications", or "History", the requirement (Section A.1.5) to Preserve its Title (Section A.1.2) will typically require changing the actual title.


A.1.10. TERMINATION

You may not copy, modify, sublicense, or distribute the Document except as expressly provided for under this License. Any other attempt to copy, modify, sublicense or distribute the Document is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance.


A.1.11. FUTURE REVISIONS OF THIS LICENSE

The Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/.

Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License "or any later version" applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation.


A.1.12. ADDENDUM: How to use this License for your documents

To use this License in a document you have written, include a copy of the License in the document and put the following copyright and license notices just after the title page:

Copyright (c) YEAR YOUR NAME. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".

If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace the "with...Texts." line with this:

with the Invariant Sections being LIST THEIR TITLES, with the Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST.

If you have Invariant Sections without Cover Texts, or some other combination of the three, merge those two alternatives to suit the situation.

If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software.


Reference List

1. AnyBrowser

2. GNU

3. LyX

4. de_Groot C...

5. Walsh N. and Muellner L. (2002)

6. CTAN

7. Komarinski M.F. and Godoy J.

8. Karakas C. (1991) Comput. Aided.Des. 23, 684-691

9. Karakas C. (1999) BoD GmbH, Norderstedt,

10. Hoenicka M.


Index

Symbols

%admon-font-family%, Using Type 1 Fonts
%admon-graphics-path%, Openjade errors
%body-font-family%, Using Type 1 Fonts
%generate-article-titlepage%, DSSSL stylesheets
%generate-article-toc%, DSSSL stylesheets
%graphic-default-extension%, Openjade errors
%graphic-extensions%, Openjade errors
%guilabel-font-family%, Using Type 1 Fonts
%mono-font-family%, Using Type 1 Fonts
%stylesheet%, Use a CSS for DocBook
%title-font-family%, Using Type 1 Fonts
0.9.4-pre5, Refdb
12pt, Writing in LyX, thinking in SGML
65%, Runsed, sed and sedscr
_DIRNAME_, HTML validation
_DOMAIN_, HTML validation
_FILENAME_, HTML validation

A

A1, Set up your bibliographic database
A2, Character Tables
AA, Accessibility
abbreviations, Further enhancements
abstract, LyX environments, Authors, Credits, Roles, Keywords, Revision history
accessibility, DSSSL stylesheets, CSS , Accessibility, Priority 1 accessibility errors, Problems of the DBTeXMath method
accessibilty, Accessibility
Acrobat®, thumbpdf, Optimal PDF , From .lyx to .pdf , The hyperref package , Bookmarks , Thumbnails , Configuring pdfjadetex , Mathematics
acronyms, Acronyms, product names, applications
activated, Hyphenation
adddscr, Admonitions , Callouts , Add density to images, Figures
admonissions, Using Type 1 Fonts
admonition, Credits, Admonitions , Callouts , Admonitions , SGML code in program listings, Openjade errors, sed segmentation fault, DSSSL stylesheets
Admonitions, Conventions, Admonitions , Admonitions , Callouts , Examples, DSSSL stylesheets, CSS , Figures , Mathematics
ae, Unprintable characters, Using Type 1 Fonts , Choosing the right font encoding, Configuring pdfjadetex , Character Tables
aecompl, Unprintable characters, Using Type 1 Fonts , Choosing the right font encoding, Configuring pdfjadetex , Character Tables
aeguill, Unprintable characters, Using Type 1 Fonts , Choosing the right font encoding, Character Tables
affiliations, Authors, Credits, Roles
algorithm, Automatic Index generation, Unescaping TeX equation code
alt attribute, Alt attributes for images
ALT-F4, Key combinations
ampersands, Openjade errors
anchor, Openjade errors, Shortcomings and bugs
ancient, Shortcomings and bugs
Apache, Refdb
append, Run sed and awk scripts, CSS
appendix, Appendix, Index , The final step: invoking lyxtox , Appendix
application-independent, Abbreviations
applications, Acronyms, product names, applications
approach, LyX , Bibliography without RefDB, TeX errors, Using Type 1 Fonts , Problems of the DBTeXMath method
archives, Housekeeping and special processing
article-titlepage-recto-elements, DSSSL stylesheets
articleinfo, Keywords not present in HTML
ASCII, Abbreviations, Set environment variables
attribute, Adapt the DocBook DSSSL stylesheets , Run sed and awk scripts, Bibliography without RefDB, Bibliography with RefDB, Openjade errors, CSS , The RefDB method, Links to internet sites , Localization
attribute-string, Links to internet sites
attributes, The RefDB method
AU, Set up your bibliographic database
auditory,, Accessibility
auto-generating, The RefDB method
auto-traced, Using Type 1 Fonts
automatically, Required software , thumbpdf, Set up your bibliographic database, Images , Tables, List of figures, tables and equations , Bibliography without RefDB, Bibliography with RefDB, Index , Automatic Index generation, Openjade errors, Missing $ inserted, Set program locations, DSSSL stylesheets, Appendix, The RefDB method, Using True Type fonts, HTML validation, Priority 2 accessibility errors, Writing Mathematics in LyX, dvips localization
availability, Required software , Optimal PDF
avoid, LyX environments, Mathematics
awk, Sed and awk, Run sed and awk scripts, Bibliography with RefDB, Automatic Index generation, DBTeXMath, SGML math code correction, Solution, Localization, awk localization
awkscr_cit, Bibliography with RefDB
awkscr_math, Appendix
awkscr_refdb_html, The RefDB method
awkscr_refdb_print, The RefDB method

B

background, LyX , Reconfigure LyX , Admonitions , Mathematics, Bibliography with RefDB, The RefDB method, HTML validation
basename, Set up your start and end scripts, Images , Inline graphics, The final step: invoking lyxtox , Thumbnails , Math processing with texmath2pngbmp.pl
behaviour, Cool labels don't change!, Missing $ inserted, Bookmarks , Configuring pdfjadetex , sed localization, awk localization
bells, Credits, Document creation: PDF
benchmarks, Fatal format file error; I'm stymied
best-of-breed, LyX
bibliographic, Refdb, Set up your bibliographic database, Bibliography without RefDB, Bibliography with RefDB, Bibliography
bibliographies, Required software , Refdb, The RefDB method
bibliography, Credits, Required software , Refdb, Run sed and awk scripts, LyX environments, Bibliography without RefDB, Bibliography with RefDB, The final step: invoking lyxtox , LaTeX errors, Appendix, Bibliography, Standard Bibliography methods in LyX, The RefDB method
bibliography.lyx, The final step: invoking lyxtox
bibliomixed, The RefDB method
bibliomset, The RefDB method
BibTeX, Refdb, Bibliography, Standard Bibliography methods in LyX, The RefDB method
bindings, Automatic Index generation
bitmapped, Figures , Using Type 1 Fonts , Configuring pdfjadetex , Embedding Computer Modern fonts, The magic behind the math
blindness, Accessibility
blue-widgets, Cool labels don't change!
BMP, Openjade errors, Figures , The magic behind the math, Math processing in the HTML stylesheets, Math processing with texmath2pngbmp.pl
Bobby, Accessibility, Priority 1 accessibility errors, Priority 2 accessibility errors
body, Sed and awk, Use a CSS for DocBook , Openjade errors, Document creation: HTML, Using Type 1 Fonts , HTML validation
body.tmp, Document creation: HTML
book-titlepage-recto-elements, DSSSL stylesheets
bookmarks, Use coolthumbs, Cool labels don't change!, Document creation: PDF, The hyperref package , Bookmarks , Configuring pdfjadetex
bookmarksnumbered:, Bookmarks
boxes, Unprintable characters, Standard Bibliography methods in LyX
browser, Required software , Document creation: RTF and TXT, Problems of the DBTeXMath method, lynx localization
Buffer, Bibliography with RefDB
bugs, Refdb
bullet, DSSSL stylesheets, CSS

C

callouts, Callouts , Callouts , Examples, DSSSL stylesheets, Figures , Mathematics, Shortcomings and bugs
capabilities, Optimal PDF , Mathematics
caption, Images , Runsed, sed and sedscr, Shortcomings and bugs
captions, Images , Tables, Runsed, sed and sedscr, Shortcomings and bugs
Carlisle, Mathematics
catalog, Check paths of catalog files , Openjade errors, Catalogs , The RefDB method, lynx localization
catalogue, Catalogs
catalogues, Catalogs , sed localization, awk localization
CD-ROM, Other methods
CDATA, Tidying up the SGML code
cgi-bin, Refdb
CGM-CHAR, Openjade errors
Chapter, Run sed and awk scripts, Bibliography without RefDB
character-cell, Lynx
chunk, Cool labels don't change!
chunk-element-list, DSSSL stylesheets
chunk-section-depth, DSSSL stylesheets
chunked, Cool labels don't change!, DSSSL stylesheets, The RefDB method
chunks, Cool labels don't change!, DSSSL stylesheets
citation, Refdb, Run sed and awk scripts, Set up your bibliographic database, Bibliography, Bibliography without RefDB, Bibliography with RefDB, Bibliography, Standard Bibliography methods in LyX, The RefDB method
citation:, Bibliography
citations, Refdb, Set up your bibliographic database, Bibliography with RefDB, Acronyms, product names, applications, The RefDB method
cite, Bibliography without RefDB, Bibliography with RefDB, The RefDB method
citlabels.lyx, Bibliography with RefDB
ck-style.css, Use a CSS for DocBook , CSS
ck_refdb, Refdb
class, LyX errors, Standard Bibliography methods in LyX, Links to internet sites
Clayton, Figures , Other methods
clumsy-looking, Mathematics
cluttered, Automatic Index generation
CM-LGC, Using Type 1 Fonts , Choosing the right font encoding
CM-super, Using Type 1 Fonts , Choosing the right font encoding
cm.map, Embedding Computer Modern fonts
cmr9, Adapt pdftex.cfg
cmz, Document creation: PS, Embedding Computer Modern fonts
coffee, Line of attack
cognitive, Accessibility
collateindex, Required software , Index , Automatic Index generation, Document creation: PDF, Index
collateindex.pl, Index
collections, The RefDB method
colour, The hyperref package , HTML validation
combinations, key, Key combinations
compatible, Accessibility
compilation, Refdb
complexity, sgmltools , The RefDB method, From .lyx to .pdf , Problems of the DBTeXMath method, Other methods
Computer-Modern, Document creation: PS
computing, Automatic Index generation, Accessibility
config.cmz, Embedding Computer Modern fonts
conformance, Accessibility
consistent, Abbreviations
constant, Mathematics
constants, Figures
constitution, TeX errors
container, LyX environments, Math processing in the HTML stylesheets
ConTeXt., Fatal format file error; I'm stymied
convert, Required software , DocBook , Admonitions , Callouts , Add density to images, Set up your bibliographic database, Figures , DBTeXMath, Math processing with texmath2pngbmp.pl, Problems of the DBTeXMath method, Other methods
Copyright, License, Credits, Use coolthumbs, Runsed, sed and sedscr
correlated, Cool labels don't change!
corrupted, Fatal format file error; I'm stymied
Cottrell, Mathematics
Courier, Using Type 1 Fonts
courtesy, Using Type 1 Fonts
CPAN, Refdb
criteria, Accessibility
cross-browser, CSS
cross-reference, Cross references , Mass insertion of cross-references in LyX, Images , Bibliography with RefDB, Openjade errors, Runsed, sed and sedscr, Writing Mathematics in LyX, Shortcomings and bugs
cross-referencing, Cross references , Openjade errors, The hyperref package , Configuring pdfjadetex
cross-refernce, Openjade errors
crucial:, Solution
CSS, Use a CSS for DocBook , DSSSL stylesheets, CSS , Optimal PDF
CTAN, Using Type 1 Fonts , TeX localization
CTRL-ALT-DEL, Key combinations
CTRL-X, Key combinations
CTRL-X-Y, Key combinations
curly, Configuring pdfjadetex
cursor, Images , Tables, Appendix
cursor-addres, Lynx
customization, Openjade errors, DSSSL stylesheets, The RefDB method, Links to internet sites , HTML validation, DSSSL stylesheet localization
customizations, Further enhancements
Cyrillic, Using Type 1 Fonts

D

dashboard, Accessibility
DATA, Set up your bibliographic database
database, Refdb, Set up your bibliographic database, Bibliography with RefDB, Standard Bibliography methods in LyX, The RefDB method
dbcommon.dsl, Keywords not present in HTML
dbindex.dsl, Links to internet sites
dbparam.dsl, Openjade errors
DBTeXMath, LyX , Adapt the DocBook DSSSL stylesheets , The magic behind the math, Problems of the DBTeXMath method, Shortcomings and bugs
density, Admonitions , Callouts , Add density to images, Inline graphics, Cool labels don't change!, Math processing with texmath2pngbmp.pl
diagnostic, sed localization, awk localization
didtribution, TeX and LaTeX
directive, Openjade errors
disabilities, HTML tidy, Accessibility, Priority 1 accessibility errors
Distiller®, From .lyx to .pdf , Mathematics
distribution, LyX , Sed and awk, Adapt pdftex.cfg , Check paths of catalog files , Openjade errors, TeX errors, Standard Bibliography methods in LyX, Further enhancements , Embedding Computer Modern fonts, DBTeXMath, TeX localization, lynx localization
DITROFF, Openjade errors
DocBook, Abbreviations, DocBook , Refdb, Adapt the DocBook DSSSL stylesheets , Set up your bibliographic database, Use a CSS for DocBook , LyX environments, Admonitions , List of figures, tables and equations , Filenames, Examples, Appendix, Bibliography without RefDB, Openjade errors, Main part, Document creation: PS, DSSSL stylesheets, Catalogs , CSS , Appendix, The RefDB method, Optimal PDF , Mathematics, Math processing in the HTML stylesheets, Problems of the DBTeXMath method, Localization, Other methods , Bibliography
docbook-dsssl-style, Keywords not present in HTML
docbook-dsssl-stylesheets, Index , Index
docbook-dsssl-stylesheets-1.72-34, DocBook , Admonitions
docbook-refdb-html, The RefDB method
docbook.dsl, DSSSL stylesheets, Math processing in the HTML stylesheets
docbook_3-3.1-98, DocBook
docbook_4-4.1-97, DocBook
DOCTYPE, HTML validation
document, Disclaimer, Formats, License, Availability of sources and support, Credits, Aknowledgements, Conventions, Abbreviations, Introduction, The general idea , Required software , LyX , DocBook , sgmltools , Dvips, Ghostscript and ImageMagik, thumbpdf, Sed and awk, Lynx, Adapt the preample , Add density to images, Run sed and awk scripts, Set up your bibliographic database, Writing in LyX, thinking in SGML , LyX environments, Authors, Credits, Roles, Images , Inline graphics, Admonitions , Table of contents , List of figures, tables and equations , Epigraphs, Cool labels don't change!, Examples, Appendix, Bibliography with RefDB, Index , Automatic Index generation, The final step: invoking lyxtox , Openjade errors, TeX errors, LaTeX errors, TeX capacity exceeded, Fatal format file error; I'm stymied, Unprintable characters, thumbpdf fails , Acrobat Reader 5 does not show thumbnails in Linux, Set environment variables, Main part, Runsed, sed and sedscr, Tidying up the SGML code, Document creation: HTML, DSSSL stylesheets, Inline graphics, CSS , Appendix, Bibliography, Standard Bibliography methods in LyX, The RefDB method, Index , Optimal PDF , From .lyx to .pdf , Figures , Using Type 1 Fonts , Choosing the right font encoding, The hyperref package , Bookmarks , Thumbnails , Configuring pdfjadetex , Further enhancements , Embedding Computer Modern fonts, HTML validation, Mathematics, Writing Mathematics in LyX, Localization, xmodmap, Character Tables, lynx localization, Shortcomings and bugs , Other methods , Bibliography
domain-specific, Abbreviations
doxygen, LyX
DPI, Figures
drinking, Line of attack
drop-down, Bibliography with RefDB
DSSSL, Credits, Abbreviations, DocBook , Openjade, pdfTeX and JadeTeX, Reconfigure LyX , Adapt the DocBook DSSSL stylesheets , List of figures, tables and equations , Cool labels don't change!, Openjade errors, Corrupted NFSS tables, Keywords not present in HTML , The RefDB method, Using Type 1 Fonts , Links to internet sites , Mathematics, Math processing in the print stylesheet, Localization, Other methods , Bibliography
DTD, Abbreviations, DocBook , Openjade, pdfTeX and JadeTeX, Admonitions , Appendix, Openjade errors, The RefDB method, Figures , HTML validation
DVI, Refdb, Openjade errors, The RefDB method, From .lyx to .pdf , PDF and PS, Localization
dvi-file, Dvips, Ghostscript and ImageMagik
dvi2ps, From .lyx to .pdf
dvips, Dvips, Ghostscript and ImageMagik, Document creation: PS, From .lyx to .pdf , Embedding Computer Modern fonts, Mathematics, DBTeXMath, Math processing with texmath2pngbmp.pl, Unescaping TeX equation code, Localization, dvips localization

E

EC, Using Type 1 Fonts , Choosing the right font encoding, Configuring pdfjadetex
efficiency, Accessibility
egg, Index
ELEC, Set up your bibliographic database
element, LyX , DocBook , LyX environments, Admonitions , Tables, Epigraphs, Openjade errors, Runsed, sed and sedscr, Tidying up the SGML code, DSSSL stylesheets, Inline graphics, The RefDB method, Links to internet sites , Problems, Solution, Math processing in the HTML stylesheets, Localization, Openjade localization, Shortcomings and bugs
eliminate, Run sed and awk scripts, Tables, Cool labels don't change!, Automatic Index generation, thumbpdf fails , Runsed, sed and sedscr, Mathematics, Problems of the DBTeXMath method
embed, Document creation: PS, Using Type 1 Fonts , Thumbnails , Optimal PS, Embedding Computer Modern fonts
embedded, Introduction, thumbpdf, Add density to images, Optimal PDF , Using Type 1 Fonts , Thumbnails , Localization
embedding, DocBook , Embedding Computer Modern fonts
emphasize, Conventions
emulators, Lynx
encapsulated, Figures
encoding, Unprintable characters, Set environment variables, From .lyx to .pdf , Using Type 1 Fonts , Choosing the right font encoding, Using True Type fonts, HTML validation, Localization, Shell localization, LyX localization, Layout Language Options, Character Tables, International Spellcheck Support, lynx localization
endings, Add density to images, Set up your start and end scripts, Set up your bibliographic database
endorsements, Disclaimer
entities, DocBook , Runsed, sed and sedscr, Document creation: HTML, Document creation: PDF, Appendix, Figures , Solution, Problems of the DBTeXMath method
entity, Index , Openjade errors, Document creation: HTML, The RefDB method, Solution, Openjade localization
environment, The general idea , TeX and LaTeX , thumbpdf, LyX environments, Authors, Credits, Roles, Keywords, Revision history, Paragraphs, Images , Inline graphics, Admonitions , Callouts , Tables, Epigraphs, SGML code in program listings, Filenames, Examples, Bibliography without RefDB, Bibliography with RefDB, Automatic Index generation, Openjade errors, TeX capacity exceeded, sed segmentation fault, Set environment variables, Runsed, sed and sedscr, Document creation: PDF, Catalogs , Standard Bibliography methods in LyX, Accessibility, Priority 2 accessibility errors, Localization, sed localization, awk localization, LyX localization, lynx localization, Shortcomings and bugs
environments, LyX environments, Epigraphs, Openjade errors, DSSSL stylesheets
EPDF, Figures
epigraph, Epigraphs
EPS, Inline graphics, Openjade errors, Figures
EQN, Openjade errors
equation-set, Math processing in the HTML stylesheets
equations, Run sed and awk scripts, List of figures, tables and equations , DSSSL stylesheets, Choosing the right font encoding, Writing Mathematics in LyX, Problems, Solution, Math processing in the HTML stylesheets, Math processing in the print stylesheet, Problems of the DBTeXMath method, Shortcomings and bugs
ER, Set up your bibliographic database
errorcontextlines, TeX errors
escaped, Automatic Index generation, Links to internet sites
estimate, Automatic Index generation
euc-jp, Set environment variables
euc-kr, Set environment variables
Eur.J.Pharmacol, Refdb
Eur.J.Pharmacol.xml, Refdb
european, Unprintable characters, Choosing the right font encoding
exposure, Cool labels don't change!
expressions, Automatic Index generation, sed localization, awk localization
extension, DSSSL stylesheets, Standard Bibliography methods in LyX, lynx localization
extensions, Adapt the DocBook DSSSL stylesheets , Openjade errors, DSSSL stylesheets, Mathematics

F

FAQ, Credits, TeX errors
FAT, Set environment variables
FAX, Openjade errors
figures, LyX , Adapt the preample , Mass insertion of cross-references in LyX, List of figures, tables and equations , DSSSL stylesheets, Shortcomings and bugs
FILE, Set up your bibliographic database
fileref, Solution, Math processing in the HTML stylesheets
filters, Set up your bibliographic database, The RefDB method
fingertips!, Writing Mathematics in LyX
FitB, The hyperref package
FitH, The hyperref package
FIXME, Openjade errors, Unprintable characters, Choosing the right font encoding, Shortcomings and bugs , Bibliography
flat-file, The RefDB method
floats, Images , Runsed, sed and sedscr
flow-object, Links to internet sites
fmtutil, Fatal format file error; I'm stymied
fonts, Introduction, DocBook , Adapt pdftex.cfg , Openjade errors, LaTeX errors, Unprintable characters, Document creation: PS, DSSSL stylesheets, From .lyx to .pdf , Using Type 1 Fonts , Choosing the right font encoding, Configuring pdfjadetex , Further enhancements , Optimal PS, Embedding Computer Modern fonts, Localization, Character Tables
footnote, CSS
footnotes, CSS
formats, Figures
formatter, LyX environments
formatting, Abbreviations, Introduction, The general idea , DocBook , LyX environments, Filenames, Bibliography, The RefDB method, Optimal PDF , Links to internet sites , awk localization, lynx localization, Shortcomings and bugs
formatting-instruction, Links to internet sites
Foundation, License
Free Software Foundation, License
frenchlinks, The hyperref package
Front-Cover, License
frontend, Writing in LyX, thinking in SGML , Mathematics
fully-featured, Lynx

H

headache, Automatic Index generation
Helvetica, Using Type 1 Fonts
hierarchy, Configuring pdfjadetex
Holmes, TeX errors
honest, LyX environments
housekeeping, Housekeeping and special processing
HTML, Formats, Abbreviations, Introduction, Required software , DocBook , Openjade, pdfTeX and JadeTeX, Lynx, HTML tidy, Refdb, Adapt the DocBook DSSSL stylesheets , Admonitions , Callouts , Run sed and awk scripts, Use a CSS for DocBook , LyX environments, Labels as filenames, Cool labels don't change!, Mathematics, The final step: invoking lyxtox , Keywords not present in HTML , URLs with underscore display '&lowbar;' instead of '_', Document creation: HTML, Document creation: PDF, Document creation: RTF and TXT, Housekeeping and special processing, DSSSL stylesheets, CSS , The RefDB method, Index , Optimal PDF , Figures , Choosing the right font encoding, HTML validation, Accessibility, Mathematics, Writing Mathematics in LyX, The magic behind the math, HTML and RTF, Math processing in the HTML stylesheets, Math processing with texmath2pngbmp.pl, PDF and PS, Problems of the DBTeXMath method, Localization, Shortcomings and bugs , Other methods , Bibliography
html-index", Document creation: HTML
htmltidy, HTML tidy
HTML_DSL, Set program locations, The RefDB method
HTTP, Lynx
human-readable, Problems of the DBTeXMath method
hundreds, Mass insertion of cross-references in LyX, Cool labels don't change!, Automatic Index generation, Fatal format file error; I'm stymied
hyperlinked, LyX , The RefDB method
hyperref, The hyperref package
hypersetup, The hyperref package , Bookmarks , PDF view options
hypertext, Lynx, CSS , Optimal PDF , The hyperref package , Links to internet sites , Configuring pdfjadetex
hyphenation, Hyphenation , Links to internet sites

I

icon, LyX environments, DSSSL stylesheets, CSS , HTML validation
icons, Use coolthumbs, CSS , HTML validation
ID, Set up your bibliographic database, Bibliography with RefDB, Openjade errors, The RefDB method
identation, Openjade errors
identification, Openjade errors, awk localization
identifier, DSSSL stylesheets, Catalogs , The RefDB method, Openjade localization
identifiers, Catalogs
ImageMagik, Required software , Dvips, Ghostscript and ImageMagik, Admonitions , Callouts , Figures
implementation, LyX environments, Automatic Index generation, The RefDB method
includegraphics, Set environment variables
indenting, LyX environments
indexing, Cool labels don't change!, Automatic Index generation
indexitems, Automatic Index generation
industrial, LyX
info-element, Keywords not present in HTML
information, Figures
inline, LyX environments, Inline graphics, SGML code in program listings, Openjade errors, Tidying up the SGML code, Inline graphics, Choosing the right font encoding, Further enhancements , Writing Mathematics in LyX, Problems, Solution
innocent, sed segmentation fault
inspiration, Aknowledgements
interface, Refdb, Set up your bibliographic database
interpreted, Automatic Index generation, Openjade errors, The hyperref package , Configuring pdfjadetex
Invariant, License
invocation, Automatic Index generation, The final step: invoking lyxtox , Openjade errors, Document creation: PS
invocations, sgmltools , Automatic Index generation
ISO, DocBook , Set environment variables, Problems of the DBTeXMath method, Openjade localization, lynx localization
iso-10646-ucs-2, Set environment variables
itemize, Openjade errors
itemized, Writing in LyX, thinking in SGML , Admonitions , CSS

L

label, Run sed and awk scripts, Cross references , Mass insertion of cross-references in LyX, Images , Tables, Labels as filenames, Cool labels don't change!, Bibliography with RefDB, Openjade errors, Standard Bibliography methods in LyX, The RefDB method, Writing Mathematics in LyX, Solution
label-to-filename, Cool labels don't change!
LastFoot, Tables
LaTeX, Credits, Required software , LyX , Openjade, pdfTeX and JadeTeX, TeX and LaTeX , Dvips, Ghostscript and ImageMagik, Writing in LyX, thinking in SGML , Tables, Openjade errors, TeX errors, The structure of TeX errors, LaTeX errors, Fatal format file error; I'm stymied, Standard Bibliography methods in LyX, The RefDB method, From .lyx to .pdf , Figures , The hyperref package , Configuring pdfjadetex , Mathematics, DBTeXMath, Writing Mathematics in LyX, Math processing in the HTML stylesheets, Math processing with texmath2pngbmp.pl, Character Tables
LaTeX2HTML, The RefDB method, Mathematics
Latin Modern, Using Type 1 Fonts , Choosing the right font encoding
latin1, Modifiers and Mode_switch
layout, LyX , Bibliography without RefDB, Automatic Index generation, Optimal PDF , LyX localization
LDP, Openjade, pdfTeX and JadeTeX
ldp.dsl, Bibliography
legislation, Accessibility
liability, Disclaimer
libdbi, Refdb
libdbi-0.7.2.tar.gz., Refdb
libdbi-drivers, Refdb
libdbi-drivers-0.7.1.tar.gz, Refdb
License, License, Refdb
lightweight, Refdb
linktocpage, Configuring pdfjadetex
linktopage, The hyperref package
list-style, CSS
listings, Inline graphics
liststyle, Refdb
localization, Localization, Shell localization, dvips localization, DSSSL stylesheet localization
localizations, DSSSL stylesheet localization
lynx, Credits, Required software , Lynx, Document creation: RTF and TXT, Localization, lynx localization
LyX, Credits, Introduction, The general idea , Line of attack , Required software , LyX , TeX and LaTeX , Refdb, Reconfigure LyX , Adapt the DocBook DSSSL stylesheets , Adapt the preample , Admonitions , Run sed and awk scripts, Set up your start and end scripts, Set up your bibliographic database, Writing in LyX, thinking in SGML , LyX environments, Authors, Credits, Roles, Keywords, Paragraphs, Cross references , Mass insertion of cross-references in LyX, Images , Inline graphics, Admonitions , Callouts , Tables, Table of contents , Epigraphs, SGML code in program listings, Filenames, Labels as filenames, Cool labels don't change!, Examples, Mathematics, Appendix, Bibliography without RefDB, Bibliography with RefDB, Index , Automatic Index generation, The final step: invoking lyxtox , LyX errors, Openjade errors, TeX errors, LaTeX errors, sed segmentation fault, URLs with underscore display '&lowbar;' instead of '_', Set environment variables, Main part, Runsed, sed and sedscr, DSSSL stylesheets, Inline graphics, Appendix, Bibliography, Standard Bibliography methods in LyX, The RefDB method, Optimal PDF , From .lyx to .pdf , Choosing the right font encoding, Embedding Computer Modern fonts, Mathematics, Writing Mathematics in LyX, SGML math code correction, Problems, Solution, Localization, Keyboard localization, Modifiers and Mode_switch, Helpful Hints and Tips, LyX localization, Layout Language Options, Keyboard mapping configuration, Character Tables, International Spellcheck Support, Shortcomings and bugs , Other methods , Bibliography
lyx-1.2.0-91, LyX
LyXese, Shortcomings and bugs
lyxrefs, Mass insertion of cross-references in LyX
lyxtox, sgmltools , Run sed and awk scripts, Use coolthumbs, The final step: invoking lyxtox , Document processing, Set environment variables, Catalogs , Index , Thumbnails , Embedding Computer Modern fonts, Priority 1 accessibility errors

M

machinery, Required software , TeX and LaTeX
Macintoshes, Lynx
macro, TeX errors, The structure of TeX errors, Character Tables
macroprocessor, TeX errors
macroprocessors, TeX errors
magic, Adapt the preample
makefile, Other methods
management, Sed and awk
Mandrake, Adapt the DocBook DSSSL stylesheets , Bibliography
manpages, Set environment variables
manual, Credits, Automatic Index generation, LaTeX errors, Set environment variables, Priority 2 accessibility errors, Mathematics, LyX localization, Character Tables, lynx localization
manual-print.dsl, Credits
MARC::Charset, Refdb
MARC::Record, Refdb
mark, Disclaimer, Filenames, Appendix, Set environment variables, Appendix
Markup, Abbreviations, DocBook , Lynx, HTML tidy, Writing in LyX, thinking in SGML , LyX environments, Automatic Index generation, Openjade errors, Tidying up the SGML code, Acronyms, product names, applications, Mathematics, Problems of the DBTeXMath method
math, LyX , Mathematics, The final step: invoking lyxtox , Missing $ inserted, Unprintable characters, Choosing the right font encoding, Mathematics, Writing Mathematics in LyX, Problems, Solution, HTML and RTF, PDF and PS, Math processing in the print stylesheet, Problems of the DBTeXMath method, Character Tables
mathematical, Choosing the right font encoding, Mathematics, Problems of the DBTeXMath method, Localization
Mathematics, Credits, Run sed and awk scripts, Mathematics, Missing $ inserted, Acronyms, product names, applications, Appendix, The RefDB method, Choosing the right font encoding, Mathematics, DBTeXMath, Writing Mathematics in LyX, Problems, Math processing in the HTML stylesheets, Problems of the DBTeXMath method, Shortcomings and bugs
MathML, Mathematics, Problems, Bibliography
Matterform, Credits, CSS
mediaobject, Runsed, sed and sedscr, DSSSL stylesheets
mediaobjects, Document creation: HTML, Document creation: PDF
medium, Optimal PDF
merging, DSSSL stylesheets
metainformation, Keywords
MIME, Set environment variables
misconception, LyX environments
misspelled, Openjade errors
modifiable, Availability of sources and support, Optimal PDF
modifier, Mathematics, Modifiers and Mode_switch, Character Tables
monitor, Figures
monospace, Using Type 1 Fonts
ms-dos, Set environment variables
multi-line, Writing Mathematics in LyX
multi-step, Index
multipage-table, Tables
MySQL, Refdb
myTemplate, Callouts , Document processing, Document creation: PDF

P

packages, Credits, Required software , LyX , DocBook , Reconfigure LyX , The structure of TeX errors, LaTeX errors, Unprintable characters, Catalogs , Figures , Choosing the right font encoding, Localization, Character Tables
padding, CSS
pagebreaks, Introduction, Tables
paper, Required software , Use coolthumbs, The hyperref package , LyX localization, Layout Language Options, dvips localization
papersize, Layout Language Options
paragraph, DocBook , Writing in LyX, thinking in SGML , LyX environments, Paragraphs, Bibliography without RefDB, Openjade errors, sed segmentation fault, Standard Bibliography methods in LyX
parameter, Add density to images, Use coolthumbs, The final step: invoking lyxtox , Openjade errors, Document processing, Check number of parameters, Set program locations, Figures , Thumbnails
parser, The general idea , Openjade errors, Tidying up the SGML code, Problems
Part2, HTML validation
Part3, HTML validation
password, Refdb
PATH, thumbpdf, HTML tidy, Adapt the DocBook DSSSL stylesheets , Callouts , Inline graphics, Openjade errors, Set environment variables, Runsed, sed and sedscr, DSSSL stylesheets, Math processing with texmath2pngbmp.pl, awk localization, lynx localization
paths, Adapt the DocBook DSSSL stylesheets , Check paths of catalog files , Run sed and awk scripts, Images , Set environment variables, DBTeXMath
patterns, Sed and awk
PCX, Openjade errors
PDF, Abbreviations, Introduction, TeX and LaTeX , Dvips, Ghostscript and ImageMagik, thumbpdf, Refdb, Adapt the DocBook DSSSL stylesheets , Adapt pdftex.cfg , Adapt jadetex.cfg, Admonitions , Add density to images, Use coolthumbs, Inline graphics, Cool labels don't change!, Mathematics, Bibliography, The final step: invoking lyxtox , Openjade errors, Fatal format file error; I'm stymied, Unprintable characters, Acrobat Reader 5 does not show thumbnails in Linux, Set environment variables, Runsed, sed and sedscr, Document creation: PDF, DSSSL stylesheets, The RefDB method, Index , Optimal PDF , From .lyx to .pdf , Figures , Using Type 1 Fonts , Choosing the right font encoding, The hyperref package , PDF view options , Links to internet sites , Thumbnails , Configuring pdfjadetex , Further enhancements , Embedding Computer Modern fonts, Mathematics, Writing Mathematics in LyX, The magic behind the math, Solution, PDF and PS, Unescaping TeX equation code, Problems of the DBTeXMath method, Localization
pdfjadetex, Required software , TeX and LaTeX , Adapt jadetex.cfg, Admonitions , TeX capacity exceeded, Document creation: PDF, Document creation: PS, Index , From .lyx to .pdf , Figures , Using Type 1 Fonts , Choosing the right font encoding, Hyphenation , Thumbnails , Mathematics, PDF and PS, Unescaping TeX equation code, Localization
pdflatex, Document creation: PDF
pdfTeX, Openjade, pdfTeX and JadeTeX, Adapt pdftex.cfg , Fatal format file error; I'm stymied, Set environment variables, From .lyx to .pdf , Using Type 1 Fonts , Configuring pdfjadetex , Embedding Computer Modern fonts
pedantic, LyX environments
perl, thumbpdf
Perl5, thumbpdf
Perlmod, Refdb
pgsql, Refdb
philosophy, LyX environments, The RefDB method
pictograms, Optimal PDF
pipes, Explaining the magic: the details , awk localization
placeholders, Document creation: HTML, HTML validation
platforms, LyX , Refdb
PNG, Admonitions , Callouts , Add density to images, Openjade errors, Runsed, sed and sedscr, Figures , The magic behind the math, Solution, Math processing in the HTML stylesheets, Math processing with texmath2pngbmp.pl
portability, Required software
porting, Required software
post-processing, DBTeXMath
PostgreSQL, Refdb
PostScript, Refdb
PostScript®, Dvips, Ghostscript and ImageMagik, Adapt pdftex.cfg , Using Type 1 Fonts , Embedding Computer Modern fonts, Mathematics, Math processing with texmath2pngbmp.pl
pre-parsed, The RefDB method
PRE.SCREEN, DSSSL stylesheets
preamble, TeX errors, LaTeX errors
preample, Adapt the preample , Run sed and awk scripts, Bibliography without RefDB, Index , Document creation: HTML, Appendix, The RefDB method, Index , Math processing with texmath2pngbmp.pl
predilection, Using Type 1 Fonts
preferences, DSSSL stylesheets, Keyboard mapping configuration
principles, Accessibility
print.dsl, sgmltools , Adapt the DocBook DSSSL stylesheets , Document creation: PS
PRINT_PDF_DSL, Document creation: PDF
procedure, Refdb, Bibliography with RefDB, Automatic Index generation, Missing $ inserted, Explaining the magic: the details , Document creation: HTML, The RefDB method, From .lyx to .pdf , Figures
process-math, Math processing in the HTML stylesheets
processing, Credits, Sed and awk, Run sed and awk scripts, Set up your start and end scripts, Set up your bibliographic database, Use a CSS for DocBook , Writing in LyX, thinking in SGML , Tables, SGML code in program listings, Mathematics, LaTeX errors, Fatal format file error; I'm stymied, Unprintable characters, thumbpdf fails , Set environment variables, Housekeeping and special processing, Appendix, The RefDB method, HTML validation, Solution, HTML and RTF, Math processing in the HTML stylesheets, PDF and PS, Math processing in the print stylesheet, Problems of the DBTeXMath method, sed localization, awk localization, Bibliography
Process_RefDB, Bibliography with RefDB
product names, Acronyms, product names, applications
production, Using Type 1 Fonts , Further enhancements , Other methods , Bibliography
professional, LyX
Proof, Configuring pdfjadetex
PS, Introduction, Admonitions , Add density to images, Cool labels don't change!, Mathematics, Bibliography, The final step: invoking lyxtox , Openjade errors, Document creation: PS, The RefDB method, From .lyx to .pdf , Figures , Choosing the right font encoding, Optimal PS, Embedding Computer Modern fonts, Mathematics, Writing Mathematics in LyX, The magic behind the math, Solution, Math processing with texmath2pngbmp.pl, PDF and PS, Unescaping TeX equation code, Problems of the DBTeXMath method, Localization
PS-specific, Optimal PS
ps2pdf, thumbpdf, Mathematics
publication, The RefDB method, Other methods
publishing, Other methods
Pubmed, Refdb, Set up your bibliographic database
punctuation, Automatic Index generation, The RefDB method

Q

QBullets, Credits, CSS
Quotation, Automatic Index generation

R

Rahtz, TeX errors, The hyperref package
randomness, Solution
ranking, Cool labels don't change!
reconfigure, LyX , Reconfigure LyX , LyX errors, LaTeX errors
recto, Configuring pdfjadetex
RefDB, Refdb, Adapt the DocBook DSSSL stylesheets , Check paths of catalog files , Run sed and awk scripts, Set up your bibliographic database, Bibliography without RefDB, Bibliography with RefDB, Corrupted NFSS tables, Set program locations, Bibliography, The RefDB method
RefDB-created, The RefDB method
refdb-html.dsl, The RefDB method
RefDB-perlmod, Refdb
RefDB-perlmod-0.3.tar.gz, Refdb
RefDB-perlmod:, Refdb
RefDB-specific, The RefDB method
refdba, Refdb
refdbd, Refdb
refdbxp, The RefDB method
RefDB_db, Set up your bibliographic database, Bibliography with RefDB
REFDB_style, Bibliography with RefDB, The RefDB method
refs.lyx, Mass insertion of cross-references in LyX
regrouping, Cool labels don't change!
remote, Lynx
rendering, Abbreviations, DocBook , Solution
reorganization, Cool labels don't change!
reorganized, Cool labels don't change!
repository, Refdb
representation, The structure of TeX errors, Choosing the right font encoding, Problems
research, Automatic Index generation, DSSSL stylesheets
resolution, thumbpdf, Figures , Using Type 1 Fonts , Thumbnails , Embedding Computer Modern fonts
reverse-video, Callouts
RIS, Run sed and awk scripts, Set up your bibliographic database, The RefDB method
roundtrip, Mathematics
RPM, LyX , DocBook , TeX localization
RTF, Abbreviations, Introduction, DocBook , Openjade, pdfTeX and JadeTeX, Refdb, Admonitions , Add density to images, Document creation: RTF and TXT, Figures , Mathematics, Writing Mathematics in LyX, The magic behind the math, HTML and RTF, PDF and PS, Problems of the DBTeXMath method, Localization
rules, Abbreviations, DocBook , Cool labels don't change!, Openjade errors, Set environment variables, DSSSL stylesheets, Catalogs , Writing Mathematics in LyX, Problems of the DBTeXMath method
runbib, The RefDB method
runsed, Run sed and awk scripts, Runsed, sed and sedscr, Document creation: HTML

S

savoir, Hyphenation
scalability, The RefDB method
scalable, Using Type 1 Fonts
scaling, Using Type 1 Fonts
scheme-based, Abbreviations
scientific, Set up your bibliographic database
scope, Further enhancements , LyX localization
scripting, Required software
SECT1,, DSSSL stylesheets
SECT2, DSSSL stylesheets
SECT3, DSSSL stylesheets
sed,, Run sed and awk scripts
sedscr, Run sed and awk scripts, Bibliography with RefDB, Runsed, sed and sedscr, Shortcomings and bugs
sedscript, Runsed, sed and sedscr
sedscr_abi, Appendix
segmentation, sed segmentation fault
selector, CSS
Self-Published, Bibliography
Semantics, Abbreviations, DocBook
semi-automatic, Automatic Index generation
SEO, Cool labels don't change!
separation, Filenames
SERPS, Cool labels don't change!
SGML, Abbreviations, Introduction, Line of attack , DocBook , sgmltools , Openjade, pdfTeX and JadeTeX, Refdb, Add density to images, Run sed and awk scripts, Set up your bibliographic database, Writing in LyX, thinking in SGML , LyX environments, Authors, Credits, Roles, Paragraphs, Cross references , Images , Inline graphics, Admonitions , Callouts , Tables, List of figures, tables and equations , Epigraphs, SGML code in program listings, Filenames, Cool labels don't change!, Examples, Appendix, Bibliography without RefDB, Bibliography with RefDB, Index , Openjade errors, sed segmentation fault, Set environment variables, Main part, Runsed, sed and sedscr, Tidying up the SGML code, Acronyms, product names, applications, Document creation: PDF, DSSSL stylesheets, Inline graphics, Catalogs , Appendix, Bibliography, The RefDB method, Figures , Mathematics, Writing Mathematics in LyX, SGML math code correction, Problems, Solution, Math processing in the HTML stylesheets, Math processing with texmath2pngbmp.pl, Math processing in the print stylesheet, Problems of the DBTeXMath method, Localization, Shortcomings and bugs , Other methods , Bibliography
sgml-tools, LyX environments
sgmltools, sgmltools , Document creation: HTML, Document creation: PDF, Document creation: PS, Catalogs , Index , Figures , Embedding Computer Modern fonts
SGMLtools-lite, DocBook , sgmltools , Adapt the DocBook DSSSL stylesheets , Catalogs
sgmltools-lite-3.0.2-164, sgmltools
sgmltools-ps, Document creation: PS
SGML_CATALOG_FILES, Check paths of catalog files , Catalogs , The RefDB method
SGML_SEARCH_PATH, Set environment variables
shift_jis, Set environment variables
shortcomings, Main part
simulate, Document creation: PDF
single-character, The structure of TeX errors
skills, TeX errors
sloppy, HTML tidy
SMGL, The RefDB method
Solaris, Refdb
some-LyX-file.lyx, Mass insertion of cross-references in LyX
sosofo, Links to internet sites
sources, Availability of sources and support, Credits, Refdb
specification, Abbreviations, DocBook , The RefDB method, awk localization
Specifications, Abbreviations
splits, Tables
splitting, HTML validation
SP_CHARSET_FIXED, Set environment variables
SP_ENCODING, Set environment variables
SP_SYSTEM_CHARSET, Set environment variables
SQL, The RefDB method
standards, Run sed and awk scripts, LyX environments, HTML validation, Shortcomings and bugs
strategy, Cool labels don't change!
stream, Abbreviations, Sed and awk
style-sheet, Openjade localization
stylesheet, Abbreviations, Table of contents , List of figures, tables and equations , Set program locations, Document creation: PDF, Document creation: PS, DSSSL stylesheets, The RefDB method, Index , Using Type 1 Fonts , HTML validation, Mathematics, Math processing in the HTML stylesheets, Math processing with texmath2pngbmp.pl, PDF and PS, Math processing in the print stylesheet, Unescaping TeX equation code, Shortcomings and bugs , Bibliography
stylesheets, Alt attributes for images
subdirectory, Admonitions , Callouts
subexpression, Shortcomings and bugs
subsection, LyX environments, Cool labels don't change!, Openjade errors, Main part, DSSSL stylesheets
subsections, Cross references , DSSSL stylesheets
substitution, Inline graphics, Solution, Unescaping TeX equation code
subsubsection, Cool labels don't change!, Openjade errors, DSSSL stylesheets
SubSubSubsections, DSSSL stylesheets
subtopic, LyX environments
superset, Abbreviations, Layout Language Options
SuSEconfig, LyX , dvips localization
symbolic, Standard Bibliography methods in LyX, Helpful Hints and Tips
symbols, Automatic Index generation, Unprintable characters, Choosing the right font encoding, Solution, Problems of the DBTeXMath method, Localization
system, Figures
system-wide, Embedding Computer Modern fonts

T

T1, Unprintable characters, From .lyx to .pdf , Using Type 1 Fonts , Choosing the right font encoding, Using True Type fonts, Character Tables
tableofcontents, Bookmarks
tag, alt, Alt attributes for images
tag, title, Alt attributes for images
tagged, Set up your bibliographic database, Other methods
tarball, Formats, Refdb
tea, Line of attack
technology, Cool labels don't change!
TEI, Refdb
terminals, Lynx
terms, License, Cool labels don't change!, Automatic Index generation, Openjade errors
teTeX-based, Fatal format file error; I'm stymied
TeX, Credits, Required software , Openjade, pdfTeX and JadeTeX, TeX and LaTeX , Writing in LyX, thinking in SGML , Tables, Mathematics, TeX errors, The structure of TeX errors, LaTeX errors, TeX capacity exceeded, Fatal format file error; I'm stymied, Missing $ inserted, Set environment variables, From .lyx to .pdf , Using Type 1 Fonts , Using True Type fonts, Links to internet sites , Configuring pdfjadetex , Embedding Computer Modern fonts, Mathematics, Writing Mathematics in LyX, The magic behind the math, Problems, Solution, Math processing in the HTML stylesheets, Math processing with texmath2pngbmp.pl, PDF and PS, Math processing in the print stylesheet, Unescaping TeX equation code, Problems of the DBTeXMath method, Localization, Character Tables, Shortcomings and bugs
TeXbook, TeX errors
TEXINPUTS, Set environment variables
texmf.cnf, TeX capacity exceeded, Set environment variables
TEXMFCNF, TeX capacity exceeded, Set environment variables
TEXPSHEADERS, Set environment variables
thesis, LyX
third-tier, Accessibility
three-pass, From .lyx to .pdf , Thumbnails
thumbnails, Introduction, Dvips, Ghostscript and ImageMagik, thumbpdf, Use coolthumbs, Acrobat Reader 5 does not show thumbnails in Linux, Document creation: PDF, Optimal PDF , The hyperref package , Thumbnails
thumbpdf, thumbpdf, Use coolthumbs, Openjade errors, thumbpdf fails , sed segmentation fault, Acrobat Reader 5 does not show thumbnails in Linux, Set environment variables, Document creation: PDF, Thumbnails
thumbpdf.pl, thumbpdf
thumbpdf.tex, Acrobat Reader 5 does not show thumbnails in Linux
THUMB_PDF, Document creation: PDF
tidy, HTML tidy, Tidying up the SGML code, Document creation: HTML
TIFF, Openjade errors
Times-Roman, Using Type 1 Fonts
token, Openjade errors
tools, Abbreviations, Introduction, Line of attack , HTML tidy, Add density to images, Use a CSS for DocBook , The final step: invoking lyxtox , Explaining the magic: the details , Set environment variables, Catalogs , HTML validation, Accessibility, Mathematics, Localization, Bibliography
trace, TeX errors
traceon, TeX errors
tracing., TeX errors
track, Openjade errors
trademark, Disclaimer
Transitional, Shortcomings and bugs
transormations, From .lyx to .pdf
transparently, Bibliography with RefDB
tree-like, Bookmarks , Configuring pdfjadetex
trivial, Writing in LyX, thinking in SGML , Automatic Index generation
TrueType, Using Type 1 Fonts
tweak, Housekeeping and special processing, The RefDB method, Problems of the DBTeXMath method
two-letter, Set up your bibliographic database, lynx localization
TXT, Introduction, Required software , Cool labels don't change!, The final step: invoking lyxtox , Document creation: RTF and TXT, Localization, Shortcomings and bugs
TY, Set up your bibliographic database
Type1, Using Type 1 Fonts , Using True Type fonts, Configuring pdfjadetex , Embedding Computer Modern fonts
typewriter, Dvips, Ghostscript and ImageMagik, Tables, LaTeX errors, Standard Bibliography methods in LyX, Using Type 1 Fonts , xmodmap, xkeycaps, Helpful Hints and Tips, Layout Language Options, Keyboard mapping configuration, Character Tables, International Spellcheck Support
typing, The structure of TeX errors, Character Tables
typographic, Optimal PDF

Y

YaST, LyX

Notes

[1]

The bibliography of this document is generated through RefDB directly in SGML, so that there is no LyX file for it available.

[2]

The current version of the lyxtox script does not make use of sgmltools, due to the need of processing intermediate results for the integration of Mathematics, Bibliography etc., as well as problems in the processing that turned out to be very difficult to debug, see runbib not working from lyxtox.

[3]

PostScript® is a registered trademark of Adobe Systems Incorporated, and is the main page description language in the UN*X world.

[4]

part of the tetex package on my SuSE system, see Section 3.5.

[5]

part of the tetex package on my SuSE system, see Section 3.5.

[6]

Acrobat® is a registered trademark of Adobe Systems Incorporated.

[7]

See also How to correctly invoke the lyxtox script.

[8]

Note that this may be specific to SuSE 9.0. Other distributions may have corrected it.

[9]

Note that this may be specific to SuSE 9.0. Other distributions may have corrected it.

[10]

The example is from the processing of the LyX file for the PHP-Nuke HOWTO. Visit the Homepage of the PHP-Nuke HOWTO to see the result. wink

[11]

There are various reasons why you may want to use the Old TeX (OT1) font encoding: you want to use the original Computern Modern fonts (which are available only in OT1) and/or have Mathematics (for which the original CM fonts are still a good choice), or you don't have other fonts, or you just find CM irresistible. wink

[12]

of SGML text, original text was certainly less than that.

[13]

I have inserted some blanks in the code snippet in order to prevent my own scripts (sedscr!) from matching and changing a code that was meant to be an example ;-)

[14]

I have inserted some blanks in the code snippet in order to prevent my own scripts (sedscr!) from matching and changing a code that was meant to be an example ;-)

[15]

I have inserted some blanks in the code snippet in order to prevent my own scripts (sedscr!) from matching and changing a code that was meant to be an example ;-)

[16]

I have inserted some blanks in the code snippet in order to prevent my own scripts (sedscr!) from matching and changing a code that was meant to be an example ;-)

[17]

I have inserted some blanks in the code snippet in order to prevent my own scripts (sedscr!) from matching and changing a code that was meant to be an example ;-)

[18]

I have inserted some blanks in the code snippet in order to prevent my own scripts (sedscr!) from matching and changing a code that was meant to be an example ;-)

[19]

I have inserted some blanks in the code snippet in order to prevent my own scripts (sedscr!) from matching and changing a code that was meant to be an example ;-)

[20]

I have inserted some blanks in the code snippet in order to prevent my own scripts (sedscr!) from matching and changing a code that was meant to be an example ;-)

[21]

Yup, we use a sed script to change another sed script...

[22]

Distiller® is a registered trademark of Adobe Systems Incorporated.

[23]

A PostScript® and an encapsulated PostScript® file differ only in the bounding box statement. The preamble of the PostScript® file contains, for example

%%BoundingBox: 65 242 547 550 

while the preamble of the encapsulated PostScript® file contains

%%BoundingBox: 0 0 482 308 

Thus, the PostScript® file specifies an absolute position for the image, while the encapsulated PostScript® file does not. The encapsulated PostScript® file will be offset by some amount, to be determined by the program that includes it. Knowing this, you can easily convert from one format to the other manually, just by editing the BoundingBox statement.

(O.K., there is another small difference: the PostScript® file contains a showpage command that instructs the printer to print the page after rendering it.)

[24]

A PostScript® and an encapsulated PostScript® file differ only in the bounding box statement. The preamble of the PostScript® file contains, for example

%%BoundingBox: 65 242 547 550 

while the preamble of the encapsulated PostScript® file contains

%%BoundingBox: 0 0 482 308 

Thus, the PostScript® file specifies an absolute position for the image, while the encapsulated PostScript® file does not. The encapsulated PostScript® file will be offset by some amount, to be determined by the program that includes it. Knowing this, you can easily convert from one format to the other manually, just by editing the BoundingBox statement.

(O.K., there is another small difference: the PostScript® file contains a showpage command that instructs the printer to print the page after rendering it.)

[25]

Although, I must say that I had to add the right density to the EPS and PDF versions too, after all - the original ones appeared too large in the PDF and PS documents.

[26]

Depending on your stylesheets, you may need to copy them to wherever they expect the icons to be, e.g. /usr/share/sgml/docbkdsl/images or somewhere else. YMMV.

[27]

Not to be confused with “True Type” fonts.

[28]

Download the freely available version (for Windows, Linux und Solaris) under http://www.pdf-tools.com/en/products_evaluation.html; free use is limited to evaluation purposes only, though - you will need a licence for productive work.

[29]

Note, however, that Bobby itself is just an attempt at automating an accessibility test and has not evaded criticism. In Accessible shopping, for example, we read:

Bobby is too primitive and unreliable and has a tendency to give false-negative results, flunking absolutely everything but the simplest text-only sites.

[30]

Even if the starting point is not a LyX document, the task of presenting Mathematics on the Web is not a trivial one, although (or exactly because) there is a bunch of solutions to choose from, see Math Typesetting for the Internet.

[31]

as you can see already, cross-references to equations (and with them also equation labels and titles) do work for all formats too with the method I will describe.

[32]

the creator of the DBTeXMath method that we are going to use here

[33]

The original script was called texmath2png. I added “bmp” to the name, because it now converts to BMP format too.

[34]

of course, you take away the quotes

[35]

at least not easily: we would probably need two consecutive invocations of sed, or employ some complicated branching. Contrary to my usual predilection, I chose to make it as simple as possible, rather than complex and wonderful. ;-) The next LyX release may render it obsolete anyway.

[36]

It doesn't matter to what you change it to, as long as it is different from <equation>.

[37]

you can only cross-reference an equation only if you previously set a label to it, but the LyX label cannot (and will not) be exported to SGML, since it refers to a line in a possibly multi-line equation - what id should then be exported to SGML form an equation with three lines, all carrying a label in LyX?

[38]

The IF statement in the code avoids the substitution for the brackets that surround the alt tags themselves.

[39]

Both HTML stylesheets contain the same code for Mathematics.

[40]

More precisely, the equation-list.sgml file is itself an SGML document, which should validate against the following DTD:

<!DOCTYPE equation-set [
<!ELEMENT equation-set - - (texequation+)>
<!ATTLIST equation-set
latexopt CDATA #IMPLIED
density CDATA #IMPLIED
usepackage CDATA #IMPLIED>
<!ELEMENT texequation - - (#PCDATA)>
<!ATTLIST texequation fileref CDATA #REQUIRED>
]>
[41]

You can see the equation-list.sgml file for this document here: equation-list.sgml.

[42]

It also unescapes some characters in the TeX code:

sub unescape {
    $eqn =~ s/&#38;/&/g;
    $eqn =~ s/&#62;/\>/g;
    $eqn =~ s/&#60;/\</g;
} 
[43]

This is new here. The original texmath2png.pl file did not compute BMP versions of the equation images.

[44]

The print stylesheets, lyxtox-print-pdf.dsl and lyxtox-print-ps.dsl both contain the same code for Mathematics. This can be probably modularized further in some next version of the scripts.

[45]

The PS version is then only one step away, with the aid of dvips.

[46]

Actually, pdfjadetex has to be called up to three times consecutively to produce table of contents, cross-references etc. correctly. lyxtox already does this for you.

[47]

Actually, jadetex has to be called up to three times consecutively to produce table of contents, cross-references etc. correctly. lyxtox already does this for you.

[48]

Inserting “role=tex” to the <alt> tag can be done very easily in the awkscr_math script.

[49]

You could also save yourself some typing by executing xkeycaps > .Xmodmap. This will create a usable map file. Of course, if you hit the “output keymap” button in xkeycaps more than once, the resulting map file will be a mess. As with all things, xkeycaps is a tool, and only as intelligent as the person on the other end.

[50]

In LaTeX terms, selecting a language other than default adds Babel support. If you do not have Babel installed, refer to the different LaTeX distributions for it.

[51]

This is also true if you do use the T1 encoding, but instruct Openjade to use the Computer Modern family of fonts (in the T1 encoding this time, of course) through the setting of the %body-font-family%, %mono-font-family%, %title-font-family%, %admon-font-family% and %guilabel-font-family% DSSSL variables in the lyxtox-print-pdf.dsl stylesheet (see Section 7.1.5) - as you can see very well in the PDF version of this document.

[52]

This only holds when you want to input these quotes by yourself. The automatic quote feature described in the “Quotes” Subsection of the LyX User's Guide (in the Section “A Few Words about Typography”), will generate automatically LaTeX code adapted to available fonts and packages.

[53]

Not to be confused with the french quotes symbol (the “guillemet”).

[54]

Not to be confused with the french quotes symbol (the “guillemet”).

[55]

The dead macron in usually not needed, as you will use a non--dead key for this instead. For example, S-M-minus, or if .Xmodmap is correct, S-M-macron.

[56]

These characters might not look very nice on screen, but they will be just fine when run through LaTeX and printed.

[57]

You will have to scroll to the bottom (perhaps after setting the “focus” with a click somewhere in the text with the mouse, since it is a framed design) to see the List of Figures.

Last updated Mon Sep 24 01:19:25 CEST 2007 Permalink: http://www.karakas-online.de/mySGML/mySGML.html All contents © 2002-2007 Chris Karakas