Copy the following files in your working directory:
sedscr (a sed script that corrects LyX' exported SGML file),
sedscr_top (a sed script you can use to eliminate the “_top” target attribute of links whose link text contains a given regular expression string of your choice),
sedscr_val (a sed script that effects all changes that are necessary for the HTML document to validate as conforming to the HTML standards, see Chapter 8 for this subject),
sedscr_ris (a sed script that can create full RIS datasets out of a file containings URLs - included only for your convenience and not absolutely necessary for our method, see more details in Section 3.11, Section 5.19 and Section 7.1.10),
sedscr_abi (a sed script that will append the SGML entities (as defined in the Preample, see Section 4.6) for the Appendix, the Bibliography and the Index at the end of the corrected SGML file, see Section 7.1.9),
sedscr_app (a sed script that will insert a label and title in the Appendix, as well as change the end tag from </article> to </appendix>),
sedscr_cit (a sed script that will create a LyX file containing citation labels, to be used in citations),
sedscr_bib (a sed script that corrects the Appendix code, for the case we insert a bibliography after it, see Section 7.1.9),
awkscr_math (an awk script that prepares the Mathematics parts, like equations, for further processing, see Chapter 10, Section 10.1, Section 10.3),
awkscr_refdb_html and awkscr_refdb_print, used to create the necessary stylesheets if you are using RefDB (see Section 3.11, Section 5.19 and Section 7.1.10),
sedscr_tidy, a very rudimentary script that tries to reduce line length of the SGML file by inserting newlines after <para> and </para> tags. Also sedscr_tidy2, another sed script, to correct the first tidy script. You would run these two as follows:
# Tidy up the SGML file.
# ${RUNSED} ${SEDSCRTIDY} $1.sgml
# ${RUNSED} ${SEDSCRTIDY2} $1.sgml
|
However, they don't produce correct results, so the calls are commented in the lyxtox script.
sedscr_ima, a sed script that is used to produce another sed script, sedscr_img (sedscr_img is not included, as it is produced dynamically from the SGML file of the document and the sed script sedscr_ima).
sedscr_apa, a sed script that is used to erase <acronym>, <productname> and <application> tags from the alt and title texts in the dynamically created sed script sedscr_img.
lyxtox, the main script that creates all documents using the above scripts and and the rest of the required software (Chapter 3),
Copy runsed somewhere like /usr/local/bin. lyxtox and runsed should be executable.
runsed is a simple script that I modified from the original runsed script found in O'Reilly's Unix Power Tools, Chapter 34, Section 3 “Testing and Using a sed Script: checksed, runsed”. It simply takes two filenames as an argument and then runs sed on the second file using the first file as a sed script:
runsed sedscript file |
A sed script is a script that tells sed what to do. sed, in turn, is a powerful line editor suitable for batch processing (see Section 3.8). You don't have to worry about runsed, sedscr and lyxtox. You may want to have a look at lyxtox, just to ensure that all paths are correctly set and that you get some idea of what it does. It is very well commented. The gory details are in Section 7.1.
| Last updated Mon Sep 24 01:19:25 CEST 2007 | Permalink: http://www.karakas-online.de/mySGML/run-sed-awk-scripts.html | All contents © 2002-2007 Chris Karakas |