Applied Abstraction: LaTeX

Showing posts with label LaTeX. Show all posts

Wednesday, October 15, 2008

Certainly More Interesting than Anything I've Ever Done with LaTeX

That is just insane. But in a good way!

Saturday, April 12, 2008

SEEing LaTeX 29: Listening to LaTeX

The SubEthaEdit mode I've developed in this series of posts is generally rather quiet. It doesn't announce successful production of a PDF from a LaTeX file, it just creates it and opens it. It also doesn't tell you when an error occurs.

You can have the errors reported by piping the output from the call to pdflatex into some command line tool. Exactly what tool to use is less clear. The natural choice for SubEthaEdit would be the see command line too, but it's not very convenient, and has some real problems. Beyond that, calling out to see repeatedly will produce a new window each time, which is just annoying.

The Python mode has a nicer behavior. It has a Check Syntax command that always writes to a window named Python Check Syntax, opening it if need be or overwriting the contents otherwise. The Lua mode has an analogous Check Syntax command with similar behavior.

Let's make a shell script that does something similar. First, I encapsulate the relevant bits of AppleScript from the Python and Lua modes into some simple handlers. Second, I embed those into a shell script using the approach recently discussed here. The resulting script I call seeless, as it is usable as something like the less pager. I'll defer the text of the script until the end, first describing the usage.

Basic usage is just to pipe in some text:
echo Hello World! | seeless
This writes the text Hello World into a SubEthaEdit window titled seeless message, opening a new window if necessary or replacing the text in an existing window. Multiple windows with the same title are a bit problematic, with no guarantee that the window you want will be the one written to. Don't do that.

The title of the window can be specified:
echo Hello World! | seeless -t"A message for you, direct from the shell"
Appropriate choice of title allows a SubEthaEdit mode to establish its own reporting window.

The window is normally brought to the front when it is written to. Call seeless as
echo Hello World! | seeless -b
to leave the window in the background. This allows, e.g., a reporting window to be kept out of the way as a tab and only checked when something seems to be wrong.

There are two different modes, insert and append, for writing to the window. For insert mode, the window is cleared before writing the text from stdin, while append mode just appends to any existing text. To set append more, use:
echo Hello World! | seeless -a
Insert mode is the default, but a flag exists for it, too:
echo Hello World! | seeless -i
Multiple flags for the insert and append mode can be given, with the final one determining the behavior.

In append mode, the text from a second call to seeless follows immediately after the text from the first. To give some visual space, specify a separator:
echo Hello World! | seeless -s"----------"
When a separator is given with the -s flag, append mode is automatically set. It is also possible to use a separator in insert mode, giving a form of header. For example,
echo Hello World! | seeless -s"$(date)" -i
shows the time when seeless was called. Another possibility would be to have a cluster of related programs writing to the same window, and using the separator to specify the program that wrote the latest text.

Let's take a look at how I put those options into use with the LaTeX mode. I set SEE_BIBTEX to:

'bibtex "${FILE%.tex}" | "$HOME/Library/bin/seeless" -t"LaTeX Messages" -s"bibtex ran at $(date)\n" -b -i &> /dev/null &'

SEE_LATEX_CLEANUP to:

'latexmk -C "$FILE" | "$HOME/Library/bin/seeless" -t"LaTeX Messages" -s"latexmk -C ran at $(date)\n" -b -i &> /dev/null &'

and SEE_LATEX_COMPILER to:

'latexmk -pdf -quiet "$FILE" | "$HOME/Library/bin/seeless" -t"LaTeX Messages" -s"latexmk ran at $(date)\n" -b -i &> /dev/null &'

Note that I have redirected the output of each call to seeless to /dev/null and made the calls asynchronous with &—SubEthaEdit hangs without doing this, requiring a force quit.

With the above settings, the LaTeX mode will cause SubEthaEdit to open up a report window titled LaTeX Messages whenever a document is typeset, bibtex is run, or the auxiliary files are cleaned up. I can put it out of the way, either as a tab or background document; because I've used the -b flag for each call, the report window will stay out of the way until I want it brought to the front. I've used a separator to show which feature was most recently used and at what time it was called. I've set insert mode with an -i flag, so I only see the latest call; by eliminating this flag, I'd have a chronological record of all the calls made (in the current editing session, anyway).

I haven't been using this very long, so there may be some bugs. However, it seems quite solid, and is definitely useful already. Download it here.

As promised above, I'll give the text of the script here, too. I've formatted the script as a shell script, to better show how the shell variables are used to adapt the behavior of the embedded AppleScript.

#!/bin/sh
 
 # Writes stdin to a SubEthaEdit document, modifying the contents if the
 # document already exists. Document is selected by title, with a default
 # title of 'seeless'. Title can be specified with a '-t' flag. If there 
 # are multiple documents with the given title, one is chosen arbitrarily.
 #
 # Writing to the document occurs in two modes, insert and append. With
 # insert mode, the document is cleared before any text is written to 
 # it. In append mode, any existing text is maintained, with the new
 # text appended at the end. By giving an '-a' flag, append mode is set.
 # Giving an '-i' flag sets insert mode; insert mode is the default.
 #
 # A separator can be specified. The separator is written to the 
 # document before the text from stdin. Specifying a separator also sets 
 # append mode (but this can be turned off again if desired with an '-i' 
 # flag). The separator is specified with an '-s' flag; the argument 
 # following the flag is used as the separator.
 # 
 # Giving an '-h' flag shows a usage summary. Any other flags are ignored.
 #
 
 #$Id: seeless.sh,v 1.9 2008/04/12 20:33:56 mjb Exp $
 
 # Copyright (c) 2008, Michael J. Barber
 # 
 # Permission is hereby granted, free of charge, to any person obtaining
 # a copy of this software and associated documentation files (the
 # "Software"), to deal in the Software without restriction, including
 # without limitation the rights to use, copy, modify, merge, publish,
 # distribute, sublicense, and/or sell copies of the Software, and to
 # permit persons to whom the Software is furnished to do so, subject
 # to the following conditions:
 # 
 # The above copyright notice and this permission notice shall be
 # included in all copies or substantial portions of the Software.
 # 
 # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
 # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
 # IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR
 # ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
 # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
 # WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 
 
 PROGRAM=$(basename $0)
 
 usage()
 {
     echo "Usage: $PROGRAM [-ahi] [-t title] [-s separator]"
 }
 
 title="$PROGRAM message"
 shouldClear=true
 shouldSeparate=false
 shouldBringToFront=true
 separator="----------"
 while getopts :t:ias:bh opt
 do
     case $opt in
     t)      title="$OPTARG"
             ;;
     i)      shouldClear=true
             ;;
     a)      shouldClear=false
             ;;
     s)      separator="$OPTARG"
             shouldSeparate=true
             shouldClear=false
             ;;
     b)      shouldBringToFront=false
             ;;
     h)      usage
             exit 0
             ;;
     '?')    echo "$PROGRAM: invalid option -$OPTARG" >&2
             usage >&2
             exit 1
             ;;
     esac
 done
 
 shift $((OPTIND - 1))
 
 /usr/bin/osascript > /dev/null <<ASCPT
     set newContents to "$(cat | sed -e 's/\\/\\\\/g' -e 's/\"/\\\"/g')"
     set seeDoc to (ensureSEEDocumentExists for "$title")
     if $shouldClear then
         replaceContents for the seeDoc by ""
     end if
     if $shouldSeparate then
         extendContents for the seeDoc by the "$separator"
     end if
     extendContents for the seeDoc by the newContents
     if $shouldBringToFront then
         tell application "SubEthaEdit" to show the seeDoc
     end if
     
     to ensureSEEDocumentExists for doctitle
         tell application "SubEthaEdit"
             if exists document named doctitle then
                 document named doctitle
             else
                 make new document with properties {name:doctitle}
             end if
         end tell
     end ensureSEEDocumentExists
     
     to replaceContents for seeDoc by newContents
         tell application "SubEthaEdit"
             set the contents of seeDoc to newContents
             clear change marks of seeDoc
             try
                 set modified of seeDoc to false
             end try
         end tell
     end replaceContents
     
     to extendContents for seeDoc by moreContents
         tell application "SubEthaEdit"
             if "" is not equal to the contents of the last paragraph of seeDoc then
                 set the contents of the last insertion point of the last paragraph of seeDoc to return
             end if
             set the contents of the last insertion point of the last paragraph of seeDoc to moreContents
             clear change marks of seeDoc
             try
                 set modified of seeDoc to false
             end try
         end tell
     end extendContents
 ASCPT
 

Monday, January 7, 2008

SEEing LaTeX 25: Putting Things in Order

As noted below, the entries in the LaTeX mode menu for SubEthaEdit need to be put into a sensible order. After some thinking, I decided that there are really three groups of scripts: scripts for interacting with the LaTeX system, scripts for simplifying typing, and a script for interacting with the LaTeX mode itself.

Ideally, we'd put the three into groups divided by a horizontal rule. Unfortunately, SEE does not (yet) allow dividing lines to be inserted into the menu. Regardless, let's organize the scripts into the three groups, and - with one exception- just alphabetize within the groups. The exception is for the "Typeset and View" menu item; it strikes me as natural to put that first in the list.

End result is a menu ordered as:

Typeset and View
Clean Up Auxiliary Files
Run BibTeX
View

Complete Citation
Inline Math
Insert Environment...
Un/Comment Selected Lines

Customize Mode...

Although I've broken the groups apart with spacing, they'll just run together in the menu. Perhaps a later version of SEE will allow some more structure to be added.

One final change is that I've renamed the "Mode Environment..." menu item to "Customize Mode...". The two distinct meanings of "environment" struck me as confusing, and "Insert Environment..." is simply too appropriate for LaTeX to change.

I'll add another entry to the menu as soon as possible. It will go into the third group, with a title something like "Mode Help". I still have to write the mode help, first.

Sunday, January 6, 2008

SEEing LaTeX 24: For Completeness, BibTeX

I guess the right solution is to add a menu item for running bibtex, and that's it. Since I don't use makeindex myself, I'll leave that aside, unless someone wants to contribute scripts or just examples of use. I'd guess that the scripts would be easy enough, just adapt the ones I'll present for BibTeX.

Here's the AppleScript:

checkSaveStatus without updating
set bibScript to join of {modeEnvironment(), quotedForm for "$SEE_MODE_RESOURCES/bin/runbibtex.sh", quotedForm for documentPath()} by space
do shell script bibScript

on seescriptsettings()
return {displayName:"Run BibTeX"}
end seescriptsettings

include(`SubEthaEditTools.applescript')

And here's the shell script that it calls:

#!/bin/sh
 
 #$Id: runbibtex.sh,v 1.1 2008/01/06 19:09:49 mjb Exp $
 
 PATH="$PATH:/usr/texbin:/usr/local/bin"
 export PATH
 
 BIBTEX=${SEE_BIBTEX:-'bibtex "$(basename $FILE .tex)"'}
 FILE="$(basename "$1")"
 DIRNAME="$(dirname "$1")"
 
 cd "$DIRNAME"
 eval $BIBTEX 
 

SEEing LaTeX 23: Am I Done?

Last July, I listed some desirable features for the LaTeX mode for SubEthaEdit. The mentioned features were integrating with a PDF viewer using pdfsync, enabling insertion of citation keys in bibtex format, allow typesetting by calling pdflatex from within SEE, cleaning up auxiliary files, and commenting out selected lines. I also mentioned that it might be nice, if inessential, to be able to insert environments and formatting.

I've accomplished all of that. As well, I have introduced a mechanism for customizing the shell script environment for the mode and have developed a set of AppleScript handlers useful for scripting SEE. I'm quite pleased with how all of that has worked out. Not only do I now have a LaTeX mode that covers my main needs, but I've got a solid foundation on which I - and hopefully others! - can build support scripts for other modes.

That said, the LaTeX mode is not quite finished. It's clear that I should write some documentation, and add a "Mode Help" menu item. I should also better document SubEthaEditTools. Beyond that, there are two additional tasks.

First, the scripts in the LaTeX mode menu should be put into some sort of reasonable order, rather than the haphazard order that currently is there. The items appear in order based on the names of the scripts, which can differ from the entry in the menu. What is the right order to use?

Second, I've omitted some important elements of a LaTeX system, because latexmk handles them for me. For example, I have not provided a way to run bibtex. Should additional elements be added? If so, which? bibtex? makeindex? Something else?

Saturday, January 5, 2008

SEEing LaTeX 22: Inline Math

Environments aren't the only thing common LaTeX constructs that are awkward to type. The delimiters for inline math are pretty awkward, too. Let's add those, too:

set mathText to selectionText()
set wrappedText to "\$ " & mathText & " \$"
setSelectionText to wrappedText
if (length of mathText) equals 0 then
    set {startChar, nextChar} to selectionRange without extendingFront or extendingEnd
    setSelectionRange to (startChar + length of mathText + 3)
end if

on seescriptsettings()
    {displayName:"Inline Math", keyboardShortcut:"@~^m", inContextMenu:"yes"}
end seescriptsettings

include(`SubEthaEditTools.applescript')

SEEing LaTeX 21: Inserting Environments

Environments again? Yes, but this time we're going to look at environments in LaTeX, not the shell environment. Environments are a bit of a pain to type, but the repetitive structure makes them suitable to automation: we just get the name of the environment, and then have a more-or-less standard form:

\begin{environmentname}
body
\end{environmentname}

Indentation of the body can be a little unclear. For example, I usually indent equation environments, but would never dream of indenting the document environment.

To add environment insertion into the LaTeX mode for SubEthaEdit, we begin by getting the environment name using display dialog:

try
display dialog "Enter environment name:" with title "Insert Environment" default answer "equation"
on error number -128 -- user canceled
return
end try
set envName to text returned of result

The try block is to handle when the user cancels instead of entering and environment name. Somewhat arbitrarily, I set the default answer to be "equation", since my guess is that equations are probably the most common environment.

After that, there is just some fiddly work getting the formatting of the environment correct. It needs to intelligently insert newlines and tabs to keep the document readable. As well, something needs to be done with the selection text. There are two possibilities as I see it: (1) treat the selection as the name of the environment and (2) treat the selection as the body of the environment. I went with the latter. Finally, it would be nice to place the insertion point somewhere reasonable; I think it works nicely at the end of the body, especially since the insertion point is positioned to start typing immediately if the body is empty. Putting it all together, we have:

set {startSelected, nextSelected} to selectionRange without extendingFront or extendingEnd
set {startExtended, nextExtended} to selectionRange with extendingFront and extendingEnd

set prefix to selectByComparing(startSelected, startExtended, "", "\n")
set suffix to selectByComparing(nextSelected, nextExtended, "", "\n")
set indent to selectByComparing(startSelected, nextSelected, "\t", "")

set beforeInsertion to (join of {prefix, "\\begin{", envName, "}\n", indent, selectionText()} by "")
set afterInsertion to (join of {suffix, "\\end{", envName, "}\n" } by "")

setSelectionText to (beforeInsertion & afterInsertion)
setSelectionRange to startSelected + (count of beforeInsertion) - 1 + (count of suffix)

Note that I've made repeated use of a convenience function:

to selectByComparing(val1, val2, sameVal, diffVal)
    if val1 equals val2 then
        sameVal
    else
        diffVal
    end if
end selectByComparing

The selectByComparing handler is not part of SubEthaEditTools - maybe it should be?

That's all there is to it, apart from the boilerplate:

on seescriptsettings()
{displayName:"Insert Environment...", keyboardShortcut:"@^e", inContextMenu:"yes"}
end seescriptsettings

include(`SubEthaEditTools.applescript')

Update: There is an interesting possibility for adjusting the default environment. I had used "equation" as the default, but it was pretty arbitrary. Another approach would be to just repeat whichever environment was last given. We just add a property to the script that holds the default environment, use its value in making the dialog, and update the property based on the dialog result:

property defaultEnvironment: "equation"

try
display dialog "Enter environment name:" with title "Insert Environment" default answer defaultEnvironment
on error number -128 -- user canceled
return
end try
set envName to text returned of result
set defaultEnvironment to envName

Is this actually a good idea? or will it just be annoying? Hard to say without using it, so I guess I'll try it out for a while.

Friday, January 4, 2008

SEEing LaTeX 20: Getting Citations Right

I started enhancing the LaTeX mode for SubEthaEdit by looking at citations. The original approach, based on just using the input manager supplied with BibDesk, proved to be unsatisfactory. Let's fix that now.

The general approach seems clear enough. We need to determine a search term based on the cursor position in the LaTeX document, pass that to BibDesk to get matching documents, let the user pick which documents are relevant, format the selected documents, and insert the result into the document. Using SubEthaEditTools our SEE scripting abstraction layer, it proves to be fairly straightforward.

The first step, determining the search term, is probably the trickiest. What I envision is that the user can enter a partial citation and have it finished by the script. Thus, we'll need to examine the text immediately before the insertion point and determine a partial citation key. We should not cross an open brace "{", as we don't want to include the macro. Additionally, we shouldn't cross a comma, since we might be looking at multiple citations within one macro. Let's not check the calling macro; this allows the completion to be invoked at inappropriate points, but also allows the completion to be invoked for user-defined macros.

One final issue is what we do when there is a range of text selected. Since the default behavior should be to have no text selected, we should treat a non-empty selection as meaningful. Let's take it to mean that the search should be constrained to only include the selected text. Therefore, we determine a search term based either on the selected text or all the text on the line preceding the insertion point.

Putting all that together, I came up with:

set {startChar, nextChar} to selectionRange without extendingFront or extendingEnd
if startChar equals nextChar then
    -- empty selection, try the whole line
    set selectionContents to extendedSelectionText with extendingFront without extendingEnd
else
    set selectionContents to selectionText()
end if
set macroArgument to the last item of the (tokens of the selectionContents between "{")
set searchTerm to the last item of the (tokens of the macroArgument between ",")

I've used a couple of the new handlers from SubEthaEditTools. These are hopefully self-explanatory, but check the implementation in case they are not.

Using the two tokens calls, we obtain a partial citation key. We pass that to BibDesk:

tell application "BibDesk"
set citeMatches to search for searchTerm with for completion
end tell

The ungrammatical with for completion will give us a list of cite keys, not just document titles. Each completion is given as a string containing both the cite key and some document information, separated by a " % " string.

We next present the list of completions to the user, in order to narrow the list down to just the appropriate citations. To present a list, we use choose from list from the AppleScript StandardAdditions. There are three cases worth considering. First, if there is only one publication, we can select it by default, so the user just confirms whether it is correct. Second, if there are multiple possible publications, we just show the list, declining to guess. Finally, if there aren't any matches, we just inform the user with display alert; in this case, we'll set the list of publications to the empty list. If there aren't any publications returned, that means the user canceled, so we should just exit and leave the document unchanged. Putting it all together, we arrive at:

if (count of citeMatches) equals 1 then
    choose from list citeMatches with title "Citation Matches" with prompt "One matching publication:" default items citeMatches
    set pubs to result
else if (count of citeMatches) > 1
    choose from list citeMatches with title "Citation Matches" with prompt "Please select publications:" with multiple selections allowed
    set pubs to result
else
    display alert "No matches found for partial citation \"" & searchTerm & "\""
    set pubs to {}
end if

if (count of pubs) equals 0 then
    -- user canceled, do nothing
    return
end

We now have a non-empty list of completions. We need to split those apart to get the cite keys, then join the cite keys with commas. I used awk:

set citation to shellTransform of (join of pubs by "\n") for "" through "awk -F' % ' 'NR == 1 { printf(\"%s\", $1) } NR > 1 { printf(\",%s\", $1) }'" without alteringLineEndings

Pretty ugly! Let's reformat the core of the awk program to make it clearer:

NR == 1 { printf("%s", $1) } 
 NR > 1 { printf(",%s", $1) }
 

In this form, it's clear enough, keeping in mind that we use " % " as the field separator: we just print out the first field, corresponding to the cite key, with the first record treated specially to get the number of commas right.

Finally, we insert the formatted citation back into the document. I also move the insertion point to the end of the formatted citation, as a typing convenience. I had considered closing the brace for the citation macro, but that would make multi-citation lists more awkward, so decided against it. This is none too difficult:

setSelectionRange to {nextChar - (length of searchTerm), nextChar - 1}
setSelectionText to citation
setSelectionRange to (nextChar - (length of searchTerm) + (length of citation))

Again, I've used new handlers from the SubEthaEditTools library.

Putting it all together, and adding a seescriptsettings handler to integrate it into SEE, we get:

-- $Id: BibDeskCompletions.applescript,v 1.2 2008/01/04 18:40:16 mjb Exp mjb $

(*
Need to figure out the search term. Treat a selection as meaning to constrain the search
term to lie within the selection, and an empty selection as meaning to get the search term
from the preceding text on the line. We don't cross an opening brace, so that the search term
comes from a call to a macro. However, we don't check to see if the macro is one of the
standard citation macros, since we do want to allow user macros.
*)
set {startChar, nextChar} to selectionRange without extendingFront or extendingEnd
if startChar equals nextChar then
    -- empty selection, try the whole line
    set selectionContents to extendedSelectionText with extendingFront without extendingEnd
else
    set selectionContents to selectionText()
end if
set macroArgument to the last item of the (tokens of the selectionContents between "{")
set searchTerm to the last item of the (tokens of the macroArgument between ",")

tell application "BibDesk"
    set citeMatches to search for searchTerm with for completion
end tell

-- get list of publications, customizing user interaction based on number of matches
if (count of citeMatches) equals 1 then
    choose from list citeMatches with title "Citation Matches" with prompt "One matching publication:" default items citeMatches
    set pubs to result
else if (count of citeMatches) > 1
    choose from list citeMatches with title "Citation Matches" with prompt "Please select publications:" with multiple selections allowed
    set pubs to result
else
    display alert "No matches found for partial citation \"" & searchTerm & "\""
    set pubs to {}
end if

if (count of pubs) equals 0 then
    -- user canceled, do nothing
    return
end

(*
At this point, there is a non-empty list of matches, which replaces the search term. By
construction, the search term always immediately precedes the end of the selection.
Call out to the shell to format the publication list into a LaTeX citation, insert the citation,
and then move the insertion point just after the citation.
*)
set citation to shellTransform of (join of pubs by "\n") for "" through "awk -F' % ' 'NR == 1 { printf(\"%s\", $1) } NR > 1 { printf(\",%s\", $1) }'" without alteringLineEndings
setSelectionRange to {nextChar - (length of searchTerm), nextChar - 1}
setSelectionText to citation
setSelectionRange to (nextChar - (length of searchTerm) + (length of citation))

on seescriptsettings()
    {displayName:"Complete Citation", shortDisplayName:"Citation", keyboardShortcut:"@^j", toolbarIcon:"ToolbarBibDesk.png", inDefaultToolbar:"yes", toolbarTooltip:"Complete citation using BibDesk", inContextMenu:"yes"}
end seescriptsettings

include(`SubEthaEditTools.applescript')

Wednesday, December 26, 2007

SEEing LaTeX 18: Cutting SubEthaEdit out of the Picture

Let's take further steps along the path begun last time. Recall that we moved the portion of the AppleScript that dealt with the SubEthaEdit mode into the handler that produces the environment for the shell scripts. Let's introduce a few more handlers:

-- Manipulation of document properties

to checkSaveStatus given updating:shouldSave
    tell application "SubEthaEdit"
        if not (exists path of front document) then
            error "You have to save the document first"
        end if
        if shouldSave and (modified of front document) then
            try
                save front document
            end try
        end if
    end tell
end checkSaveStatus

on documentPath()
    tell application "SubEthaEdit" to get the path of the front document
end documentPath

on documentLine()
    tell application "SubEthaEdit" to get the startLineNumber of selection of front document
end documentLine

The checkSaveStatus handler is the most complex of the three, and the only one that warrants any discussion. It never returns a meaningful value, and thus is only to be called for its side effects. There are two possible side effects. First, if the front document in SEE has never been saved, an error is raised. Second, if requested and necessary, the handler will update the file on disk by saving the current document.

That may seem a bit obscure, so let's examine an example. The AppleScript for typesetting previously featured a rather complicated nesting of tell, try, and if blocks. Using the new handlers, the only thing outside the handlers in a rewritten typesetting script is:

checkSaveStatus with updating
set buildScript to join of {modeEnvironment(), quotedForm for "$SEE_MODE_RESOURCES/bin/buildlatex.sh", quotedForm for documentPath(), documentLine()} by space
do shell script buildScript

on seescriptsettings()
return {displayName:"Typeset and View", shortDisplayName:"Typeset", keyboardShortcut:"@b", toolbarIcon:"ToolbarIconBuildAndRun", inDefaultToolbar:"yes", toolbarTooltip:"Typeset and view the current document", inContextMenu:"no"}
end seescriptsettings

Just three statements, plus the seescriptsettings handler to connect it to SEE.

The various handlers introduced thus far have been to support several actions in the LaTeX mode: typesetting, viewing the product PDF in an external viewer, cleaning up auxiliary files, and commenting out lines. All of these actions can be rewritten using just the handlers, without directly addressing SubEthaEdit at all.

The handlers thus provide a useful abstraction layer for working with several distinct types of actions. They definitely won't cover every case of interest, but should still be useful patterns for a lot of common actions in many different modes.

SEEing LaTeX 17: A Bit More on Environments

To make the comments scripts a little more flexible, it turns out that we need to make some adjustments to how we handle the shell environment. The changes needed are minor, but, as we'll see, they lead to some new possibilities.

First, let's take a look at why the change is needed. I'd like to have more flexibility in how I call the comment.sh shell script. Significantly, I'd like to be able to call something else first, perhaps tr to change line endings to the newlines ("\n") that the shell requires. Alternatively, the shell script could be replaced with another tools for applying comments, much like how the typesetting scripts call pdflatex by default, but can be replaced by another tool like latexmk.

Doing that is quite straightforward. We just write a little shell script to handle the defaults, just like we did with other actions for the mode. It's just two lines:

#!/bin/sh -
 
 COMMENT=${SEE_LATEX_COMMENT:-'"$SEE_MODE_RESOURCES/bin/comment.sh" %'}
 
 eval $COMMENT

Notice that the default call needs to know the path to the resources directory for the mode, indicated here as the shell variable SEE_MODE_RESOURCES. Also, I've abandoned the original meaning of the SEE_LATEX_COMMENT variable; now, it defines the complete behavior for the comment command, not just the string to use.

Perhaps the simplest way to make the path available is just to define SEE_MODE_RESOURCES with the appropriate value. To do that, we rewrite the modeEnvironment handler to inject the appropriate value. I came up with this:

on modeEnvironment()
    tell application "SubEthaEdit" to set {modeName, modeResources} to {name, resource path} of the mode of the front document
    set envFilePath to (path to preferences from user domain as string) & "de.codingmonkeys.SubEthaEdit." & modeName & "_environment.plist"
    join of {"export SEE_MODE_RESOURCES=", quotedForm for modeResources, "; ", readEnvironment out of envFilePath} by ""
end modeEnvironment

I've also eliminated the parameter to the handler, which had specified the language mode for SubEthaEdit. The mode details were only used in defining the environment, so it seemed sensible.

With the new modeEnvironment handler, the logic for the CommentLines AppleScript becomes quite simple:

set env to modeEnvironment()
set pipeline to quotedForm for "$SEE_MODE_RESOURCES/bin/commentlines.sh"

completeSelectedLines()
set outText to shellTransform of selectionText() for env through pipeline without alteringLineEndings
setSelectionText to outText

There is now no direct interaction with SEE; the program is written, in effect, in a domain specific language for scripting SEE, abstracting away from the details of SEE's scripting implementation.

As a final point, I'd like to emphasize that the resource path for the mode is now available to the user. For example, I prefer to have a space after the percent sign for LaTeX comments. Thus, I just define SEE_LATEX_COMMENT to be '"$SEE_MODE_RESOURCES"/bin/comment.sh "% "' in the mode environment.

Sunday, December 23, 2007

SEEing LaTeX 16: Comments Continued

In installment 15 of this series, I discussed difficulties that I'd encountered while exploring adding comments to the SubEthaEdit LaTeX mode. In so doing, I presented a shell script and several AppleScript handlers to address the difficulties. With that infrastructure, it is not so hard to incorporate comments into the mode.

First, there is an additional, minor change. Earlier, I'd described a prependEnvironment AppleScript handler. This proves to be too specific. A slightly simpler modeEnvironment handler seems to be a better fit. It is defined as:

on modeEnvironment for seeMode
set envFilePath to (path to preferences from user domain as string) & "de.codingmonkeys.SubEthaEdit." & (name of seeMode) & "_environment.plist"
readEnvironment out of envFilePath
end modeEnvironment

Hopefully, that is straightforward enough.

With the infrastructure set up, the logic for the AppleScript becomes pretty straightforward. We first get some needed information from the SEE document. We use that information to read out the environment using modeEnvironment and to build a pipeline string calling our comment.sh shell script. Then, we adjust the text selection to select complete lines, transform that text by running it through pipeline, and set the selection to the transformed text. Additionally, we define seescriptsettings to integrate our AppleScript into SEE. The settings are patterned after those of the Objective C mode. Here is the relevant fragment of the AppleScript:

tell application "SubEthaEdit"
        set activeMode to mode of front document
        set modeResources to resource path of activeMode
end tell

set env to modeEnvironment for activeMode

completeSelectedLines()
set inText to selectionText()
set pipeline to join of {quotedForm for (modeResources & "/bin/comment.sh"), quotedForm for "${SEE_LATEX_COMMENT:-%}"} by space

set outText to shellTransform of inText for env through pipeline without alteringLineEndings
setSelectionText to outText

-- SubEthaEdit settings

on seescriptsettings()
    {displayName:"Un/Comment Selected Lines", keyboardShortcut:"@/", inContextMenu:"yes"}
end seescriptsettings

Note that the comment string can be set with the SEE_LATEX_COMMENT environment variable, defaulting to a "%".

The result is fairly nice, but not without shortcomings. There is a brief, but noticable, hesitation between activating the script and seeing the results. It's reasonably clear that building up the environment string is the problem; I'm not sure whether it's worth fixing or not. More significantly, there is a subtle logic problem. Implicitly, I've assumed that the LaTeX document is using Unix-style ("\n") line endings. That's a pretty safe assumption, most of the time, but can be wrong; we'll fix that next time.

Friday, December 21, 2007

SEEing LaTeX 15: Comments on Comments

I really thought that getting comments working in the SubEthaEdit LaTeX mode would be easy. It wasn't.

My expectation was that I could just adapt the Un/Comment Selected Lines script from the Objective C mode. However, I didn't really find the script to be satisfactory. Aesthetically, it's unappealing. Instead of using all the selected lines to determine whether to comment or uncomment the lines, just the first line is used. You can see the comments being applied one line at a time, because the AppleScript implementation loops over the lines, with each iteration in the loop sending its own, slow AppleEvent to SEE.

More importantly, the script functions in a manner that I consider to simply be incorrect. If lines are already commented, the objc mode script leaves them unaltered. Now consider working with a block of lines, only some of which are commented. If you invoke the Un/Comment Selected Lines script twice, the first invocation will comment, and the second will uncomment. Since the first invocation leaves the originally commented lines unchanged, there is no distinction between them and the originally uncommented lines. The second invocation thus removes the comments from all the lines, failing to restore the original state, changing the meaning of the program. In Objective C, that probably leads to an error at compile time. In LaTeX, you've just altered your document in a way that is quite likely to still be valid. It's also quite likely to be wrong, wrong, wrong!

In short, I decided I needed to rewrite the Un/Comment Selected Lines script to be (1) more aesthetically pleasing and (2) its own inverse. Working directly in AppleScript was not so easy. To eliminate the loop that causes the aesthetic issues, you really need to use a where clause that addresses all the lines at once. I couldn't get that to work. Maybe someone who's better with AppleScript could. But, really, why put in the effort, when the shell is considerably more powerful for text manipulation?

Once again, I'd figured it would be easy. Just a little grep to detect whether I should comment or uncomment and a little sed to make the actual change. After actually trying it, I quickly realized that there were some real challenges in trying to dynamically build up the regular expressions needed for grep and sed. As I often do when scripting, I found awk to be the solution to my problem, producing comment.sh:

#!/bin/sh
 
 # Toggle comment status for text lines. Text lines are read from 
 # stdin and un/commented text lines are written to stdout.
 #
 # Comments are defined by the line starting with a text string given
 # by the first argument. The lines will be uncommented if all lines
 # are commented, and commented if any or all of the the lines are
 # uncommented. The script is its own inverse, i.e., piping the text 
 # through the script twice writes the original text to stdout. 
 
 #$Id: comment.sh,v 1.3 2007/12/18 21:40:47 mjb Exp $
 
 tmp=$(mktemp /tmp/comments.XXXXXXXXXXXXXXXXXXXX)
 
 clen=$(printf "%s" "$1" | wc -c)
 
 tee "$tmp" | 
     awk -v clen="$clen" '{ print substr($0, 1, clen) }' | 
         grep -F -q -v "$1"
 
 if (($?)) 
 then
     # uncomment
     cat "$tmp" | awk -v lnbeg=$((clen+1)) '{ print substr($0, lnbeg) }'
 else
     # comment
     cat "$tmp" | awk -v comment="$1" '{ print comment $0 }'
 fi
 
 trap "rm -f $tmp; exit" EXIT HUP INT TERM
 

The script reads lines from stdin, writing either commented or uncommented lines to stdout. The comment string is given by the first argument to the script.

In comment.sh, I first make a temporary file, since I'll need to go through the lines twice; using tee, I can make a copy of the lines in the temp file. I determine the length of the comment string. I then use awk to cut away just the first few characters of each line, comparing them to the comment string with grep -F (-F for fixed strings, no regular expressions!). That determines whether or not all lines start with the comment string.

When all the lines start with the comment string, I uncomment by chopping off the initial characters using awk. Otherwise, I comment by printing both the comment string and the lines, again using awk. Again, note that I've avoided using regular expressions.

The last line ensures that the temp file is removed. There's not much else to say about it.

With the comment.sh script available, it now remains to get the necessary text from the LaTeX document and send it to the script. I follow the basic strategy shown on the Coding Monkeys website, consisting of copying the text to the clipboard and using pbpaste to pipe it into a desired shell script. I wrapped all that up into an AppleScript handler:

on shellTransform of inText for envString through pipeline given alteringLineEndings:altEnds
    set shellscript to envString & " export __CF_USER_TEXT_ENCODING=0x1F5:0x8000100:0x8000100; pbpaste | " & pipeline
    set the oldClipboard to the clipboard
    set the clipboard to the inText
    try
        set shellresponse to do shell script shellscript altering line endings altEnds
    on error errMsg number errNum from badObject
        set the clipboard to the oldClipboard
        error errMsg number errNum from badObject
    end try
    set the clipboard to the oldClipboard
    shellresponse
end shellTransform

Note the use of the try block to restore the clipboard in case of error, followed by another statement restoring the clipboard. It should then be that the clipboard is always restored to its original state. This construction is a little awkward, so the handler is a natural abstraction to hide the mess. The environment variable __CF_USER_TEXT_ENCODING follows the Coding Monkeys site example exactly - it doesn't seem to hurt when I omit it, but I'll just trust that it is correct.

What remains is to specify exactly what the text is. At first glance, it seems like we should just take the selected text. This has a serious drawback: you need to completely select all the lines you're interested in, or you'll add comments to the middle of a line. As an important special case, you'd be unable to just press the keyboard shortcut to comment out the current line with no selected text. So, I decided to extend the selection to complete the first and last lines of the selection; the special case is handled cleanly in this way, too. I defined another handler to manage the selection:

to completeSelectedLines()
    tell the front document of application "SubEthaEdit"
        set {startChar, nextChar} to {startCharacterIndex of paragraph (startLineNumber of selection), nextCharacterIndex of paragraph (endLineNumber of selection)}
        set selection to {startChar, nextChar - 1}
    end tell
end completeSelectedLines

The handler is a little complicated to understand because of how it is written. An equivalent form is:

to completeSelectedLines2()
    tell the front document of application "SubEthaEdit"
        set startLineNum to startLineNumber of the selection
        set endLineNum to endLineNumber of the selection
        set startChar to startCharacterIndex of paragraph startLineNum
        set nextChar to nextCharacterIndex of paragraph endLineNum
        set selection to {startChar, nextChar - 1}
    end tell
end completeSelectedLines2

The second form is a lot slower, though, because it sends individual AppleEvents to handle things that are done in just one with the first set statement of the first form.

With that, all the infrastructure needed to put together the desired Un/Comment Selected Lines script are at hand. I'll do that in the next installment.

Update: A better shell script for handling the comments is available.

Sunday, November 4, 2007

SEEing LaTeX 14: General Use of Environments

We've now seen how to define and update an environment for SubEthaEdit. The approach is modeled on how Mac OS X applications store their preferences; effectively, I used application preferences as a design pattern. To demonstrate the general applicability of the approach, let's apply it to some additional scripts; specifically, let's revisit viewing the PDF compiled from a LaTeX document and cleaning up the auxiliary files that LaTeX produces. The short version is that the approach works smoothly in both cases, with minimal differences in the AppleScripts used to add behavior to SEE. The long version follows, including the scripts to actually implement it.

First, let's look at viewing the compilation product. Previously, I'd just used the LaTeX file name to define the PDF file name, and sent it to PDFView using AppleScript. With the new approach, I need a shell script defining the default behavior, and an AppleScript invoking the shell script from SEE. The shell script is:

PATH="$PATH:/usr/texbin:/usr/local/bin"
 export PATH
 
 VIEWER=${SEE_LATEX_VIEWER:-'open "$PRODUCT"'}
 PRODUCT_TYPE="${SEE_LATEX_PRODUCT_TYPE:-pdf}"
 
 FILE="$(basename "$1")"
 DIRNAME="$(dirname "$1")"
 LINE="$2"
 PRODUCT="$(basename "$1" .tex).$PRODUCT_TYPE"
 
 cd "$DIRNAME"
 if [ -s "$PRODUCT" ]
 then
     eval $VIEWER 
 fi
 

Note that the new definition nowhere assumes that we will produce a PDF file as output; it could be used with latex to produce a DVI, for instance.
The AppleScript is:

tell application "SubEthaEdit"
    if exists path of front document then
        set filePath to path of front document
        set lineNumber to startLineNumber of selection of front document
        set activeMode to mode of front document
        set modeResources to resource path of activeMode
    else
        error "You have to save the document first"
    end if
end tell

set viewScript to prependEnvironment for activeMode onto (join of {quotedForm for (modeResources & "/bin/viewproduct.sh"), quotedForm for filePath, lineNumber} by space)

do shell script viewScript

-- SubEthaEdit settings

on seescriptsettings()
    {displayName:"View", shortDisplayName:"View", keyboardShortcut:"^~@o", toolbarIcon:"ToolbarIconRun", inDefaultToolbar:"yes", toolbarTooltip:"View current document in external viewer", inContextMenu:"no"}
end seescriptsettings

on join of tokenList by delimiter
    set oldTIDs to text item delimiters of AppleScript
    set text item delimiters of AppleScript to delimiter
    set joinedString to tokenList as string
    set text item delimiters of AppleScript to oldTIDs
    return joinedString
end join

on quotedForm for baseString
    quote & baseString & quote
end quotedForm

to prependEnvironment for seeMode onto scriptString
    set envFilePath to (path to preferences from user domain as string) & "de.codingmonkeys.SubEthaEdit." & (name of seeMode) & "_environment.plist"
    (readEnvironment out of envFilePath) & scriptString
end prependEnvironment

to readEnvironment out of plist
    readListPair out of plist
    environmentString from result
end readEnvironment

to readListPair out of plist
    tell application "System Events"
        if exists file plist then
            tell property list file plist
                get {name, value} of every property list item
            end tell
        else
            {{}, {}}
        end if
    end tell
end readPlist

on environmentString from keyValueListPair
    set {plistKeys, plistValues} to keyValueListPair
    set accumulator to {}
    set oldTIDs to text item delimiters of AppleScript
    set text item delimiters of AppleScript to ""
    repeat with i from 1 to number of items in plistKeys
        set tokens to {"export ", item i of plistKeys, "=", item i of plistValues, ";"}
        copy (tokens as string) to the end of the accumulator
    end repeat
    set AppleScript's text item delimiters to space
    set envString to accumulator as string
    set AppleScript's text item delimiters to oldTIDs
    envString
end environmentString

The shell script uses the same SEE_LATEX_VIEWER environment variable used for compiling; I'll adapt the compilation script a little to allow separate viewing behavior for the two cases, defaulting to both using the SEE_LATEX_VIEWER contents. Essentially, this consists of changing just one line, replacing

VIEWER=${SEE_LATEX_VIEWER:-'open "$PRODUCT"'}
 

with

VIEWER=${SEE_LATEX_COMPILEVIEWER:-${SEE_LATEX_VIEWER:-'open "$PRODUCT"'}}
 

Note that the AppleScript uses the same code to read from the same plist of environment settings as the compilation script--no changes were needed to accommodate the new settings.

Second, let's examine cleaning up the auxiliary files. The shell script is:

PATH="$PATH:/usr/texbin:/usr/local/bin"
 export PATH
 
 CLEANUP=${SEE_LATEX_CLEANUP:-'rm -f $(basename "$FILE" .tex).{aux,bbl,blg,dvi,log,out,ps,pdf,pdfsync,toc}'}
 
 FILE="$(basename "$1")"
 DIRNAME="$(dirname "$1")"
 PRODUCT="$(basename "$1" .tex).$PRODUCT_TYPE"
 
 cd "$DIRNAME"
 eval $CLEANUP

The cleanup behavior can be defined in a SEE_LATEX_CLEANUP variable, used to set the CLEANUP variable. The default value for CLEANUP is to remove files with the same name as the LaTeX file but with different filename extensions. The list of extensions (aux, bbl, blg, dvi, log, out, ps, pdf, pdfsync, toc) is pretty arbitrary, being essentially what were created for my own writings.

The associated AppleScript is:

tell application "SubEthaEdit"
    if exists path of front document then
        set filePath to path of front document
        set activeMode to mode of front document
        set modeResources to resource path of activeMode
    else
        --Unsaved document, so LaTeX not run on it and can just return
        return
    end if
end tell

set cleanupScript to prependEnvironment for activeMode onto (join of {quotedForm for (modeResources & "/bin/cleanupaux.sh"), quotedForm for filePath} by space)

do shell script cleanupScript

on seescriptsettings()
    return {displayName:"Clean Up Auxiliary Files"}
end seescriptsettings

on join of tokenList by delimiter
    set oldTIDs to text item delimiters of AppleScript
    set text item delimiters of AppleScript to delimiter
    set joinedString to tokenList as string
    set text item delimiters of AppleScript to oldTIDs
    return joinedString
end join

on quotedForm for baseString
    quote & baseString & quote
end quotedForm

to prependEnvironment for seeMode onto scriptString
    set envFilePath to (path to preferences from user domain as string) & "de.codingmonkeys.SubEthaEdit." & (name of seeMode) & "_environment.plist"
    (readEnvironment out of envFilePath) & scriptString
end prependEnvironment

to readEnvironment out of plist
    readListPair out of plist
    environmentString from result
end readEnvironment

to readListPair out of plist
    tell application "System Events"
        if exists file plist then
            tell property list file plist
                get {name, value} of every property list item
            end tell
        else
            {{}, {}}
        end if
    end tell
end readPlist

on environmentString from keyValueListPair
    set {plistKeys, plistValues} to keyValueListPair
    set accumulator to {}
    set oldTIDs to text item delimiters of AppleScript
    set text item delimiters of AppleScript to ""
    repeat with i from 1 to number of items in plistKeys
        set tokens to {"export ", item i of plistKeys, "=", item i of plistValues, ";"}
        copy (tokens as string) to the end of the accumulator
    end repeat
    set AppleScript's text item delimiters to space
    set envString to accumulator as string
    set AppleScript's text item delimiters to oldTIDs
    envString
end environmentString

Again, the bulk of the script is unchanged, with no changes at all to the portions handling the environment settings.

Applied Abstraction