I really thought that getting comments working in the
SubEthaEdit LaTeX mode would be easy. It wasn't.
My expectation was that I could just adapt the
Un/Comment Selected Lines
script from the Objective C mode. However, I didn't really find the script to be satisfactory. Aesthetically, it's unappealing. Instead of using all the selected lines to determine whether to comment or uncomment the lines, just the first line is used. You can see the comments being applied one line at a time, because the AppleScript implementation loops over the lines, with each iteration in the loop sending its own, slow AppleEvent to SEE.
More importantly, the script functions in a manner that I consider to simply be incorrect. If lines are already commented, the objc mode script leaves them unaltered. Now consider working with a block of lines, only some of which are commented. If you invoke the
Un/Comment Selected Lines
script twice, the first invocation will comment, and the second will uncomment. Since the first invocation leaves the originally commented lines unchanged, there is no distinction between them and the originally uncommented lines. The second invocation thus removes the comments from all the lines, failing to restore the original state, changing the meaning of the program. In Objective C, that probably leads to an error at compile time. In LaTeX, you've just altered your document in a way that is quite likely to still be valid. It's also quite likely to be wrong, wrong, wrong!
In short, I decided I needed to rewrite the
Un/Comment Selected Lines
script to be (1) more aesthetically pleasing and (2) its own inverse. Working directly in AppleScript was not so easy. To eliminate the loop that causes the aesthetic issues, you really need to use a
where
clause that addresses all the lines at once. I couldn't get that to work. Maybe someone who's better with AppleScript could. But, really, why put in the effort, when the shell is considerably more powerful for text manipulation?
Once again, I'd figured it would be easy. Just a little
grep
to detect whether I should comment or uncomment and a little
sed
to make the actual change. After actually trying it, I quickly realized that there were some real challenges in trying to dynamically build up the regular expressions needed for
grep
and
sed
. As I often do when scripting, I found
awk
to be the solution to my problem, producing
comment.sh
:
#!/bin/sh
# Toggle comment status for text lines. Text lines are read from
# stdin and un/commented text lines are written to stdout.
#
# Comments are defined by the line starting with a text string given
# by the first argument. The lines will be uncommented if all lines
# are commented, and commented if any or all of the the lines are
# uncommented. The script is its own inverse, i.e., piping the text
# through the script twice writes the original text to stdout.
#$Id: comment.sh,v 1.3 2007/12/18 21:40:47 mjb Exp $
tmp=$(mktemp /tmp/comments.XXXXXXXXXXXXXXXXXXXX)
clen=$(printf "%s" "$1" | wc -c)
tee "$tmp" |
awk -v clen="$clen" '{ print substr($0, 1, clen) }' |
grep -F -q -v "$1"
if (($?))
then
# uncomment
cat "$tmp" | awk -v lnbeg=$((clen+1)) '{ print substr($0, lnbeg) }'
else
# comment
cat "$tmp" | awk -v comment="$1" '{ print comment $0 }'
fi
trap "rm -f $tmp; exit" EXIT HUP INT TERM
The script reads lines from
stdin
, writing either commented or uncommented lines to
stdout
. The comment string is given by the first argument to the script.
In
comment.sh
, I first make a temporary file, since I'll need to go through the lines twice; using
tee
, I can make a copy of the lines in the temp file. I determine the length of the comment string. I then use
awk
to cut away just the first few characters of each line, comparing them to the comment string with
grep -F
(
-F
for fixed strings, no regular expressions!). That determines whether or not all lines start with the comment string.
When all the lines start with the comment string, I uncomment by chopping off the initial characters using
awk
. Otherwise, I comment by printing both the comment string and the lines, again using
awk
. Again, note that I've avoided using regular expressions.
The last line ensures that the temp file is removed. There's not much else to say about it.
With the
comment.sh
script available, it now remains to get the necessary text from the LaTeX document and send it to the script. I follow the basic strategy shown on the
Coding Monkeys website, consisting of copying the text to the clipboard and using
pbpaste
to pipe it into a desired shell script. I wrapped all that up into an AppleScript handler:
on shellTransform of inText for envString through pipeline given alteringLineEndings:altEnds
set shellscript to envString & " export __CF_USER_TEXT_ENCODING=0x1F5:0x8000100:0x8000100; pbpaste | " & pipeline
set the oldClipboard to the clipboard
set the clipboard to the inText
try
set shellresponse to do shell script shellscript altering line endings altEnds
on error errMsg number errNum from badObject
set the clipboard to the oldClipboard
error errMsg number errNum from badObject
end try
set the clipboard to the oldClipboard
shellresponse
end shellTransform
Note the use of the
try
block to restore the clipboard in case of error, followed by another statement restoring the clipboard. It should then be that the clipboard is always restored to its original state. This construction is a little awkward, so the handler is a natural abstraction to hide the mess. The environment variable
__CF_USER_TEXT_ENCODING
follows the Coding Monkeys site example exactly - it doesn't seem to hurt when I omit it, but I'll just trust that it is correct.
What remains is to specify exactly what the text is. At first glance, it seems like we should just take the selected text. This has a serious drawback: you need to completely select all the lines you're interested in, or you'll add comments to the middle of a line. As an important special case, you'd be unable to just press the keyboard shortcut to comment out the current line with no selected text. So, I decided to extend the selection to complete the first and last lines of the selection; the special case is handled cleanly in this way, too. I defined another handler to manage the selection:
to completeSelectedLines()
tell the front document of application "SubEthaEdit"
set {startChar, nextChar} to {startCharacterIndex of paragraph (startLineNumber of selection), nextCharacterIndex of paragraph (endLineNumber of selection)}
set selection to {startChar, nextChar - 1}
end tell
end completeSelectedLines
The handler is a little complicated to understand because of how it is written. An equivalent form is:
to completeSelectedLines2()
tell the front document of application "SubEthaEdit"
set startLineNum to startLineNumber of the selection
set endLineNum to endLineNumber of the selection
set startChar to startCharacterIndex of paragraph startLineNum
set nextChar to nextCharacterIndex of paragraph endLineNum
set selection to {startChar, nextChar - 1}
end tell
end completeSelectedLines2
The second form is a lot slower, though, because it sends individual AppleEvents to handle things that are done in just one with the first
set
statement of the first form.
With that, all the infrastructure needed to put together the desired
Un/Comment Selected Lines
script are at hand. I'll do that in the next installment.
Update: A better shell script for handling the comments is
available.