Wednesday, December 26, 2007

SEEing LaTeX 18: Cutting SubEthaEdit out of the Picture

Let's take further steps along the path begun last time. Recall that we moved the portion of the AppleScript that dealt with the SubEthaEdit mode into the handler that produces the environment for the shell scripts. Let's introduce a few more handlers:
-- Manipulation of document properties

to checkSaveStatus given updating:shouldSave
    tell application "SubEthaEdit"
        if not (exists path of front document) then
            error "You have to save the document first"
        end if
        if shouldSave and (modified of front document) then
            try
                save front document
            end try
        end if
    end tell
end checkSaveStatus

on documentPath()
    tell application "SubEthaEdit" to get the path of the front document
end documentPath

on documentLine()
    tell application "SubEthaEdit" to get the startLineNumber of selection of front document
end documentLine


The checkSaveStatus handler is the most complex of the three, and the only one that warrants any discussion. It never returns a meaningful value, and thus is only to be called for its side effects. There are two possible side effects. First, if the front document in SEE has never been saved, an error is raised. Second, if requested and necessary, the handler will update the file on disk by saving the current document.

That may seem a bit obscure, so let's examine an example. The AppleScript for typesetting previously featured a rather complicated nesting of tell, try, and if blocks. Using the new handlers, the only thing outside the handlers in a rewritten typesetting script is:
checkSaveStatus with updating
set buildScript to join of {modeEnvironment(), quotedForm for "$SEE_MODE_RESOURCES/bin/buildlatex.sh", quotedForm for documentPath(), documentLine()} by space
do shell script buildScript

on seescriptsettings()
    return {displayName:"Typeset and View", shortDisplayName:"Typeset", keyboardShortcut:"@b", toolbarIcon:"ToolbarIconBuildAndRun", inDefaultToolbar:"yes", toolbarTooltip:"Typeset and view the current document", inContextMenu:"no"}
end seescriptsettings

Just three statements, plus the seescriptsettings handler to connect it to SEE.

The various handlers introduced thus far have been to support several actions in the LaTeX mode: typesetting, viewing the product PDF in an external viewer, cleaning up auxiliary files, and commenting out lines. All of these actions can be rewritten using just the handlers, without directly addressing SubEthaEdit at all.

The handlers thus provide a useful abstraction layer for working with several distinct types of actions. They definitely won't cover every case of interest, but should still be useful patterns for a lot of common actions in many different modes.

SEEing LaTeX 17: A Bit More on Environments

To make the comments scripts a little more flexible, it turns out that we need to make some adjustments to how we handle the shell environment. The changes needed are minor, but, as we'll see, they lead to some new possibilities.

First, let's take a look at why the change is needed. I'd like to have more flexibility in how I call the comment.sh shell script. Significantly, I'd like to be able to call something else first, perhaps tr to change line endings to the newlines ("\n") that the shell requires. Alternatively, the shell script could be replaced with another tools for applying comments, much like how the typesetting scripts call pdflatex by default, but can be replaced by another tool like latexmk.

Doing that is quite straightforward. We just write a little shell script to handle the defaults, just like we did with other actions for the mode. It's just two lines:
#!/bin/sh -

COMMENT=${SEE_LATEX_COMMENT:-'"$SEE_MODE_RESOURCES/bin/comment.sh" %'}

eval $COMMENT

Notice that the default call needs to know the path to the resources directory for the mode, indicated here as the shell variable SEE_MODE_RESOURCES. Also, I've abandoned the original meaning of the SEE_LATEX_COMMENT variable; now, it defines the complete behavior for the comment command, not just the string to use.

Perhaps the simplest way to make the path available is just to define SEE_MODE_RESOURCES with the appropriate value. To do that, we rewrite the modeEnvironment handler to inject the appropriate value. I came up with this:
on modeEnvironment()
    tell application "SubEthaEdit" to set {modeName, modeResources} to {name, resource path} of the mode of the front document
    set envFilePath to (path to preferences from user domain as string) & "de.codingmonkeys.SubEthaEdit." & modeName & "_environment.plist"
    join of {"export SEE_MODE_RESOURCES=", quotedForm for modeResources, "; ", readEnvironment out of envFilePath} by ""
end modeEnvironment


I've also eliminated the parameter to the handler, which had specified the language mode for SubEthaEdit. The mode details were only used in defining the environment, so it seemed sensible.

With the new modeEnvironment handler, the logic for the CommentLines AppleScript becomes quite simple:
set env to modeEnvironment()
set pipeline to quotedForm for "$SEE_MODE_RESOURCES/bin/commentlines.sh"

completeSelectedLines()
set outText to shellTransform of selectionText() for env through pipeline without alteringLineEndings
setSelectionText to outText

There is now no direct interaction with SEE; the program is written, in effect, in a domain specific language for scripting SEE, abstracting away from the details of SEE's scripting implementation.

As a final point, I'd like to emphasize that the resource path for the mode is now available to the user. For example, I prefer to have a space after the percent sign for LaTeX comments. Thus, I just define SEE_LATEX_COMMENT to be '"$SEE_MODE_RESOURCES"/bin/comment.sh "% "' in the mode environment.

Sunday, December 23, 2007

SEEing LaTeX 16: Comments Continued

In installment 15 of this series, I discussed difficulties that I'd encountered while exploring adding comments to the SubEthaEdit LaTeX mode. In so doing, I presented a shell script and several AppleScript handlers to address the difficulties. With that infrastructure, it is not so hard to incorporate comments into the mode.

First, there is an additional, minor change. Earlier, I'd described a prependEnvironment AppleScript handler. This proves to be too specific. A slightly simpler modeEnvironment handler seems to be a better fit. It is defined as:
on modeEnvironment for seeMode
    set envFilePath to (path to preferences from user domain as string) & "de.codingmonkeys.SubEthaEdit." & (name of seeMode) & "_environment.plist"
    readEnvironment out of envFilePath
end modeEnvironment

Hopefully, that is straightforward enough.

With the infrastructure set up, the logic for the AppleScript becomes pretty straightforward. We first get some needed information from the SEE document. We use that information to read out the environment using modeEnvironment and to build a pipeline string calling our comment.sh shell script. Then, we adjust the text selection to select complete lines, transform that text by running it through pipeline, and set the selection to the transformed text. Additionally, we define seescriptsettings to integrate our AppleScript into SEE. The settings are patterned after those of the Objective C mode. Here is the relevant fragment of the AppleScript:
tell application "SubEthaEdit"
        set activeMode to mode of front document
        set modeResources to resource path of activeMode
end tell

set env to modeEnvironment for activeMode

completeSelectedLines()
set inText to selectionText()
set pipeline to join of {quotedForm for (modeResources & "/bin/comment.sh"), quotedForm for "${SEE_LATEX_COMMENT:-%}"} by space  

set outText to shellTransform of inText for env through pipeline without alteringLineEndings
setSelectionText to outText

-- SubEthaEdit settings

on seescriptsettings()
    {displayName:"Un/Comment Selected Lines", keyboardShortcut:"@/", inContextMenu:"yes"}
end seescriptsettings

Note that the comment string can be set with the SEE_LATEX_COMMENT environment variable, defaulting to a "%".

The result is fairly nice, but not without shortcomings. There is a brief, but noticable, hesitation between activating the script and seeing the results. It's reasonably clear that building up the environment string is the problem; I'm not sure whether it's worth fixing or not. More significantly, there is a subtle logic problem. Implicitly, I've assumed that the LaTeX document is using Unix-style ("\n") line endings. That's a pretty safe assumption, most of the time, but can be wrong; we'll fix that next time.

Friday, December 21, 2007

SEEing LaTeX 15: Comments on Comments

I really thought that getting comments working in the SubEthaEdit LaTeX mode would be easy. It wasn't.

My expectation was that I could just adapt the Un/Comment Selected Lines script from the Objective C mode. However, I didn't really find the script to be satisfactory. Aesthetically, it's unappealing. Instead of using all the selected lines to determine whether to comment or uncomment the lines, just the first line is used. You can see the comments being applied one line at a time, because the AppleScript implementation loops over the lines, with each iteration in the loop sending its own, slow AppleEvent to SEE.

More importantly, the script functions in a manner that I consider to simply be incorrect. If lines are already commented, the objc mode script leaves them unaltered. Now consider working with a block of lines, only some of which are commented. If you invoke the Un/Comment Selected Lines script twice, the first invocation will comment, and the second will uncomment. Since the first invocation leaves the originally commented lines unchanged, there is no distinction between them and the originally uncommented lines. The second invocation thus removes the comments from all the lines, failing to restore the original state, changing the meaning of the program. In Objective C, that probably leads to an error at compile time. In LaTeX, you've just altered your document in a way that is quite likely to still be valid. It's also quite likely to be wrong, wrong, wrong!

In short, I decided I needed to rewrite the Un/Comment Selected Lines script to be (1) more aesthetically pleasing and (2) its own inverse. Working directly in AppleScript was not so easy. To eliminate the loop that causes the aesthetic issues, you really need to use a where clause that addresses all the lines at once. I couldn't get that to work. Maybe someone who's better with AppleScript could. But, really, why put in the effort, when the shell is considerably more powerful for text manipulation?

Once again, I'd figured it would be easy. Just a little grep to detect whether I should comment or uncomment and a little sed to make the actual change. After actually trying it, I quickly realized that there were some real challenges in trying to dynamically build up the regular expressions needed for grep and sed. As I often do when scripting, I found awk to be the solution to my problem, producing comment.sh:
#!/bin/sh

# Toggle comment status for text lines. Text lines are read from
# stdin and un/commented text lines are written to stdout.
#
# Comments are defined by the line starting with a text string given
# by the first argument. The lines will be uncommented if all lines
# are commented, and commented if any or all of the the lines are
# uncommented. The script is its own inverse, i.e., piping the text
# through the script twice writes the original text to stdout.

#$Id: comment.sh,v 1.3 2007/12/18 21:40:47 mjb Exp $

tmp=$(mktemp /tmp/comments.XXXXXXXXXXXXXXXXXXXX)

clen=$(printf "%s" "$1" | wc -c)

tee "$tmp" |
    awk -v clen="$clen" '{ print substr($0, 1, clen) }' |
        grep -F -q -v "$1"

if (($?))
then
    # uncomment
    cat "$tmp" | awk -v lnbeg=$((clen+1)) '{ print substr($0, lnbeg) }'
else
    # comment
    cat "$tmp" | awk -v comment="$1" '{ print comment $0 }'
fi

trap "rm -f $tmp; exit" EXIT HUP INT TERM

The script reads lines from stdin, writing either commented or uncommented lines to stdout. The comment string is given by the first argument to the script.

In comment.sh, I first make a temporary file, since I'll need to go through the lines twice; using tee, I can make a copy of the lines in the temp file. I determine the length of the comment string. I then use awk to cut away just the first few characters of each line, comparing them to the comment string with grep -F (-F for fixed strings, no regular expressions!). That determines whether or not all lines start with the comment string.

When all the lines start with the comment string, I uncomment by chopping off the initial characters using awk. Otherwise, I comment by printing both the comment string and the lines, again using awk. Again, note that I've avoided using regular expressions.

The last line ensures that the temp file is removed. There's not much else to say about it.

With the comment.sh script available, it now remains to get the necessary text from the LaTeX document and send it to the script. I follow the basic strategy shown on the Coding Monkeys website, consisting of copying the text to the clipboard and using pbpaste to pipe it into a desired shell script. I wrapped all that up into an AppleScript handler:
on shellTransform of inText for envString through pipeline given alteringLineEndings:altEnds
    set shellscript to envString & " export __CF_USER_TEXT_ENCODING=0x1F5:0x8000100:0x8000100; pbpaste | " & pipeline
    set the oldClipboard to the clipboard
    set the clipboard to the inText
    try
        set shellresponse to do shell script shellscript altering line endings altEnds
    on error errMsg number errNum from badObject
        set the clipboard to the oldClipboard
        error errMsg number errNum from badObject
    end try
    set the clipboard to the oldClipboard
    shellresponse
end shellTransform

Note the use of the try block to restore the clipboard in case of error, followed by another statement restoring the clipboard. It should then be that the clipboard is always restored to its original state. This construction is a little awkward, so the handler is a natural abstraction to hide the mess. The environment variable __CF_USER_TEXT_ENCODING follows the Coding Monkeys site example exactly - it doesn't seem to hurt when I omit it, but I'll just trust that it is correct.

What remains is to specify exactly what the text is. At first glance, it seems like we should just take the selected text. This has a serious drawback: you need to completely select all the lines you're interested in, or you'll add comments to the middle of a line. As an important special case, you'd be unable to just press the keyboard shortcut to comment out the current line with no selected text. So, I decided to extend the selection to complete the first and last lines of the selection; the special case is handled cleanly in this way, too. I defined another handler to manage the selection:
to completeSelectedLines()
    tell the front document of application "SubEthaEdit"
        set {startChar, nextChar} to {startCharacterIndex of paragraph (startLineNumber of selection), nextCharacterIndex of paragraph (endLineNumber of selection)}
        set selection to {startChar, nextChar - 1}
    end tell
end completeSelectedLines

The handler is a little complicated to understand because of how it is written. An equivalent form is:
to completeSelectedLines2()
    tell the front document of application "SubEthaEdit"
        set startLineNum to startLineNumber of the selection
        set endLineNum to endLineNumber of the selection
        set startChar to startCharacterIndex of paragraph startLineNum
        set nextChar to nextCharacterIndex of paragraph endLineNum
        set selection to {startChar, nextChar - 1}
    end tell
end completeSelectedLines2


The second form is a lot slower, though, because it sends individual AppleEvents to handle things that are done in just one with the first set statement of the first form.

With that, all the infrastructure needed to put together the desired Un/Comment Selected Lines script are at hand. I'll do that in the next installment.

Update: A better shell script for handling the comments is available.

Monday, December 17, 2007

If I Were Better With bash

There's an interesting question posed on the O'reilly FYI blog: what would you do if you were better at bash? There is a prize for the best response: the deadline is tomorrow, so hurry if you're interested!

I was a little surprised at the responses. The responses fit into two groups. The first group of responses is, roughly, doing specific projects on the job. The second group is about rounding out skill sets, without being overly specific about applications of that improved skill set; while I definitely support self-improvement and continued learning, the responses in the second group don't really seem to answer the question. What surprised me was that no one took the opportunity to dream a little and suggest something out of the ordinary. Is bash really so prosaic?

Here's what I submitted:
If I were better at bash, I'd write a book.

Every year, many graduate students in the sciences are confronted with the fact that they have to use their conceptual knowledge of their field to conduct original research. It's not easy. One needless difficulty is that they need to write programs to support their research, but without having learned about the tools available. Tools like revisions control systems, make, and shells like bash.

Their classmates and advisors generally don't know about those tools, either -- there is a cultural mismatch, and so there is no support. I'd write a book to provide that support and introduce those tools.

Basically, I'd write the book from which I could have so much benefitted when I was a physics graduate student about fifteen years ago.


What would you do?

Tuesday, December 4, 2007

I Guess That Answers It

Back in July, I posted twice about a promotion by MacUpdate. I first obliquely noted their curious incentive structure at the beginning of the promotion, questioning the basic idea. Towards the end of their promotion, I was considerably more direct and more critical.

To summarize, their promotion had a system of "unlocks," where additional -- and higher quality -- applications would be "unlocked" with sufficient sales and added to the promotion. This is, in short, a classically bad idea, because there must be early sales to add the extra applications and attract more customers, but, as a customer, you're better off waiting to make sure the extras are unlocked. Thus, the promotion either needs to be sufficiently attractive without the extra applications, or you need a bunch of (let's be positive) optimists to buy under the assumption that the best applications would in fact be reached.

The approach seemed like a bad idea to me at the start of the promotion, and the way the promotion played out only strengthened that. MacUpdate didn't actually reach their goals. They changed the targets and extended the promotion, so that all the applications were in fact provided. They really had to -- just imagine how poorly it would have reflected on both MacUpdate and the "premium" applications had they failed to unlock everything!

So, MacUpdate has another promotion. They've again got 10 applications, with 7 available at first and 3 more to be unlocked with sufficient sales. The target numbers for the unlocks are much more modest, which seems prudent. However, instead of having the most valuable application being unlocked last, they've gone and reversed it! That's right, the $300 Xmind Pro application is unlocked at 1000 sales, while the final unlock at 5000 sales gives the $45 PulpMotion application. Does this make sense? No, of course not. Really, what should we conclude from the structure? That PulpMotion is the crown jewel of the promotion? Isn't that tantamount to saying that Xmind is overpriced by an order of magnitude? (N.B., I have not used either Xmind or PulpMotion.)

It is obviously an attempt to ensure success, but I think that MacUpdate has drawn entirely the wrong conclusion. Wouldn't a simpler approach just have been to offer all ten applications from the beginning? If nothing else, it would have avoided the absurdly contrived bonus structure that they've produced.

My key point in the posts from July was put in the form of a question: is this really a good idea in the long run? I think MacUpdate has, unintentionally, provided a clear answer: no.

Sales, Sales, Sales!

There sure seem to be a lot of sales on Mac software this month. I don't recall quite so many last year. I've already mentioned MacSanta. There are also Give Good Food to Your Mac and a MacUpdate promotion. Are there others?

Of the three, Give Good Food to Your Mac is definitely the most appealing for me. It allows you to decide on your own bundle of applications, with a discount based on the number of applications. Their approach does lead to some oddities, though. For example, you should never buy between 7 and 9 applications, because it is always cheaper to add inexpensive applications and get 10. Even if you'd never use them, you still pay less, and, hey, I would use, e.g., CoverScout, even if I'd never buy it on its own merits.

MacSanta is more traditional in its structure. Unfortunately. New applications are announced every day, with a 20% discount possible for that day. Later in the month, you can still get the applications, but with only a 10% discount. It's a very annoying set-up. I really can't picture buying anything from it at the 10% level, but the 20% level seems only relevant for something that I'd decided to get anyway. Of course, if I've decided to get an application, I've probably already bought it! That said, I have already bought ShutterBug and may well get MarsEdit, too, so they are definitely doing something right. I guess it works to push me off the fence…

The MacUpdate promotion looks to be a good value, if it is offering applications you want. Which it is not, for me.

Update: There is one other point about Give Good Food to Your Mac that I'd wanted to mention. In their FAQ, there is a clear and unambiguous statement that all licenses purchased are normal licenses, with no restrictions. This is the way it should be done, in my opinion.

In contrast, MacUpdate has "full licenses with normal upgrade paths" except for "Swift Publisher, which will be a paid upgrade." Seriously, why?

MacSanta is structured differently, so the issue doesn't really come up there, as far as I can see. You just get a discount code that you use directly on the developers' sites, rather than a bundle through a third party.

Trying Something New…

I see on the MacSanta site that MarsEdit is on sale for 20% off today. Now, while I'm not above complaining, I haven't really commented on my general dissatisfaction with the editing tools for Blogger. In short, I don't care for the tools: even if they worked properly in Safari (and they don't), they'd still be a typically disappointing web application.

So, the next few posts will be done with MarsEdit—let's hope that is a good thing. They will also be quite a bit lighter in content than my usual posting style—and that's certainly a good thing!

Update: Well, that was a nice start. Composing and uploading went just fine. This update is essentially just a way to test out editing an existing post.

Update 2: Editing an existing post went fine. Editing again to try adding some tags. As a whole, this seems to be working very well, even to the point where one could just work in MarsEdit and assume that it gets to Blogger correctly.