Applied Abstraction: 2010

Saturday, March 6, 2010

Modeless Scripts for SubEthaEdit

SubEthaEdit (SEE) supports mode-dependent extensions to its functionality. The mechanism for this is the embedding of AppleScripts into the mode. This lets, for example, Python documents have a Check Syntax command differently implemented from the identically named Check Syntax command for Lua documents.

All well and good, but there are tasks that are of interest across most or all modes. A prime example for a programmer's editor like SEE is commenting out lines. The basics are the same regardless of language: the line needs to begin with a specific string to indicate a comment. But the specifics do matter: Python needs hashes # for comments, Lua needs a double hyphen --, and so on.

We'd thus like to have scripts that are modeless, present in any mode, but that are customizable, appropriate to any mode. Such an AppleScript needs to go into the global scripts menu for SEE, but still allow each mode to define how the behavior of the script should be implemented.

Here's how to do it. We use an AppleScript to capture the basics of a given pattern, such as determining which lines should be commented out. The AppleScript then calls a shell script specified for the current mode.

All the components for this task have been presented previously on this blog, with code indentation exemplifying the approach. I'll use SubEthaEditTools to implement the AppleScripts. The mode-specific customization is done using a plist of environment variables. Particular tasks are done by writing an AppleScript that calls a shell command stored in an environment variable in the plist; the use of SubEthaEditTools makes these scripts quite brief.

Opening the plist is done with this script:

if not documentIsAvailable() then
	return
end

openEnvironmentSettings()

on seescriptsettings()
	return {displayName:"Customize Mode...", shortDisplayName:"Environment", inContextMenu:"no"}
end seescriptsettings

include(`SubEthaEditTools.applescript')

The include command is not part of AppleScript, it is an m4 macro used to modularize the scripts. The logic is simple: make sure there is a document available, then use its mode to open the mode-specific environment settings.

For a particular function we include the common features and call out to the shell to do the rest. For adding or removing line-oriented comments, I used:

if not documentIsAvailable() then
	return
end

if (modeSetting for "COMMENTER") is missing value then
	display alert "Commenting not available." message "You need to define COMMENTER for the mode."
	return
end

completeSelectedLines()
set outText to shellTransform of selectionText() for modeEnvironment() through "eval $COMMENTER" without alteringLineEndings
setSelectionText to outText

-- SubEthaEdit settings

on seescriptsettings()
    {displayName:"Un/Comment Selected Lines", keyboardShortcut:"@/", inContextMenu:"yes"}
end seescriptsettings

include(`SubEthaEditTools.applescript')

Again, the logic is simple: get the COMMENTER shell command from the mode-specific environment for the front current document, and use it to transform the selected lines.

So what should the shell command be? One possibility is the comment script presented in the preceding post. Some examples are given in that post: use those as the value in the environment variable plist, with COMMENTER as the key. There is no default comment method, so it's necessary to provide values for each of the modes you use. In practice, this isn't bad, since you can often just copy and paste between modes with minimal or no changes.

It's just as easy to define a script for block comments, such as those used in C or SML. I'll omit the details.

The AppleScripts and some supporting scripts are available for download.

Sunday, February 28, 2010

A Better Comment Script

A couple of years ago, I presented a shell script for commenting out lines, for use in a LaTeX mode for SubEthaEdit. The script is an improvement over the AppleScript approach used in other SubEthaEdit modes, but does something I don't really like: it always inserts or removes comments at the beginning of the lines, rather than at an appropriate indentation level.

Below, I present line-comment, an awk script that handles line-oriented comments, taking indentation into account. Lines to comment or uncomment are read from Standard Input, and the processed lines are written to Standard Output. The script uses the current commenting of the lines to determine whether to comment or uncomment.

There are two options that can be set. First, there is the TabWidth, which defines an indentation level; this defaults to the (basically useless) Unix standard of an eight-space tab. You'll almost always want to set this, even if you're using tabs, not spaces, for indentation. Second, there is the CommentString, whose meaning should be obvious; this defaults to the hash character # common to many programming languages.

As an example, a nice choice for Python could be line-comment TabWidth=4, while for Scala line-comment TabWidth=2 CommentString="//" would be more suitable.

Update: If the script doesn't seem to work, try setting the environment variable COMMAND_MODE=unix2003. This is relevant if you want to call it from SubEthaEdit: SEE uses COMMAND_MODE=legacy, which can cause the regular expressions to fail to match.

The script is unchanged:

#! /usr/bin/awk -f

function trimIndent(text, indRE,   n, tokens) {
  # Returns the text with the indentation removed. Sets
  # global variable IndentLevel to show how many
  # indentation levels were removed.
  n = split(text, tokens, indRE)
  if (n > 1) {
    IndentLevel = n-1
    rest = tokens[n]
  } else {
    IndentLevel = 0
    rest = text
  }
  return rest
}

function commentIndex(txt, commtxt,      n) {
  n = index(txt, commtxt)
  if (n > 0 && substr(txt, 1, n-1) !~ /^[ \t]*$/) {
    n = 0
  }
  return n
}

function noncommentPrefix(txt) {
  return match(txt, /^[ \t]*/) ? substr(txt, 1, RLENGTH) : ""
}

function multiString(str, mult,     n, mstr) {
  mstr = ""
  for (n=0; n<mult; n++) {
    mstr = mstr str
  }
  return mstr
}

function offsetString(offset) {
  return multiString(" ", offset)
}

BEGIN {
  TabWidth=8
  CommentString="#"
}

NR == 1 {
  # Establish regex based on tab settings. This comes after
  # the BEGIN to allow the TabWidth to be overriden.
  indentRegex = "( {0,"(TabWidth-1)"}\t| {"TabWidth"})"
  # indentRegex = "( {0,3}\t|"offsetString(TabWidth)")"
}

{
  # Common processing for all lines. Divide the line into a
  # prefix of whitespace, followed immediately by the
  # comment string, if present, or a non-tab, non-space
  # character if not. The prefix consists of indentation
  # steps followed by an offset, defined as a number of spaces
  # insufficient to constitute an indentation step.
  Line[NR] = $0
  commInd = commentIndex($0, CommentString)
  if (commInd > 0) {
    prefix = substr($0, 1, commInd-1)
    CommentPosition[NR] = commInd
  } else {
    prefix = noncommentPrefix($0)
  }
  offset = length(trimIndent(prefix, indentRegex))
}

NR == 1 {
  BaseOffset = offset
  BaseIndent = IndentLevel
  MinOffset = BaseOffset
  MinIndent = BaseIndent
}

NR > 1 {
  if (IndentLevel < MinIndent ) {
    MinIndent = IndentLevel
    MinOffset = 0
  } else if (offset < MinOffset) {
    MinOffset = offset
  }
}

END {
  commLen = length(CommentString)
  if (length(CommentPosition) == length(Line) && MinIndent == BaseIndent && MinOffset == BaseOffset) {
    for (n=1; n<=NR; n++) {
      commPos = CommentPosition[n]
      print substr(Line[n], 1, commPos-1) substr(Line[n], commLen+commPos)
    }
  } else {
    indentPart = MinIndent ? indentRegex"{"MinIndent"}" : ""
    # indentPart = multiString(indentRegex, MinIndent)
    offsetPart = offsetString(MinOffset)
    # offsetPart = offsetString(MinOffset)
    commRegex = "^" indentPart offsetPart
    for (n=1; n<=NR; n++) {
      match(Line[n], commRegex)
      print substr(Line[n], 1, RLENGTH) CommentString substr(Line[n], 1+RLENGTH)
    }
  }
}

Saturday, February 27, 2010

TaskPaper Mode for SubEthaEdit

TaskPaper is an application for managing simple to-do lists. It is somewhere between a text editor and an outline processor, focused on lists of to-do items that can be checked off. The lists can be organized into projects and marked up with tags, enabling search and selection by tag.

TaskPaper saves its documents as plain text. They can be opened, modified, and created by any application that can work with text files. In fact, it is fair to say that TaskPaper is both an application and a lightweight text-markup system specifically for to-do lists. It is pretty easy to support the TaskPaper file format, and it has been done for several text editors.

Some months back, I created a ToDo mode for SubEthaEdit that supports the TaskPaper format. Today, I finally got around to making it available for download. The mode supports syntax highlighting and has scripts to automate creating new tasks and projects, marking tasks as done, and archiving completed tasks. Tags are detected and highlighted, but there is unfortunately no way to do the outline-processor-style hoisting of particular tags. To aid in managing multiple tasks, project names appear in the function popup menu.

It is also possible to modify how the mode handles marking tasks as completed and archiving them. This requires an additional script to open a plist of environment settings; install this script in the scripts folder for SubEthaEdit (if you're not sure where that is, use Open Scripts Folder under the scripts menu in SEE). Two relevant keys can be set, SEE_TODO_MARK_DONE and SEE_TODO_ARCHIVE_DONE. The values should be set to shell commands that implement the desired behavior for marking tasks as completed and for moving completed tasks to the Archive pseudo-project.

One possibility is to pass different command-line options to the scripts that implement the default behavior for the mode. For example, you could set the value of SEE_TODO_MARK_DONE to '"$SEE_MODE_RESOURCES"/bin/markdone.sh -c -t' and the value of SEE_TODO_ARCHIVE_DONE to '"$SEE_MODE_RESOURCES/bin/archivecompleted.awk" -v Mode=c' (including the quotes in the values). With these flags, tasks are no longer marked complete with a @done tag, but instead the leading hyphen is turned into a plus sign, giving a sort of check-off effect instead of a tagging effect. Be aware that this breaks compatibility with the TaskPaper application.

Update: The ToDo mode is available on the Coding Monkeys website.

Saturday, February 20, 2010

Exploring Ctags: Summary

To facilitate learning about Ctags, I've written two AppleScripts and several supporting shell scripts. These scripts were not written by an expert on Ctags, so there may be some sub-optimal, or outright wrong, choices in how they were implemented. Please let me know of any bugs found or suggestions for possible improvements.

The AppleScripts use Ctags to add a couple of features to SubEthaEdit (SEE). First, there is the text completion AppleScript, which looks up a string in the tag file and identifies possible matches. SEE already does text completion, but only in open files; by using Ctags as a basis for completions, matching symbols can be found across all the files in a large project. The second AppleScript finds definitions of selected symbols, again facilitating working with a large number of files.

The interactions with the tag file are handled using shell scripts. These are written to handle tag files created by invoking Exuberant Ctags with a variety of different options, notably including either absolute or relative paths and either numeric or ex pattern references for the location in the files. The shell scripts need to be placed somewhere on the paths defined in the AppleScripts; if in doubt, ~/Library/Application Support/SubEthaEdit/bin/ will work.

A zip archive with the scripts is available for download.

The scripts are described in a series of blog posts:

Exploring Ctags: Motivations

Find That Tags File!

Tag Matching

Ctags in SubEthaEdit

Ctags from SubEthaEdit to the Shell

Text Completions with Ctags in SubEthaEdit

Finding Definitions with Ctags in SubEthaEdit

Update: I've added another AppleScript and accompanying shell script for creating or updating a tag file for the front document in SEE. These are now in the zip archive, available at the same download link given above.

Finding Definitions with Ctags in SubEthaEdit

As with using Ctags for text completion, finding definitions for symbols can be expressed largely in terms of the shell scripts and AppleScript handlers already presented. Another handler, openTaggedSources, is needed, which will open files to the location of the selected tag or tags.

The resulting AppleScript is again quite concise:

on seescriptsettings()
  {displayName:"Find Definition using Ctags", shortDisplayName:"Ctags Definition", keyboardShortcut:"@^f", inContextMenu:"yes"}
end seescriptsettings

try
  requireValidDocumentForCtags()
  set tagfilepath to findTagFile()
  set searchTerm to determineSearchTerm with userIntervention
  set taglist to (pipeMatches of searchTerm out of tagfilepath thru "")
  set tagsToOpen to (pickTags from taglist with multipleSelectionsAllowed)
  openTaggedSources for tagsToOpen from tagfilepath
on error errMsg number errNum
  if errNum is equal to 901 then
    return
  else if errNum is equal to 902 then
    beep
    return
  else
    error errMsg number errNum
  end if
end try

The structure directly parallels that used for the text completion script.

Let's take a look inside the openTaggedSources handler. My approach is to dump all the selected tags back to the shell, where the shell script open-tag-files will finish the job. Here's the handler:

to openTaggedSources for tags from tagfile
  --pass tags to external script that opens them in SEE
  set exportTagsFile to "export TAGDIR=\"$(dirname " & (quoted form of tagfile) & ")\";"
  set openTagFilesPipeline to join of {"printf " & quoted form of tags, "open-tag-files RelTo=\"$TAGDIR\""} by "|"
  set openTagFilesScript to join of {UnixPath, exportTagsFile, openTagFilesPipeline, "&> /dev/null &"} by space
  do shell script openTagFilesScript
end openTaggedSources

I pass the location of the tag file to the script, so that either absolute or relative paths can be used in the tag files. Otherwise, it's just passing the selected tags out as stdin to open-tag-files in a straightforward way.

So let's look at open-tag-files:

#! /usr/bin/awk -f

BEGIN {
    FS="\t"
}

{
    # Treat relative filenames as relative to RelTo
    if ($2 ~ /^\//) {
        filePath = $2
    } else {
        filePath = RelTo "/" $2
    }
    # Handle both numeric and regex patterns
    if ($3 ~ /^[[:digit:]]+(;\")?$/) {
        match($3, /^[[:digit:]]+/)
        gotoLine = "-g " substr($3, RSTART, RLENGTH)
    } else {
        patternPlusExtras = substr($0, index($0, $3))
        numTokens = split(patternPlusExtras, token, "/")
        if (length(token[1])) {
            # Pattern looks invalid, so can't specify the line
            gotoLine = ""
        } else {
            exQuery = ""
            for (n=2; n<=numTokens; n++) {
                exQuery = exQuery "/" token[n]
                if (token[n] !~ /[^\\](\\\\)*\\$/) {
                    break
                }
            }
            exQuery = exQuery "/"
            command = "cat '"filePath"' | sed -e '"exQuery" q' | wc -l"
            command | getline lineCount
            close(command)
            gotoLine = "-g "lineCount
        }
    }
    #printf("see %s \"%s\" &\n", gotoLine, filePath)
    system("altsee "gotoLine" \""filePath"\" &")
}

This is an awk script which mostly consists of handling different ways that the tag file can be structured. Since the point is to provide a platform for experimenting with Ctags, it seems premature to commit to specific choices of absolute or relative paths, numeric line references or ex patterns, extended fields from Exuberant Ctags or just vanilla Ctags lines. For what it is worth, I'm invoking Exuberant Ctags as ctags -n --fields=+a+m+n+S -R (but there may well be better choices).

At the end open-tag-files, I use altsee to open the source files. This is a replacement for the see command line tool that comes with SubEthaEdit. I find that see is a bit of a hassle for this sort of use, so gave up on it for here (if you can get open-tag-files to work cleanly with multiple selected files, I'd love to hear about how!).

All the scripts and handlers need to be assembled into a compiled AppleScript in ~/Library/Application Support/SubEthaEdit/Scripts/ with the shell scripts set to be executable and on the path defined in the AppleScripts. If you're not sure where to put the shell scripts, I'd suggest creating a ~/Library/Application Support/SubEthaEdit/bin/ directory for SubEthaEdit-related shell scripts, and putting the scripts there. A compiled script with the needed shell script support is available for download.

Text Completions with Ctags in SubEthaEdit

With the infrastructure set up in the last few posts, it is now relatively easy to add Ctags-based text completions to SubEthaEdit (SEE). We use the shell scripts and AppleScript handlers to locate the tag file, determine a search term, get a list of tags matching the search term, and put up a dialog to have the user pick a tag. The only thing we're missing is a handler to actually insert the selected tag.

Here's a handler that does the job:

to insertCompletion of baseText by completionText
  -- assumes that the baseText is what was determined from the selection
  set {startChar, nextChar} to selectionRange without extendingFront and extendingEnd
  if the completionText does not start with the baseText then
    error "Invalid completion"
  end if
  if length of baseText is equal to length of completionText then
    -- completion is the same as the existing text, just position the insertion point
    setSelectionRange to nextChar
  else if startChar is equal to nextChar then
    -- empty selection, search term was inferred and only the difference needs to be included
    set completion to characters (1 + (length of baseText)) through (length of completionText) of completionText as text
    setSelectionText to completion
    setSelectionRange to nextChar + (length of completion)
  else
    --text selected, just replace it
    setSelectionText to completionText
    setSelectionRange to startChar + (length of completionText)
  end if
end insertCompletion

The handler has parameters corresponding to the base text sought for in the tag file and to the selected tag. These two strings are used, along with the length of the selection in SEE, to determine exactly how much text to insert. It would have been possible to just use the SEE selection, without passing in the base text, but it would have required essentially repeating the entire process of determining the search term; I think the design could be improved here, but I can live with this for now.

Using all these handlers, the logic for the text completion script is now expressible in a compact form:

try
  requireValidDocumentForCtags()
  set tagfilepath to findTagFile()
  set searchTerm to determineSearchTerm without userIntervention
  --set taglist to (pipeMatches of searchTerm out of tagfilepath thru "awk -F\"\\t\" '{ print $1 }' | sort -u")
  set taglist to (pipeMatches of searchTerm out of tagfilepath thru "cut -f1 | sort -u")
  set selectedTag to (pickTags from taglist without multipleSelectionsAllowed)
  insertCompletion of searchTerm by selectedTag
on error errMsg number errNum
  if errNum is equal to 901 then
    return
  else if errNum is equal to 902 then
    beep
    return
  else
    error errMsg number errNum
  end if
end try

The try block catches the errors we defined, letting any others go through for SEE to inform us about.

The last component needed is a seescriptsettings handler. I used this:

on seescriptsettings()
  {displayName:"Complete using Ctags", shortDisplayName:"Ctags Completion", keyboardShortcut:"@^t", inContextMenu:"yes"}
end seescriptsettings

All this needs to be assembled into a script, which is saved as a compiled script in ~/Library/Application Support/SubEthaEdit/Scripts/. A compiled script is available for download.

Thursday, February 18, 2010

Ctags from SubEthaEdit to the Shell

In the last few posts on Ctags, I've presented shell scripts for locating a tag file and looking up a tag in it, and AppleScripts for identifying what tag file should be used and what tag to search for in it. In this post, I'll present AppleScript handlers that bridge between these two scripting systems. As in the previous post, I'll use my SubEthaEditTools to simplify the process.

Essentially, the handler will need to construct a shell command that invokes look to find a tag in the tag file. Beyond that, I'll include the option to post-process the matching lines, which I'll use for text completion. For finding the definition of a tag, no post-processing is needed, so the handler checks for an empty pipeline and handles it cleanly.

The handler is:

to pipeMatches of tag out of tagfile thru pipeline
  ignoring white space
    if "" is equal to pipeline then
      set postProcess to ""
    else
      set postProcess to "| " & pipeline
    end if
  end ignoring
  set lookupScript to (join of {UnixPath, "look ", tag, quoted form of tagfile, postProcess} by space)
  try
    do shell script lookupScript
  on error
    error "Pipeline failed to process tag matches" number 902
  end try
  paragraphs of the result
end pipeMatches

Note that the handler ends by taking the paragraphs of the shell script result. This converts the lines selected by look (and any post-processing) into a list of matches.

With the two use cases in mind, the user will need to pick a relevant tag or tags from the list of matches. With text completion, only one selection makes sense, but more than one might be OK for finding definitions. Here's a handler for the two cases:

to pickTags from taglist given multipleSelectionsAllowed:allowMultiple
  try
    if allowMultiple then
      choose from list taglist with title "Matching tags" with prompt "Select tag:" default items (first item of taglist) with multiple selections allowed
      join of result by "\n"
    else
      choose from list taglist with title "Matching tags" with prompt "Select tag:" default items (first item of taglist)
      first item of the result
    end if
  on error
    -- user canceled, do nothing
    error number 901
  end try
end pickTags

We're nearly done. What remains is to assemble all these handlers into AppleScripts for the two use cases, adding whatever specifics are needed for the two tasks.

Wednesday, February 17, 2010

Ctags in SubEthaEdit

We've now looked at how to locate the right tags file and match a tag against it by working in the shell. But our goal is to connect Ctags to an editor, SubEthaEdit (SEE) in this case. We thus will need to switch from the world of the shell to the world of AppleScript. In this post, I'll just focus on getting the path to the tags file and a tag for which to search from SEE.

I'll not be working directly with SubEthaEdit's AppleScript dictionary, instead using my SubEthaEditTools handlers as a basis. Should anyone be interested in connecting Ctags to another Mac OS X editor that supports AppleScript, it would probably be better to port the SubEthaEditTools handlers to work with the editor and directly use the scripts I'll present here.

As a general design strategy, I'll identify two AppleScript error numbers with expected behaviors. First, I'll use number 901 to indicate that tag processing should be abandoned. Second, I'll use number 902 to indicate that an error of known type has occurred. This lets me handle a broad class of troubles by either quietly exiting, or beeping then exiting. Any other errors will just be unhandled, causing SubEthaEdit to show a sheet with details of the error.

Additionally, I'll need to define a search path for shell tools. Rather than using a customizable environment as I've done before, I'll just define one as an AppleScript property:

property UnixPath : "export PATH=\"$HOME/Library/Application Support/SubEthaEdit/bin:/Library/Application Support/SubEthaEdit/bin:$HOME/Library/bin:/usr/local/bin:/opt/local/bin:/usr/bin:/bin:/usr/local/sbin:/opt/local/sbin:/usr/sbin:/sbin\";"

To find the tags file, I first need to make sure a document is available to use as the starting point for the search. Second, I just need to call out to the shell with an appropriate command. Encapsulating these in handlers, I define:

on requireValidDocumentForCtags()
  if not documentIsAvailable() then
    error "No document open" number 902
  end if
  checkSaveStatus without updating
end requireValidDocumentForCtags

to findTagFile()
  set findTagfileScript to (join of {UnixPath, "climb", "-b \"$(dirname", quoted form of documentPath(), ")\"", "tags"} by space)
  try
    do shell script findTagfileScript
  on error
    error "Unable to locate tags file"
  end try
end findTagFile

Getting the candidate tag is harder than getting the path to the tag file, mostly because it is not as well-defined of a task. Since Ctags can index lots of different languages, it won't be easy to get a solution that is right for every language. Instead, I'll define a handler that works reasonably for a lot of languages, and maintains the possibility for the user to specify the candidate precisely. This latter case is straightforward: if there is text selected in SEE, we'll search for that tag.

When no text is selected, we need to get a candidate tag in some other way. To me, it makes sense that finding symbol definitions should let the user give a term in a dialog, and that text completion should work by using the text preceding the cursor. But how much text should be used? I don't think that the longest possible tag makes sense, as that would mean, e.g., a method invocation in Python of form obj.method would use the whole thing, even though that full term is unlikely to be indexed in the tag file. Instead, it would be better to just use method as the candidate tag. A reasonable choice for many languages would then be to take the longest string of alphanumeric characters and underscores, right to left from the insertion point. Those choices lead to the handler:

to determineSearchTerm given userIntervention:shouldAsk
  set {startChar, nextChar} to selectionRange without extendingFront and extendingEnd
  if startChar is equal to nextChar then
    -- empty selection
    if shouldAsk then
      try
        display dialog "Enter search term:" default answer "" with title "Find Definition"
      on error number -128
        error "User canceled" number 901
      end try
      text returned of result
    else
      -- try the whole line
      set selectionContents to extendedSelectionText with extendingFront without extendingEnd
      get shellTransform of the selectionContents for "" thru "sed -E -e 's/.*([[:<:]][[:alnum:]_]+)$/\\1/'" without alteringLineEndings
      -- sed returns lines that are terminated with linefeeds, so get text before the final linefeed
      paragraph -2 of the result
    end if
  else
    -- just use the selection; there is too much variation in what could be a tag to guess
    selectionText()
  end if
end determineSearchTerm

The handlers presented in this post are enough to get the path to the tag file and a (partial) tag to search for. Next time, I'll connect these values from SubEthaEdit to the shell scripts handling the lookup.

Sunday, February 14, 2010

Tag Matching

Our goal remains to add support for Ctags to an application. We know how to locate the relevant tags file, but what do we do with it? Fundamentally, we use the tag file to match identifiers against tags indexed by Ctags; let's make that specific, restricting ourselves for the moment to just working in the shell.

The tag file is structured as sorted lines of tab-separated records. The first field in the line is the tag, other fields identify the position of the tag in a particular source file. With this, we can check a candidate tag $TAG against the tag file $TAGFILE using look:


look "$TAG" "$TAGFILE"

Easy and fast.

To use tags to find the definition of a symbol, we'll want to hang onto all the information about each matching tag; the above use of look is all we need. For use in text completion, we'll want a longer pipeline eliminating extraneous information:


look "$TAG" "$TAGFILE" | cut -f1 | sort -u

The pipeline drops all fields but the first, the tag field, using cut and eliminates duplicates with sort -u (I suspect that uniq should work here, but look is curiously unspecific about whether it always produces its output in sorted order).

And that's it for matching tags. The file format was clearly set up with just this sort of use in mind. More details on the file format are available elsewhere.

Find that Tags File!

Our first challenge in incorporating Ctags into an editor is locating the tags file. A first attempt might be to look for a file named tags in the same directory as the document in the frontmost editor window. But this isn't quite good enough. Ctags can create a tags file by recursively descending into subdirectories, so a useful tags file might be located somewhere higher in the directory tree.

It seems like there should be a standard shell command to search upward in the directory tree, but I couldn't find it. The task isn't really that hard, so I wrote a shell script climb to do it instead of spending more time fruitlessly searching. Usage is patterned after which. To look for a tags file that recursively indexed the present directory, just do climb tags. Options are available to set where the search starts and stops.

Here's my script:

#!/bin/sh
#
# climb -- locate a file by ascending the directory tree
#
# climb [-b bottomdir] [-t topdir] filename
#
# Climb directory tree looking for a file named filename. The search
# starts by checking in the bottom directory (defaults to the current
# directory), with each parent directory checked until either the
# file is found or the top directory (defaults to root) is reached.
#


# Options allow setting the search range. Defaults are starting the
# search in the current directory and ending at root.
upTo="/"
upFrom="$PWD"

while getopts b:t: opt
do
    case $opt in
    b)  upFrom="$OPTARG"
        if ! [ -d "$upFrom" ]
        then
            echo $0: $upFrom: No such directory >&2
            exit 2
		else
			# standardize the lowermost directory path
        	upFrom="$(cd "$upFrom" && pwd -P)"
        fi
        ;;
    t)  upTo="$OPTARG"
        if ! [ -d "$upTo" ]
        then
            echo $0: $upTo: No such directory >&2
            exit 2
		else
			# standardize the uppermost directory path
        	upTo="$(cd "$upTo" && pwd -P)"
        fi
        ;;
    esac
done
shift $((OPTIND - 1))

targetFile="$1"

# To ensure termination, require that the uppermost directory is
# an ancestor of the directory where the search begins.
indx=$(awk -v d1="$upTo" -v d2="$upFrom" 'BEGIN { print index(d2, d1) }')
if ! [ $indx -eq 1 ]
then
    echo $0: $upFrom is not a descendant of $upTo >&2
fi

# Check each directory for the target file, moving up the directory tree
# until either the target is found or the uppermost directory has been
# searched. Both the lowermost directory and the uppermost directory
# are checked for the file.
while true
do
    if [ -f "$upFrom/$targetFile" ]
    then
        break
    fi
    if [ "X$upTo" = "X$upFrom" ] || [ -z "$upFrom" ] || [ "X$upFrom" = "X/" ]
    then
        exit 1
    else
        upFrom=$(dirname "$upFrom")
    fi
done

echo "$upFrom/$targetFile"

Most of the script deals with establishing the starting and ending points of the search, which I referred to in the script as the bottommost and topmost directories, respectively. They're put into a standardized format and tested for consistency, then used to define the search. The search is simple, amounting to nothing more than successively chopping off the last element of the directory path and seeing if the target file is in the resulting directory. The search stops when the topmost directory is reached, or when root is reached, just in case.

The script is general purpose, suitable for finding more than just tags files. I have mostly just called climb from AppleScripts in SubEthaEdit, with a pretty well-behaved file name and start directory. It may well be that more complex use would reveal bugs, so use with caution.

Saturday, February 13, 2010

Exploring Ctags: Motivations

I've been vaguely aware of Ctags for years, but only in the last few months have I gotten a handle on how it would benefit me. Part of the problem is that most mentions of Ctags seem to assume you already know the benefits: the Wikipedia entry does this, as does the Exuberant Ctags site. Worse, many discussions make it seem that it is just an auxiliary for vi-family editors, so perhaps not even relevant to those who, like me, haven't seriously used a vi derivative in years.

After seeing an explanation in the context of BBEdit, I have a much better idea of what Ctags provides. Essentially, it generates an index called a tags file that allows for easier code navigation across multiple files, in particular providing text completions and navigating to the definition of functions or other symbols. Within BBEdit, tags also are used to improve syntax highlighting.

I must admit that I find some of the praise for it to be overblown, but maybe I just need to try it. Of course, I don't use BBEdit, either. In fact, no editor that I regularly use supports Ctags. Let's do something about that. I'll work in the context of SubEthaEdit (SEE), since I have a fair amount of experience with scripting it, and of Exuberant Ctags, since it supports more languages than the Ctags built into Mac OS X.

I'll add two features to SEE, text completion and finding definitions. To some extent, these are redundant, in that SEE has text completions and a function pop-up, but they don't extend across multiple files in the same way as Ctags. I won't be able to do anything with syntax highlighting, as in BBEdit, but it should still be enough to try out Ctags.

Both features will be structured as AppleScripts invoking shell scripts to do most of the work. The AppleScripts both have a similar structure, consisting of:

locating the tags file

determining a search term to match against the tags file

identifying and processing matching tags

letting the user select from the matching tags

doing something with the selection

I'll break these stages out into several posts.

Friday, February 12, 2010

DWM AppleScripts

I've been experimenting with new time management systems from Mark Forster, first trying Autofocus v. 4 and now DWM. I've found AF4 to be quite nice over the last few weeks, and like what I've seen of DWM over the last few days. In each case, I've used iCal to manage the tasks in the system.

With DWM, I keep my at-home tasks as iCal todos on a separate calendar (my work tasks are still in AF4, but will be switched over soon). Each todo has a due date; the due date here doesn't mean "do on this date," but instead means "do by this date." I keep the tasks sorted by due date. For tasks that really must be done on a particular date, put them on a different calendar, and they'll appear at the top of the list on that due date. This works well, but it is a little annoying to regularly set the due dates by hand.

The todos are set with a regular pattern, to either the next week or the next month. This is scriptable. Here is the next-week script, which I saved as "To Do Within 7 Days" under the iCal application scripts:

setDueDate of (7*days) for selectedToDo()

to setDueDate of timeFrame for task
	set newDate to (current date) + timeFrame
	tell application "iCal" to set due date of task to newDate
end setDueDate

on selectedToDo()
	set referenceText to iCalSelectionText at 1
	tell application "iCal"
		repeat with cal in calendars
			set matches to (todos of cal where summary is equal to referenceText)
			if (count of matches) > 0 then
				exit repeat
			end if
		end repeat
		if (count of matches) is equal to 0 then
			error "No matching to-do item found."
		end if
		first item of matches
	end tell
end selectedToDo

on iCalSelectionText at timeDelay
	set the oldClipboard to the clipboard
	try
		copyICalSelection at timeDelay
		set selectionText to the clipboard
	on error errText number errNum
		set the clipboard to the oldClipboard
		error errText number errNum
	end try
	set the clipboard to the oldClipboard
	selectionText
end iCalSelectionText

on copyICalSelection at timeDelay
	tell application "iCal" to activate
	tell application "System Events"
		tell process "iCal"
			keystroke return
			keystroke "c" using {command down}
			keystroke return
		end tell
	end tell
	delay timeDelay
end copyICalSelection

The next-month script is similar, just replace the 7*days by 30*days.

The bulk of the script, and the only thing tricky about it, is getting a selected to-do item; the iCal scripting dictionary provides no way to do this! The handlers selectedToDo, iCalSelectionText, and copyICalSelection are a work around. I didn't come up with this approach, it comes from a Mac OS X Hints contributor.

Overall, I'm liking DWM a lot, but I doubt I'd like it without the scripts. Because of the nature of the system, I'll make no recommendation either for or against using DWM until a month has passed, but I already do think it is quite interesting and worth taking a look at.

Update: You can download compiled scripts here.

Applied Abstraction