Monday, July 23, 2007

Creating a Unique URL

SEO optimization these days requires that you have nicely-dashed-urls instead of ids. While there's a few examples of how to generate these on the web, I found all of them lacking in flexibility and reuse. So here's my solution to the problem. The approach decomposes the problem into three subproblems:
  1. Eliminate unsafe characters from a string.
  2. Identify the uniqueness of a string.
  3. Generation of endings to construct uniqueness candidates.
You can then put them all together according to your needs. For example:
#Generate a unique name by adding integers to the end
uniquify(url_safe(name.downcase)) do |candidate| 
  Article.find_by_url_name(candidate).blank?
end
The code is formatted as a Ruby module that can be mixed in to your classes as needed. I hope you find it useful!
module UrlUtils
  # Makes a string safe for urls
  # Options:
  #   :replacements - a Hash of a replacement string to a regex that should match for replacement
  #   :char - when :replacements is not provided, this is the string that will be used to replace unsafe characters. defaults to '-'
  #   :collapse - set to false if multiple, consecutive unsafe characters should not be replaced with only a single instance of :char. defaults to true.
  def url_safe(s, options = {})
    default_regex = options.fetch(:collapse, true) ? /[^a-zA-Z0-9-]+/ : /[^a-zA-Z0-9-]/
    replacements = options.fetch(:replacements, { options.fetch(:char,"-") => default_regex })
    replacements.each do |replacement, regex|
      s = s.gsub(regex,replacement)
    end
    return s
  end
  
  # Generate integers
  # Options:
  #   :start => 1, The integer to start with
  #   :end => nil, the last integer to generate, when nil this becomes an infinite sequence
  #   :increment => 1, the amount to add for each iteration
  def int_generator(options = {})
    start = options.fetch(:start,1)
    last = options[:end]
    increment = options.fetch(:increment, 1)
    raise ArgumentError if increment == 0
    raise ArgumentError if last && (increment > 0 && start > last) || (increment < 0 && start < last)
    Generator.new do |g|
      i = start
      loop do
        g.yield i
        return if !last.nil? && (increment > 0 && i >= last) || (increment < 0 && i <= last)
        i = i + increment
      end
    end
  end
  
  # accepts a block that will be passed a candidate string and should return true if it is unique.
  # Options:
  # => :separator => "-", a string that will be injected between the base string and the uniqifier
  # => :endings => generator, a Generator that provides endings to be placed at the end of the base.
  #                           defaults to the set of positive integers.
  def uniquify(base, options = {})
    sep = options.fetch(:separator, "-")
    endings = options[:endings] || int_generator
    return base if yield base
    while endings.next? do
      candidate = base+sep+endings.next.to_s
      return candidate if yield candidate
    end
    raise ArgumentError.new("No unique construction found for \"#{base}\"")
  end
end
If you prefer you can get the code as a pastie.

Sunday, June 17, 2007

DRYing up Unit Test Preconditions and Postconditions

It's good practice in unit testing to assert any preconditions (aka assumptions) relevant to the test and the expected postconditions (aka side-effects). I had a unit test that went something like this:
class TaggableTest < Test::Unit::TestCase
  fixtures :tags, :items
  def test_tagging
    one = items(:one)
    one.tag_list = "tag1, tag2, tag3, tag4"
    assert_equal 0, one.tags.size
    assert_equal 3, Tag.count
    assert_equal 0, Tagging.count
    one.save
    assert_not_nil one.id
    assert_equal 4, Tagging.count
    assert_equal 4, Tag.count
    assert_equal 4, one.tags.size
  end
I didn't care for this code because of the redundancy of the assert equals. So I added some new asserts to TestCase in test_helper.rb and was able to convert it into the following:
  def test_tagging
    one = items(:one)
    one.tag_list = "tag1, tag2, tag3, test4"
    assert_changed(lambda {one.tags.size},
                   lambda {Tag.count},
                   lambda {Tagging.count},
                   :from => [2, 3, 2],
                   :to => [4,4,4]) { one.save } 
  end
assert_changed will execute the code blocks passed in prior to the execution of the primary block and again afterwards. It will then compare those values to the from and to values (optionally) passed in. If you don't pass in both :from and :to, assert_changed merely validates that the values changed.
  private
  def singleton_maybe(s)
    return s[0] if s.is_a?(Array) and s.size == 1
    return s
  end
  
  public
  def assert_array_or_svo_equal(expected, actual)
    expected = singleton_maybe(expected)
    actual = singleton_maybe(actual)
    assert_equal expected, actual
  end

  def assert_changed(*expressions)
    options = {}
    options.update(expressions.pop) if expressions.last.is_a?(Hash)
    initial = expressions.map {|e| e.call}
    assert_array_or_svo_equal options[:from], initial if options.has_key?(:from)
    yield
    subsequent = expressions.map {|e| e.call}
    initial.each_with_index { |i, n| assert_not_equal i, subsequent[n] } if not (options.has_key?(:to) and options.has_key?(:from))
    assert_array_or_svo_equal options[:to], subsequent if options.has_key?(:to)
  end
  def assert_unchanged(*expressions)
    initial = expressions.map {|e| e.call}
    yield
    subsequent = expressions.map {|e| e.call}
    initial.each_with_index { |i, n| assert_equal subsequent[n] }
  end

Tuesday, May 22, 2007

Continuous Build Integration

If you don't know what Continuous Build Integration is, then you should because it makes the development process work very well.

Here at UnnamedStartup® we're using Mercurial (a.k.a. Hg) as our revision control system and we wanted to use buildbot to manage our continuous integration according to the following diagram:


I've just got the closed loop from developer to Hg to Buildbot to email. It could have gone smoother. Here's a few of the challenges I faced, and how I solved them.


  1. Sending Check-in Notifications from Mercurial to Buildbot - There is a user contribution to buildbot for doing this. You can download hg_buildbot.py and make sure it is executable to the user(s) commiting to your Mercurial repository. Follow the instructions in the comments of that file to have mercurial call this script. Please note that if you are submitting changes via https, I hit a bug and ended up changing to committing via ssh to work around it (for what it's worth, ssh is faster than https).
  2. Configure Buildbot Sources - The hg_buildbot.py script is expecting buildbot to be accepting changes from a PBChangeSource. Configure buidbot like so:

    from buildbot.changes.pb import PBChangeSource
    c['sources'].append(PBChangeSource())

  3. Unified email recipients list - I've configured Mercurial to send email notifications like so:

    .hg/hgrc

    ...
    [hooks]
    #callback to the notifier extension when changegroups are constructed
    changegroup.notify = python:hgext.notify.hook

    [notify]
    #only send out emails if a changegroup is pushed to the master repository
    sources = serve
    # set this to True when you need to do testing
    test = False
    config = /usr/local/share/hg/my_email_notifications
    template = Subject: Changes in repository: {desc|firstline|strip}\nFrom: {author}\n\ndetails: {baseurl}/rev/{node|short}\nchangeset: {rev}:{node|short}\nuser: {author}\ndate: {date|date}\ndescription:\n{desc}\n
    ...


    /usr/local/share/hg/my_email_notifications

    [reposubs]
    * = "Developer 1"<dev1@unamedstartup.com>, "Developer 2"<dev2@unamedstartup.com>


    So we now want to use the same list when telling our developers that the build failed. Since buildbot configuration file is just python we can embed this parsing code directly in our configuration file:

    emailcfg = open("/usr/local/share/hg/my_email_notifications")
    emailcfg.readline()
    import re
    emailparser = re.compile("<(.+@.+)>")
    emails = map(lambda s: emailparser.search(s).group(1),
    emailcfg.readline().split("=")[1].split(","))
    emailcfg.close()

    Granted, this isn't going to handle changes to the my_email_notications file very well, but you should get the idea. The important thing is that we aren't maintaining two lists of emails.
  4. Sending email to an authenticating smtp server - Out of the box buildbot can only send email to an open SMTP server... I'm not sure who's dumb enough to leave their email server open like that, but we don't. So a little reading through the twisted libraries and I found that twisted kind-of supports ESMTP. I had to wrap this up in a buildbot notifier. Here's the code for that:

    from buildbot.status.mail import MailNotifier
    class ESMTPMailNotifier(MailNotifier):
    def __init__(self, username=None, password=None, port=25, *args, **kwargs):
    MailNotifier.__init__(self,*args,**kwargs)
    self._username = username
    self._password = password
    self._port = port
    def sendMessage(self, m, recipients):
    from twisted.internet.ssl import ClientContextFactory
    from twisted.internet import reactor
    from twisted.mail.smtp import ESMTPSenderFactory
    from StringIO import StringIO
    from twisted.internet import defer
    s = m.as_string()
    ds = []
    for recip in recipients:
    if not hasattr(m,'read'):
    # It's not a file
    m = StringIO(str(m))
    d = defer.Deferred()
    factory = ESMTPSenderFactory(self._username, self._password,
    self.fromaddr, recip, m, d,
    contextFactory=ClientContextFactory())
    reactor.connectTCP(self.relayhost, self._port, factory)
    ds.append(d)
    return defer.DeferredList(ds)

    from buildbot.status import mail
    c['status'].append(ESMTPMailNotifier(username="yourusername",
    password="yourpassword",
    fromaddr="you@yourcompany.net",
    relayhost="smtp.gmail.com",
    mode="all",
    extraRecipients=emails,
    sendToInterestedUsers=False))



Granted, this isn't a complete guide to how to set up Mercurial and Buildbot, but I hope this will help you get over some of the minor hurdles I had to jump over.

Monday, May 21, 2007

Paranoid Versions

As I mentioned in my previous post, we have an Item object that is both versioned and remains in the database after it's deleted. Fortunately, there's two ruby plugins that do just that: acts_as_versioned and acts_as_paranoid. Unfortunately, these plugins don't work together perfectly. Even more unfortunately, this fix doesn't work as it should because acts_as_versioned tries to version the deleted_at column as well, so here's the module we're using to combine them:
module ActiveRecord
 module Acts
  module Versioned
    module ClassMethods
      def acts_as_paranoid_versioned
        acts_as_paranoid
        acts_as_versioned    
        self.non_versioned_columns << 'deleted_at'
        # protect the versioned model
        self.versioned_class.class_eval do
          def self.delete_all(conditions = nil); return; end
        end
      end
    end
  end
 end
end

Wednesday, May 16, 2007

All Aboard!

Today at 1pm I started writing our first ruby model object. It is the Item base class and of course it's non trivial as far as model objects go because we have lots of non-standard behavior for it. But I don't like to start with the easy stuff, especially when I'm just learning a new framework and all... At 11pm I had the following working:
  • Item model and schema complete with data migrations so that anyone can start with an empty database
  • CRUD support for Item
  • output as html and xml
  • Items are versioned. The complete edit history is stored in the database (each version is a complete record)
  • Items are "paranoid". This means that when they are deleted, they are not really deleted -- instead the deleted_at timestamp is set.
  • Concurrent editing detection, first submitter wins (error handling needed)
  • textile formatting support
  • listing of items using the xoxo microformat
What we don't have that we should:
  • Version viewing/comparing/reverting from the interface
I took a few breaks for dinner at whatnot. I'd say I spent a sum total of 7 hours "programming" but in actuality, it was about 1 hour of programming and 6 hours of looking up things on websites and doing research on how to use these rails plugins. This included fixing an incompatibility between two of the rails plugins I wanted to use (acts_as_versioned and acts_as_paranoid). it's hard to to much more with items until we make more models. My goal was to get enough underway that we could check in our initial project and set up continuous build, testing, and demo environments. And that, my friends, is why rails is neat. An experienced rails developer would have finished all this before leaving the office and then do something more productive with his/her evening ;-)

Tuesday, May 15, 2007

Getting Rolling

As chief architect of a angel-funded web startup that shall remain nameless for now, I've decided to use Ruby on Rails as our site platform. My early experiences with rails have been extremely positive and have so far, made me very productive -- in most part due to the great work that has been done by the rails community. I'll post my experiences and any useful information I learn here, in an attempt to give back to the community.