programming

You are currently browsing articles tagged programming.

Ruby is a very flexible and expressive language. A recent question posted by a Ruby newbie got me looking through my IRC logs for a discussion about the performance of various dynamic method invocation approaches, so I thought I’d share some performance results.

Ruby is currently my primary programming language, and I like it a lot; however, the conciseness (except for blocks) and performance of its higher order function capabilities could be improved. Scheme and Haskell beat it in this regard.

Consider three versions of a higher order function, caller, and associated ways of providing information to it to allow invoking a method dynamically given a pre-existing object and argument:

# 1. lambda
def caller fn
  fn.call(obj, arg)
end

caller(lambda {|obj,arg| obj.meth(arg) })

# 2. send
def caller sym
  obj.send(sym, arg)
end

caller(:meth)

# 3. bind/call
def caller meth
  meth.bind(obj).call(arg)
end

caller(MyClass.instance_method(:meth))

A simple benchmark of the three approaches with results follows. The benchmark uses an elide(n) method that has been added to String:

lam = lambda {|s,n| s.elide(n) }
sen = :elide
met = String.instance_method(:elide)

bm(5) do |x|
  x.report('lambda') { 100_000.times { lam.call('abcdef',4) } }
  x.report('send') { 100_000.times { 'abcdef'.send(sen,4) } }
  x.report('meth') { 100_000.times { met.bind('abc').call(4) } }
end

           user     system      total        real
lambda 1.460000   0.510000   1.970000 (  1.982085)
send   0.810000   0.260000   1.070000 (  1.075201)
meth   0.660000   0.250000   0.910000 (  0.913455)

# results with the 3 preparation lines w/in their respective loops

           user     system      total        real
lambda 1.900000   0.590000   2.490000 (  2.498358)
send   0.800000   0.250000   1.050000 (  1.068035)
meth   0.760000   0.260000   1.020000 (  1.021275)

I personally like the lambda approach, but Ruby does impose a penalty for using it.

Tags: , ,

Peter Norvig wrote a simple spelling corrector in 20 lines of Python 2.5,
so I thought I’d see what it looks like in Ruby. Here are some areas I’m not pleased with:

  1. List comprehensions in Python made the edits1 function more elegant IMO.
  2. The boolean expression in the correct function evaluates empty sets/arrays as false in Python but not in Ruby, so I had to add the “result.empty? ? nil : result” expression to several functions. I expect there’s a better way to handle this also.

Otherwise, the translation was pretty straightforward.

Here’s a link to Norvig’s page:
http://www.norvig.com/spell-correct.html

That page includes a link to a text file that I saved locally as
holmes.txt: http://www.norvig.com/holmes.txt

def words text
  text.downcase.scan(/[a-z]+/)
end

def train features
  model = Hash.new(1)
  features.each {|f| model[f] += 1 }
  return model
end

NWORDS = train(words(File.new('holmes.txt').read))
LETTERS = ("a".."z").to_a.join

def edits1 word
  n = word.length
  deletion = (0...n).collect {|i| word[0...i]+word[i+1..-1] }
  transposition = (0...n-1).collect {|i| word[0...i]+word[i+1,1]+word[i,1]+word[i+2..-1] }
  alteration = []
  n.times {|i| LETTERS.each_byte {|l| alteration << word[0...i]+l.chr+word[i+1..-1] } }
  insertion = []
  (n+1).times {|i| LETTERS.each_byte {|l| insertion << word[0...i]+l.chr+word[i..-1] } }
  result = deletion + transposition + alteration + insertion
  result.empty? ? nil : result
end

def known_edits2 word
  result = []
  edits1(word).each {|e1| edits1(e1).each {|e2| result << e2 if NWORDS.has_key?(e2) }}
  result.empty? ? nil : result
end

def known words
  result = words.find_all {|w| NWORDS.has_key?(w) }
  result.empty? ? nil : result
end

def correct word
  (known([word]) or known(edits1(word)) or known_edits2(word) or
    [word]).max {|a,b| NWORDS[a] <=> NWORDS[b] }
end

After you’ve saved the holmes.txt file, load the code into irb and call the correct function with a string as follows:

badkins:~/sync/code/ruby$ irb
irb(main):001:0> require 'spelling_corrector.rb'
=> true
irb(main):002:0> correct "whree"
=> "where"

Tags: ,

Now that I’ve switched to a Macbook Pro with OSX Leopard as my primary desktop, I’ve located my Ubuntu machine in another part of the house to be accessible to my children. Not wanting to walk to the room where it’s located just to flip the power switch, I researched how to get “wake on LAN” working, so I could power it up remotely.

1. Enable the appropriate setting in your BIOS. Mine had something to do with wake on PCI device.

2. Install ethtool if you don’t already have it.

sudo apt-get install ethtool
cd /etc/init.d
sudo vim wakeonlanconfig

Add the following lines to that file:

#!/bin/bash
ethtool -s eth0 wol g

Install the script:

sudo update-rc.d -f wakeonlanconfig defaults

Run the script:

sudo /etc/init.d/wakeonlanconfig

3. Keep the network interface alive after shut down.

sudo vim /etc/init.d/halt

Change the following line:

halt -d -f -i $poweroff $hddown

to the following line (i.e. remove the -i)

halt -d -f $poweroff $hddown

4. Get the MAC address

ifconfig | grep HW

5. Send the magic packet via the following Ruby program:

require 'socket'
mac_addr = "x21x53x39xB3x90x42"
s = UDPSocket.new
s.setsockopt(Socket::SOL_SOCKET, Socket::SO_BROADCAST, 1)
s.send("xff"*6 + mac_addr*16, Socket::SO_BROADCAST, '10.0.0.255', 7)

Tags: , , , , ,

This has been a long time in coming. Paul Graham and Robert Morris have released an initial version of the Arc programming language.

Announcement
Language Web Site
Software
Tutorial
Forum

They recommend using version 352 of MzScheme because the latest version apparently breaks Arc. I already had 360 installed and was in a hurry, so I tried it, and most of the tutorial seemed to work fine except for the web server which failed. I’ll try later with 352 and see how it goes.

The language is still quite volatile, so I’m not sure if anyone is too interested in investing a lot of time creating libraries yet, but when the language settles down, I’m very curious about the acceptance level of Arc.

It seems to have quite a bit of Lispy goodness, and I’ve agreed with Paul’s language philosophy from what I’ve read about what he wants Arc to become. Hopefully it will live up to those ideas. On the one hand, I can see benefits in having a standard such as the one for Common Lisp, but on the other hand, Ruby & Python have done extremely well with the BDFL model with Matz & Guido, and I think Paul Graham could pull off that role if he wants to.

A problem with a “standards” approach is the proliferation of implementations dividing the community; whereas, the single implementation languages seem to have a more unified community.

If Arc can retain the best of Lisp, add some niceties from other languages and attract an active developer community, I think it may become very interesting.

rlwrap

Probably one of the best things I’ve gotten out of the Arc release so far was a tip from a guy on the forum on how to add readline support to the Arc REPL using rlwrap. I’d never heard of rlwrap before, and it’s awesome! I can now get readline support for logo and arc without needing to rebuild them with native support.

sudo apt-get install rlwrap
rlwrap logo

What a great idea :)

Tags: , ,

I came across a web site recently that prompted me to enter two names before I could view the site. So I wondered how difficult it would be to write a Ruby program that could pass the challenge. As you’d expect, it’s trivial in Ruby. If you encounter such a site, please don’t use the code in part two unless it’s permitted by the terms of service of the site in question.

Part One – Obtain a file of names

I found a government site that provided lists of the 1,000 most popular names for each decade, so I wrote a Ruby program to screen scrape the names into a list in order of popularity.

Make the open-uri library available.

require 'open-uri'

Create a simple Struct to contain the url for each decade along with an array for boy names and girl names (the site provides an ordered list for each gender). Then create an array of Decade instances for 80’s, 90’s and 00’s.

Decade = Struct.new(:url, :boy_names, :girl_names)
decades = [
  Decade.new('http://www.ssa.gov/OACT/babynames/decades/names1980s.html', [], []),
  Decade.new('http://www.ssa.gov/OACT/babynames/decades/names1990s.html', [], []),
  Decade.new('http://www.ssa.gov/OACT/babynames/decades/names2000s.html', [], [])
]

Iterate over each decade and screen scrape the list of boy names and girl names.

decades.each do |decade|
  str = open(decade.url).read
  str.scan(/<tr><td align="right">.*?<td width="15%">(w+)</td>.*?<td width="15%">(w+)</td>/m) do
    decade.boy_names << $1
    decade.girl_names << $2
  end
end

Now we have a pair of lists (boys & girls) for each decade with each list in order of popularity – 6 lists in total. We want to form a single list by selecting the most popular name from each of the 6 lists, followed by the second most popular name, etc. The zip function comes to mind, so we’ll first create an array of the 6 lists.

name_lists = decades.inject([]) do |result, decade|
  result << decade.boy_names
  result << decade.girl_names
  result
end

Now we have a list containing 6 lists. Here is where the "everything is an object" aspect of ruby is a bit of a pain. What I want to do is simply zip the 6 lists together, for example:

x = zip(one, two, three, four, five, six)

but I'm forced to pick one of the lists to be the instance for the zip method, for example:

x = one.zip(two, three, four, five, six)

I suppose my dissatisfaction is due in part to my recent research in functional programming.

The line below will do the following (I really like Ruby :) ):

  1. pop one of the lists off of the main list (leaving 5) to be the instance for zip
  2. explode the list of 5 remaining lists into 5 individual arguments via the * operator
  3. zip the 6 lists together
  4. flatten the resulting list of lists produced by zip into one list
  5. remove duplicate entries
  6. write each name to standard output
(name_lists.shift).zip(*name_lists).flatten.uniq.each {|name| puts name }

Run the program as follows to create a file named names.txt which will be used in part two.

ruby scrape_names.rb > names.txt

Part Two - Brute force form submission

Now that we have a list of over 2,000 names, it's a simple matter of repeatedly trying pairs of names in order of popularity. Here's the program in its entirety with comments following:

 1  require 'net/http'

 2  names = ARGF.inject([]) {|result, line| result << line.chomp; result }

 3  Net::HTTP.start('www....com') do |query|
 4    headers = { 'Content-Type' => 'application/x-www-form-urlencoded' }
 5    (1...names.length).each do |i|
 6      (0..i).each do |j|
 7        response = query.post('/guess/',
 8          "name1=#{names[i]}&name2=#{names[j]}", headers)
 9        if !response.body.include?('Failed')
10          puts "Answer: #{names[i]}, #{names[j]}"
11          exit
12        end
13      end
14    end
15  end

Line 1: makes the HTTP library accessible.

Line 2: reads a list of names from standard input into an array.

Line 3: opens a TCP connection to the specified web server (site withheld for privacy).

Line 4: through trial and error I discovered that contrary to the example in the Pick Axe p. 700, I need to add this header to get the form submission to work.

Lines 5-6: setup the iteration to compare each name (beginning with the second) with each of the previous names up to and including the current one (in case the answer is two identical names).

Lines 7-8: perform an HTTP post with the pair of names.

Lines 9-11: if the retrieved page doesn't contain a failure string, print the solution to standard output and exit.

Run the program as follows:

ruby guess_names.rb < names.txt

There you have it - a pair of Ruby programs to pass a "guess two names to enter" challenge. The amount of time to find a solution will depend on the uncommonness of the names used. This is an O(n^2) algorithm, so if the names in the solution are near the bottom of the popularity scale, expect a long run time. The name list I produced had 2,697 names, so the worst case scenario is (2697 * 2698) /2 -1 = 3,638,252 requests. If only the first 100 names need to be compared there would be 5,049 requests.

This algorithm is trivial to parallelize - just change line 5 above from (1...names.length) to multiple non-overlapping ranges (one per process), and let her fly.

Ruby has many benefits and is currently my most productive programming language, but I think it particularly shines in this type of ad-hoc, web-enabled task because of the expressiveness of the language and the availability of useful libraries. I highly recommend putting it high on your list of languages to learn if you don't already use it.

Tags: ,

Someone posted a question on comp.lang.ruby recently asking for help with solving anagrams. The poster originally asked about ways of generating permutations and several people pointed him to the facets library which has some permutation utility functions. As it turns out, I benchmarked the following naive permutation generator as 3 times faster than the facets library code:

def words chars, result='', &b
  if chars.empty?
    yield result
  else
    chars.length.times do |i|
      c = (x = chars.clone).slice!(i)
      words(x, result + c, &b)
    end
  end
end

However, generating all possible combinations of letters in an anagram is a very bad approach which doesn’t take into consideration key knowledge about the problem. Rather than generating all permutations of an anagram, which is infeasible for longer words, we can precompute a hash from our dictionary by using the sorted letters of words as keys. Then to solve an anagram, we simply sort the letters in the anagram and look it up in the hash.

The first program creates the hash and serializes it to disk for future anagram solving:

words = Hash.new([])

File.open("/usr/share/dict/words", "r") do |file|
  while line = file.gets
    word = line.chomp
    words[word.split('').sort!.join('')] += [word]
  end
end

File.open("word_hash", "w") do |file|
  Marshal.dump(words, file)
end

The second program loads the serialized hash from disk to solve anagrams that we type at the console:

words = nil

File.open("word_hash", "r") do |file|
  words = Marshal.load(file)
end

while true
  print "Enter word: "
  anagram = gets.chomp
  sorted_anagram = anagram.split('').sort!.join('')
  p  words[sorted_anagram]
end

This is really fast, but there are probably better methods. If you have one, feel free to add a comment with the code.

Tags: , ,

I was reading an article about adding code to JavaScript to make it more functional, and one of the blog commenters mentioned some built-in features that were added to JavaScript 1.6 & 1.7 on Firefox, so I checked out the links (see below) – very cool stuff.

  • Array methods

    • indexOf
    • lastIndexOf
    • every
    • filter
    • forEach
    • map
    • some
  • Array & String generics
  • Generators & Iterators
  • Array Comprehensions
  • Block Scope w/ let
  • Destructuring Assignment
  • etc.

They won’t help if you have to target IE also, but it should be possible to conditionally include your own code to implement the ones that don’t require syntactic changes for pages loaded from IE. That would reduce network load for customers using Firefox.

New in JavaScript 1.6 (Firefox 1.5)

New in JavaScript 1.7 (Firefox 2.0)

Hopefully IE will catch up someday, but if not, I can see taking advantage of Firefox specific JavaScript enhancements for niche applications. Firefox is so easy to install, that it should be easy to convince customers to use it for certain custom applications.

Tags: , , , ,

Did you know you can write C code directly within a Ruby program? I learned that at the Ruby Hoedown here in Raleigh and thought I’d share it for those who aren’t aware of it.

First install the rubyinline gem.

sudo gem install rubyinline

Then code :)

require 'rubygems'
require 'inline'

class MyTest
  inline do |builder|
    builder.c "
      long factorial(int max) {
        long result = 1;
        while (max >= 2) {
          result *= max--;
        }
        return result;
      }"
  end
end
t = MyTest.new
puts t.factorial(10)

The first time you run this it will be slower than subsequent runs because the C code will be compiled and cached. Typically you’ll find the generated C code in ~/.ruby_inline

Enjoy,
Brian

Tags: , ,

« Older entries § Newer entries »