Books & Tools Techniques

Comprehensive coverage of Ruby 1.8 and 1.9

"The New Most Important Ruby Book"
Peter Cooper,
rubyinside.com

Completely updated for Ajax and Web 2.0

"A must-have reference"
Brendan Eich,
creator of JavaScript

Jude

Jude is my Java documentation browser. It combines Sun's definitive javadocs with the easy-to-use format of Java in a Nutshell, and tops it off with easy keyboard-based navigation and full-text searching.

Jude is available for free evaluation.

See the user's guide for more info

Java in a Nutshell

The 5th edition is now out, with complete coverage of Java 5.0!

It includes a fast-paced tutorial on the language, and a compact quick-reference for the core Java API.

Java Examples in a Nutshell

The 3rd edition, updated for Java 1.4

This edition has all-new coverage of the NIO and JavaSound APIs, completely rewritten Servlets and XML chapters, and coverage of new Java 1.4 features (assertions, logging, preferences, SSL, etc.) added througout. A great book for those who like to learn by example. 193 working examples: 21,900 lines of carefully commented code to learn from.

Java 1.5 Tiger: A Developer's Notebook

Amazon incorrectly credits me as the main author on this book. I'm actually the second author: really more of a consultant. This is a good book about all the language changes in the latest version of Java.

Effective Java

I didn't write this excellent book, but I wish I had.

Author Josh Bloch is probably best known for the collections classes in the java.util package. His experience and wisdom are apparent in this book. I learned from it and recommend it highly.

August 31, 2007

Favorite Unicode Codepoints

I stumbled across this fun page today. I think my favorite codepoint name of those listed is U+2764 "Heavy Black Heart"...

I doesn't look like that page is actively maintained any more, so I don't think you can submit your own favorites.

August 22, 2007

Fibonacci numbers with Ruby 1.9 Fibers

Here's how to use the new Fiber class (warning: class name may change) to generate an infinite sequence of Fibonacci numbers. I use "generate" in the sense of Python's generators. Ruby's new fibers are "semi-coroutines"

fib = Fiber.new do 
  x, y = 0, 1
  loop do 
    Fiber.yield y
    x,y = y,x+y
  end
end

20.times { puts fib.resume }

Remember that the Fiber class was added to Ruby 1.9 yesterday, so you need a very recent snapshot to make this work.

Here's a slightly higher-level way to do the Fibonacci numbers. We hide the Fiber class inside of a Generator class:

class Generator
  def initialize &block
    @f = Fiber.new &block
  end

  def next?
    @f.alive?
  end

  def next(*args)
    @f.resume(*args)
  end
end

fib2 = Generator.new do 
  x, y = 0, 1
  loop do 
    Fiber.yield y
    x,y = y,x+y
  end
end
20.times { puts fib2.next }

Coroutines (via fibers) in Ruby 1.9

As a followup to the post below, here is some example code using fibers to implement co-routines. This works in today's snapshot build, but it uses brand-new stuff, so it probably won't work if your build is more than a day old. Similarly, the API may change again, so it might not work if your snapshot is too new, either. svn.ruby-lang.org seems to be down (for me at least) right now, so I got my snapshot here: http://www.mirrorservice.org/sites/ftp.ruby-lang.org/pub/ruby/snapshot.tar.gz The code works for me on Linux, but I haven't tested it elsewhere:

f = g = nil

f = Fiber::Core.new { |x|
  puts "F1: #{x}"        
  x = g.transfer(x+1)    
  puts "F2: #{x}"
  x = g.transfer(x+1)
  puts "F3: #{x}"
}

g = Fiber::Core.new { |x|   
  puts "G1: #{x}"           
  x = f.transfer(x+1)
  puts "G2: #{x}"
  x = f.transfer(x+1)
}

f.transfer(100)

This code prints the following

F1: 100
G1: 101
F2: 102
G2: 103
F3: 104

Notice that the code uses the class Fiber::Core, and its instance method transfer. There is also a Fiber class, with a slightly higher-level API involving the instance method resume and the class method yield. I'm not sure exactly how these are intended to be used, but I was able to produce the same output as above using Fiber (instead of Fiber::Core) as follows:

g = Fiber.new { |x|
  puts "G1: #{x}"
  x = Fiber.yield(x+1)   # Can't use resume here: double resume error
  puts "G2: #{x}"
  x = Fiber.yield(x+1)
}

f = Fiber.new { |x|
  puts "F1: #{x}"
  x = g.resume(x+1)
  puts "F2: #{x}"
  x = g.resume(x+1)
  puts "F3: #{x}"
}

f.resume(100)

Update: Koichi SASADA notes on ruby-core:

Fiber::Core and Fiber::Core#transfer is black magic. So I'm planning to rename this class and method to Fiber::DangerousCore::__unsafe_transfer__I_cant_promise_your_program_run_correctly__!

He also says that the Fiber class is a "semi-coroutine" like Python's generator and that the method names Fiber.resume and Fiber::yield are taken from the language Lua. Time to go read up on Lua, I suppose

External Iterators in Ruby 1.9 Core!

Ruby 1.9 has, for a while now, supported Enumerators. These are Enumerable objects returned by iterator methods when the methods are invoked with no blocks. Until recently, Enumerators were still internal, Ruby-style iterators that you invoke with each.

But that has changed. I haven't seen this discussed anywhere in English, but recent builds of Ruby 1.9 (my build is from August 17th) also define methods next and next? on Enumerator. These are Python-style (and Java-style) external iterators:

Update: 8/24/07: Matz has just removed the next? method. So you now have to be ready to catch StopIteration when using Enumerators. This means that the twine method below doesn't work anymore..

r = 1..3   # An enumerable object
e = r.each  # An enumerator object
e.next   # => 1
e.next # => 2
e.next # => 3
e.next # StopIteration exception raised

True external iterators make the age-old problem of parallel iteration in Ruby trivial to solve. Here are some examples:

def twine(*enumerables)
  enumerators = enumerables.map { |x| x.each }
  while not enumerators.empty?
    e = enumerators.shift
    if e.next?
      yield e.next
      enumerators << e
    end
  end
end

def braid(*enumerables)
  enumerators = enumerables.map { |x| x.each }
  begin
    loop do
      values = enumerators.map {|x| x.next }
      yield *values
    end
  rescue StopIteration
    # This is normal termination condition
  end
end

a = [1,2,3]
b = [4,5,6]
twine(a,b,'a'..'b') { |x| puts x }
braid(a,b,7..10) { |x,y,z| print "#{x},#{y},#{z}\n" }

It appears that these new external iterators are not build on the old Generator library which used continuations (I think). Instead, they use a new Fiber class (a kind of micro-thread). It is not documented yet, but its in there in recent 1.9 builds. If you know how to use Fiber, feel free to explain in the comments.

August 14, 2007

Changes between Ruby 1.8 and Ruby 1.9

This is a repost. The original from August 2nd started getting comment spam, and I accidentally deleted the post while deleting the spammy comments. The original, valid comments are gone as well, which is unfortunate, because Tom Klaasen and Dmitry Kim made helpful points that resulted in corrections to the post below.

Someone recently emailed the ruby-core mailing list asking "Is there some list of 'bullet points' on the major differences between the syntax of Ruby 1.8 and Ruby 1.9 available somewhere?" The response, of course was a link to the definitive list of changes in Ruby 1.9.

But that is an exhaustive list instead of just highlighting the major changes. So, the following is my somewhat more digested list of the important changes, as I understand them. Additions, corrections, clarifications, and so forth are welcome in the comments.

Text

  • Characters are represented by single-character strings, rather than integers:
    • ?A returns "A" instead of 65
    • "HELLO"[1] returns "E" instead of 69. s[x] is now the same as s[x,1]
    • Use ord method of String to get character encoding. It returns the encoding of the first character of the string
  • Strings are no longer Enumerable, and the each method has been removed. Use each_line and each_byte to iterate lines and bytes. Both of these methods can return enumerators (see below), which are Enumerable.
  • Ruby 1.9 adopts the Oniguruma regexp engine, which adds advanced new features for regular expression wizards.
  • Additional changes are expected to Ruby's Unicode and multi-byte string support, but Matz has not unveiled them yet. Strings may have an encoding method for querying or setting their encoding.

Ranges

  • member? and include? work differently if the endpoints of a range are not numbers: they actually iterate with succ to test membership in that case.
  • The new method covers? does what member? and include? did in 1.8

Hashes

  • New hash syntax. When the keys of a hash are symbols, you can move the colon from the beginning of the symbol to the end (no space allowed) and omit the =>. So this hash {:a=>1,:b=>2} turns into {a:1,b:2}. The Ruby 1.8 syntax is still supported, of course.

Parallel Assignment

  • Any number of splat operators may appear on the right-hand side of a parallel assignment in Ruby 1.9. Previously only the last rvalue could have a splat.
  • The left-hand side of a parallel assignment may have only one splat operator, as always, but it is no longer required to be on the last lvalue. In Ruby 1.9 a splat may appear before any one lvalue.
  • Remember that the rules of parallel assignment apply to block invocation as well: the arguments to yield are rvalues, and the block parameters are lvalues. So these splat changes apply to blocks as well.
  • In Ruby 1.8, the value of a parallel assignment expression was an array of the lvalues. For efficiency, Ruby 1.9 evaluates all parallel assignments to true.

Enumerators

  • The iterator methods of core classes and modules like String, Fixnum, Array, Hash and Enumerable now return an enumerator object when invoked with no block. An enumerator is an Enumerable object. The enumerator return by each of these iterator methods uses that underlying iterator in place of the each method normally used by the Enumerable mixin. So we can write things like:
    
    counter = (1..10).each  # returns an enumerator 
    counter.each_with_index { |n,i| puts n,i }
    
  • Ruby makes Enumerable::Enumerator core, so you no longer have to require "enumerator" to get methods like enum_for

Blocks

  • Block parameters are always local to their blocks, even when the enclosing scope includes a variable by the same name. Use -w to get a warning when this will change the behavior of your code.
  • Block parameters must be local variables in Ruby 1.9. No more assigning to instance variables or global variables as a side-effect of block invocation
  • You can declare block-local variables in Ruby 1.9. Just follow the ordinary list of block parameters with a semi-colon and follow it with a comma-separated list of variable names:
    
    hash.each { |k,v; x,y,z| ... }
    
    With this block declaration, x, y, and z will be local to the block, even if they are already defined in the enclosing scope.
  • As per the parallel-assignment changes described above a block parameter list may a splat operator before any one parameter. It is no longer required to be the last one.
  • The last block parameter may be prefixed with an ampersand to make it receive a block, just as you can do with methods. This is typically only useful when the block is being turned into a proc or lambda

Procs and Lambdas

  • Kernel.proc is now a synonym for Proc.new: proc now creates a proc and lambda creates a lambda. (Both procs and lambdas are still instances of Proc, of course.)
  • The Symbol.to_proc method is now built-in to Ruby.
  • Ruby 1.9 supports a (strange at first) new syntax for defining lambdas:
    
    ->(x,y) { x + y }  # same as lambda {|x,y| x + y}
    
    • Parentheses are optional in this new syntax:
      
      ->x,y { x + y }  # same as lambda {|x,y| x + y}
      
    • The new lambda syntax supports block-local variable declarations following a semicolon just as the regular block syntax does.
    • The new lambda syntax allows argument defaults using the same syntax as method declarations:
      
      sale_price = ->(price,discount=.25) { (1.0-discount)*price }
      
  • Procs and lambdas can be invoked with parentheses preceded by a period. Given a lambda sum, the following three lines are synonyms:
    
    sum.call(1,2)
    sum[1,2]
    sum.(1,2)
    
  • Procs now have a yield method that is an alternative to call. yield uses yield semantics rather than method calling semantics to invoke the proc. This means that it is more relaxed about arity mis-matches and behaves like parallel assignment when the argument is a single array.

Bindings

  • Binding objects have an eval method to evaluate in that binding. This is an alternative to passing the binding as the second argument to Kernel.eval.
  • Proc.binding is now a private method. It is not clear if this is a bug or if that method will no longer be available.

Continuations

  • Continuations are not supported in Ruby 1.9

Private Methods

  • The method name resolution algorithm has changed or may be changing to alter the way private methods are looked up. The details are still unclear (to me, at least).

Class Variables

  • Class variables are no longer shared by a class and its subclasses. A subclass can read the values of class variables defined by its superclass. But if it sets the value of such a variable, it simply creates its own local copy of the variable, and no longer alters the value seen by the superclass.

Math

  • Math.log2 computes base-2 log
  • Math.log(x,y) computes the log base-y of x

August 13, 2007

Nifty Ruby Unicode codepoints utility

I recently learned about the Module.const_missing method for lazily computed constants, and came up with this Unicode utility module to try it out.

#
# This module lazily defines constants of the form Uxxxx for all Unicode
# codepoints from U0000 to U10FFFF. The value of each constant is the
# UTF-8 string for the codepoint.
# Examples:
#   copyright = Unicode::U00A9
#   euro = Unicode::U20AC
#   infinity = Unicode::U221E
#
module Unicode
  def self.const_missing(name)  
    # Check that the constant name is of the right form: U0000 to U10FFFF
    if name.to_s =~ /^U([0-9a-fA-F]{4,5}|10[0-9a-fA-F]{4})$/
      # Convert the codepoint to an immutable UTF-8 string,
      # define a real constant for that value and return the value
      const_set(name, [$1.to_i(16)].pack("U").freeze)
    else  # Raise an error for constants that are not Unicode.
      raise NameError, "Uninitialized constant: Unicode::#{name}"
    end
  end
end

August 10, 2007

Concurrent Exchanger class in Ruby

I've always thought that java.util.concurrent.Exchanger was a nifty little class. The javadoc I've linked to describes it like this:

A synchronization point at which two threads can exchange objects. Each thread presents some object on entry to the exchange method, and receives the object presented by the other thread on return.

The javadoc also includes a nifty little example demonstrating why you might want an Exchanger object.

I decided to try to implement this in Ruby, to test my understanding of Mutex and ConditionVariable. The code is below. I'm not sure how useful it actually is, but it was fun to write. And if you can understand what it does, then you've got a good working knowledge of threads in Ruby!

Finally, if you've got the JDK installed, unzip the src.zip file and take a look at the source code for in java/util/concurrent/Exchanger.java. It is much more complicated than this trivial example. It was eye-opening to see how carefully written classes like this one are for high performance on multi-core or multi-CPU systems.

require 'thread'

class Exchanger
  def initialize
    # These variables will hold the two values to be exchanged
    @first_value = @second_value = nil
    # This Mutex protects access to the exchange method
    @lock = Mutex.new
    # This Mutex allows us to determine whether we're the first or
    # second thread to call exchange
    @first = Mutex.new
    # This ConditionVariable allows the first thread to wait for
    # the arrival of the second thread
    @second = ConditionVariable.new
  end

  # Exchange this value for the value passed by the other thread
  def exchange(value)
    @lock.synchronize do      # Only one thread can call this method at a time
      if (@first.try_lock)    # We are the first thread
        @first_value = value  # Store the first thread's argument
        # Now wait until the second thread arrives.
        # This temporarily unlocks the Mutex while we wait, so 
        # that the second thread can call this method, too
        @second.wait(@lock)   # Wait for second thread 
        @first.unlock         # Get ready for the next exchange
        @second_value         # Return the second thread's value
      else                    # Otherwise, we're the second thread
        @second_value = value # Store the second value
        @second.signal()      # Tell the first thread we're here
        @first_value          # Return the first thread's value
      end
    end
  end
end

# This is some test code

e = Exchanger.new

5.times do
  t1 = Thread.new {
    sleep(rand)
    puts "t1 exchanges 1 for #{e.exchange(1)}"
  }
  t2 = Thread.new {
    sleep(rand)
    puts "t2 exchanges 2 for #{e.exchange(2)}"
  }

  t1.join; t2.join
end

August 09, 2007

New William Gibson book: Spook Country

William Gibson's latest novel, Spook Country is out in hardback!

I'm not done yet, but I'm really enjoying it so far. An engaging story, plus he revisits/reconsiders some ideas from Neuromancer and Virtual Light

New Java 7 Column on java.net

The first article of my new column on Java 7 is now up on java.net.

In the article I discuss the jdk7 and OpenJDK projects at java.net, mention some projects that are being developed openly and might become part of Java 7, and speculate that, because of schedule constraints, Java 7 might include a lot less than we were told to expect 10 months ago.

Future installments of the column will, I hope, go into more depth with specific APIs and will actually include code examples.

August 06, 2007

Stupid Paypal Tricks

Some guy named John Richmond, apparently of Milford, Michigan, hacked the HTML form on the Paypal payment page for my Jude software, so that he could send me a payment of two cents instead of the $48 that I actually charge for the software. I validate the payment amount on the server side, of course, so my scripts never sent him a software license. Paypal took the two cents as a transation fee, I got nothing, and Mr. Richmond was, one assumes, two cents wiser.

But no, he had the gall to initiate a Paypal dispute demanding that I send him a license or refund his two cents. As part of the dispute process he wrote "I paid what the software was worth". He doesn't like my software so he uses the Paypal to send an insult and then uses the dispute process to yank my chain!

What a jerk!

In addition to griping, the point of this post, I guess, is to get something into Google, so that anyone considering doing business with John Richmond can find out about his jerkiness. Richmond uses the domain name clientsg.com, which is owned by his defunct Michigan corporation "Client Services Group, Inc.". Google shows listings for "Salepoint Inc." at the same address and phone number as "Client Services Group". Salepoint is not a Michigan corporation. Perhaps Richmond consults for salepoint.com, or perhaps he just uses that name for some other business.

Update: Paypal sent the guy his two cents back. So I guess the paypal user agreement allows strangers to send you money for no reason and then demand that you send it back.

Update 2: I'm getting tired of updating this post, but since I've been bashing Mr. Richmond, I should acknowledge a useful and unintentionally funny response I finally got from him. He writes:

Paypal allows you to use encrypted data in your HTML to accept payments.

You need to do this.

Contact paypal for the information on how to send your data as an encrypted block via a web page to paypal so idiots viewing the source code cant see it and modify it.

At least you check the payment amount. You would be amazed at how many sites there are out there that are automated and simply send out the licence when any payment amount is received.

The useful part: the info about encryption. Though since this is the only person who has attempted this in three years, I don't think I'll run out and patch my scripts now. The funny part: he acknowledges that he's an "idiot" :-)

Also, the timing of the comment from "me" strongly suggests that it is from Mr. Richmond himself. He seems to call himself "jobu", and he acknowledges that he exploits the weaknesses in sites Paypal scripts.

Advertising
About
Store
Search
Google
Web this site
Archives
Syndicate

Powered by
Movable Type