Generating Quality Code – Proof of Concept

I put together a quick and dirty example of what I’ve been talking about in the last two posts.

This example shows how you can brute-force a bunch of random attempts of creating code based on a set of tests you want to have pass.

It also demonstrates how to use a basic grammar to randomly pick a little more intelligently, so there’s less chance of picking a completely drunk string of tokens.

I’m going to show it here with comments embedded so you can see how it works, and explain how it can be used to make simple arithmetic methods:

# generate.rb

# token_hash is a list of tokens and what 
# valid tokens can follow. a[0] and a[1] 
# are the arguments that are fed into the 
# generated method. this is so you have a 
# much higher chance of having working code
# instead of randomly throwing tokens 
# into a file to see what works. 
token_hash = { 
  "return " => ["a[0]", "a[1]"], 
  "+" => ["a[0]", "a[1]"], 
  "-" => ["a[0]", "a[1]"], 
  "*" => ["a[0]", "a[1]"], 
  "**" => ["a[0]", "a[1]"], 
  "a[0]" => ["+", "-", "*", "**", ""], 
  "a[1]" => ["+", "-", "*", "**", ""],

# here, we'll be making a method called "add". 
# the generator doesn't know what that means, 
# the name is for our benefit.
action = "add"

# first step is delete the generated file,
# so we can start with a clean slate. 
system("del #{action}.rb")

# our file's going to be based off the 
# action, but with ".rb" at the end so 
# it's obvious it's a ruby script.
method_file = action + ".rb"

# we'll be generating a file and then 
# running it. the file itself will contain 
# some tests that will pass if the code 
# was generated properly. so as soon as 
# running the code results in a "true", 
# we'll break out of this loop. 
while system("ruby #{method_file}") == false

  # here's where we start writing out the file."#{method_file}", 'w') { |file|  

    # first we write out the method definition for what we're 
    # wanting to generate. the parameters will be a variable number
    # of arguments. could be 2, could be 20. it's a little more flexible
    # than it needed to be for this example, but it works, so i left it in.
    file.write("def #{action}(*a)\n") 

    # this is the string that we'll be building on as we find tokens
    # to stick in there. once we get a full string, we'll write it to 
    # the file. 
    string = "return " 

    # the first token we'll start with is "return ". 
    token = "return " 

    # the token_hash above has a few empty items in there. 
    # these are used so we know when to stop generating a longer string. 
    # this loop exits when 
    while token_hash[token] != nil do  

      # to pick a random token, we randomize the possible ones, and
      # then pick the first from out of that. 
      new_token = token_hash[token].shuffle.first 

      # then we append this choice to the end of the string that we'll
      # print at the end. 
      string += new_token 

      # finally we set the current token to the new one, and repeat
      # the loop
      token = new_token 

    # when we exit the loop, we write the generated string out, followed
    # by an "end" statement. this closes out the method that we just
    # generated. 

    # finally, we write out a couple tests. -we- know that we want a 
    # program that adds two numbers together, but the generator doesn't
    # have a concept of that. so we make some tests that will fail if
    # the generated code isn't correct. whatever code gets generated, 
    # 3 and 5 should return 8, and 2 and 2 should return 4. 
    file.write("fail if #{action}(3, 5) != 8\n") 
    file.write("fail if #{action}(2, 2) != 4\n")

What Got Created?

def add(*a)
return a[0]+a[1]
fail if add(3, 5) != 8
fail if add(2, 2) != 4

 Let’s Run It

ruby generate.rb
add.rb:4:in `<main>': unhandled exception
add.rb:4:in `<main>': unhandled exception
add.rb:4:in `<main>': unhandled exception
add.rb:4:in `<main>': unhandled exception
add.rb:4:in `<main>': unhandled exception
add.rb:4:in `<main>': unhandled exception
add.rb:4:in `<main>': unhandled exception
[Finished in 2.3s]

Each line that mentions an unhandled exception is from one of the tests failing–particularly, the one at line 4 where we expect 3 and 5 to give us an 8.

Don’t know what it tried giving us, but it wasn’t 8 so FAIL. Until it doesn’t.

Next Steps

  • Some tweaking to allow scripts to run across multiple threads, and they all stop once one instance succeeds. Run in the cloud or on a Raspberry Pi cluster.
  • Generator reads in previously generated methods at runtime, and uses those as possible tokens too.
  • Although we can do arithmetic, trying to figure out formulas can be really tricky. If we wanted a method that we knew needed to add a constant (like 2, 2 returns 7, and we expect it to be a[0]+a[1]+3), that makes for an infinite number of possibilities. Something like WolframAlpha might be good for that.
  • Specialized generators–having ALL possible tokens will make a lot more combinations to weed through. Maybe a generator for webservices, one for databases, one for string/file manipulation, etc.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s