Joe finds a reference to Conway's Game of Life whilst clearing out his desk, and is suddenly gripped with nostalgia. It wasn't just flares, mullets and disco, but simple computer games in interpreted basic. Somehow, Conways Game of Life was too intriguing to be abandoned in the attic. Can it be implemented in SQL? Joe sets up a challenge.
Every few years, you need to go to the garage and pull out the old stuff. By “old stuff” I mean the physical things – garages attract many strange objects like broken toys, old magazines and unused tools. A few years ago, a friend of mine took his kids to help clean out grandma's attic when his mother died.
His daughter grabbed the trunk with her aunt's old clothes. Some things had aged like fine wine into “vintage clothing” since then 1970's, some had become Halloween costumes. His son found a box of 45 RPM vinyl records.
- “Hey, Dad, what are these things?”
- “Of what?”
- “Music. These are stone age CDs. They are probably worth a few bucks at the used bookstore.”
- “How many songs on this thing?”
- “One on each side.”
- “No, really! How did you live like that?”
The hardware might be dead now, but the music is still good. It is the same way with software. I pulled out some of my old articles, clippings and books from the garage this month. In particular, I found some things on cellular automate and Conway's Game of Life. John Horton Conway is a name you should know if you deal with recreational math and computer science.
Conway's Game of Life
This was invented in 1970, but became a geek fad in the 1980's. It was easy to program in BASIC or Pascal and it looked pretty on a small screen.
For us databases guys, we know that this is where Dr. Codd started before he invented the Relational Model.
If you want to read it, you can download it athttp://ebookee.org/E-F-Codd-Cellular-Automata_2164601.html.
Imagine a planar grid of cells in which strange creatures I called Merkles live. These are simple creatures that only know about the eight cells in their immediate neighborhood.
The grid starts at time zero with Merkles sprinkled in it. Every cycle, a Merkle looks at his neighborhood and follows these rules for the next cycle:
- If the neighborhood has zero or one neighbor, the poor Merkle dies of loneliness.
- If the neighborhood has four or more neighbors, the poor Merkle dies of overcrowding.
- If the neighborhood has two or three neighbors, the Merkle lives to see the next cycle.
- Any empty cell with exactly three live neighbors grows a Merkle.
If you want more details and to see some simple animations, look up Conway's Game of Life (Wikipadia).
When you gave the Game of Life as a programming assignment, the bad programmers would use bit flags (0 for empty cells and 1 for a Merkle) because it is the most direct and obvious approach. But that decision leads to two grid arrays, one of the odd numbered cycles and oen for the even numbered cycles. A simple flag in the program keeps track of which of ht two arrays is current. A simple loop on the current grid looks at each cell's neighborhood and marks the corresponding cells in the next generation gird.
But there is a better way, discovered by Ward Christensen. Create an array of tiny integers and use ten (hot one) for a Merkle and zero for an empty cell. In the first pass, scan they array for cells with a ten; when you find one, add one to each of its neighbors. Now make the second pass. When you find a three (newborn Merkle) or a 12 or 13 (survivor Merkle), set it to ten. Zero out any other values.
The real performance improvement in the Christensen algorithm came from only scanning the occupied cells. Most of the grid will be empty, even if the starting configuration was packed. Over population leads to a high initial death rate. That does not mean that the grid will necessarily stabilize at some point in time.
Generalizing the Lessons of Life
Changing the rules for the Merkles will give us slightly different worlds. There is more information this topic than you really want to know at http://mathworld.wolfram.com/topics/CellularAutomata.html. This site has a list of rules and demonstrations of the outputs,
One surprise is that there are configurations which can only be initial configurations (aka “Garden of Eden”). There are proofs that certain cellular automata have to have a “Garden of Eden” if they meet certain criteria, but the proof does not show them to you. The first “Garden of Eden” was not found until 1971, and at least three are now known. It is not, however, known if a pattern exists which has a father pattern, but no grandfather pattern.
Another family of cellular automata has two (or more) types of creatures in the grid and the rules involve how they interact. My favorite example
In this imaginary world, the grid is populated with Red Merkles and Blue Merkles. Every cell is occupied and then we start the game. A Merkle is unhappy if its neighborhood has less than (n) neighbors of its own color. Two unhappy Merkles can switch places if one or both of them increase their happiness.
In a few cycles, the grid becomes segregated neighborhoods. , Economist Thomas C. Schelling created this simple model to study segregated neighborhoods in the US in the 1970's. While racism might account for some of the U-shaped distribution of race in neighborhoods, the stronger factor is the sum of individual preferences.
You can see some QuickTime animations of Merkleville at these sites:
- For (n = 2)
- For (n = 1)
Well, Celko has been blathering. Let's get to some conclusions and general principles that help us think about data.
- Individuals in a social environment interact, but only with their neighbors. How many people outside your Google circles, Facebook or email contact lists?
- The members of social systems work in parallel. Sudden changes are not magic or random, but they can be surprising.
- Complex patterns in data arise from simple underlying rules.
- Simple models can reveal those rules, or at least a simplified version of them.
Hey, I have insights on social networks and how things can go viral! I feel smarter. I hope that you do, too.
Obviously SQL does not have arrays, so you need to think a way to model a grid. The classic way in SQL has been this skeleton schema.
CREATE TABLE Grid
(i INTEGER NOT NULL,
j INTEGER NOT NULL,
PRIMARY KEY (i,j),
merkle <data type> NOT NULL);
The two coordinate columns might also have a BETWEEN constraint to mimic the range declaration in procedural languages that have arrays. You might also add a constraint to the Merkle's values.
Build a 10 by 10 Schelling model in SQL. That means the Grid might be declared as:
CREATE TABLE Grid
(i INTEGER NOT NULL
CHECK (i BETWEEN 1 AND 10),
j INTEGER NOT NULL
CHECK (j BETWEEN 1 AND 10),,
PRIMARY KEY (i,j),
merkle CHAR(1) NOT NULL
CHECK (merkle IN ('R', 'B')));
You will need a procedure to load the initial grid and a procedure to swap unhappy Merkles. You can be creative here.
Write a stored procedure that will play Conway's Game of Life. You will need to load the grid with a starting position, then one to do each cycle.
- Hint: do you really need to have complete grid to do this or can you use a sparse model?
- Hint: Spend some time on a “neighborhood query”
- Hint: The MERGE statement can do inserts, updates and deletes in parallel which is how Life is supposed to work.