The SQL of Scrabble and Rapping

In which Phil decides to use a table consisting of all the common words in English to explore ways of cheating at Scrabble and writing doggerel using SQL Server. He then issues a SQL challenge.

Preamble

Sometimes, when you tackle a different problem in SQL, one can hit on techniques that come in handy in all sorts of other contexts.  This is the theme of this article, which uses a bank of all common words of the English language. 

Cheating at Scrabble

 I was playing Scrabble the other day and faced, as usual with an impossible hand. Scrabble is, as you probably know, a game where you pick tiles from a sack, and each tile has a letter written on it. You are allowed up to seven tiles at a time, and you have to place these, made up into words, on a game board marked with a 15-by-15 grid. The words are formed across and down in crossword fashion, and must be real words in common use. Each letter is scored individually, but the score is boosted by special squares on the board that give you double, or triple, word, or letter, scores.

It occurred to me, as it must have done to many others, that one could cheat at this game with a surreptitious iPhone or iPod and a simple word bank.

If you based this application on SQL Server, using a simple HTML interface, it would be easy to find all the words that could be made up from your seven tiles. Because you will need to link in with another word, that will come to eight letters. In some denser games, even more than eight-letter words are made as more existing words are crossed.

The first exercise will be to find all the permutation of the characters in your hand. Actually, if you are being subtle, you can restrict yourself to a subset, since only a small number of the combination of vowels and consonants are actually allowed in English. We won’t be subtle here: we’ll use the brute-force attack to the problem. The simplest way is to do a series of joins to a table of letters, but I’ll try a more flexible approach that uses a variation of the card-sharper’s shuffle instead. To do this, you will firstly need a number table. This number table is used frequently in SQL and you may already have one in your test database.  If you haven’t, this stored procedure will deal with the task of creating and populating the number table.

With these basics out of the way, we can now create the Table-Valued Function that returns the permutations in sequence of the characters you give it. This will work with any ASCII character and can be altered to deal with unicode, of course. All we are doing here is creating an empty table of the right size and then filling it with all the permutations of the characters you supply to the function. Normally, permutations will be done with a series of self-joins to a table with all the characters, one per row, but here we want something that will be useful when you do not know the length of the string in advance. (Permutations are great for doing many of the ‘graph’ problems, such as finding the shortest route, network routing, or  time-tabling)

Now we have this working, we’ll need all the common words of English. We first create the table of words and stock it from our word bank that I provide in the downloads at the bottom of the article.

You would have to alter the path to the word list before using this, of course.  I usually find that my seven tiles consist mostly of vowels so I couldn’t resist extracting lists of handy words with at least four consecutive vowels, just to make sure that everything was loaded properly

Though my favourite has five consecutive vowels

Good grief, if you are short of vowels, there is plenty you can do, as there are words with seven or more consonants in a row (if you include Y as a consonant) …

Now we have a function that will provide a table with all the permutation of between two and eight letters, we can use it to find those common words that can be built with your list of letters. Here we have an effective way of cheating at Scrabble. You’d have to use something like iWebKit to knock together a little application.

Rapping and doggerel.

Rhyming dictionaries aren’t new. they are simply dictionaries that are ordered by the word written backwards. It starts with Baa and ends in Fuzz. The most famous one is probably Walkers Rhyming Dictionary. Every poet has one. Whilst they are useful, they are a bit hit and miss to use. We’ll

 be slightly more ambitious and try to give to a better rhyme. We’ll extract up to two of the final syllables of the word and match them to all  other words with the same two syllables. We are actually not getting syllables as such but the sonorant/coda combinations. (syllables usually have an initial consonant) This seems to get a better set of rhymes

So, for Phil Factor, we have the rhymes …

actor,benefactor,chiropractor,contractor,detractor,extractor,malefactor,protractor,subcontractor,tractor

… We could soon be rapping with this lot

So as to get a quick response, and keep the code manageable, we’ll create a special table for our rhyming dictionary.

Now, we are in the position to create a function that takes any word and gives you the rhymes to it. You need to beware, because I have not yet programmed in automatic translation of homophones. English spelling is so inconsistent that rhyming dictionaries will never tell you all the rhymes. You have to use some artistry to get the best out of a Rhyming dictionary. ‘Rhyme’, rhymes with ‘chime’, even though the word endings are spelt differently. You need to search all the alternative spellings of the final syllable to get all the rhymes.

No subtlety could, for a moment, attract her,
Until she succumbed to the charms of Phil Factor

You see? No easy programming would have given you that rhyme.

Rap up

So here we are with a word-bank that allows you to cheat at Scrabble and rap, or make up doggerel. More to the point, it has illustrated, in the ‘permutation’ function, how to use a numbers table to create a table, and the ‘quirky update’ method of filling a table with permutation, or any other data you need. We’ve also illustrated some techniques of using the built-in, and rather primitive, character pattern-matching techniques of SQL Server.

A Parting Competition

To end up with: here is a simple competition, that I will award a Christmas prize of a $50 Amazon voucher for.

Given that Scrabble is scored to the following table:

  • How fast can you score all common words according to their Scrabble scores, so as to list them in order.?
  • Find all the words that can’t ever be used in Scrabble? (Blank tiles can mean any letter).   Scrabble contains…
    •     2 blank tiles (scoring 0 points) that can represent any letter
    •     1 point: E ×12, A ×9, I ×9, O ×8, N ×6, R ×6, T ×6, L ×4, S ×4, U ×4
    •     2 points: D ×4, G ×3
    •     3 points: B ×2, C ×2, M ×2, P ×2
    •     4 points: F ×2, H ×2, V ×2, W ×2, Y ×2
    •     5 points: K ×1
    •     8 points: J ×1, X ×1
    •     10 points: Q ×1, Z ×1

Downloads

Tags: , ,

  • 19784 views

  • Rate
    [Total: 0    Average: 0/5]
  • Peso

    Scrabble

    This returns all word in wordlist file in less than 2 seconds.

    CREATE TABLE #Scrabble
      
    (
        
    Letter CHAR(1) PRIMARY KEY CLUSTERED,
        
    Items TINYINT NOT NULL,
        
    Points SMALLINT NOT NULL
       )

    INSERT #Scrabble
    (
      
    Letter,
      
    Items,
      
    Points
    )
    VALUES (‘%’, 2, 0), (‘A’, 9, 1), (‘B’, 2, 3), (‘C’, 2, 3), (‘D’, 4, 2), (‘E’, 12, 1), (‘F’, 2, 4), (‘G’, 3, 2), (‘H’, 2, 4),
    (
    ‘I’, 9, 1), (‘J’, 1, 8), (‘K’, 1, 5), (‘L’, 4, 1), (‘M’, 2, 3), (‘N’, 6, 1), (‘O’, 8, 1), (‘P’, 2, 3), (‘Q’, 1, 10),
    (
    ‘R’, 6, 1), (‘S’, 4, 1), (‘T’, 6, 1), (‘U’, 4, 1), (‘V’, 2, 4), (‘W’, 2, 4), (‘X’, 1, 8), (‘Y’, 2, 4), (‘Z’, 1, 10)

    ;WITH cteYak(Word, Score, Pooled, Invalid)
    AS (
    SELECT    y.Word,
        
    SUM(s.Points * CASE WHEN y.Items <= s.Items THEN y.Items ELSE s.Items END) AS Score,
        
    SUM(CASE WHEN y.Items > s.Items THEN y.Items s.Items ELSE 0 END) AS Pooled,
        
    SUM(CASE WHEN s.Letter IS NULL THEN 1 ELSE 0 END) AS Invalid
    FROM    (
          
    SELECT    w.Word,
              
    SUBSTRING(w.Word, n.Number, 1) AS Letter,
              
    COUNT(*) AS Items
          
    FROM    dbo.WordList AS w
          
    INNER JOIN dbo.Numbers AS n ON n.Number BETWEEN 1 AND LEN(w.Word)
        
    GROUP BY w.Word,
              
    SUBSTRING(w.Word, n.Number, 1)
         )
    AS y
    LEFT JOIN   #Scrabble AS s ON s.Letter = y.Letter
    GROUP BY    y.Word
    )
    SELECT   Word,
      
    Score,
      
    CASE
        
    WHEN Pooled > 2 OR Invalid > 0 THEN ‘Not a valid Scrabble word.’
        
    ELSE ‘Valid Scrabble word’
      
    END AS Valid
    FROM   cteYak
    ORDER BY  Score DESC,
      
    Word

    DROP TABLE  #Scrabble

  • Anonymous


    This is mutant smart…

  • jenniebee

    fun
    –My Numbers table is called Common_Increment. The eureka moment for me was realizing that I didn’t have to actually find the letter placement for the blanks, I could just go with sums.
    IF OBJECT_ID(‘ScrabbleScore’) IS NOT NULL DROP TABLE ScrabbleScore

    CREATE TABLE ScrabbleScore
    (
    Ltr  CHAR(1) PRIMARY KEY,
    Score INT,
    Quant INT)

    GO

    INSERT INTO ScrabbleScore
      
    SELECT  ‘ ‘, 0, 2
    UNION  SELECT  ‘E’, 1, 12
    UNION  SELECT  ‘A’, 1, 9
    UNION  SELECT  ‘I’, 1, 9
    UNION  SELECT  ‘O’, 1, 8
    UNION  SELECT  ‘N’, 1, 6
    UNION  SELECT  ‘R’, 1, 6
    UNION  SELECT  ‘T’, 1, 6
    UNION  SELECT  ‘L’, 1, 4
    UNION  SELECT  ‘S’, 1, 4
    UNION  SELECT  ‘U’, 1, 4
    UNION  SELECT  ‘D’, 2, 4
    UNION  SELECT  ‘G’, 2, 3
    UNION  SELECT  ‘B’, 3, 2
    UNION  SELECT  ‘C’, 3, 2
    UNION  SELECT  ‘M’, 3, 2
    UNION  SELECT  ‘P’, 3, 2
    UNION  SELECT  ‘F’, 4, 2
    UNION  SELECT  ‘H’, 4, 2
    UNION  SELECT  ‘V’, 4, 2
    UNION  SELECT  ‘W’, 4, 2
    UNION  SELECT  ‘Y’, 4, 2
    UNION  SELECT  ‘K’, 5, 1
    UNION  SELECT  ‘J’, 8, 1
    UNION  SELECT  ‘X’, 8, 1
    UNION  SELECT  ‘Q’, 10, 1
    UNION  SELECT  ‘Z’, 10, 1

    ;WITH SelectWord (String, Ltr, Ord)
    AS (SELECT String, SUBSTRING(String, i, 1) , n.i
      
    FROM Allwords
      
    JOIN Common_increment n
        
    ON n.i <= LEN(String)),
    SubBlanks (String, Ltr, Blanks, SubtractScore)
    AS (SELECT w.String, w.Ltr, CASE WHEN w.Occ < s.Quant THEN 0 ELSE w.Occ s.Quant END Blanks,
      
    CASE WHEN w.Occ < s.Quant THEN 0 ELSE (w.Occ s.Quant) * s.Score END SubtractScore
    FROM (SELECT String, Ltr, COUNT(1) Occ FROM SelectWord GROUP BY String, Ltr) w
    JOIN ScrabbleScore s
      
    ON w.Ltr = s.Ltr),
    GetScore (String, Score)
    AS (
    SELECT w.String, SUM(s.Score)
    FROM SelectWord w
    JOIN ScrabbleScore s
    ON w.Ltr= s.ltr
    GROUP BY w.String)
    SELECT gs.String, gs.Score sb.SubtractScore Score, np.AllBlanks, CASE WHEN np.AllBlanks < 3 THEN ELSE ‘invalid’ END valid
    FROM GetScore gs
    JOIN (SELECT string, SUM(SubtractScore) SubtractScore FROM SubBlanks GROUP BY string) sb
    ON gs.String = sb.String
    JOIN (SELECT String, SUM(Blanks) AllBlanks
      
    FROM SubBlanks
      
    GROUP BY String) np
    ON gs.String = np.String
    ORDER BY gs.Score sb.SubtractScore DESC

  • jenniebee

    oops
    hit paste twice. WTB edit button… (We fixed this : Ed)

  • Anonymous

    ouch
    my brain just exploded.

  • Anonymous

    The permutation function seems pretty complex why not use this one?
    IF  EXISTS (SELECT * FROM sys.objects WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[Strings]’) AND TYPE IN (N’U’))
    DROP TABLE [dbo].[Strings]
    GO

    CREATE TABLE [dbo].[Strings](
    [String] [varchar](50) NOT NULL,
    CONSTRAINT [PK_Strings] PRIMARY KEY CLUSTERED
    (
    [String] ASC
    )
    )
    ON [PRIMARY]

    GO

    –Will put everything into strings table
    CREATE PROCEDURE dbo.spPermutationsOf (
    @Prefix AS VARCHAR(MAX), –recursive parameter, enter ” for first instance
    @Suffix AS VARCHAR(MAX), –enter the origninal string to be permutated,
    @AllChar AS bit
    )
    AS
    BEGIN
    –Can run up to limit of recursion, 32 character in my case but 32 characters equals 32! permutations = be patient….
    DECLARE @NewString AS VARCHAR(MAX)
    DECLARE @Index AS INT
    DECLARE
    @NewPrefix AS VARCHAR(MAX)
    DECLARE @NewSuffix AS VARCHAR(MAX)
    DECLARE @SuffixLength AS INT
    SET
    @SuffixLength=LEN(@Suffix)
    IF @SuffixLength=1
      
    BEGIN
         SET
    @NewString=@Prefix+RIGHT(@Suffix,1)
        
    IF NOT EXISTS (SELECT 1 FROM Strings WHERE String=@NewString) –remove this check to get same letter permutations, will also run faster
          
    BEGIN
             INSERT INTO
    Strings (String) VALUES (@NewString)
          
    END
       END
    ELSE
       BEGIN
        
    –Test:Lou
        
    SET @Index=1
        
    SET @NewSuffix=SUBSTRING(@Suffix,2,@SuffixLength) –1 run: ou, 2nd run: u
        
    SET @NewPrefix=@Prefix+LEFT(@Suffix,1) — L, Lo
        
    WHILE @Index<=@SuffixLength
          
    BEGIN
             EXEC
    dbo.spPermutationsOf @NewPrefix,@NewSuffix,@AllChar
            
    IF @AllChar=0 EXEC dbo.spPermutationsOf ,@NewSuffix,0 –to also put substring
            
    SET @Suffix=SUBSTRING(@Suffix,2,@SuffixLength)+LEFT(@Suffix,1) –ouL
            
    SET @NewSuffix=SUBSTRING(@Suffix,2,@SuffixLength) –uL
            
    SET @NewPrefix=@Prefix+LEFT(@Suffix,1) –o
            
    SET @Index=@Index+1
          
    END
       END
    END

  • Phil Factor

    Re: The permutation function seems pretty complex why not use this one?
    The reason that I chose the approach that I illustrate is that I was aiming for speed rather than simplicity. For eight letters, I timed yours at 58 seconds. Mine goes in at 763 Ms. (less than a second) However, I certainly agree that yours is easier to understand!

  • Peso

    PermutationsOf

    Here is another approach for the function

    DECLARE  @Word VARCHAR(10) = ‘Peter’

    ;WITH cteYak(Word, Letters)
    AS (
    SELECT  CAST(SUBSTRING(@Word, Number, 1) AS VARCHAR(MAX)) AS Word,
      
    STUFF(@Word, Number, 1, ) AS Letters
    FROM  dbo.TallyNumbers
    WHERE Number BETWEEN 1 AND LEN(@Word)

    UNION ALL

    SELECT    Word + SUBSTRING(y.Letters, n.Number, 1) AS Word,
        
    STUFF(y.Letters, n.Number, 1, ) AS Letters
    FROM    cteYak AS y
    INNER JOIN dbo.TallyNumbers AS n ON n.Number BETWEEN 1 AND LEN(y.Letters)
    )
    SELECT DISTINCT Word
    FROM cteYak
    WHERE  LEN(Word) = LEN(@Word)

  • Peso

    Not valid Scabble words
    These are the words that are not valid Scrabble words in the supplied wordlist

    razzamatazz
    razzmatazz
    knickknacks
    knickknack
    pizzazz

  • Carol

    Scrabble
    Thank you, Phil. My husband and I enjoy frequent Scrabble challenges, and while I’m a better speller he’s more crafty. I’ll have to show him this code – he will think it’s cool!

  • Joe Celko

    Remember Moby Words?
    Does anyone else rember teh episode of THE SIMPSONS where Homer is playing Scrabble with Lisa and the word “oxidize’ is sitting in front of him?

    A million years ago in the CP/M days long ago, The Austin Codeworks published a huge file of English words under the name “Moby Words” for just such puzzles. It is still out there in .zip files for download. Hooray!

    Now we need a procedure to delete all the non-Scrabble-able words in the list. That is uglier than it looks … so I wave my hands for now.

    I also have Fike’s algorithm for permutations around here somewhere, but try this approach:

    CREATE TABLE Wordlist
    (word VARCHAR(25) NOT NULL PRIMARY KEY,
    scrabble_score INTEGER NOT NULL
    CHECK (scrabble_score > 0),
    word_length INTEGER NOT NULL
    CHECK (word_length > 0),
    sorted_word VARCHAR(25) NOT NULL);

    word is the original word. The scrabble_score is the point value of the word and word length explains itself. The sorted_word column is a new “word” made by sorting the letters of the original word.

    Input the letters in your tile rack. Query up all subsets of 7, 6, 5, 4 and 3 letters. Sort each subset, which will give us (1 + 7 + 21 + 35+ 35) = 99 strings to match to (word_length, sorted_word) in the word list table.

    I will try to play with this if I can get my head above water over the holidays.

  • Anonymous

    Scrabble
    Just buy the Franklin scrabble computer for $50.00. It also accounts for the blanks. http://www.franklin.com/estore/dictionary/SCR-228/

  • Anonymous

    Franklin device
    But that’s like saying you could buy your rogue nuclear device on the black market and avoid the glowing fun of building your own.

  • Dan Newton

    Great minds think alike!
    I did the same thing for myself, when I play my wife on Facebook’s Scrabble. But she always knows when I’m cheating because I can’t resist the urge to use some ridiculous word that neither of us has ever heard of.

  • Phil Factor

    Cheating at Scrabble
    Various members of my family used to believe that it was fair and polite to play the game with a copy of a large dictionary on their lap. It always slowed the game to a crawl and meant the use of extraordinarily silly words and spellings. No, best play without props, and to a time limit.

  • Marc

    FROM clause
    In running the code, the Inserts that use the syntax “FROM ( VALUES (1), (2), (3), … ) as X(i)” gives me an error: “Incorrect syntax near the keyword ‘values’.” Is this sample code for SQL Server 2008?

  • Peso

    2008 compi
    Yes. Yoy have to rewrite that part to
    SELECT 1 AS i UNION ALL SELECT 2 UNION ALL…

  • Phil Factor

    Re: FROM Clause
    You can easily change this into a more ‘classic’ syntax by using a UNION clause

    Declare @Digits table (i int)
    Insert into @Digits(i)
    Select 1 union select 2 union select 3
    union select 4 union select 5
    union select 6 union select 7
    union select 8 union select 9 union select 0
    INSERT INTO numbers(number)
    SELECT (D5.i*100000 + D4.i*10000 + D3.i * 1000 + D2.i * 100
    + D1.i * 10 + D0.i + 1) AS seq
    FROM @Digits AS D0,@Digits AS D1,@Digits AS D2,@Digits AS D3,
    @Digits AS D4,@Digits AS D5

  • Phil Factor

    Peso and I sang a duet
    Oops. Sorry, Peso, we answered that together!

  • Peso

    Duet
    I don’t mind at all!