Programming challenge: Which words have the most anagrams?

September 18th, 2013 | Categories: Problem of the week, programming | Tags:

While on the train to work this morning, I wondered which English words have the most anagrams that are also valid English words. So, I knocked up few lines of Mathematica code and came up with 4 sets of 7:

{{"ates", "east", "eats", "etas", "sate", "seat", "teas"},
{"pares", "parse", "pears", "rapes", "reaps", "spare", "spear"}, 
{"capers", "crapes", "pacers", "parsec", "recaps", "scrape", "spacer"},
{"carets", "caster", "caters", "crates", "reacts", "recast", "traces"}}

So, according to my program (and Mathematica’s dictionary), the best you can do is 7.  I’m not going to post the code until later because I don’t want to influence the challenge which is ‘Write a program in your language of choice that queries a dictionary to find which English words have the most anagrams that are also valid English words.’

  1. September 18th, 2013 at 16:29
    Reply | Quote | #1

    Here are a couple of ways of doing it in Python: https://gist.github.com/christianp/6610845.

    Because my wordlist has loads of not-really-words in it, I get:
    ['apers', 'apres', 'asper', 'pares', 'parse', 'pears', 'prase', 'presa', 'rapes', 'reaps', 'spare', 'spear']

  2. Thales Fernandes
    September 18th, 2013 at 17:14
    Reply | Quote | #2

    You forgot {“least”, “slate”, “Stael”, “stale”, “steal”, “tales”, “teals”, “Tesla”}
    Or names don’t count? If not we get your solution, but a 1 set of 8 words exist. :)

    Code Bellow:

    char = CharacterRange["a", "z"];
    char = char ~ Join ~ ToUpperCase@char;
    n = Length@char;
    WordToNumber[str_String] := Position[char, #] & /@ StringSplit[str, ""] // Flatten // Sort // FromDigits[#, n] &
    SetAttributes[WordToNumber, {Listable}];

    words = ToLowerCase /@ DictionaryLookup[{"English", All}] // DeleteDuplicates;
    wordnumb = WordToNumber@words;

    tally = Transpose@Tally@wordnumb;
    count = Last@tally;
    vals = First@tally;

    pos = Extract[vals, Position[count, Max@count]];
    wordPos = (Position[wordnumb, #] & /@ pos);
    wordAnagram = Extract[words, #] & /@ wordPos;
    wordAnagram /. str_String :> First@DictionaryLookup[str, IgnoreCase -> True]

    * Bonus Function (Slow as hell for big words, 1st function try, LOL)
    Anagram[str_String] := Select[StringJoin /@ Permutations@StringSplit[str, ""], Length@DictionaryLookup@# > 0 &]

  3. Thales Fernandes
    September 18th, 2013 at 17:30
    Reply | Quote | #3

    @Thales Fernandes
    Improved version for non-english words. In Portuguese (my language) we have an anagram of 15 words!

    {“aporeis”, “aproeis”, “aprosei”, “arpoeis”, “esporai”, “operais”, “opiares”, “poeiras”, “Poiares”, “posarei”, “proseai”, “proseia”, “raposei”, “repoisa”, “sopeira”}

    {“arrastei”, “arrestai”, “arretais”, “arriaste”, “aterrais”, “atirares”, “iteraras”, “rasteira”, “rastreai”, “rastreia”, “ratareis”, “restaria”, “retraias”, “tarareis”, “traseira”}

    {“arroteis”, “estoirar”, “ostreira”, “rasteiro”, “rastreio”, “reitoras”, “retroais”, “rotareis”, “soterrai”, “tesoirar”, “torareis”, “torreais”, “torreias”, “traseiro”, “troareis”}

    Code:
    Clear[words, lang, char, n, WordToNumber, wordnumb, tally, count, vals, pos, wordPos, wordAnagram, Anagram]

    lang = “Portuguese”;

    Anagram[str_String] := Select[StringJoin /@ Permutations@StringSplit[str, ""], Length@DictionaryLookup@# > 0 &]

    words = ToLowerCase /@ DictionaryLookup[{lang, All}] // DeleteDuplicates;
    Print["Number of words to test (", lang, "): ", Length@words];

    char = Union@StringSplit[StringJoin@words, ""];
    n = Length@char;
    Print["Character used(",n,"): ", StringJoin@char];

    WordToNumber[str_String] := Position[char, #] & /@ StringSplit[str, ""] // Flatten // Sort // FromDigits[#, n] &
    SetAttributes[WordToNumber, {Listable}];

    wordnumb = WordToNumber@words;

    tally = Transpose@Tally@wordnumb;
    count = Last@tally;
    vals = First@tally;
    Print["Word with maximum number of anagrams: ", Max@count];

    pos = Extract[vals, Position[count, Max@count]];
    wordPos = (Position[wordnumb, #] & /@ pos);
    wordAnagram = Extract[words, #] & /@ wordPos;
    wordAnagram /. str_String :> First@DictionaryLookup[{lang, str}, IgnoreCase -> True]

  4. Szabolcs
    September 18th, 2013 at 18:59
    Reply | Quote | #4

    Mathematica, without much thinking:

    SortBy[GatherBy[DictionaryLookup["*"],
    Sort@Characters@ToLowerCase[#] &], Length][[-10 ;;]]

    It gives the 10 longest lists. Should be a good exercise for other languages, which I’ll try later.

  5. Mike Croucher
    September 18th, 2013 at 19:09
    Reply | Quote | #5

    OK, so its not quite a week but here’s mine. Much longer than @Szabolcs!

    (*Given a word, this function finds all permutations that are also \
    words*)
    wordperms = Flatten[
    Map[DictionaryLookup, Map[StringJoin, Permutations[Characters[#]]]]
    ] &;
    words = DictionaryLookup[]; (*Get all words from the dictionary*)
    sorted = Sort[
    Map[Sort,
    Map[Characters,
    words]]]; (*Sort letters of each word into alphabetical order \
    and then sort the resulting words*)
    found = Part[Select[Tally@sorted, Part[#, 2] > 6 &], All,
    1]; (*Select all lists that are repeated more than 6 times. More \
    than 7 gives an empty list*)
    Map[wordperms, found]

    I missed {“least”, “slate”, “Stael”, “stale”, “steal”, “tales”, “teals”, “Tesla”} because I didn’t consider case.

  6. September 18th, 2013 at 21:54
    Reply | Quote | #6

    words = {#, Sort[Characters[ToLowerCase[#]]]} & /@ DictionaryLookup[];
    words = SortBy[GatherBy[words, Last], Length][[All, All, 1]];
    words[[-20 ;;]] // Grid

    Same idea as Szabolcs

  7. Szabolcs
    September 19th, 2013 at 17:17
    Reply | Quote | #7

    So far there are five Mathematica solutions and a single Python one. I would like to see solutions in other languages, in particular R. People on Unix based platforms can use /usr/share/dict/words as the database.

  8. September 20th, 2013 at 16:37
    Reply | Quote | #8

    organ has 9 anagrams: rogan, ronga, orang, nagor, groan, grano, goran, argon, angor
    reset has 8 anagrams: reest, estre, ester, stree, stere, steer, terse, tsere
    trace has 8 anagrams: react, recta, crate, creta, creat, cater, carte, caret
    plate has 7 anagrams: pleat, palet, patel, pelta, petal, leapt, tepal

    This is using Ruby with words of length 7 or smaller.

    WORDS = File.readlines(‘/usr/share/dict/words’).select{|w| w.length < 7 }
    puts "There are #{WORDS.count} WORDS in the dictionary."
    WORDS_HASH = Hash[WORDS.map{|w| [w.strip,w.strip]}]

    results = []

    class Result
    include Comparable

    attr_reader :original_word, :valid_anagrams, :size

    def initialize(original_word, valid_anagrams)
    @original_word = original_word
    @valid_anagrams = valid_anagrams
    end

    def other
    self.size other.size
    end

    def size
    valid_anagrams.count
    end

    end

    n = 0

    for word in WORDS
    word.strip!
    word.downcase!
    word_chars = word.chars
    word_chars_a = word_chars.to_a
    permutations = word_chars_a.permutation(word.length)
    permutations = permutations.map(&:join)
    valid_anagrams = []
    permutations.each do |perm|
    valid_anagrams < 0
    results <= 25
    #break
    end
    permutations = nil
    word = nil
    valid_anagrams = nil
    word_chars = nil
    word_chars_a = nil
    end

    results.sort.reverse.take(30).each do |result|
    puts “#{result.original_word} has #{result.size} anagrams: #{result.valid_anagrams.join(‘, ‘)}”
    end

  9. September 20th, 2013 at 16:39
    Reply | Quote | #9

    Looks like some characters and formatting are lost. There is a gist here: https://gist.github.com/seanhandley/48d4e42c90b533f6d966

    Still needs some work with regard to efficiency!

  10. Globules
    September 22nd, 2013 at 07:56

    Here’s a Haskell version. (I don’t consider case.)

    import Control.Arrow
    import Data.Function
    import Data.List
    import Data.Ord
    import System.Environment
    import qualified Data.Map as M
    import qualified Data.Set as S

    anagrams :: [String] -> [[String]]
    anagrams = map (S.toList . snd) . M.toList
    . M.fromListWith S.union
    . map (sort &&& S.singleton)

    rank :: [[a]] -> [[a]]
    rank = sortBy (compare `on` (Down . length))

    mostPopular :: [[a]] -> [[a]]
    mostPopular = concat . take 1 . groupBy ((==) `on` length)

    mostAnagrams :: [String] -> [[String]]
    mostAnagrams = mostPopular . rank . anagrams

    main :: IO ()
    main = do
    args readFile path >>= mapM_ print . mostAnagrams . words
    _ -> error “Supply one argument giving the path to a dictionary.”

    When run on MacOS I get:

    $ ./anawords /usr/share/dict/words
    ["caret","carte","cater","crate","creat","creta","react","recta","trace"]
    ["angor","argon","goran","grano","groan","nagor","orang","organ","rogan"]
    ["ester","estre","reest","reset","steer","stere","stree","terse","tsere"]
    $

  11. September 23rd, 2013 at 06:22

    listIta = {Sort[Characters[#]], #} & /@ DictionaryLookup[{"Italian", All}];
    Reverse[Sort[Length /@ GatherBy[listIta, First]]][[1 ;; 20]]
    Reverse[Sort[GatherBy[listIta, First]]][[1 ;; 3, All, 2]]

    Result
    {{“aperti”, “aprite”, “pareti”, “patire”, “patrie”, “perita”,
    “pietra”, “rapite”, “ripeta”}, {“avresti”, “restavi”, “servita”,
    “stivare”, “svitare”, “versati”, “viraste”, “vistare”}, {“parasti”,
    “piastra”, “rapasti”, “raspati”, “satrapi”, “sparati”, “sparita”,
    “spirata”}}

  12. September 27th, 2013 at 15:04

    Here’s how I did it in MATLAB, which gave me the same 10 “apers”:

    wordList = urlread(‘http://www-01.sil.org/linguistics/wordlists/english/wordlist/wordsEn.txt’);
    words = regexp(wordList, ‘\s+’, ‘split’);
    map = containers.Map(‘KeyType’, ‘char’, ‘ValueType’, ‘any’);
    for i = 1:numel(words)
    key = sort(lower(words{i}));
    if isKey(map, key)
    map(key) = [map(key) words(i)];
    else
    map(key) = words(i);
    end
    end
    vals = values(map);
    counts = cellfun(@numel, vals);
    maxCount = max(counts);
    vals{counts == maxCount}

  13. October 10th, 2013 at 17:11

    A bash one-liner (the line got a bit longer than I intended) It prints out any sets of anagrams with 10 or more words.

    bash-4.2$ c=0; while read l; do echo `echo $l | fold -1 | sort ` @ $l;done < /usr/share/dict/words |sort | while read x; do oldk=$k;k=${x%%@*}; if [ "$k" = "$oldk" ] ; then c=$(($c + 1)); w="$w ${x#*@}"; else if [ $c -gt 9 ]; then echo $c $w; fi; c=1; w=${x#*@}; fi; done

    14 anestri antsier asterin eranist nastier ratines resiant restain retains retinas retsina stainer starnie stearin
    11 alerts alters artels estral laster lastre rastle ratels relast resalt salter
    12 astel least salet setal slate stale steal stela taels tales teals tesla
    10 aster astre rates reast resat serta stare strae tares tarse
    10 estrous oestrus ousters sestuor sourest souters stoures toruses trousse tussore

    so /usr/share/dict/words claims 14 (somewhat dubious:-) anagrams for nastier

  14. Brendan Babb
    November 19th, 2013 at 11:22

    This problem reminds me of playing Boggle: http://en.wikipedia.org/wiki/Boggle. Where if you can spell A R T from a triangle of letters, then you can spell R A T and T A R. I am not sure how you would do it, but could you modify the programs to look for a long word that would give you other smaller words in Boggle. I know it would really depend on the layout of the Boggle dice, and you would have less anagrams than without the Boggle constraint.