|
From: Honest John on 31 May 2006 03:09 I am trying to filter every combination of ten letters through a spell checker, and so find valid english words. I have a problem in that ten factorial, is over three point-six million words to filter, and results in the computer crashing. (The computer is an old 2.66GHz P4 and just not up to the task). Any suggestions how I can save computer resources? #--------%<--------%<----------------------------------------------------------------------------- use Algorithm::Combinatorics qw(permutations); use Text::Aspell; my $speller = Text::Aspell->new; die unless $speller; # Set some options $speller->set_option('lang','en_US'); $speller->set_option('sug-mode','fast'); # check a word my @data = qw(s f e e r t v c o h); my @words = grep $speller->check ( join '', @$_ ), permutations(\@data); print "\nSolution\n"; print map "@$_\n", @words;
From: Christian Winter on 31 May 2006 04:10 Honest John schrieb: > I am trying to filter every combination of ten letters through a spell > checker, and so find valid english words. I have a problem in that ten > factorial, is over three point-six million words to filter, and > results in the computer crashing. (The computer is an old 2.66GHz P4 > and just not up to the task). > > Any suggestions how I can save computer resources? > > #--------%<--------%<----------------------------------------------------------------------------- > use Algorithm::Combinatorics qw(permutations); Maybe using Math::Combinatorics instead could help. If I recall correctly the permutations are calculated with each call to next_permutation. my $inp = Math::Combinatorics->new( [qw(s f e e r t v c o h)] ); while( my @permset = $inp->next_permutation ) { my $searchword = join '', @permset; push @words, $searchword if( $speller->check( $searchword ) ); } HTH -Chris
From: corff on 1 Jun 2006 12:45 Honest John <Hon.JW.Foulfellow(a)gmail.com> wrote: : I am trying to filter every combination of ten letters through a spell : checker, and so find valid english words. I have a problem in that ten : factorial, is over three point-six million words to filter, and : results in the computer crashing. (The computer is an old 2.66GHz P4 : and just not up to the task). : Any suggestions how I can save computer resources? Yes, but a decidedly non-Perlish one: Since most spell-checkers are not rule-based but consult actual dictionaries (try ispell(1) for instance), why don't you simply do a pattern match on the english dictionary of ispell(1) with the search expression /[a-z]{10}/ ? If you put this in a while loop reading your dictionary line by line then it will not make your machine crash. Just my two cents. Oliver. -- Dr. Oliver Corff e-mail: corff(a)zedat.fu-berlin.de
From: Bob Walton on 1 Jun 2006 20:37 Honest John wrote: > I am trying to filter every combination of ten letters through a spell > checker, and so find valid english words. I have a problem in that ten > factorial, is over three point-six million words to filter, and > results in the computer crashing. (The computer is an old 2.66GHz P4 > and just not up to the task). > > Any suggestions how I can save computer resources? > .... John, if I understand what you want to do correctly, you could approach the problem from an entirely different direction: Take a list of valid English words and generate a hash with keys being the sorted letters of the word, and the value being a reference to an array containing the word numbers of all the words which sorted to that key. Then for the word you want to look up (or unscramble), sort the letters and look up in the hash. Here is an example program: use strict; use warnings; print "loading...\n"; my $file='words.txt'; open IN,$file or die "Oops, couldn't open $file, $!"; my $i=0; my @orig; my %sorted; while(<IN>){ chomp; push @orig,$_; push @{$sorted{join '',sort split '',lc $_}},$i++; } close IN; while(1){ print "Enter a scramble: "; my $scr=<>; chomp $scr; last unless $scr; my $sor=join '',sort split '',lc $scr; if(exists($sorted{$sor})){ my @list=@{$sorted{$sor}}; for(@list){ print "$scr -> $orig[$_]\n"; } } } This will build the hash from a list of 234992 words in about 11 seconds on my crappy old 1 GHz computer with 100 MHz FSB, and will then provide "instant" lookup of scrambled words. Sample output: D:\junk\wordlist>perl unscramble.pl loading... Enter a scramble: senator senator -> noreast senator -> rosetan senator -> seatron senator -> senator senator -> treason Enter a scramble: D:\junk\wordlist> Your sample scramble "sfeertvcoh" isn't in my dictionary under any permutation. -- Bob Walton Email: http://bwalton.com/cgi-bin/emailbob.pl
From: J?rgen Exner on 1 Jun 2006 22:57 Honest John wrote: > I am trying to filter every combination of ten letters through a spell > checker, and so find valid english words. I have a problem in that ten > factorial, is over three point-six million words to filter, and > results in the computer crashing. (The computer is an old 2.66GHz P4 > and just not up to the task). > > Any suggestions how I can save computer resources? [...] > my @words = grep $speller->check ( join '', @$_ ), > permutations(\@data); Quite simple, actually. Your algorithm first generates the whole set of permutations, then filters the desired results. Instead just check each candidate as you create it. This way you avoid having to store the whole 10 factorial elements. jue
|
Pages: 1 Prev: xml::twig - writing utf-8 Next: FAQ 4.39 What is the difference between a list and an array? |