|
Prev: Publicly flog David Filmer for writing multi-post flagging'bot
Next: Cygwin error regarding profile.global
From: cms-team on 28 Aug 2006 06:31 Hy! I've got the following code and would like to ask, if there is any possibilty for more performance. The script inserts about 240 recordsets per second into a MySQL database. The file (input.txt) contains about 365.000 lines/recordsets. ==================================================== # Grundwerte initialisieren require "/home/vip/my.properties"; $IMPDAT="$HOME_HOST/input.txt"; if (!open (FP1,"<$IMPDAT")) { print ("$IMPDAT nicht vorhanden!\n"); exit(3); } #Mit der DB verbinden $rc=&connect_to_db(); $linecount = 0; # Alte Tabelleninhalte vorher löschen $statment="DROP TABLE IF EXISTS `testtabelle`"; $rv=$dbh->do($statment); $statment="CREATE TABLE `testtabelle` (`apl` varchar(2) NOT NULL default '',`vsnr` varchar(11) NOT NULL default '',`pvsnr` varchar(8) NOT NULL default '',`art` char(1) NOT NULL default '',`aktiv` char(1) NOT NULL default '',`vmnr` varchar(4) NOT NULL default '',`vbeginn` varchar(8) NOT NULL default '',`vende` varchar(8) default NULL,`zw` int(1) NOT NULL default '0',`ikm` char(1) NOT NULL default '',`ktoidx` char(1) default NULL,`nettobeitrag` decimal(9,2) NOT NULL default '0.00',`bruttoratenbeitrag` decimal(9,2) NOT NULL default '0.00',`produkttext` varchar(150) NOT NULL default '',`rohbau` varchar(6) default NULL,PRIMARY KEY (`vsnr`),KEY `pvsnr` (`pvsnr`),KEY `vm` (`vmnr`)) ENGINE=InnoDB DEFAULT CHARSET=utf8;"; $rv=$dbh->do($statment); # Nimmt die Startzeit der Anwendung $startzeit = time; # Zeilenweise Auslesen der Datei while (<FP1>) { $line = $_; chop ($line); $linecount += 1; $line=~s/\"//g; ($apl,$vsnr1,$vsnr2,$pvsnr,$art,$aktiv,$vmnr,$vbeginn,$vende,$zw,$ikm,$ktoidx,$nettobeitrag,$bruttoratenbeitrag,$produkttext,$rohbau) = split(/;/,$line); $vende=~s/ *$//g; $vende=~s/,/\./g; $ktoidx=~s/ *$//g; $ktoidx=~s/,/\./g; $nettobeitrag=~s/ *$//g; $nettobeitrag=~s/,/\./g; $bruttoratenbeitrag=~s/ *$//g; $bruttoratenbeitrag=~s/,/\./g; $produkttext=~s/ *$//g; $produkttext=~s/,/\./g; $rohbau=~s/ *$//g; $rohbau=~s/,/\./g; $statment="INSERT into testtabelle (apl,vsnr,pvsnr,art,aktiv,vmnr,vbeginn,vende,zw,ikm,ktoidx,nettobeitrag,bruttoratenbeitrag,produkttext,rohbau) value ('$apl','$vsnr1$vsnr2','$pvsnr','$art','$aktiv','$vmnr','$vbeginn',".($vende eq "" ? "NULL" : "'".$vende."'").",$zw,'$ikm',".($ktoidx eq "" ? "NULL" : "'".$ktoidx."'").",$nettobeitrag,$bruttoratenbeitrag,'$produkttext',".($rohbau eq "" ? "NULL" : "'".$rohbau."'").")"; $rv=$dbh->do($statment); } $rc = $dbh->disconnect; close(FP1); # Zeit ausgeben &Dauer; $est = $est == 0 ? 1 : $est; $dsek = int($linecount/$est); print "Das Script 'laden_all_daten.pl' hat fuer ".punktuiere($linecount)." Datensaetze insgesamt ".$dauer." Minuten gebraucht (".$dsek." Datensaetze pro Sekunde)\n"; unlink($TEMPDAT); exit(0); ==================================================== Where can I make this thing run a little faster? Thx in advance! Greetings Marcus Dau
From: Marc Espie on 28 Aug 2006 07:09 In article <1156761108.773369.258590(a)m79g2000cwm.googlegroups.com>, <cms-team(a)ivi.de> wrote: >Hy! > >I've got the following code and would like to ask, if there is any >possibilty for more performance. The script inserts about 240 >recordsets per second into a MySQL database. > >The file (input.txt) contains about 365.000 lines/recordsets. > >Where can I make this thing run a little faster? > Turn autocommit off, and commit just once every n transactions. Not sure how good it will be with mysql, but sqlite loves it quite a bit. Otherwise, use some other database engine. mysql is simple to set up, but definitely not the fastest.
From: cms-team on 28 Aug 2006 08:16 Thx a lot. This does the trick. Performance went up from 50 minutes runtime for 30 scripts (tables) to 10 minutes!!! Greetings Marcus
From: goho on 28 Aug 2006 09:54 cms-team(a)ivi.de wrote: > Thx a lot. This does the trick. > > Performance went up from 50 minutes runtime for 30 scripts (tables) to > 10 minutes!!! > > Greetings > Marcus Your problem doesn't have a lot to do with perl; I suggest you look towards the mysql group, however, that said: I've not used MySQL but from experience with other DBMS you may get significant improvement by creating the pvsnr and vm keys (indexes) AFTER loading the data. In some databases it helps to offline pre-sort in order of the primary key - try it. Without knowing the internal structure of InnoDB I cant say whether it will improve things or be a disaster but you wont know until you try. Be aware of ROLLBACK issues if you dont regularly commit; not a subject for discussion here, I'll leave that for you to follow up. If you are looking for a book, Michael Kofler's 'The Definitive Guide to MySQL' looks pretty reasonable Good luck.
From: Paul Lalli on 28 Aug 2006 10:18 cms-team(a)ivi.de wrote: > I've got the following code and would like to ask, if there is any > possibilty for more performance. The script inserts about 240 > recordsets per second into a MySQL database. <snip> > while (<FP1>) > { > $line = $_; Why make two assignments? while (my $line = <FP1>) { > chop ($line); You want chomp(), not chop(). There is a difference. Look at perldoc -f chomp and perldoc -f chop > $linecount += 1; No need for this variable. Perl maintains the linecount in the $. variable for you. > > $line=~s/\"//g; > ($apl,$vsnr1,$vsnr2,$pvsnr,$art,$aktiv,$vmnr,$vbeginn,$vende,$zw,$ikm,$ktoidx,$nettobeitrag,$bruttoratenbeitrag,$produkttext,$rohbau) > = split(/;/,$line); > > $vende=~s/ *$//g; > $vende=~s/,/\./g; > > $ktoidx=~s/ *$//g; > $ktoidx=~s/,/\./g; > > $nettobeitrag=~s/ *$//g; > $nettobeitrag=~s/,/\./g; > > $bruttoratenbeitrag=~s/ *$//g; > $bruttoratenbeitrag=~s/,/\./g; > > $produkttext=~s/ *$//g; > $produkttext=~s/,/\./g; > > $rohbau=~s/ *$//g; > $rohbau=~s/,/\./g; For readability sake, make this one for loop that iterates over your six variables. For performance sake, change the * to a + and change the second s///g to a tr/// for ($vende, $ktoidx, $nettobeitrag, $bruttoratenbeitrag, $producttext, $rohbau) { s/ +$//g; tr/,/./; } > > $statment="INSERT into testtabelle > (apl,vsnr,pvsnr,art,aktiv,vmnr,vbeginn,vende,zw,ikm,ktoidx,nettobeitrag,bruttoratenbeitrag,produkttext,rohbau) > value > ('$apl','$vsnr1$vsnr2','$pvsnr','$art','$aktiv','$vmnr','$vbeginn',".($vende > eq "" ? "NULL" : "'".$vende."'").",$zw,'$ikm',".($ktoidx eq "" ? "NULL" > : > "'".$ktoidx."'").",$nettobeitrag,$bruttoratenbeitrag,'$produkttext',".($rohbau > eq "" ? "NULL" : "'".$rohbau."'").")"; > > $rv=$dbh->do($statment); Here's a large performance hit. Instead of compiling this SQL each time through, prepare this SQL *once*, outside the loop. Then execute it many times in the loop. #outside the loop: my $statement = <<"EO_SQL"; INSERT INTO testabelle (apl,vsnr,pvsnr,art,aktiv,vmnr,vbeginn,vende,zw,ikm,ktoidx,nettobeitrag,bruttoratenbeitrag,produkttext,rohbau) VALUE (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?); EO_SQL my $sth = $dbh->prepare($statement); #... #later, inside the loop: $sth->execute($apl, $vsnr1 . $vnsr2, $pvsnr, $art, $aktiv, $vmnr, $vbeginn, ($vende eq q{} ? 'NULL' : $vende), $zw, $ikm, ($ktoidx eq q{} ? 'NULL' : $ktoidx), $nettobeitrag, $bruttoratenbeitrag, $produkttext, ($rohbau eq q{} ? 'NULL' : $rohbau)); You'll note that this also saves you the annoyance of having to figure out all those quotes in your original SQL. The previous recommendations given to you are good - committing only after all the inserts will definately save you time - but this should help as well. Paul Lalli
|
Next
|
Last
Pages: 1 2 Prev: Publicly flog David Filmer for writing multi-post flagging'bot Next: Cygwin error regarding profile.global |