From: Shurik on
Hi

I have ksh script that execute many times grep command ( in loop ) on
the same file ( big file ~ 7K lines )

Can I improve the grep command ? Like try to load the file to memory .

The grep command like below:

grep "$FILE_NAME" myfile | read A B

Thanks
From: Eric on
On 2010-04-25, Shurik <shurikgefter(a)gmail.com> wrote:
> Hi
>
> I have ksh script that execute many times grep command ( in loop ) on
> the same file ( big file ~ 7K lines )
>
> Can I improve the grep command ? Like try to load the file to memory .
>
> The grep command like below:
>
> grep "$FILE_NAME" myfile | read A B
>
> Thanks

fgrep (equivalent to grep -F) is faster, but you'd still be running it a
lot of times. Probably better to think of what your loop is really
trying to achieve - I wouldn't be too surprised if it was a case for
awk. Perl may also be a reasonable idea (it can keep the file in memory)
but it's out of scope for this newsgroup.

If you want any useful help from here I think you need to give us the
whole loop.

E.
From: Shurik on
On Apr 25, 2:50 pm, Eric <e...(a)deptj.eu> wrote:
> On 2010-04-25, Shurik <shurikgef...(a)gmail.com> wrote:
>
> > Hi
>
> > I have ksh script that execute many times grep command ( in loop ) on
> > the same file ( big file ~ 7K lines )
>
> > Can I improve the grep command ? Like try to load the file to memory .
>
> > The grep command like below:
>
> > grep "$FILE_NAME" myfile | read A B
>
> > Thanks
>
> fgrep (equivalent to grep -F) is faster, but you'd still be running it a
> lot of times. Probably better to think of what your loop is really
> trying to achieve - I wouldn't be too surprised if it was a case for
> awk. Perl may also be a reasonable idea (it can keep the file in memory)
> but it's out of scope for this newsgroup.
>
> If you want any useful help from here I think you need to give us the
> whole loop.
>
> E.


The script is:
#!/bin/ksh
SourceFile=$1
TargetFile=$2
TargetHost=HP1
SourceHost=SUN2

exec 3<${SourceFile}
while read -u3 Line
do
OLD_IFS="${IFS}"
IFS="|"
echo "$Line" | read PERM SIZE File

grep "|${File}$" ${TargetFile} | read TargetPerm TargetSize
XXXX
IFS="${OLD_IFS}"

if [ "${TargetSize}" = "" ]
then
echo "${File} MISSING on ${TargetHost}"
elif [ ${SIZE} -ne ${TargetSize} ]
then
echo "${File} SIZE is different on $
{TargetHost}"
elif [ "${TargetPerm}" != "${PERM}" ]
then
echo "${File} PERMISSION is different on $
{TargetHost}"
fi

done

The target and source files contain 7K lines:

-rw-r--r--|214890729|./ACEXML/apps/svcconf/
ACEXML_XML_Svc_Conf_Parser.pc.in
-rw-r--r--|1370781355|./ACEXML/apps/svcconf/
ACEXML_XML_Svc_Conf_Parser.bor
-rw-r--r--|3618598382|./ACEXML/apps/svcconf/
ACEXML_XML_Svc_Conf_Parser_Static.vcproj
-rw-r--r--|983012176|./ACEXML/apps/svcconf/
ACEXML_XML_Svc_Conf_Parser.vcproj
From: Stachu 'Dozzie' K. on
On 2010-04-25, Shurik <shurikgefter(a)gmail.com> wrote:
>> > I have ksh script that execute many times grep command ( in loop ) on
>> > the same file ( big file ~ 7K lines )
>>
>> > Can I improve the grep command ? Like try to load the file to memory .
[...]
>> If you want any useful help from here I think you need to give us the
>> whole loop.

> exec 3<${SourceFile}
> while read -u3 Line
> do
> OLD_IFS="${IFS}"
> IFS="|"
> echo "$Line" | read PERM SIZE File
>
> grep "|${File}$" ${TargetFile} | read TargetPerm TargetSize
> XXXX

For each line from $SourceFile you're running grep just to read
$TargetPerm and $TargetSize. Just give up and write it in a language
that has kind hashmap: AWK, or possibly Perl.

--
Secunia non olet.
Stanislaw Klekot
From: Ed Morton on
On 4/25/2010 8:16 AM, Shurik wrote:
> On Apr 25, 2:50 pm, Eric<e...(a)deptj.eu> wrote:
>> On 2010-04-25, Shurik<shurikgef...(a)gmail.com> wrote:
>>
>>> Hi
>>
>>> I have ksh script that execute many times grep command ( in loop ) on
>>> the same file ( big file ~ 7K lines )
>>
>>> Can I improve the grep command ? Like try to load the file to memory .
>>
>>> The grep command like below:
>>
>>> grep "$FILE_NAME" myfile | read A B
>>
>>> Thanks
>>
>> fgrep (equivalent to grep -F) is faster, but you'd still be running it a
>> lot of times. Probably better to think of what your loop is really
>> trying to achieve - I wouldn't be too surprised if it was a case for
>> awk. Perl may also be a reasonable idea (it can keep the file in memory)
>> but it's out of scope for this newsgroup.
>>
>> If you want any useful help from here I think you need to give us the
>> whole loop.
>>
>> E.
>
>
> The script is:
> #!/bin/ksh
> SourceFile=$1
> TargetFile=$2
> TargetHost=HP1
> SourceHost=SUN2
>
> exec 3<${SourceFile}
> while read -u3 Line
> do
> OLD_IFS="${IFS}"
> IFS="|"
> echo "$Line" | read PERM SIZE File
>
> grep "|${File}$" ${TargetFile} | read TargetPerm TargetSize
> XXXX
> IFS="${OLD_IFS}"
>
> if [ "${TargetSize}" = "" ]
> then
> echo "${File} MISSING on ${TargetHost}"
> elif [ ${SIZE} -ne ${TargetSize} ]
> then
> echo "${File} SIZE is different on $
> {TargetHost}"
> elif [ "${TargetPerm}" != "${PERM}" ]
> then
> echo "${File} PERMISSION is different on $
> {TargetHost}"
> fi
>
> done
>
> The target and source files contain 7K lines:
>
> -rw-r--r--|214890729|./ACEXML/apps/svcconf/
> ACEXML_XML_Svc_Conf_Parser.pc.in
> -rw-r--r--|1370781355|./ACEXML/apps/svcconf/
> ACEXML_XML_Svc_Conf_Parser.bor
> -rw-r--r--|3618598382|./ACEXML/apps/svcconf/
> ACEXML_XML_Svc_Conf_Parser_Static.vcproj
> -rw-r--r--|983012176|./ACEXML/apps/svcconf/
> ACEXML_XML_Svc_Conf_Parser.vcproj

Try this (untested):

awk -v targetHost="$TargetHost" -F'|' '
NR==FNR { perm[$3]=$1; size[$3]=$2; next }
$1 != perm[$3] { print $3,"PERMISSION is different on",targetHost }
$2 != size[$3] { print $3,"SIZE is different on",targetHost }
{ delete perm[$3] }
END { for (file in perm) print file,"MISSING on",targetHost }
' "$SourceFile" "$TargetFile"

Regards,

Ed.