From: Maxim Heijndijk on
Is there a simpler way to exclude unwanted Windows files, I'm using
this, but I get a headache from typing:

--exclude [Hh][Ii][Bb][Ee][Rr][Ff][Ii][Ll][.]][Ss][Yy][Ss] \
--exclude [Pp][Aa][Gg][Ee][Ff][Ii][Ll][Ee][.]][Ss][Yy][Ss] \
--exclude
[Dd][Oo]][Cc][Uu][Mm][Ee][Nn][Tt][s]?[Aa][Nn][Dd]?[Ss][Ee][Tt][Tt][Ii][Nn][Gg][Ss]/*/[Ll][Oo][Cc][Aa][Ll]?[Ss][Ee][Tt][Tt][Ii][Nn][Gg][Ss]/[Tt][Ee][Mm][Pp]*/***


etc etc

TIA Max
From: Kaz Kylheku on
Maxim Heijndijk wrote:
> Is there a simpler way to exclude unwanted Windows files, I'm using
> this, but I get a headache from typing:
>
> --exclude [Hh][Ii][Bb][Ee][Rr][Ff][Ii][Ll][.]][Ss][Yy][Ss] \
> --exclude [Pp][Aa][Gg][Ee][Ff][Ii][Ll][Ee][.]][Ss][Yy][Ss] \
> --exclude
> [Dd][Oo]][Cc][Uu][Mm][Ee][Nn][Tt][s]?[Aa][Nn][Dd]?[Ss][Ee][Tt][Tt][Ii][Nn][Gg][Ss]/*/[Ll][Oo][Cc][Aa][Ll]?[Ss][Ee][Tt][Tt][Ii][Nn][Gg][Ss]/[Tt][Ee][Mm][Pp]*/***

Sure, just match their case as is. For instance, you don't have to
match every possible way of assigning case to the characters of
"Documents and Settings".

Put the excludes into a file and use --exclude-from.

Another idea would be to explore the case-insensitive matching
abilities of some other program to locate the matching files and list
their names. Then feed that as the exclude list into rsync.

For instance, GNU find has the "-iname" predicate.

GNU bash has the "nocaseglob" setting also which affects pathname
expansion; that could be useful in some way:

shopt -s nocaseglob

Note that when this is turned on, then if a pathname component contains
globbing characters, then /all/ characters in that component (but not
the entire pathname) are treated insensitively. For instance:

$ echo /BIN/LS* # no match
/BIN/LS*
$ echo /BIN*/LS* # aha!
/bin/ls

So for instance

pagefile.sy[s]

would be good enough to match PageFile.SYS or PAGEFILE.SYS, etc.

From: Maxim Heijndijk on
Kaz Kylheku schreef:
> Maxim Heijndijk wrote:
>> Is there a simpler way to exclude unwanted Windows files, I'm using
>> this, but I get a headache from typing:
>>
>> --exclude [Hh][Ii][Bb][Ee][Rr][Ff][Ii][Ll][.]][Ss][Yy][Ss] \
>> --exclude [Pp][Aa][Gg][Ee][Ff][Ii][Ll][Ee][.]][Ss][Yy][Ss] \
>> --exclude
>> [Dd][Oo]][Cc][Uu][Mm][Ee][Nn][Tt][s]?[Aa][Nn][Dd]?[Ss][Ee][Tt][Tt][Ii][Nn][Gg][Ss]/*/[Ll][Oo][Cc][Aa][Ll]?[Ss][Ee][Tt][Tt][Ii][Nn][Gg][Ss]/[Tt][Ee][Mm][Pp]*/***
>
> Sure, just match their case as is. For instance, you don't have to
> match every possible way of assigning case to the characters of
> "Documents and Settings".
>
> Put the excludes into a file and use --exclude-from.
>
> Another idea would be to explore the case-insensitive matching
> abilities of some other program to locate the matching files and list
> their names. Then feed that as the exclude list into rsync.
>
> For instance, GNU find has the "-iname" predicate.
>
> GNU bash has the "nocaseglob" setting also which affects pathname
> expansion; that could be useful in some way:
>
> shopt -s nocaseglob
>
> Note that when this is turned on, then if a pathname component contains
> globbing characters, then /all/ characters in that component (but not
> the entire pathname) are treated insensitively. For instance:
>
> $ echo /BIN/LS* # no match
> /BIN/LS*
> $ echo /BIN*/LS* # aha!
> /bin/ls
>
> So for instance
>
> pagefile.sy[s]
>
> would be good enough to match PageFile.SYS or PAGEFILE.SYS, etc.
>

Thanx. I've tried the noglob idea, but it doesn't work. I have this in
an elaborate script for syncing to USB disks:

shopt -s nocaseglob

EXCLUDES="
--exclude [Hh]iberfil.sys \
--exclude [Nn]tuser.dat \
--exclude [Pp]agefile.sys \
--exclude [Ww]in386.swp \
--exclude [Cc]ache/*** \
--exclude [Cc]ookies/*** \
--exclude [Ll]ogs/*** \
--exclude [Pp]refetch/*** \
--exclude system*Volume*Information/*** \
--exclude [Tt]humbs/*** \
--exclude [Tt]mp/*** \
--exclude [Tt]emp/*** \
--exclude [Tt]emporary/*** \
--exclude _[Rr]estore/*** \
"
nice -20 \
rsync -uz${OPTS} --modify-window=1 \
${RELATIVE} \
--delete \
--progress \
${EXCLUDES} \
"${SOURCE}" "${TARGET}/${SUBDIR}"

shopt -u nocaseglob

From: Kaz Kylheku on
Maxim Heijndijk wrote:
>
> Thanx. I've tried the noglob idea, but it doesn't work.

I wonder why. What is your $BASH_VERSION?

Note that my trick depends on the paths being resolvable relative to
where you are running rsync from.

I don't think you are doing that, otherwise your use of the rsync
pattern *** would not work. For instance, suppose you write:

--exclude some/directory/***

If some/directory exists relative to the current working directory
where you are running rsync, the shell will apply the *** pattern and
expand to the contents of that directory. The *** pattern won't be
passed to rsync, and there will likely be superfluous arguments also.

It's probably a good idea to quote the patterns being passed to rsync
so that they are not interpreted by the shell at all. But then the
trick of using shell globbing certainly won't work without some added
contortions.

Also note that, for the same reason, your use of the EXCLUDES variable
introduces a difficulty: you cannot quote the names within that string
to protect them from expansion. In order to use EXCLUDES, you have to
expand it, but the results of expansion are not protected from further
expansion. If the expansion tries to produce quotes, that won't work
because the shell does not re-tokenize the the expansion.

Here is an idea: stick with your original patterns, but make the
machine write them for you, rather than typing them by hand.

# ignore case
ic ()
{
# note that \u in the replacement is a GNU sed extension
echo $1 | sed -e 's/[a-z]/[&\u&]/g'
}

# sample run of ic
$ ic abc
[aA][bB][cC]

rsync \
--exclude "$(ic pagefile.sys)" \
--exclude "$(ic "documents and settings/local?settings/temp/***")" ...

If you want to put the exclusions into an EXCLUDES variable, and quote
them, then you have to use val. It becomes ugly, so at this point it's
wortwhile to consider a tool which is better at preprocessing code to
be evaluated by the shell, namely make!

# GNU Makefile

# The * and "*" here are thrown in as an example of how accidental
# globbing is prevented by the quotes here.

EXCLUDES := * "*" "pagefile.sys" "documents and
settings/local?settings/temp/***"

EXCLUDES_IC := $(foreach name,$(EXCLUDES),\
$(shell echo '$(name)' | sed -e 's/[a-z]/[&\u&]/g'))

# "do_rsync" is a virtual target, not a real file being updated
..PHONY: do_rsync show_excludes

# remove the echo and complete the rsync
do_rsync:
@echo rsync $(foreach item,$(EXCLUDES_IC),--exclude $(item))
#^ tab here

When you run make, you can see the built-up command line:

$ make
rsync --exclude files from your current directory --exclude * --exclude
[pP][aA][gG][eE][fF][iI][lL][eE].[sS][yY][sS] --exclude
[dD][oO][cC][uU][mM][eE][nN][tT][sS] --exclude [aA][nN][dD] --exclude
[sS][eE][tT][tT][iI][nN][gG][sS]/[lL][oO][cC][aA][lL]?[sS][eE][tT][tT][iI][nN][gG][sS]/[tT][eE][mM][pP]/***

Note how the unquoted * is expanded into "files from your current
directory", an unwanted effect. But the quoted "*" came through
verbatim as an argument to echo, without the quotes. Because make is
designed to tightly interface with the shell, it has special quoting
logic. Essentially, a string within make is attributed as being quoted
without actually containing quotes in the sting data. The quotes are
reproduced on the way out, when shell code is generated.

From: Maxim Heijndijk on
Kaz Kylheku schreef:
> Maxim Heijndijk wrote:
>> Thanx. I've tried the noglob idea, but it doesn't work.
>
> I wonder why. What is your $BASH_VERSION?
>
> Note that my trick depends on the paths being resolvable relative to
> where you are running rsync from.
>
> I don't think you are doing that, otherwise your use of the rsync
> pattern *** would not work. For instance, suppose you write:
>
> --exclude some/directory/***
>
> If some/directory exists relative to the current working directory
> where you are running rsync, the shell will apply the *** pattern and
> expand to the contents of that directory. The *** pattern won't be
> passed to rsync, and there will likely be superfluous arguments also.
>
> It's probably a good idea to quote the patterns being passed to rsync
> so that they are not interpreted by the shell at all. But then the
> trick of using shell globbing certainly won't work without some added
> contortions.
>
> Also note that, for the same reason, your use of the EXCLUDES variable
> introduces a difficulty: you cannot quote the names within that string
> to protect them from expansion. In order to use EXCLUDES, you have to
> expand it, but the results of expansion are not protected from further
> expansion. If the expansion tries to produce quotes, that won't work
> because the shell does not re-tokenize the the expansion.
>
> Here is an idea: stick with your original patterns, but make the
> machine write them for you, rather than typing them by hand.
>
> # ignore case
> ic ()
> {
> # note that \u in the replacement is a GNU sed extension
> echo $1 | sed -e 's/[a-z]/[&\u&]/g'
> }
>
> # sample run of ic
> $ ic abc
> [aA][bB][cC]
>
> rsync \
> --exclude "$(ic pagefile.sys)" \
> --exclude "$(ic "documents and settings/local?settings/temp/***")" ...
>
> If you want to put the exclusions into an EXCLUDES variable, and quote
> them, then you have to use val. It becomes ugly, so at this point it's
> wortwhile to consider a tool which is better at preprocessing code to
> be evaluated by the shell, namely make!
>
> # GNU Makefile
>
> # The * and "*" here are thrown in as an example of how accidental
> # globbing is prevented by the quotes here.
>
> EXCLUDES := * "*" "pagefile.sys" "documents and
> settings/local?settings/temp/***"
>
> EXCLUDES_IC := $(foreach name,$(EXCLUDES),\
> $(shell echo '$(name)' | sed -e 's/[a-z]/[&\u&]/g'))
>
> # "do_rsync" is a virtual target, not a real file being updated
> .PHONY: do_rsync show_excludes
>
> # remove the echo and complete the rsync
> do_rsync:
> @echo rsync $(foreach item,$(EXCLUDES_IC),--exclude $(item))
> #^ tab here
>
> When you run make, you can see the built-up command line:
>
> $ make
> rsync --exclude files from your current directory --exclude * --exclude
> [pP][aA][gG][eE][fF][iI][lL][eE].[sS][yY][sS] --exclude
> [dD][oO][cC][uU][mM][eE][nN][tT][sS] --exclude [aA][nN][dD] --exclude
> [sS][eE][tT][tT][iI][nN][gG][sS]/[lL][oO][cC][aA][lL]?[sS][eE][tT][tT][iI][nN][gG][sS]/[tT][eE][mM][pP]/***
>
> Note how the unquoted * is expanded into "files from your current
> directory", an unwanted effect. But the quoted "*" came through
> verbatim as an argument to echo, without the quotes. Because make is
> designed to tightly interface with the shell, it has special quoting
> logic. Essentially, a string within make is attributed as being quoted
> without actually containing quotes in the sting data. The quotes are
> reproduced on the way out, when shell code is generated.

Wow! This goes far beyond my knowledge, but i'm gonna try to figure this
out the next few weeks! Thanx again.