Do named pipes have disk I/O ?? [Unix Programming]

Prev: Learn about proxy sites and how to use them to open blocked sites unlimited downloads from RapidShare and megaupload and increase the speed of the Internet with new sites for free
Next: Why can't I catch a SIGBUS on my Mac (when strlen(NULL) is used)?

From: Peter Olcott on 5 Apr 2010 17:27

I was going to do the thread priority thing very simply.
Only one thread can run at a time, and the purpose of the
multiple threads was to make fast context switching using
thread local data. As soon as a higher priority job arrives
activity in the current lower priority thread is suspended.

"David Schwartz" <davids(a)webmaster.com> wrote in message
news:23567a1b-a879-4ad2-8f9f-2eccc628914d(a)30g2000yqi.googlegroups.com...
On Apr 5, 6:52 am, "Peter Olcott" <NoS...(a)OCR4Screen.com>
wrote:

> It is a "given" (something I can't change) that the web
> server will have one thread per HTTP request. It is also a
> "given" architectural decision to provide optimal
> performance that I will have at least one thread per
> processor core that doe the OCR processing. I might have
> two
> threads per processor core to do OCR processing because I
> will have two level of OCR processing priority, and the
> first level of priority has absolute priority over the
> second level.

I would not recommend using priorities this way. It usually
doesn't
work. If you want some requests to have priority over
others, do those
requests first. Attempting to do them with higher-priority
threads
tends to lead to extreme pain unless you have expertise in
dealing
with thread priorities. (Search 'priority inversion' for
just one of
the horrible things that can happen to you.)

DS

From: Peter Olcott on 5 Apr 2010 17:28

"Scott Lurndal" <scott(a)slp53.sl.home> wrote in message
news:ilsun.317$4K5.204(a)news.usenetserver.com...
> "Peter Olcott" <NoSpam(a)OCR4Screen.com> writes:
>>
>>"Mark Hobley" <markhobley(a)hotpop.donottypethisbit.com>
>>wrote
>>in message
>>news:v4nl87-ula.ln1(a)neptune.markhobley.yi.org...
>>> In comp.unix.programmer Peter Olcott
>>> <NoSpam(a)ocr4screen.com> wrote:
>>>> I will be receiving up to 100 web requests per second
>>>> (that
>>>> is the maximum capacity) and I want to begin processing
>>>> them
>>>> as soon as they arrive hopefully without polling for
>>>> them.
>>>
>>> That will happen. If 100 web requests come down the
>>> pipe,
>>> the receiving process
>>> will get them.
>>>
>>> It is only when no requests come down the pipe that the
>>> receiving process will
>>> have to wait for a request to come in. This is no big
>>> deal.
>>>
>>> Mark.
>>>
>>> --
>>> Mark Hobley
>>> Linux User: #370818 http://markhobley.yi.org/
>>>
>>
>>So it is in an infinite loop eating up all of the CPU time
>>only when there are no requests to process? I don't think
>>that I want this either because I will have two priorities
>>of requests. If there are no high priority requests I want
>>it to begin working on the low priority requests.
>
>
> man poll
>

Thanks, someone else already enlightened me.

From: Nicolas George on 5 Apr 2010 18:14

"Peter Olcott" wrote in message
<ecmdnQAcGvT_oCfWnZ2dnUVZ_rWdnZ2d(a)giganews.com>:
> I am trying to find algorithm that makes cache as
> ineffective as possible.

Yes, I perfectly understood that.

> The only thing that I can think of is to make
> sure that each memory access is more than max cache size
> away from the prior one. This should eliminate spatial
> locality of reference.

Absolutely not. If you think that, then you have yet to understand what
cache means and how it works. The principle of cache is to have a certain
quantity of fast memory to automatically and transparently keep a copy of
frequently used data that would normally reside in slower memory.

To achieve that, each time a fragment of data is accessed, the cache system
checks to see if it already has it. If it has, then it stops here. If it
does not, it makes room for it by evicting the least useful item currently
present in the cache and then loads it. "Least useful" is weighted using
various heuristics, such as "least recently used".

If the program repeatedly access the same items, then they will rapidly stay
in the cache, no matter their actual position on the slow memory.

Consider the disk cache, it is probably easier to understand. You probably
frequently use /bin/sh, /bin/ls and /lib/libc.so.42: all these files will be
in the disk cache almost all of the time, no matter where they are on the
disk.

The only case where the actual place on the slow memory matters is when two
elements are near enough to be seen as a single item by the cache system.
For example, inodes are typically 128 octets, while the disk cache system
always works using whole sectors (512 octets) or pages (4096 octets). If the
inode for /sbin/fsck is in the same sector as the inode for /bin/ls, then
both will stay in the cache, since each time you use ls, that makes activity
for both of them.

Memory cache works mostly the same way, although the atomic elements are
smaller.

From: Nicolas George on 5 Apr 2010 18:26

Moi wrote in message
<455de$4bba44e4$5350c024$9115(a)cache110.multikabel.net>:
> I've one done a similar thing for a disk-exerciser: grab each diskblok
> exactly once,

Grab each block exactly once: you said it. Peter's method does not do that,
it probably only accesses a very small fraction of the blocks.

> but in pseudo-random order, and sufficiently distant to avoid
> read-ahead to be effective.

Such things make sense for disks because disks require mechanical movements
and thus are much more efficient for contiguous reads. That is why
filesystems drivers try to avoid fragmentation and implement readahead.

And for the same reason, software and file formats try to make reads as
sequential as possible (for example interleaving audio and video data in
multimedia container formats). Which in turn makes readahead useful even for
flash-based devices.

RAM does not have mechanical parts, and no bonus for sequential reads.
Furthermore, the read patterns are typically much more random: access a
pointer, then another, read until a 0 byte, access another pointer, etc. The
RAM cache system does not try to do simple readahead for that; instead, it
follows the flow of the program and analyzes and predicts the memory
accesses.

Therefore, if the algorithm does not achieve cache locality at the scale of
a cache line (typically 32 to 128 octets), then the distance between items
is irrelevant.

From: Peter Olcott on 5 Apr 2010 18:36

"Nicolas George" <nicolas$george(a)salle-s.org> wrote in
message news:4bba60b6$0$9659$426a74cc(a)news.free.fr...
> "Peter Olcott" wrote in message
> <ecmdnQAcGvT_oCfWnZ2dnUVZ_rWdnZ2d(a)giganews.com>:
>> I am trying to find algorithm that makes cache as
>> ineffective as possible.
>
> Yes, I perfectly understood that.
>
>> The only thing that I can think of is to make
>> sure that each memory access is more than max cache size
>> away from the prior one. This should eliminate spatial
>> locality of reference.
>
> Absolutely not. If you think that, then you have yet to
> understand what
> cache means and how it works. The principle of cache is to
> have a certain

I already made this work.

> quantity of fast memory to automatically and transparently
> keep a copy of
> frequently used data that would normally reside in slower
> memory.
>
> To achieve that, each time a fragment of data is accessed,
> the cache system
> checks to see if it already has it. If it has, then it
> stops here. If it
> does not, it makes room for it by evicting the least
> useful item currently
> present in the cache and then loads it. "Least useful" is
> weighted using
> various heuristics, such as "least recently used".
>
> If the program repeatedly access the same items, then they
> will rapidly stay
> in the cache, no matter their actual position on the slow
> memory.
>
> Consider the disk cache, it is probably easier to
> understand. You probably
> frequently use /bin/sh, /bin/ls and /lib/libc.so.42: all
> these files will be
> in the disk cache almost all of the time, no matter where
> they are on the
> disk.
>
> The only case where the actual place on the slow memory
> matters is when two
> elements are near enough to be seen as a single item by
> the cache system.
> For example, inodes are typically 128 octets, while the
> disk cache system
> always works using whole sectors (512 octets) or pages
> (4096 octets). If the
> inode for /sbin/fsck is in the same sector as the inode
> for /bin/ls, then
> both will stay in the cache, since each time you use ls,
> that makes activity
> for both of them.
>
> Memory cache works mostly the same way, although the
> atomic elements are
> smaller.