Can extra processing threads help in this case? [MFC]

Prev: Improving Pete'r Application Performance
Next: Competitors for Pet'e OCR system

From: Hector Santos on 8 Apr 2010 21:23

Peter Olcott wrote:

>> But you can use:
>>
>> TCP/IP <<--- What we use for ICP
>> UPD <<--- What we use for ICP
>> HTTP <<--- What we use for ICP
>> RPC <<--- What we use for ICP
>> DCOM
>
> But as I understand it these will not automatically grow a
> queue to any arbitrary length.

Your queue is as good and fast as you request it, pipe or otherwise.

>>
>> And a 2003 Dr. Dobbs article on how to handle named pipes
>> correctly, even though it seems so "simple":
>>
>> http://www.drdobbs.com/architecture-and-design/184416624;jsessionid=BVL3ABP0UVUSJQE1GHPSKH4ATMY32JVN
>>
>
> OK so the Unix/Linux people say that it is well know that MS
> named pipes are borked, yet, they have never had any problem
> with Unix/Linux name pipes.

You didn't go a google. did you? Figures you would ignore it.

But even then, I can understand why the success. Unix is not
traditionally known to work with threads, and the piping has permanent
storage - your DISK - making it easy to allow for easy recovery.
Simple.

--
HLS

From: Peter Olcott on 8 Apr 2010 21:24

"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
message news:4qusr5hrulhbsjgvogtna53ilq52tq4bre(a)4ax.com...
> See below...
> On Thu, 8 Apr 2010 08:54:38 -0500, "Peter Olcott"
> <NoSpam(a)OCR4Screen.com> wrote:
>
>>
>>"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
>>message news:6aspr5dr3kb4npe47j9mu26kbl2ib4s28v(a)4ax.com...
>>> On Wed, 7 Apr 2010 10:07:02 -0500, "Peter Olcott"
>>> <NoSpam(a)OCR4Screen.com> wrote:
>>>
>>>>
>>>>Sure so another way to solve this problem is on the rare
>>>>cases when you do lose a customer's money you simply
>>>>take
>>>>their word for it and provide a refund. This also would
>>>>hurt
>>>>the reputation though, because this requires the
>>>>customer
>>>>to
>>>>find a mistake that should not have occurred.
>>> ****
>>> Incredibly elaborate mechanisms to solve non-problems.
>>> Simple mechanisms (e.g., "resubmit
>>> your request") should suffice. Once your requirements
>>> state what failure modes are
>>
>>You are not paying attention. I am talking about a server
>>crash with loss of data after the customer has added money
>>to their account, but, before this financial transaction
>>has
>>been saved to offsite backup. They add ten bucks to their
>>account and I lose track of it because the server crashed
>>and it was not yet time for my periodic backup.
> ****
> Actually, I AM paying attention; you are not paying
> attention. I suggest creating the
> MINIMUM amount of complexity that guarantees that the
> customer is not charged for a
> failure; you are attempting to create incredibly elaborate
> mechanisms that give you the
> illusion of 100% reliability. I say: fail and don't
> charge, or fail and refund, and
> implement the smallest, simplest system that satisfies
> this design.
> joe

OK what is the simplest possible way to make sure that I
never ever lose the customer's ten bucks, even if the server
crashes before my next backup and this data is lost in the
crash?

> ****
>>
>>
> Joseph M. Newcomer [MVP]
> email: newcomer(a)flounder.com
> Web: http://www.flounder.com
> MVP Tips: http://www.flounder.com/mvp_tips.htm

From: Joseph M. Newcomer on 8 Apr 2010 21:36

See below...
On Thu, 8 Apr 2010 15:44:37 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote:

>
>"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
>news:uhFwgp11KHA.140(a)TK2MSFTNGP05.phx.gbl...
>> Peter Olcott wrote:
>>
>>
>>> I think that many of these issues may go away by using
>>> two half-duplex named pipes one in each direction. No one
>>> has yet pointed out any issues with Unix/Linux named
>>> pipes. I like named pipes because the implement the FIFO
>>> intuitively with minimal learning curve.
>>
>>
>> I can only hope that one day you will actually begin your
>> work, so you can see how great it will work.
>>
>> Google: named pipe problems
>>
>> http://www.google.com/search?q=named+pipe+problems&start=0&ie=utf-8&oe=utf-8&client=firefox-a&rls=org.mozilla:en-US:official
>>
>> When our multi-million dollar server was first under
>> design back in the mid 90s, name pipes was going to be
>> used. We saw almost immediately how unreliable it was a
>> for a high end, high throughput, high multi-thread WAN/LAN
>> network server.
>
>First of all are you talking about named pipes in Windows or
>Unix/Linux?
****
YOU were talking about named pipes on linux. You even suggested something about
full-duplex named pipes, which ONLY exist on Windows. So it is not surprising that there
is some confusion here. As I pointed out, just about the only thing these have in common
is the phrase 'named pipe'. So you have to be precise.
****
>
>>
>> Not saying you can make it work, but you will spend more
>> time on getting that right than anythingelse and for what?
>> A fifo? When there are so many other more reliable
>> methods and simpler methods?
>>
>
>What simpler more reliable methods are you referring to that
>can provide event based notification between processes?
****
You must define "reliable". For example, a "reliable" network does NOT guarantee delivery
of anything. All it guarantees that it will tell you if it has delivered the data, OR, it
tells you that it could not, and you are then responsible for dealing with recovery from
that problem.

What I'm saying is that you have failed to comprehend that it DOES NOT MATTER if your
mechanisms fail, as long as you can detect and recover from that failure according to a
specified protocol (e.g., the customer is not charged for failure to deliver; whether this
is done by not charging until delivery or refunding if failure occurs is a trivial and
unimportant implementation detail of the requirement that "the customer is not charged for
failure to deliver". You keep exhibiting buzzword-lock (e..g, "named pipes"), ascribe to
these mechanisms magical properties they do not possess, and then declare the problem is
solved. As you stack more and more of these magical mechanisms, you get a clusier and
clumsier system that is no more robust or reliable than a simpler system. I keep saying
"build the simplest possible system that safisfies the requirements" and you keep
proposing more and more elaborate and complex schemes based on hypothesized magical
properties of various technologies which, in your spec, GUARANTEE delivery. No, forget
about guaranteeing delivery; if you fail to deliver, in the SIMPLEST POSSIBLE mechanism
(e.g., a TCP/IP reply such as over stdout of a cgi-style child process). Then, if you get
an warning about delivery failure, you take the conservative approach and either don't
charge or re-credit the account (minor implementation detail). Quit designing these
incredibly complex schemes based on writing code based on inventing new code based on
problematic implementation strategies. For example, forget about fsycn, seek, pwrite,
fflush, etc. and assume a transacted database and work from there. That way you can build
something and test it without having to write tons of code that you only barely can
comprehend.

I learned an important lesson back in 1977, writing operating system components
(a) Code should work correctly
(b) If it doesn't work correctly, and nobody notices, this is good enough

Key here is writing good recovery mechanisms so nobody notices.

Do it right and not even power failure will matter. And you don't have to invent magical
properties of existing mechanisms to satisfy your requirements.
joe
*****
>

>
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

From: Peter Olcott on 8 Apr 2010 21:40

"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
message news:68vsr5dlv7053hcg4es22v0obf1tilc7ee(a)4ax.com...
> See below...
> On Thu, 8 Apr 2010 14:57:19 -0500, "Peter Olcott"
> <NoSpam(a)OCR4Screen.com> wrote:
>
>>If my understanding is correct fsync() is supposed to
>>handle
>>both.
>> http://linux.die.net/man/2/fsync
>>It might be the case that I must use the low level open()
>>command so that there are no application buffers.
> ****
> fflush() will flush application buffers if you are using
> stdio. and fsync(), if it is
> implemented (did you see the section of that SQLLITE
> discussion that says that it is not
> always implemented correctly?)

No, missed that.

>
> You may not be able to turn off the onboard disk cache
> buffering. That's part of the
> problem I was referring to. (And yes, it kills hard drive
> performance)
> ****

Solution is new vendor where disk caching can be turned off.
Vendor says that disk caching can be turned off. Vendor rep
may be guessing.

>>
>>Also the experts seem to be saying that the drive's own
>>onboard cache is not much of an issue if there is UPS.
>>There are some ways to force some drives to empty their
>>onboard cache. The only way that is supposed to always
>>work
>>is to turn write buffering off. This can really hurt
>>hard-drive performance.
> ****
> Power failure is not much of an issue if you have a UPS,
> so worrying about what happens
> under power failure is not a really high priority in real
> life.
> ****

But required OS reboots are, right? Still need all writes to
go straight to the platters.

>>
>>>>It helps to have a single point I/O controller, but how
>>>>are you
>>>>planning to use this thread? How will you talk to it?
>>>>IOW, now you
>>>>really need to make sure you have synchronization.
>>> ****
>>> If one thread handles the file, then no
>>> "synchronization"
>>> is required because all requests
>>> serialze through this one thread. It is an approach
>>> called the "agent pattern".
>>
>>It looks like clarification from the Linux/Unix experts
>>indicate that this would be required for my transaction
>>log.
> ****
> Of course, you still have to flush application buffers and
> flush kernel buffers; putting
> it in a single thread still does not guarantee
> transactional integrity.
>
> You have to decide where your "start transaction" and "end
> transaction" points are.
>
> Oh yes, it really is hard on the disk drive; I killed on
> disk drive by running a large
> number of tests on a transacted database; it just stopped
> seeking. But during the tests,
> it was seeking ferociously as it made sure the directory
> blocks were consistent with the
> file contents.

Good reason for hot swappable RAID, then.

>>> Apparently, he thinks that a database can't a FIFO queue
>>> because he once read that SQLLITE
>>> doesn't have a record number, or something else silly
>>> like
>>> that. He missed the idea thata
>>> a FIFO queue is a FIFO queue and ANY stream-oriented
>>> protocol (including TCP/IP to the
>>> local machine!) could be a valid implementation;
>>> instead,
>>> he fastened on one
>>
>>And its buffer would automatically grow to any required
>>length and automatically shorten as items are removed?
> ****
> Yep. That's EXACTLY what happens. And only and undefined
> and indeterminate points does
> the file system manage to get these updated blocks out to
> the hard drive (unless you have
> a way to force synchronization of the buffers with the
> magnetic surfaces). So imagine
> that you have deleted records in page 1 and added records
> to page 7. When you delete
> records, the other records are "shuffled down" to fill the
> space. These pages are
> committed to disk in opportunistic order, so what is on
> the platters represents a snapshot
> of the in-memory buffers at random states, and the pages
> on the disk may be inconsistent
> with the pages in memory. So you can end up with
> duplicate records, missing records at
> the end, etc.
> ****

Make sure the flush to disk then.

>>I think that many of these issues may go away by using two
>>half-duplex named pipes one in each direction. No one has
>>yet pointed out any issues with Unix/Linux named pipes. I
>>like named pipes because the implement the FIFO
>>intuitively
>>with minimal learning curve.
> ****
> No, in fact, NONE of them change, at all. Whether you are
> using two half-duplex pipes
> (which is all linux supports, even as named pipes) or a
> full-duplex pipe (as is supported
> in Windows).

Unix/Linux groups say that any issues with named pipes must
be on Windows because Windows named pipes are borked.

>
> If either the server app or the app it spawns fail, the
> contents of the name pipe will be
> lost. Just because nobody bothered to point out the
> obvious does not mean the problem
> does not exist. Low learning curve does not immediately
> map to robust transacted data
> transfer!

I already solved this issue with my very early design. That
is the purpose of my persistent disk file based FIFO queue.
As soon as it gets to this file, then even a server crash
will not prevent the job from getting completed correctly.
We may lose the way to send it back to the user's screen (it
will still be in their account when they log in) but we did
not lose the actual transaction even if the server crashes.

Alternatively if we lose any part of the process before we
get an HTTP acknowledgement that they received their
results, we roll the whole transaction back.

> joe
> ****
>>
>>
> Joseph M. Newcomer [MVP]
> email: newcomer(a)flounder.com
> Web: http://www.flounder.com
> MVP Tips: http://www.flounder.com/mvp_tips.htm

From: Peter Olcott on 8 Apr 2010 21:44

"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
news:ec0DFN41KHA.224(a)TK2MSFTNGP06.phx.gbl...
> Peter Olcott wrote:
>
>>> But you can use:
>>>
>>> TCP/IP <<--- What we use for ICP
>>> UPD <<--- What we use for ICP
>>> HTTP <<--- What we use for ICP
>>> RPC <<--- What we use for ICP
>>> DCOM
>>
>> But as I understand it these will not automatically grow
>> a queue to any arbitrary length.
>
>
> Your queue is as good and fast as you request it, pipe or
> otherwise.

Some of the above have fixed queue lengths don't they?

>
>>>
>>> And a 2003 Dr. Dobbs article on how to handle named
>>> pipes correctly, even though it seems so "simple":
>>>
>>> http://www.drdobbs.com/architecture-and-design/184416624;jsessionid=BVL3ABP0UVUSJQE1GHPSKH4ATMY32JVN
>>>
>>
>> OK so the Unix/Linux people say that it is well know that
>> MS named pipes are borked, yet, they have never had any
>> problem with Unix/Linux name pipes.
>
>
> You didn't go a google. did you? Figures you would
> ignore it.
>

I did and one of the links says something like there aren't
any problems with named pipes.

> But even then, I can understand why the success. Unix is
> not traditionally known to work with threads, and the
> piping has permanent storage - your DISK - making it easy
> to allow for easy recovery. Simple.

The data is not supposed to ever actually hit the disk. I
started a whole thread on just that one point.
>
> --
> HLS

First | Prev | Next | Last
Pages: 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119
Prev: Improving Pete'r Application Performance
Next: Competitors for Pet'e OCR system