From: Jonathan Lewis on
"joel garry" <joel-garry(a)home.com> wrote in message
news:29c017c9-7c3a-40db-b422-1b1f2d861431(a)i9g2000prn.googlegroups.com...
]On Jun 25, 9:37 am, "Jonathan Lewis" <jonat...(a)jlcomp.demon.co.uk>
]wrote:
]>
]> The trouble with your requirement is that we really need to do a
backwards
]> tablescan - because it's probably the data near the end of the table
that
]> is
]> changing while you are "wasting" time reading all the data from the
start
]> of
]> the table.
]
]Excellent explanation, but I lost you here. He says plan says doing a
]range scan, for 1% of the table? (Maybe you hadn't seen subsequent
]post yet, where he mentions a fetch suddenly exhibiting the
]characteristics you describe.)
]

By the time I'd written this much, I'd forgotten that he'd added the note
about the index - but it doesn't really make any difference (a) to the
explanation or (b) to the concept in the solution - except that you
can put in an "index_desc()" hint and that might be enough to help.
It depends on the how the query is written, what index it uses, and
the distribution of the changed data.

]>
]> Unfortunately there is no such hint - but if it's really critical, you
]> could write
]> some code to scan the table one extent at a time in reverse order.
]
]This cleaning makes perfect sense, but I'm wondering if there is some
]administrative tuning like adjusting undo size or retention or some
]fiddling with initrans? Sounds critical if it's interrupting data
]extraction.

The error is "just" the same as a traditional 1555 problem when it gets
that far so a "large enough" undo retention should stop the 1555 - but
that won't stop the amount of work it takes. Thinking about initrans is
a good idea - but that won't have any effect either because the problem
is the number of backward steps that have to be taken and the value of
initrans only eliminates the first few (i.e. a few relating to the size of
INITRANS).


--
Regards

Jonathan Lewis
http://jonathanlewis.wordpress.com



From: Steve Howard on
On Jun 25, 12:01 pm, joel garry <joel-ga...(a)home.com> wrote:
> On Jun 24, 7:25 pm, Steve Howard <stevedhow...(a)gmail.com> wrote:
>
>
>
> > On Jun 24, 10:20 pm, Steve Howard <stevedhow...(a)gmail.com> wrote:
>
> > > Hi All,
>
> > > 10.2.0.4 three node cluster EE on SLES 10
>
> > ...I should also mention the explain plan for the query is OK.  It is
> > a range scan for the time period being retrieved, which may be two or
> > three hours, representing 300,000 rows (out of 300 million).
>
> > Also, when I look at "table fetch by rowid" for the querying session,
> > it will periodically just stall.  It is as if the session is reading
> > thousands upon thousands of undo blocks for that next row.  The
> > arraysize is 15 (standard SQL*Plus)
>
> > Sometimes this happens, sometimes it doesn't.
>
> > We did apply patch 7527908
>
> I have no idea, but I speculate you can use Tanel Poders' latchprofx
> and poke around on his site about in memory undo to figure this out.
> Does your plan show lots of recursion?
>
> jg
> --
> @home.com is bogus.http://www.businessweek.com/news/2010-06-25/oracle-rises-after-sun-ac...

Hi Joel,

I got a tip from Dion Cho regarding this that was also suggesting
dc_rollbacks as latching issue, but I could never prove any similarity
between his test case and our situation.

http://dioncho.wordpress.com/2009/04/04/rethinking-on-ora-01555-error/

Thanks,

Steve
From: Steve Howard on
On Jun 26, 2:05 am, "Jonathan Lewis" <jonat...(a)jlcomp.demon.co.uk>
wrote:
> "joel garry" <joel-ga...(a)home.com> wrote in message
>
> news:29c017c9-7c3a-40db-b422-1b1f2d861431(a)i9g2000prn.googlegroups.com...
> ]On Jun 25, 9:37 am, "Jonathan Lewis" <jonat...(a)jlcomp.demon.co.uk>
> ]wrote:
> ]>
> ]> The trouble with your requirement is that we really need to do a
> backwards
> ]> tablescan - because it's probably the data near the end of the table
> that
> ]> is
> ]> changing while you are "wasting" time reading all the data from the
> start
> ]> of
> ]> the table.
> ]
> ]Excellent explanation, but I lost you here.  He says plan says doing a
> ]range scan, for 1% of the table?  (Maybe you hadn't seen subsequent
> ]post yet, where he mentions a fetch suddenly exhibiting the
> ]characteristics you describe.)
> ]
>
> By the time I'd written this much, I'd forgotten that he'd added the note
> about the index - but it doesn't really make any difference (a) to the
> explanation or (b) to the concept in the solution - except that you
> can put in an "index_desc()" hint and that might be enough to help.
> It depends on the how the query is written, what index it uses, and
> the distribution of the changed data.
>
> ]>
> ]> Unfortunately there is no such hint - but if it's really critical, you
> ]> could write
> ]> some code to scan the table one extent at a time in reverse order.
> ]
> ]This cleaning makes perfect sense, but I'm wondering if there is some
> ]administrative tuning like adjusting undo size or retention or some
> ]fiddling with initrans?  Sounds critical if it's interrupting data
> ]extraction.
>
> The error is "just" the same as a traditional 1555 problem when it gets
> that far so a "large enough" undo retention should stop the 1555 - but
> that won't stop the amount of work it takes.  Thinking about initrans is
> a good idea - but that won't have any effect either because the problem
> is the number of backward steps that have to be taken and the value of
> initrans only eliminates the first few (i.e. a few relating to the size of
> INITRANS).
>
> --
> Regards
>
> Jonathan Lewishttp://jonathanlewis.wordpress.com

Thanks much , Jonathan (and Joel).

I found a comment by you several months ago similar to this that I
always go back to. I can't tell you how many times I would go for a
cup of coffee with someone and mention that link and the fact that I
would like to send you a note for clarification, but just never did.

http://www.orafaq.com/usenet/comp.databases.oracle.server/2008/02/01/0022.htm

I really appreciate you taking the time to respond, as this has been
driving me nutty.

Let me take some time to digest what you posted and then I will
respond.

Thanks,

Steve
From: Steve Howard on
On Jun 26, 2:05 am, "Jonathan Lewis" <jonat...(a)jlcomp.demon.co.uk>
wrote:
> "joel garry" <joel-ga...(a)home.com> wrote in message
>
> news:29c017c9-7c3a-40db-b422-1b1f2d861431(a)i9g2000prn.googlegroups.com...
> ]On Jun 25, 9:37 am, "Jonathan Lewis" <jonat...(a)jlcomp.demon.co.uk>
> ]wrote:
> ]>
> ]> The trouble with your requirement is that we really need to do a
> backwards
> ]> tablescan - because it's probably the data near the end of the table
> that
> ]> is
> ]> changing while you are "wasting" time reading all the data from the
> start
> ]> of
> ]> the table.
> ]
> ]Excellent explanation, but I lost you here.  He says plan says doing a
> ]range scan, for 1% of the table?  (Maybe you hadn't seen subsequent
> ]post yet, where he mentions a fetch suddenly exhibiting the
> ]characteristics you describe.)
> ]
>
> By the time I'd written this much, I'd forgotten that he'd added the note
> about the index - but it doesn't really make any difference (a) to the
> explanation or (b) to the concept in the solution - except that you
> can put in an "index_desc()" hint and that might be enough to help.
> It depends on the how the query is written, what index it uses, and
> the distribution of the changed data.
>
> ]>
> ]> Unfortunately there is no such hint - but if it's really critical, you
> ]> could write
> ]> some code to scan the table one extent at a time in reverse order.
> ]
> ]This cleaning makes perfect sense, but I'm wondering if there is some
> ]administrative tuning like adjusting undo size or retention or some
> ]fiddling with initrans?  Sounds critical if it's interrupting data
> ]extraction.
>
> The error is "just" the same as a traditional 1555 problem when it gets
> that far so a "large enough" undo retention should stop the 1555 - but
> that won't stop the amount of work it takes.  Thinking about initrans is
> a good idea - but that won't have any effect either because the problem
> is the number of backward steps that have to be taken and the value of
> initrans only eliminates the first few (i.e. a few relating to the size of
> INITRANS).
>
> --
> Regards
>
> Jonathan Lewishttp://jonathanlewis.wordpress.com

What is really odd about this is that several months ago, I started
running a job to “pre-scan” all the rows we would need, before ‘the
“real” job got there. My assumption was this had something to do with
block cleanout, even though none of the cleanout statistics were
incremented like the “transaction tables consistent reads – undo
records applied” counter was.

This doesn’t seem to help, though. My “pre-scan” job never has an a
issue, but I run one hour windows for the range to scan.

A little more background. This is a “transaction history” table of
sorts. It is partitioned by month, and records are only added, never
updated.

SQL> desc big_table
Name Null? Type
----------------------------------------- --------
----------------------------
PK NOT NULL NUMBER
FK NOT NULL NUMBER
COL3 NOT NULL NUMBER(3)
CREATE_TIME TIMESTAMP(6)
COL5 NOT NULL VARCHAR2(50)
COL6 VARCHAR2(50)
COL7 XMLTYPE

SQL>

We query as follows:

SELECT concatenated_xml_string_of_columns_from_big_table,
a.xml_col.getClobVal()
FROM big_table a
WHERE create_time between trunc(sysdate) + (:1 / 1440) and
trunc(sysdate) + (:2 / 1440)

…where the window is three hours. This does a range scan on the
create_time column, which is good as it is by far the most selective
filter.

The selected records are retrieved in PL/SQL (no bulk collect), and
run through a few more XML tagging operations and written to a file.
They are then propagated to a mainframe for additional business usage
to which I am not privy.

If the query runs “fast enough” (less than 30 minutes or so), we don’t
see the issue. If it starts to “get slow” for whatever reason, we
start reading tons of undo.

Based on what you wrote, and the fact that I “pre-scan” the rows,
shouldn’t I pay the price for the cleanout? Or could it be we *do*
have other transactions hitting this table of which I am not aware?
In other words,

* I pre-scan
* A row *is* changed after my query finishes
* They run the “real” query

Thanks,

Steve
From: joel garry on
On Jun 28, 6:37 am, Steve Howard <stevedhow...(a)gmail.com> wrote:
> On Jun 26, 2:05 am, "Jonathan Lewis" <jonat...(a)jlcomp.demon.co.uk>
> wrote:

>
> This doesn’t seem to help, though.  My “pre-scan” job never has an a
> issue, but I run one hour windows for the range to scan.
>
> A little more background.  This is a “transaction history” table of
> sorts.  It is partitioned by month, and records are only added, never
> updated.
>
> SQL> desc big_table
>  Name                                      Null?    Type
>  ----------------------------------------- --------
> ----------------------------
> PK                                   NOT NULL NUMBER
> FK                                  NOT NULL NUMBER
> COL3                                    NOT NULL NUMBER(3)
>  CREATE_TIME                                        TIMESTAMP(6)
> COL5                                NOT NULL VARCHAR2(50)
> COL6                                     VARCHAR2(50)
> COL7                                          XMLTYPE
>
> SQL>
>
> We query as follows:
>
> SELECT concatenated_xml_string_of_columns_from_big_table,
>        a.xml_col.getClobVal()
>   FROM big_table a
>   WHERE create_time between trunc(sysdate) + (:1 / 1440) and
> trunc(sysdate) + (:2 / 1440)
>
> …where the window is three hours.  This does a range scan on the
> create_time column, which is good as it is by far the most selective
> filter.
>
> The selected records are retrieved in PL/SQL (no bulk collect), and
> run through a few more XML tagging operations and written to a file.
> They are then propagated to a mainframe for additional business usage
> to which I am not privy.
>
> If the query runs “fast enough” (less than 30 minutes or so), we don’t
> see the issue.  If it starts to “get slow” for whatever reason, we
> start reading tons of undo.

Something old but new to me I learned today (from Lob retention not
changing when undo_retention is changed [ID 563470.1]):

"...It is assumed that when UNDO_RETENTION is changed the lobs
connected to that retention are also changed which is not the case .

If a lob is modified from RETENTION to PCTVERSION and back to
RETENTION again then the lob retention is updated. ..."

Of course I have no idea if it is related to your problem, unless you
say something like you've changed your undo retention from 30 minutes
or so and didn't know about this...

A bit more of a reach, maybe Bug 2931779 - False ORA-1555 accessing
"cache read" LOBs in RAC [ID 2931779.8] or related has reanimated in
some form.

I guess you need to start tracing and digging deep to figure this one
out. Those mysterious xml packages may be doing something strange...
(I've run into 3rd party app code at times that does stuff like update
and rollback, unexpectedly).

jg
--
@home.com is bogus.
http://thehill.com/blogs/hillicon-valley/technology/105721-sen-bond-says-dhs-shouldnt-oversee-cybersecurity