Surprising Performance Changes with Oracle 11.2.0.1 (Long Post) [Oracle]

Prev: So is 11.2.0.2 going to be out anytime soon?
Next: Ora-19906 when restoring user managed backup - stumped!

From: Charles Hooper on 5 Sep 2009 11:28

On Sep 5, 2:04 am, "Jonathan Lewis" <jonat...(a)jlcomp.demon.co.uk>
wrote:
> > "Charles Hooper" <hooperc2...(a)yahoo.com> wrote in message
> >news:1ea99c67-8713-4ece-a0f5-85f66851b016(a)v2g2000vbb.googlegroups.com...
> > I am working on an Oracle performance related project, so I thought
> > that I would compare the performance of Oracle database 11.1.0.6 (on
> > 64 bit Linux), 11.1.0.7 (on 64 bit Windows), and 11.2.0.1 (on 64 bit
> > Linux).
>
> ...
>
> Charles,
>
> Interesting figures.
> A couple of follow-on questions:
> How much memory in the machine ?
> How many CPUs
> Were you running the client locally - ipc or tcp
> How much CPU usage "outside" Oracle in each test
> Any significant variation in 'prefetch' blocks
> Any other significant variation in stats
>
> --
> Regards
>
> Jonathan Lewis
> http://jonathanlewis.wordpress.com
>
> Author: Cost Based Oracle: Fundamentalshttp://www.jlcomp.demon.co.uk/cbo_book/ind_book.html
>
> The Co-operative Oracle Users' FAQhttp://www.jlcomp.demon.co.uk/faq/ind_faq.html

12GB of memory, 1 CPU (a Core i7 940 at 2.93GHz), local connection
with no listener started.

Robert, good question. Yes, table and index stats were collected with
no histograms.

A short sampling of the waits found during the index scan (near the
start of the trace) with an unset FILESYSTEMIO_OPTIONS (69689 is the
table, 69690 is the index):
FETCH
#8:c=0,e=897,p=100,cr=101,cu=0,mis=0,r=100,dep=0,og=1,plh=634656657,tim=1252080204301576
WAIT #8: nam='SQL*Net message from client' ela= 103 driver
id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252080204301706
WAIT #8: nam='db file parallel read' ela= 173 files=1 blocks=39
requests=39 obj#=69689 tim=1252080204301959
WAIT #8: nam='db file sequential read' ela= 7 file#=7 block#=1350890
blocks=1 obj#=69689 tim=1252080204302009
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232
#bytes=1 p3=0 obj#=69689 tim=1252080204302022
WAIT #8: nam='db file parallel read' ela= 192 files=1 blocks=39
requests=39 obj#=69689 tim=1252080204302283
WAIT #8: nam='db file sequential read' ela= 7 file#=7 block#=1356525
blocks=1 obj#=69689 tim=1252080204302373
WAIT #8: nam='db file scattered read' ela= 33 file#=7 block#=2162857
blocks=7 obj#=69690 tim=1252080204302442
WAIT #8: nam='db file parallel read' ela= 88 files=1 blocks=19
requests=19 obj#=69689 tim=1252080204302579
WAIT #8: nam='db file sequential read' ela= 7 file#=7 block#=1361383
blocks=1 obj#=69689 tim=1252080204302666
FETCH
#8:c=1000,e=982,p=107,cr=102,cu=0,mis=0,r=100,dep=0,og=1,plh=634656657,tim=1252080204302701
WAIT #8: nam='SQL*Net message from client' ela= 105 driver
id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252080204302823
WAIT #8: nam='db file parallel read' ela= 178 files=1 blocks=39
requests=39 obj#=69689 tim=1252080204303081
WAIT #8: nam='db file sequential read' ela= 7 file#=7 block#=1364081
blocks=1 obj#=69689 tim=1252080204303130
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232
#bytes=1 p3=0 obj#=69689 tim=1252080204303143
WAIT #8: nam='db file parallel read' ela= 174 files=1 blocks=39
requests=39 obj#=69689 tim=1252080204303381
WAIT #8: nam='db file sequential read' ela= 8 file#=7 block#=1369027
blocks=1 obj#=69689 tim=1252080204303468
WAIT #8: nam='db file parallel read' ela= 98 files=1 blocks=19
requests=19 obj#=69689 tim=1252080204303613
WAIT #8: nam='db file sequential read' ela= 7 file#=7 block#=1373988
blocks=1 obj#=69689 tim=1252080204303719
FETCH
#8:c=1000,e=921,p=100,cr=101,cu=0,mis=0,r=100,dep=0,og=1,plh=634656657,tim=1252080204303757

In the above, note the impossibly fast times for the waits - this is
fairly consistent through the trace file. The index blocks were read
by db file scattered read, db file sequential read, but a quick check
of the file did not seem to indicate that db file parallel read was
used for the index blocks. The table blocks were read by db file
scattered read, db file sequential read, and db file parallel read.

A short sampling of the waits found during the index scan (near the
start of the trace) with FILESYSTEMIO_OPTIONS set to SetAll (69689 is
the table, 69690 is the index):
FETCH
#8:c=2000,e=251016,p=100,cr=101,cu=0,mis=0,r=100,dep=0,og=1,plh=634656657,tim=1252088662939190
WAIT #8: nam='SQL*Net message from client' ela= 200 driver
id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252088662939432
WAIT #8: nam='db file parallel read' ela= 94215 files=1 blocks=39
requests=39 obj#=69689 tim=1252088663033884
WAIT #8: nam='db file sequential read' ela= 5612 file#=7
block#=1397615 blocks=1 obj#=69689 tim=1252088663039645
WAIT #8: nam='SQL*Net message to client' ela= 2 driver id=1650815232
#bytes=1 p3=0 obj#=69689 tim=1252088663039684
WAIT #8: nam='db file parallel read' ela= 74655 files=1 blocks=39
requests=39 obj#=69689 tim=1252088663114551
WAIT #8: nam='db file sequential read' ela= 3377 file#=7
block#=1402576 blocks=1 obj#=69689 tim=1252088663118159
WAIT #8: nam='db file parallel read' ela= 41403 files=1 blocks=19
requests=19 obj#=69689 tim=1252088663159699
WAIT #8: nam='db file sequential read' ela= 8198 file#=7
block#=1407689 blocks=1 obj#=69689 tim=1252088663168086
FETCH
#8:c=2999,e=228700,p=100,cr=101,cu=0,mis=0,r=100,dep=0,og=1,plh=634656657,tim=1252088663168161
WAIT #8: nam='SQL*Net message from client' ela= 219 driver
id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252088663168420
WAIT #8: nam='db file scattered read' ela= 7961 file#=7 block#=1346037
blocks=3 obj#=69690 tim=1252088663176493
WAIT #8: nam='db file parallel read' ela= 61848 files=1 blocks=39
requests=39 obj#=69689 tim=1252088663238666
WAIT #8: nam='db file sequential read' ela= 54241 file#=7
block#=1410476 blocks=1 obj#=69689 tim=1252088663293049
WAIT #8: nam='SQL*Net message to client' ela= 2 driver id=1650815232
#bytes=1 p3=0 obj#=69689 tim=1252088663293115
WAIT #8: nam='db file parallel read' ela= 87347 files=1 blocks=39
requests=39 obj#=69689 tim=1252088663380663
WAIT #8: nam='db file sequential read' ela= 9152 file#=7
block#=1415385 blocks=1 obj#=69689 tim=1252088663390014
WAIT #8: nam='db file parallel read' ela= 30242 files=1 blocks=19
requests=19 obj#=69689 tim=1252088663420387
WAIT #8: nam='db file sequential read' ela= 6660 file#=7
block#=1420835 blocks=1 obj#=69689 tim=1252088663427228
FETCH
#8:c=4000,e=258841,p=103,cr=102,cu=0,mis=0,r=100,dep=0,og=1,plh=634656657,tim=1252088663427311

Elapsed time for the full tablescan with with direct I/O and asynch I/
O and other monitoring enabled: roughly 42 seconds

Top output with direct I/O and asynch I/O enabled during the full
tablescan operation (terminated about 5 seconds into the run):
top - 09:48:21 up 1:40, 4 users, load average: 0.42, 0.23, 0.15
Tasks: 244 total, 1 running, 243 sleeping, 0 stopped, 0 zombie
Cpu(s): 3.7%us, 1.0%sy, 0.0%ni, 85.5%id, 9.6%wa, 0.2%hi,
0.1%si, 0.0%st
Mem: 12291184k total, 1694080k used, 10597104k free, 93000k
buffers
Swap: 14352376k total, 0k used, 14352376k free, 1123524k
cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
7090 oracle 16 0 8416m 125m 112m S 24.3 1.0 0:10.34
oracle
7082 oracle 18 0 65468 12m 9032 S 9.6 0.1 0:04.15
sqlplus
4966 oracle 15 0 284m 18m 8964 S 2.3 0.2 0:03.53 gnome-
terminal
4848 oracle 15 0 492m 30m 12m S 0.7 0.3 0:18.77
nautilus
718 root 10 -5 0 0 0 S 0.3 0.0 0:05.19 usb-
storage
....

V$OSSTAT delta values (60 second capture) with direct I/O and asynch I/
O enabled during the full tablescan operation:
STAT_NAME END_VALUE DELTA DELTA_SEC
------------------------- ---------- ----------- ------------
BUSY_TIME 5912 1759 29.32
GLOBAL_RECEIVE_SIZE_MAX 4194304 0 0.00
GLOBAL_SEND_SIZE_MAX 1048586 0 0.00
IDLE_TIME 864418 46730 778.83
IOWAIT_TIME 4973 3424 57.07
LOAD .319335938 0 0.00
NICE_TIME 0 0 0.00
NUM_CPUS 8 0 0.00
NUM_CPU_CORES 4 0 0.00
NUM_CPU_SOCKETS 1 0 0.00
PHYSICAL_MEMORY_BYTES 1.2586E+10 0 0.00
RSRC_MGR_CPU_WAIT_TIME 0 0 0.00
SYS_TIME 1353 383 6.38
TCP_RECEIVE_SIZE_DEFAULT 87380 0 0.00
TCP_RECEIVE_SIZE_MAX 4194304 0 0.00
TCP_RECEIVE_SIZE_MIN 4096 0 0.00
TCP_SEND_SIZE_DEFAULT 16384 0 0.00
TCP_SEND_SIZE_MAX 4194304 0 0.00
TCP_SEND_SIZE_MIN 4096 0 0.00
USER_TIME 4120 1266 21.10

V$SYS_TIME_MODEL delta values (60 second capture) with direct I/O and
asynch I/O enabled during the full tablescan operation:
STAT_NAME DELTA
DELTA_SEC
-------------------------------------------------- -----------
------------
DB CPU 10574389
176239.82
DB time 40284606
671410.10
Java execution elapsed time 0
0.00
PL/SQL compilation elapsed time 135017
2250.28
PL/SQL execution elapsed time 799537
13325.62
RMAN cpu time (backup/restore) 0
0.00
background cpu time 262959
4382.65
background elapsed time 6982341
116372.35
connection management call elapsed time 310170
5169.50
failed parse (out of shared memory) elapsed time 0
0.00
failed parse elapsed time 0
0.00
hard parse (bind mismatch) elapsed time 62740
1045.67
hard parse (sharing criteria) elapsed time 114156
1902.60
hard parse elapsed time 3062463
51041.05
inbound PL/SQL rpc elapsed time 0
0.00
parse time elapsed 3414383
56906.38
repeated bind elapsed time 7238
120.63
sequence load elapsed time 9007
150.12
sql execute elapsed time 39621673
660361.22

------------------------------------

Elapsed time for the full tablescan with other monitoring enabled:
11.97 seconds

Top output with unset filesystemio_options during the full tablescan
operation (terminated about 5 seconds into the run):
top - 10:26:47 up 2:18, 4 users, load average: 0.41, 0.28, 0.20
Tasks: 245 total, 2 running, 243 sleeping, 0 stopped, 0 zombie
Cpu(s): 9.5%us, 3.6%sy, 0.0%ni, 86.8%id, 0.1%wa, 0.0%hi,
0.0%si, 0.0%st
Mem: 12291184k total, 8364088k used, 3927096k free, 95948k
buffers
Swap: 14352376k total, 0k used, 14352376k free, 7768492k
cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
7936 oracle 18 0 8416m 138m 125m S 74.2 1.2 0:20.07
oracle
4994 oracle 18 0 69240 16m 9392 R 25.6 0.1 0:16.30
sqlplus
4727 root 15 0 173m 43m 23m S 1.7 0.4 1:26.83
Xorg
4848 oracle 15 0 492m 30m 12m S 1.3 0.3 0:25.46
nautilus
7870 oracle 16 0 8401m 69m 67m S 0.7 0.6 0:00.14 oracle
....

V$OSSTAT delta values (60 second capture) with unset
filesystemio_options during the full tablescan operation:
STAT_NAME END_VALUE DELTA DELTA_SEC
------------------------- ---------- ----------- ------------
BUSY_TIME 4843 1464 24.40
GLOBAL_RECEIVE_SIZE_MAX 4194304 0 0.00
GLOBAL_SEND_SIZE_MAX 1048586 0 0.00
IDLE_TIME 194252 46582 776.37
IOWAIT_TIME 66 30 0.50
LOAD .559570313 0 0.00
NICE_TIME 0 0 0.00
NUM_CPUS 8 0 0.00
NUM_CPU_CORES 4 0 0.00
NUM_CPU_SOCKETS 1 0 0.00
PHYSICAL_MEMORY_BYTES 1.2586E+10 0 0.00
RSRC_MGR_CPU_WAIT_TIME 0 0 0.00
SYS_TIME 1218 380 6.33
TCP_RECEIVE_SIZE_DEFAULT 87380 0 0.00
TCP_RECEIVE_SIZE_MAX 4194304 0 0.00
TCP_RECEIVE_SIZE_MIN 4096 0 0.00
TCP_SEND_SIZE_DEFAULT 16384 0 0.00
TCP_SEND_SIZE_MAX 4194304 0 0.00
TCP_SEND_SIZE_MIN 4096 0 0.00
USER_TIME 3561 1067 17.78

V$SYS_TIME_MODEL delta values (60 second capture) with unset
filesystemio_options during the full tablescan operation:
STAT_NAME DELTA
DELTA_SEC
-------------------------------------------------- -----------
------------
DB CPU 9641528
160692.13
DB time 10092101
168201.68
Java execution elapsed time 0
0.00
PL/SQL compilation elapsed time 49928
832.13
PL/SQL execution elapsed time 227048
3784.13
RMAN cpu time (backup/restore) 0
0.00
background cpu time 124977
2082.95
background elapsed time 339846
5664.10
connection management call elapsed time 42929
715.48
failed parse (out of shared memory) elapsed time 0
0.00
failed parse elapsed time 0
0.00
hard parse (bind mismatch) elapsed time 718
11.97
hard parse (sharing criteria) elapsed time 48453
807.55
hard parse elapsed time 1029104
17151.73
inbound PL/SQL rpc elapsed time 0
0.00
parse time elapsed 1324548
22075.80
repeated bind elapsed time 21478
357.97
sequence load elapsed time 4464
74.40
sql execute elapsed time 9611995
160199.92

------------------------------------
here
Elapsed time for the index access with other monitoring enabled: 23.44
seconds

Top output with unset filesystemio_options during the index scan
operation (terminated about 5 seconds into the run):
top - 10:41:56 up 2:33, 4 users, load average: 0.10, 0.09, 0.12
Tasks: 247 total, 2 running, 245 sleeping, 0 stopped, 0 zombie
Cpu(s): 7.4%us, 5.0%sy, 0.0%ni, 87.1%id, 0.4%wa, 0.1%hi,
0.0%si, 0.0%st
Mem: 12291184k total, 8811144k used, 3480040k free, 97216k
buffers
Swap: 14352376k total, 0k used, 14352376k free, 8205800k
cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
7936 oracle 18 0 8412m 652m 641m R 88.5 5.4 0:32.71
oracle
4994 oracle 18 0 69240 16m 9392 S 9.3 0.1 0:19.61
sqlplus
7924 oracle 15 0 8405m 74m 68m S 0.7 0.6 0:00.43
oracle
6 root 34 19 0 0 0 S 0.3 0.0 0:02.67 ksoftirqd/
1
718 root 10 -5 0 0 0 S 0.3 0.0 0:08.02 usb-
storage
....

V$OSSTAT delta values (60 second capture) with unset
filesystemio_options during the index scan operation:
STAT_NAME END_VALUE DELTA DELTA_SEC
------------------------- ---------- ----------- ------------
BUSY_TIME 9826 2459 40.98
GLOBAL_RECEIVE_SIZE_MAX 4194304 0 0.00
GLOBAL_SEND_SIZE_MAX 1048586 0 0.00
IDLE_TIME 913964 45579 759.65
IOWAIT_TIME 256 123 2.05
LOAD .33984375 0 0.01
NICE_TIME 0 0 0.00
NUM_CPUS 8 0 0.00
NUM_CPU_CORES 4 0 0.00
NUM_CPU_SOCKETS 1 0 0.00
PHYSICAL_MEMORY_BYTES 1.2586E+10 0 0.00
RSRC_MGR_CPU_WAIT_TIME 0 0 0.00
SYS_TIME 2807 962 16.03
TCP_RECEIVE_SIZE_DEFAULT 87380 0 0.00
TCP_RECEIVE_SIZE_MAX 4194304 0 0.00
TCP_RECEIVE_SIZE_MIN 4096 0 0.00
TCP_SEND_SIZE_DEFAULT 16384 0 0.00
TCP_SEND_SIZE_MAX 4194304 0 0.00
TCP_SEND_SIZE_MIN 4096 0 0.00
USER_TIME 6618 1459 24.32

V$SYS_TIME_MODEL delta values (60 second capture) with unset
filesystemio_options during the index scan operation:
STAT_NAME DELTA
DELTA_SEC
-------------------------------------------------- -----------
------------
DB CPU 21543721
359062.02
DB time 22548982
375816.37
Java execution elapsed time 0
0.00
PL/SQL compilation elapsed time 179573
2992.88
PL/SQL execution elapsed time 210949
3515.82
RMAN cpu time (backup/restore) 0
0.00
background cpu time 184972
3082.87
background elapsed time 482752
8045.87
connection management call elapsed time 52331
872.18
failed parse (out of shared memory) elapsed time 0
0.00
failed parse elapsed time 0
0.00
hard parse (bind mismatch) elapsed time 5838
97.30
hard parse (sharing criteria) elapsed time 49516
825.27
hard parse elapsed time 899229
14987.15
inbound PL/SQL rpc elapsed time 0
0.00
parse time elapsed 1097713
18295.22
repeated bind elapsed time 7553
125.88
sequence load elapsed time 5252
87.53
sql execute elapsed time 22255425
370923.75

Test table and index (think of this example as picking up the X or Y
positions of an engine cylinder wall during CMM testing with 32 test
data points per cylinder the buffer cache is flushed for consistency
and to simulate an infrequently accessed table):
CREATE TABLE T1 (
ID NUMBER,
DESCRIPTION VARCHAR2(80));

INSERT INTO T1
SELECT
CEIL(ABS(SIN(ROWNUM/9.9999)*10000)),
'This is the long description for this number '|| TO_CHAR(CEIL(ABS
(SIN(ROWNUM/9.9999)*10000)))
FROM
(SELECT
ROWNUM RN
FROM
DUAL
CONNECT BY
LEVEL<=10000),
(SELECT
ROWNUM RN
FROM
DUAL
CONNECT BY
LEVEL<=10000);

COMMIT;

CREATE INDEX IND_T1 ON T1(ID);

EXEC DBMS_STATS.GATHER_TABLE_STATS
(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE, METHOD_OPT=>'FOR ALL
COLUMNS SIZE 1')

ANALYZE INDEX IND_T1 VALIDATE STRUCTURE;

Test script:
set linesize 150
set pagesize 2000
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH SHARED_POOL;
SET ARRAYSIZE 100

spool /u01/app/oracle/results/BCHRTestShortNoAsync.txt

SET AUTOTRACE TRACEONLY EXPLAIN
ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=100;

SELECT
ID,
DESCRIPTION
FROM
T1
WHERE
ID BETWEEN 1 AND 400;

ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=5;

SELECT
ID,
DESCRIPTION
FROM
T1
WHERE
ID BETWEEN 1 AND 400;

SET ARRAYSIZE 100
SET AUTOTRACE TRACEONLY STATISTICS

SET TIMING ON

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SESSION SET TRACEFILE_IDENTIFIER='IND_COST_LARGE100';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 12';

ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=100;

SELECT
ID,
DESCRIPTION
FROM
T1
WHERE
ID BETWEEN 1 AND 400;

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SESSION SET TRACEFILE_IDENTIFIER='IND_COST_LARGE5';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 12';

ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=5;

SELECT
ID,
DESCRIPTION
FROM
T1
WHERE
ID BETWEEN 1 AND 400;

spool off

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.

From: Charles Hooper on 5 Sep 2009 12:14

On Sep 5, 11:17 am, "Matthias Hoys" <a...(a)spam.com> wrote:
> "Jonathan Lewis" <jonat...(a)jlcomp.demon.co.uk> wrote in message
>
> news:stydnRbjqr7nYjzXnZ2dnUVZ8nSdnZ2d(a)bt.com...
> >> "Charles Hooper" <hooperc2...(a)yahoo.com> wrote in message
> >>news:1ea99c67-8713-4ece-a0f5-85f66851b016(a)v2g2000vbb.googlegroups.com....
> >> I am working on an Oracle performance related project, so I thought
> >> that I would compare the performance of Oracle database 11.1.0.6 (on
> >> 64 bit Linux), 11.1.0.7 (on 64 bit Windows), and 11.2.0.1 (on 64 bit
> >> Linux).
>
> > ...
>
> > Charles,
>
> > Interesting figures.
> > A couple of follow-on questions:
> > How much memory in the machine ?
> > How many CPUs
> > Were you running the client locally - ipc or tcp
> > How much CPU usage "outside" Oracle in each test
> > Any significant variation in 'prefetch' blocks
> > Any other significant variation in stats
>
> > --
> > Regards
>
> > Jonathan Lewis
> >http://jonathanlewis.wordpress.com
>
> > Author: Cost Based Oracle: Fundamentals
> >http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html
>
> > The Co-operative Oracle Users' FAQ
> >http://www.jlcomp.demon.co.uk/faq/ind_faq.html
>
> And what disk subsystem did you use? DAS or a shared system?
>
> Matthias

Direct attached, so the only buffering was in Oracle's buffer cache
and the 32MB of cache memory built into each of the drives. The
problem, I believe, is caused by the drives having to constantly
reposition their heads for each random I/O - I believe that the drives
have either a 5ms or 6ms random I/O time rating. The average time for
the db file sequential read wait is 5.942ms, which I believe is
considered very good for a random I/O.

The trace file shows that Oracle keeps jumping between reads of the
index, followed by several usually parallel reads (I believe that each
of these block reads would also be considered a random I/O) of the
table blocks.

On an unrelated note, it may take over an hour to generate the sample
T1 table, so fair warning for anyone wanting to try the test on their
system.

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.

From: Jonathan Lewis on 5 Sep 2009 12:35

"Charles Hooper" <hooperc2000(a)yahoo.com> wrote in message
news:20921d95-425c-4a2e-8442

CREATE TABLE T1 (
ID NUMBER,
DESCRIPTION VARCHAR2(80));

INSERT INTO T1
SELECT
CEIL(ABS(SIN(ROWNUM/9.9999)*10000)),
'This is the long description for this number '|| TO_CHAR(CEIL(ABS
(SIN(ROWNUM/9.9999)*10000)))
FROM
(SELECT
ROWNUM RN
FROM
DUAL
CONNECT BY
LEVEL<=10000),
(SELECT
ROWNUM RN
FROM
DUAL
CONNECT BY
LEVEL<=10000);

COMMIT;

CREATE INDEX IND_T1 ON T1(ID);

EXEC DBMS_STATS.GATHER_TABLE_STATS
(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE, METHOD_OPT=>'FOR ALL
COLUMNS SIZE 1')

ANALYZE INDEX IND_T1 VALIDATE STRUCTURE;

Test script:
set linesize 150
set pagesize 2000
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH SHARED_POOL;
SET ARRAYSIZE 100

spool /u01/app/oracle/results/BCHRTestShortNoAsync.txt

SET AUTOTRACE TRACEONLY EXPLAIN
ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=100;

SELECT
ID,
DESCRIPTION
FROM
T1
WHERE
ID BETWEEN 1 AND 400;

ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=5;

SELECT
ID,
DESCRIPTION
FROM
T1
WHERE
ID BETWEEN 1 AND 400;

SET ARRAYSIZE 100
SET AUTOTRACE TRACEONLY STATISTICS

SET TIMING ON

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SESSION SET TRACEFILE_IDENTIFIER='IND_COST_LARGE100';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 12';

ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=100;

SELECT
ID,
DESCRIPTION
FROM
T1
WHERE
ID BETWEEN 1 AND 400;

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SESSION SET TRACEFILE_IDENTIFIER='IND_COST_LARGE5';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 12';

ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=5;

SELECT
ID,
DESCRIPTION
FROM
T1
WHERE
ID BETWEEN 1 AND 400;

spool off

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.

Charles,

My numbers may be wrong, but I estimate that your table is
about 6.5GB and your index is about 1.5GB, on which you do
a range scan of 4% for a total 600MB likely to be buffered.

When you finish the range scan versions, how much of the
buffer cache is still free. On one hand we might expect the
entire table and the section of index to be buffered - leaving
about 1GB free - on the other hand Oracle may have been
re-using buffers for the range scan even though the number of
free buffers was huge. (Consider the possibility that you are
reading into the cold half only - this probably shouldn't be
happening after a flush buffer cache, but if it is your buffer
cache is effectively 4GB instead of 8GB).

A possible interpretation of the big difference in figures is as
follows: when doing the range scan you visit many blocks in
the table 3 or 4 times (due to the cyclic but non-uniform nature
of your data). If Oracle is re-using buffers instead of consuming
free buffers all the time then you have to re-read a lot of buffers.

When you enable direct I/O, all those reads come from disk - if
async i/o is also enabled many of those reads might be competing
with each other through different AIO processes.

When you disable direct I/O you have an extra 4GB of file system
buffer backing the Oracle buffer cache, and do far fewer real disk
accesses.

When you do the tablescan, you visit each block only once - when
you do direct I/O you get readahead benefits from the hardware
and don't waste CPU double-buffering through the file system.

When you disable direct IO you use more CPU because of the double
buffering - but because of the long physical reads you don't lose any extra
time on the physical I/O.

You might like to reboot the hardware between runs to eliminate any
filesystem and SAN caching if you want to do a painfully rigorous test,
of course, but I don't think it would affect my guess by much.

--
Regards

Jonathan Lewis
http://jonathanlewis.wordpress.com

Author: Cost Based Oracle: Fundamentals
http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html

The Co-operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html

From: Mladen Gogala on 5 Sep 2009 13:00

On Fri, 04 Sep 2009 18:04:37 -0700, Charles Hooper wrote:

> Obviously from the above, 11.2.0.1 is significantly faster at the index
> access than is 11.1.0.6 on the same platform. But wait, I forgot
> something. On 11.1.0.6 on Linux I had enabled direct I/O and
> asynchronous I/O by setting the FILESYSTEMIO_OPTIONS parameter to
> SETALL. I did not change that parameter on 11.2.0.1, so it defaulted to
> NONE. What happens when the FILESYSTEMIO_OPTIONS parameter is set to
> SETALL?
> * Oracle 11.2.0.1 now required 42.45 seconds for the full tablescan
> while the index access path required one hour, 16 minutes and 46
> seconds.

Charles, that is to be expected, although this is a bit drastic
difference. Most file systems do read prefetch and even asynchronous
reads of the blocks when using buffer cache. Also, just as Oracle RDBMS,
Unix derivatives also have smart strategies for keeping the hot blocks in
the cache, which results in rather drastic performance gains. If you
start doing direct I/O, lose all the help from the system buffer cache
and OS. You can help things by using readahead* RH EL services and
"blockdev" command to increase read-ahead.

--
http://mgogala.freehostia.com

From: Charles Hooper on 5 Sep 2009 16:26

On Sep 5, 12:35 pm, "Jonathan Lewis" <jonat...(a)jlcomp.demon.co.uk>
wrote:
> "Charles Hooper" <hooperc2...(a)yahoo.com> wrote in message
>
> news:20921d95-425c-4a2e-8442
>
> CREATE TABLE T1 (
> ID NUMBER,
> DESCRIPTION VARCHAR2(80));
>
> INSERT INTO T1
> SELECT
> CEIL(ABS(SIN(ROWNUM/9.9999)*10000)),
> 'This is the long description for this number '|| TO_CHAR(CEIL(ABS
> (SIN(ROWNUM/9.9999)*10000)))
> FROM
> (SELECT
> ROWNUM RN
> FROM
> DUAL
> CONNECT BY
> LEVEL<=10000),
> (SELECT
> ROWNUM RN
> FROM
> DUAL
> CONNECT BY
> LEVEL<=10000);
>
> COMMIT;
>
> Charles,
>
> My numbers may be wrong, but I estimate that your table is
> about 6.5GB and your index is about 1.5GB, on which you do
> a range scan of 4% for a total 600MB likely to be buffered.
>
> When you finish the range scan versions, how much of the
> buffer cache is still free. On one hand we might expect the
> entire table and the section of index to be buffered - leaving
> about 1GB free - on the other hand Oracle may have been
> re-using buffers for the range scan even though the number of
> free buffers was huge. (Consider the possibility that you are
> reading into the cold half only - this probably shouldn't be
> happening after a flush buffer cache, but if it is your buffer
> cache is effectively 4GB instead of 8GB).
>
> A possible interpretation of the big difference in figures is as
> follows: when doing the range scan you visit many blocks in
> the table 3 or 4 times (due to the cyclic but non-uniform nature
> of your data). If Oracle is re-using buffers instead of consuming
> free buffers all the time then you have to re-read a lot of buffers.
>
> When you enable direct I/O, all those reads come from disk - if
> async i/o is also enabled many of those reads might be competing
> with each other through different AIO processes.
>
> When you disable direct I/O you have an extra 4GB of file system
> buffer backing the Oracle buffer cache, and do far fewer real disk
> accesses.
>
> When you do the tablescan, you visit each block only once - when
> you do direct I/O you get readahead benefits from the hardware
> and don't waste CPU double-buffering through the file system.
>
> When you disable direct IO you use more CPU because of the double
> buffering - but because of the long physical reads you don't lose any extra
> time on the physical I/O.
>
> You might like to reboot the hardware between runs to eliminate any
> filesystem and SAN caching if you want to do a painfully rigorous test,
> of course, but I don't think it would affect my guess by much.
>
> --
> Regards
>
> Jonathan Lewis
> http://jonathanlewis.wordpress.com
>
> Author: Cost Based Oracle: Fundamentals
> http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html
>
> The Co-operative Oracle Users' FAQ
> http://www.jlcomp.demon.co.uk/faq/ind_faq.html

Jonathan,

Those are very good estimates. The table and index reside in an ASSM
autoallocate tablespace. The table and index statistics looked like
this during one of the test runs (either on 64 bit Windows with
11.1.0.7 or 64 bit Windows with 10.2.0.4, but it sometimes changed in
repeated tests):
SEGMENT EXTENTS EXT_SIZE_KB TOTAL_MB
------- -------- ------------ ---------
IND_T1 16 64 1
IND_T1 63 1,024 63
IND_T1 120 8,192 960
IND_T1 10 65,536 640
T1 16 64 1
T1 63 1,024 63
T1 120 8,192 960
T1 1 15,360 15
T1 1 22,528 22
T1 1 63,488 62
T1 82 65,536 5,248

DBA_INDEXES reported the CLUSTERING_FACTOR as 101,149,320.
INDEX_STATS reported the following with an 8KB block size:
HEIGHT BLOCKS LF_BLKS LF_ROWS DISTINCT_KEYS
MOST_REPEATED_KEY PCT_USED
------ -------- -------- ------------ ------------- -----------------
----------
3 212,992 208,854 100,000,000 10,000
900,324 90

212,992 * 8,192 = 1,744,830,464 = 1.625 GB

The table extents add up to about 6.22GB, but of course it is possible
that the last 64MB extent was not fully used (this seems to be
confirmed by the number of physical reads).

Unfortunately, I did not check how much free space was available in
the buffer cache following the runs. The full tablescans primarily
performed direct path reads which I believe would prevent the blocks
from being cached in the buffer cache (this was also a bit of a
surprise as I was expecting db file scattered read waits).
Considering that there were 2,573,633 consistent block reads and
2,508,560 physical block reads during the test with 11.2.0.1 with
direct I/O and asych I/O enabled, and only 838,370 consistent block
reads and 813,120 physical block reads during the full tablescan, it
would seem that not many of the blocks remained in the buffer cache.
On Windows, __DB_CACHE_SIZE had a value of about 1,375,731,712 at the
end of the test. I did not check the value on Linux after the test
run. On Windows I actually performed one other test, which
essentially set the KEEP buffer cache to a very small value. This
allowed the __DB_CACHE_SIZE parameter to increase from roughly 1.3GB
to roughly 6.6GB which decreased the time for the full tablescan to
31.68 seconds, and the index range scan decreased to 32 minutes and
40.27 seconds.

Mladen, thanks for sharing that information.

This thread seems to fit in nicely with a couple recent blog posts on
Jonathan's website: Queue Time, and Real World. However, I must say
that everyone in this thread did not jump to the conclusion that when
selecting 2.55% of a table, that an index should be used rather than a
full tablescan.

There were, of course, several surprises:
* A tablescan of a "large" table would correctly be preferred by
Oracle when selecting 2.55% of the rows in the test table.
* A tablescan is significantly faster in some cases than an index
range scan when a small portion of the data from the table is needed.
* Direct I/O and Asynch I/O, which seem to be frequently recommended
to improve performance, do not always improve performance, and may in
fact drastically affect performance.
* 64 bit Windows faired reasonably well with Linux when Direct I/O and
Asynch I/O were enabled in Linux, when the same hardware is used for
both platforms.
* Linux 11.2.0.1 seemed to be a bit slower than 11.1.0.6 with Direct I/
O and Asynch I/O enabled, but of course 11.2.0.1 does not suffer as
badly from adaptive cursor sharing problems as 11.1.0.6.
* Oracle used direct path reads rather than db file scattered reads
during the tablescan.
* Others?

So, should the OPTIMIZER_INDEX_COST_ADJ parameter be set to the lower
number to (quoting from a posting on the Internet) immediately tune
all of the SQL in your database to favor index scans over full-table
scans? :-)

Please don't spend too much time considering how to fix this test case
(it was intended as a simple set up, which evolved a bit when 11.2.0.1
was released and I obtained *amazing* performance improvements. There
were other tests too, but I will save those for later.

Aman, thanks for the compliment.

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.

First | Prev | Next | Last
Pages: 1 2 3 4 5
Prev: So is 11.2.0.2 going to be out anytime soon?
Next: Ora-19906 when restoring user managed backup - stumped!