Exadata Storage Index Aging – Part 2B

This is the third blog in a series looking at Storage Cell Storage Index Predicate Offloads. Continued from previous post. And there’s more here: Analysis for Test 2B click here.

For Test 2B – This corrects the columns with very little offload so that they get between 60-100 percent offload from the storage index. See the previous 2 blogs for more background information.

Now we have offload for each test segment and 8 columns offloaded for each In Line (IL) test as expected. We can even see a pattern on when columns are released (aged out) after repeated queries on the 8 columns.

The pattern I see is that after 10 queries columns 1, and 7-21 seemed to keep their offload status without aging out. But what is going on with column 1 that keeps it from aging out for the entire test. And what about 2-6 that seem to age out fairly fast after their query run with a storage index offload.

I see some patterns developing, but may need more eyes info or tests to figure this out. Will continue when I can.

Let me hear your ideas.

Here’s the test sequence again:

Test 2 - Test Plan

Test 2 – Test Plan

Here’s the full output in excel format; this time run to completion:

Exadata Storage Index Aging – Part 2A

Continuing from Storage Index Aging – part 1 the test was re-run. This time the In-Line (IL) test was run for the most recent test column and then proceeding backwards through the most recently run columns, then running the remaining columns in the 21 column set for the table.

Here’s a quick graphic that may be easier to visualize.

Test 2 - Test Plan

Test 2 – Test Plan

This set of tests are closer to what I expected, but those “pesky” columns 6, 7, 13, 14, 20 and 21 are still not offloading the way I’d like, so I’ll fix that in Test 2B (next blog).

On the bright side, we see more of the columns persisting in the IL tests. Looks like 6 out of our expected 8 columns are persisting without aging out. Also, although the columns above are “pesky”, they may also be telling us something about how the aging process works. But for now it’s still just an observation.

Also note the persistence of columns 18 and 19, still the same as in Test 1.

Output Data (Abbreviated):

Image with partial output data:

Partial Image of Data Results - Test 2a

Partial Image of Data Results – Test 2a

Excel file with more output data: dcox_exadata_storage_index_aging_test2

Exadata Storage Index Aging – Part 1

Just took some classes with some of the smart guys at the Accenture Enkitec Group (Enkitec) guys. Disclaimer: I work there too.

In my last class I got inspired by Carlos Sierra’s Exadata Tuning class. Carlos was talking about how little we know about the actual aging process in the Exadata Storage Index arena. He mentioned it might be easy to write a little code to check out this aging process. So I stole was inspired by his idea.

If you are unfamiliar with Exadata Storage Indexes read this Oracle Magazine article from Arup Nanda. Or better yet, get more details in the Expert Oracle Exadata book.

Here’s what I think I know about this process: Up to 8 columns per table can be held in the storage index array.

Target Problem: How do single table single predicate storage index offloads age out.

Environment Specifications:
Target Machine: X3 quarter rack
GI Version:
DB Version:
_cell_storidx_mode EVA
_kcfis_storageidx_diag_mode 0
_kcfis_storageidx_disabled FALSE

Cell Parameters(storage index parameters):
_cell_enable_storage_index_for_loads = TRUE
_cell_enable_storage_index_for_writes = TRUE
_cell_si_max_num_diag_mode_dumps = 20
_cell_storage_index_columns = 8 (default = 0)
_cell_storage_index_partial_reads_threshold_percent = 85
_cell_storage_index_partial_rd_sectors = 512
_cell_enable_storage_index_for_loads = TRUE
_cell_enable_storage_index_for_writes = TRUE
_cell_storage_index_diag_mode = 0
_cell_storage_index_sizing_factor = 2

Test 1:
I’ve identified a table where I can easily get a storage index to run for most columns. I verify this by gathering statistics before and after a query execution and computing the difference. The two statistics used for this test are:
cell physical IO bytes eligible for predicate offload
cell physical IO bytes saved by storage index

From the two statistics above the percentage of storage index predicate offload can be determined (see partial output data below). This is similar to a script we used in class to capture the difference in stats on both sides of a query.

My test table has 2.5+ billion rows. I took lab table from class with 7 columns and copied the same 7 columns over to have 21 columns.  It takes about 1-4 seconds for an offloaded query and 27-30 ish seconds for a non offloaded query.  Remember lots of variables here could change things, but this specific table and data on this specific machine works for my test goals.  I can repeat queries fairly quickly and have enough difference in the numbers to see when I do and don’t get an offload for the storage index.

Test Sequence (for Test 1):
I run a baseline test and query on a value out of each column (data types: number, varchar2 and date). Then I run 10 queries on each column, then move over to the next column and do it again for a total of 10 columns. An “in-line” (IL) or intermediate test is performed on all 21 columns starting at the first column and then the test is run again, but starting in column n+1 for each iteration.

Test 1 - Test Plan

Test 1 – Test Plan

The initial guess is that 8 of the columns, possibly the last 8 would still perform well on the IL test, but the others would not use the storage index offload.

The query each time would look something like this:
select count(column1) from table where column1 = ‘x’;

What I found:

Ok, I didn’t get what I expected, but this is the first try. Only the first column of the IL test showed an offload, and a small bit on the second column. But then stale from there on for the IL test. A strange blip of offloads on columns 18 and 19. So I need to modify my test to get more information. Still a start at least.

It will be easy to add to these test scripts. I’ve added some of the resulting output in an image below and a little more in an excel spreadsheet attached. Here are some basic notes:

  • After running the first 10 columns through the 10 repetitions, there is still a 83.2 pct offload on column 1 when it gets run on the IL test.
  • Curiously columns 18 and 19 get offloads and they haven’t been touched since the baseline run at the start.
  • Columns 6,7,13,14 (and probably 20,21) don’t get much offload at all.

This begs for more testing. The primary interest here is still to figure out more on aging. Perhaps these columns age out quicker than thought originally.

Modifying Test (next blog):
So I’ll test this next (see next blog – TEST 2). After the main set of repetitions, on the IL test I’ll test the columns “just run” first, starting with the most recent, then test the remaining columns.

Limited/No offload on some columns:
A quick glance at Columns 6,7,13,14,20,21 shows the type are all number(10,2). But there must be some difference in the way the data turns out since the offload on column 6 is 8.1% and 0 on column 7. And remember that if the value exists on all of the cells there would be no storage index offload, but there may be other types of cell offload. (Over simplifying here, see the docs on the intro for more information on this.)

After a quick look at the counts for each values one value has 1 billion rows of the 2.5 billion rows. Here are the top 5 counts:
1 109500000
81 85190307
21 85190243
61 85190243
41 84500018
Concentrating on the aging for now, will leave “why” it didn’t offload for another blog.

Note the colors closer to white show more offload. Darker colors show less offload (brown and blue).

Test Case 1 Exadata Storage Aging

Test Case 1 Exadata Storage Aging

Test Case 1 Exadata Storage Aging – Excel Version