Re-Sync issue on Exadata X5-2 after adding Automatic Storage Management Cluster File System (ACFS)

Had the joy of working on a new X5-2 with a 12c database recently. We added an ACFS mount and suddenly the ASM Operations showed a long re-sync process — days instead of minutes as usual. We asked around and no one we asked had seen this behavior. So after a little searching on the Oracle Support pages we found the bug and workaround in this note: “ASM RESYNC TOOK VERY LONG FOR ONE OF THE DISKGROUP IN EXADATA (Doc ID 1957331.1)”. Fixed it!

A few notes from the docs: You may want to make sure all of your files / DB’s are mounted and active before making this change (see the Oracle Doc: 12c ASM Docs )

We’re not sure if it was related, but before making this change startup and shutdown times on some empty default database(s) seemed inappropriately long, but now that seems to be cleared up as well(hmmm).

Table Driven Data Pump Applications – Lessons learned

After building a real world table driven data pump application (see blog Table Driven Data Pump), thought recording a few lessons learned / notes before forgetting would be helpful.

  • Create job control as part of the overall application.
  • Watch out for Character Set issues.
  • Jobs should be able to re-use generic tasks from other jobs.
  • Jobs should allowing grouping sub-jobs.
  • Progress monitoring queries should be considered in advance.
  • Informational and error logging are very valuable in debugging.
  • Intermingling SQL tasks with Data Pump tasks is awesome.
  • Version control from a remote master for all aspects of the code is a must.
  • Standard user provisioning inclusion is helpful.
  • Consider any encryption issues (current and future) as part of the initial plan.
  • Exadata Storage Index Aging – Part 3 – Analysis on Test 2B

    Continued from previous post.

    Based on the observations below, it looks as if there are more mechanisms for offloading than a max of 8 columns in the storage index array. It seems that the 8 columns parameter is a minimum parameter for retention (_cell_storage_index_columns = 8). If columns 1-10 have all been offloaded, but a post test still shows a similar, but different set of columns (look closely at 1, 19, 20 and 21) retained then the total number of columns must have risen above the 8 columns in the storage index array. I’d just attribute this to the fact that the array items are only created for 1Mb sections of each segment storage region, but the same percentage of offload occurs on the In Line (IL) tests as previously tested on an earlier IL test.

    This is good news for the end users meaning there there is some elastic capacity to the storage index array above the 8 columns specified. It seems that although more than these specific columns were offloaded, but once activity died down only 8 remained active. So 8 seems to be a minimum and not a maximum.

    But, still need to understand why 1, 21, 20, and 19 seem to persist longer? Was it previous activity or something else in the data? Other possibilities?

    —————————————————————————————————————————

    Observations
    Looking at the data from Test 2B here are some summary observations of the changes:

    Baseline Test:
    8 offloaded columns retained from previous actions. Columns (in order of testing): 1, 7, 8, 9, 10, 19, 20, 21

    Test Columns 1-10:
    All ten columns get storage index offloads, but column 1 is offloaded on first try and has not been aged out of the storage index array.

    IL Test – All Columns – After test on columns 1-10
    Offloads not aged out in order of column testing: 10, 9, 8, 7, 1, 21, 20, 19

    Test Columns 2-11:
    All ten columns get storage index offloads, all aged out (have to offload after 1st query)

    IL Test – All Columns – After test on columns 2-11
    Offloads not aged out in order of column testing: 11, 10, 9, 8, 1, 21, 20, 19

    Test Columns 3-12:
    All ten columns get storage index offloads, all aged out (have to offload after 1st query)

    IL Test – All Columns – After test on columns 3-12
    Offloads not aged out in order of column testing: 12, 11, 10, 9, 1, 21, 20, 19

    Test Columns 4-13:
    All ten columns get storage index offloads, all aged out (have to offload after 1st query)

    IL Test – All Columns – After test on columns 4-13
    Offloads not aged out in order of column testing: 13, 12, 11, 10, 1, 21, 20, 19

    Test Columns 5-14:
    All ten columns get storage index offloads, all aged out (have to offload after 1st query)

    IL Test – All Columns – After test on columns 5-14
    Offloads not aged out in order of column testing: 14, 13, 12, 11, 1, 21, 20, 19

    Test Columns 6-15:
    All ten columns get storage index offloads, all aged out (have to offload after 1st query)

    IL Test – All Columns – After test on columns 6-15
    Offloads not aged out in order of column testing: 15, 14, 13, 12, 1, 21, 20, 19

    Test Columns 7-16:
    All ten columns get storage index offloads, all aged out (have to offload after 1st query)

    IL Test – All Columns – After test on columns 7-16
    Offloads not aged out in order of column testing: 16, 15, 14, 13, 1, 21, 20, 19

    Test Columns 8-17:
    All ten columns get storage index offloads, all aged out (have to offload after 1st query)

    IL Test – All Columns – After test on columns 8-17
    Offloads not aged out in order of column testing: 17, 16, 15, 14, 1, 21, 20, 19

    Test Columns 9-18:
    All ten columns get storage index offloads, all aged out (have to offload after 1st query)

    IL Test – All Columns – After test on columns 9-18
    Offloads not aged out in order of column testing: 18, 17, 16, 15, 1, 21, 20, 19

    Test Columns 10-19:
    All ten columns get storage index offloads, all aged out (have to offload after 1st query), except 19

    IL Test – All Columns – After test on columns 10-19
    Offloads not aged out in order of column testing: 19, 18, 17, 16, 15, 1, 21, 20

    Test Columns 11-20:
    All ten columns get storage index offloads, all aged out (have to offload after 1st query), except 19, and 20

    IL Test – All Columns – After test on columns 11-20
    Offloads not aged out in order of column testing: 20, 19, 18, 17, 16, 15, 1, 21

    Test Columns 12-21:
    All ten columns get storage index offloads, all aged out (have to offload after 1st query), except 19, 20, and 21

    IL Test – All Columns – After test on columns 12-21
    Offloads not aged out in order of column testing: 21, 20, 19, 18, 17, 16, 15, 1

    Exadata Storage Index Aging – Part 2B

    This is the third blog in a series looking at Storage Cell Storage Index Predicate Offloads. Continued from previous post. And there’s more here: Analysis for Test 2B click here.

    For Test 2B – This corrects the columns with very little offload so that they get between 60-100 percent offload from the storage index. See the previous 2 blogs for more background information.

    Results:
    Now we have offload for each test segment and 8 columns offloaded for each In Line (IL) test as expected. We can even see a pattern on when columns are released (aged out) after repeated queries on the 8 columns.

    The pattern I see is that after 10 queries columns 1, and 7-21 seemed to keep their offload status without aging out. But what is going on with column 1 that keeps it from aging out for the entire test. And what about 2-6 that seem to age out fairly fast after their query run with a storage index offload.

    I see some patterns developing, but may need more eyes info or tests to figure this out. Will continue when I can.

    Let me hear your ideas.

    Here’s the test sequence again:

    Test 2 - Test Plan

    Test 2 – Test Plan

    Here’s the full output in excel format; this time run to completion:
    dcox_exadata_storage_index_aging_test2b

    Exadata Storage Index Aging – Part 2A

    Continuing from Storage Index Aging – part 1 the test was re-run. This time the In-Line (IL) test was run for the most recent test column and then proceeding backwards through the most recently run columns, then running the remaining columns in the 21 column set for the table.

    Here’s a quick graphic that may be easier to visualize.

    Test 2 - Test Plan

    Test 2 – Test Plan

    Results:
    This set of tests are closer to what I expected, but those “pesky” columns 6, 7, 13, 14, 20 and 21 are still not offloading the way I’d like, so I’ll fix that in Test 2B (next blog).

    On the bright side, we see more of the columns persisting in the IL tests. Looks like 6 out of our expected 8 columns are persisting without aging out. Also, although the columns above are “pesky”, they may also be telling us something about how the aging process works. But for now it’s still just an observation.

    Also note the persistence of columns 18 and 19, still the same as in Test 1.

    Output Data (Abbreviated):

    Image with partial output data:

    Partial Image of Data Results - Test 2a

    Partial Image of Data Results – Test 2a

    Excel file with more output data: dcox_exadata_storage_index_aging_test2

    Exadata Storage Index Aging – Part 1

    Just took some classes with some of the smart guys at the Accenture Enkitec Group (Enkitec) guys. Disclaimer: I work there too.

    In my last class I got inspired by Carlos Sierra’s Exadata Tuning class. Carlos was talking about how little we know about the actual aging process in the Exadata Storage Index arena. He mentioned it might be easy to write a little code to check out this aging process. So I stole was inspired by his idea.

    If you are unfamiliar with Exadata Storage Indexes read this Oracle Magazine article from Arup Nanda. Or better yet, get more details in the Expert Oracle Exadata book.

    Here’s what I think I know about this process: Up to 8 columns per table can be held in the storage index array.

    Target Problem: How do single table single predicate storage index offloads age out.

    Environment Specifications:
    Target Machine: X3 quarter rack
    GI Version: 11.2.0.4.0
    DB Version: 11.2.0.3.0
    Parameters/Values:
    _cell_storidx_mode EVA
    _kcfis_storageidx_diag_mode 0
    _kcfis_storageidx_disabled FALSE

    Cell Parameters(storage index parameters):
    _cell_enable_storage_index_for_loads = TRUE
    _cell_enable_storage_index_for_writes = TRUE
    _cell_si_max_num_diag_mode_dumps = 20
    _cell_storage_index_columns = 8 (default = 0)
    _cell_storage_index_partial_reads_threshold_percent = 85
    _cell_storage_index_partial_rd_sectors = 512
    _cell_enable_storage_index_for_loads = TRUE
    _cell_enable_storage_index_for_writes = TRUE
    _cell_storage_index_diag_mode = 0
    _cell_storage_index_sizing_factor = 2

    Test 1:
    I’ve identified a table where I can easily get a storage index to run for most columns. I verify this by gathering statistics before and after a query execution and computing the difference. The two statistics used for this test are:
    cell physical IO bytes eligible for predicate offload
    cell physical IO bytes saved by storage index

    From the two statistics above the percentage of storage index predicate offload can be determined (see partial output data below). This is similar to a script we used in class to capture the difference in stats on both sides of a query.

    Data:
    My test table has 2.5+ billion rows. I took lab table from class with 7 columns and copied the same 7 columns over to have 21 columns.  It takes about 1-4 seconds for an offloaded query and 27-30 ish seconds for a non offloaded query.  Remember lots of variables here could change things, but this specific table and data on this specific machine works for my test goals.  I can repeat queries fairly quickly and have enough difference in the numbers to see when I do and don’t get an offload for the storage index.

    Test Sequence (for Test 1):
    I run a baseline test and query on a value out of each column (data types: number, varchar2 and date). Then I run 10 queries on each column, then move over to the next column and do it again for a total of 10 columns. An “in-line” (IL) or intermediate test is performed on all 21 columns starting at the first column and then the test is run again, but starting in column n+1 for each iteration.

    Test 1 - Test Plan

    Test 1 – Test Plan

    Expectation:
    The initial guess is that 8 of the columns, possibly the last 8 would still perform well on the IL test, but the others would not use the storage index offload.

    The query each time would look something like this:
    select count(column1) from table where column1 = ‘x’;

    What I found:

    Ok, I didn’t get what I expected, but this is the first try. Only the first column of the IL test showed an offload, and a small bit on the second column. But then stale from there on for the IL test. A strange blip of offloads on columns 18 and 19. So I need to modify my test to get more information. Still a start at least.

    It will be easy to add to these test scripts. I’ve added some of the resulting output in an image below and a little more in an excel spreadsheet attached. Here are some basic notes:

    • After running the first 10 columns through the 10 repetitions, there is still a 83.2 pct offload on column 1 when it gets run on the IL test.
    • Curiously columns 18 and 19 get offloads and they haven’t been touched since the baseline run at the start.
    • Columns 6,7,13,14 (and probably 20,21) don’t get much offload at all.

    This begs for more testing. The primary interest here is still to figure out more on aging. Perhaps these columns age out quicker than thought originally.

    Modifying Test (next blog):
    So I’ll test this next (see next blog – TEST 2). After the main set of repetitions, on the IL test I’ll test the columns “just run” first, starting with the most recent, then test the remaining columns.

    Limited/No offload on some columns:
    A quick glance at Columns 6,7,13,14,20,21 shows the type are all number(10,2). But there must be some difference in the way the data turns out since the offload on column 6 is 8.1% and 0 on column 7. And remember that if the value exists on all of the cells there would be no storage index offload, but there may be other types of cell offload. (Over simplifying here, see the docs on the intro for more information on this.)

    After a quick look at the counts for each values one value has 1 billion rows of the 2.5 billion rows. Here are the top 5 counts:
    VALUE Count
    1 109500000
    81 85190307
    21 85190243
    61 85190243
    41 84500018
    Concentrating on the aging for now, will leave “why” it didn’t offload for another blog.

    Note the colors closer to white show more offload. Darker colors show less offload (brown and blue).

    Test Case 1 Exadata Storage Aging

    Test Case 1 Exadata Storage Aging

    Test Case 1 Exadata Storage Aging – Excel Version

    Check out this new tool!

    Download this new tool from Carlos Sierra at Carlos Sierra’s Blog via DropBox or EDB360 on Enkitec’s Web Page.

    edb360_teaser

    EDB360 – Too much stuff to put on one page, here’s a teaser of some of the output.

    Does a great job summarizing the state of a database. “The edb360 tool started as a quick and dirty “script” to gather basic information about a database without knowing anything about it before hand. The first rule for edb360 was: it has to install nothing in the database. The second rule became: it has to provide some insight about a database.”, from his blog.

    Download it now and try it out for yourself.

    Change from global names to non-global names

    Had some requests from friends on how to change the global names, there are lots of other posts on this same topic. Here’s another.

    If you started up your database with global_names=true and want to change, here is how to do this.

    Perform the following tasks.

    Example – Global names is using “dbname.dbdomainname.com” as the DB_NAME and DB_DOMAIN combination, non-global names would use “dbname”, with no DB_DOMAIN.

    Open up a command tool and Log in as sysdba or SYS.
    cmd> sqlplus / as sysdba

    Create the file initDB_NAMETEMP.ora (where DB_NAME is your DBNAME).
    sql> create pfile=’initTHESHOWTEMP.ora’ from spfile;

    Shutdown the database.
    sql> shutdown immediate

    Keep the command tool open, you’ll jump back to it later.

    Now edit the initDB_NAMETEMP.ora file … C:\app\oracle\product\11.2.0\server\database\initDB_NAMETEMP.ora (or $ORACLE_HOME/dbs).

    Remove the line:
    db_domain=’dbdomainname.com’

    Change the line with the *.global_names to false
    *.global_name=FALSE

    Save the file initDB_NAMETEMP.ora

    Back to the command tool from earlier:

    create spfile from pfile=’initTHESHOWTEMP.ora’;

    Startup the database.
    sql> startup;

    sql> show parameter db_domain
    Check db_domain – should be null.

    sql> show parameter global_name
    Check global_names – should be false.

    sql> show parameter service_name
    Check the service name to be sure it matches the db_name. This is just an extra check.

    You can run the listener status from the command line to verify the database is correctly registered:
    cmd&gt lsnrctl status

    Look for the db_name without the domain name

    Change any tnsnames.ora files to match accordingly.

    Test with tnsping
    tnsping db_name

    Try to connect, and you’re done.