Neste penúltimo da série sobre MAA com Oracle RAC 11GR2 vou falar um pouco sobre como ocorre o failover automático em caso de falha do primary. Vou demonstrar que basicamente você não faz nada, todo o trabalho “sujo” será pelo próprio Oracle.
NONO ARTIGO
Este artigo irá mostrar como o MAA resolve de forma automática todos os pontos de um failover. Claro que para isso você tem que ter tudo devidamente configurado e operacional. Você vai precisar de um Observer configurado e o Fast-Start Failover habilitado (como demonstrado aqui) bem como um Broker operacional (veja aqui). Se você leu os artigos anteriores você já tem tudo isso configurado e não irá se preocupar com mais nada.
AMBIENTE
Até o momento você tem um banco de dados primary (maa) e um banco de dados standby (maastb) sincronizados em Maximum Availability (com Real-Time no envio de redo). Além disso o Fast-Start Failover está habilitado.
Já aviso que o artigo pode ser extenso devido aos logs, tentarei suprimir as informações que não são necessárias. Mas mesmo assim recomendo a leitura para compreender tudo o que ocorre.
FAILOVER AUTOMÁTICO
Neste artigo você vai ver que os comandos executados serão exclusivamente para monitoramento das atividades, tudo o que você precisava fazer já foi feito. O que acontecerá abaixo é uma simulação do que você veria em um ambiente real, você não vai querer que ocorra isso no seu ambiente de produção, não é? Mas se você chegou até aqui já tem configurado um ambiente com MAA e Oracle RAC e isso já denota a importância do ambiente e que deseja estar protegido das adversidades.
Como um failover automático não acontece “naturalmente” na vida real, vamos simular com um abort em todas as instâncias do primary:
[oracle@rac11pri01 ~]$ srvctl stop database -d maa -o abort
Note que o log do Observer já detectou isso:
[W000 07/28 21:57:00.35] Observer started. 23:43:15.41 Monday, July 28, 2014 Initiating Fast-Start Failover to database "maastb"... Performing failover NOW, please wait... Failover succeeded, new primary is "maastb" 23:45:11.19 Monday, July 28, 2014
Como visto acima, em 3 minutos ocorreu o failover completo do ambiente, o banco maastb passou a responder como novo primary. Claro que esse tempo dependerá do tamanho de sua base de dados mas vamos ver como tudo isso aconteceu “de verdade”.
No log do Broker (da instância maastb01) temos informações mais interessantes, veja abaixo:
07/29/2014 02:23:36 Fast-Start Failover cannot proceed because: "standby has connectivity to the primary" 07/29/2014 02:23:38 FAILOVER TO maastb 07/29/2014 02:23:39 Beginning failover to database maastb Notifying Oracle Clusterware to teardown database for FAILOVER 07/29/2014 02:24:20 DMON: Old primary "maa" needs reinstatement 07/29/2014 02:25:07 Protection mode set to MAXIMUM AVAILABILITY 07/29/2014 02:25:27 Deferring associated archivelog destinations of sites permanently disabled due to Failover Notifying Oracle Clusterware to buildup primary database after FAILOVER Data Guard notifying Oracle Clusterware to start services and other instances change 07/29/2014 02:25:30 Posting DB_DOWN alert ... ... with reason Data Guard Fast-Start Failover - Primary Disconnected 07/29/2014 02:25:32 Command FAILOVER TO maastb completed 07/29/2014 02:25:35 Data Guard Broker Status Summary: Type Name Severity Status Configuration dgrac Warning ORA-16608 Primary Database maastb Warning ORA-16817 Physical Standby Database maa Error ORA-16661 07/29/2014 02:26:34 Data Guard Broker Status Summary: Type Name Severity Status Configuration dgrac Warning ORA-16608 Primary Database maastb Warning ORA-16817 Physical Standby Database maa Error ORA-16661
Deixei em negrito algumas informações, mas observe que que o antigo primary precisará sofrer o reinstate assim que foi iniciado (já que ocorreu um failover) e que existem mensagens de erro informando que não há nenhum destino para manter a disponibilidade desejada.
Mas existem locais que podem nos dar mais detalhes do que ocorreu (dos passos) e isso está no alertlog da antiga standby. No log abaixo eu suprimi os erros de TNS entre as bases, mas deixei em negrito pontos importantes.
[root@rac11stb01 ~]# tail -f /u01/app/oracle/diag/rdbms/maastb/maastb1/trace/alert_maastb1.log Tue Jul 29 02:22:36 2014 RFS[15]: Possible network disconnect with primary database Tue Jul 29 02:22:36 2014 RFS[13]: Possible network disconnect with primary database Tue Jul 29 02:22:36 2014 RFS[18]: Possible network disconnect with primary database Tue Jul 29 02:22:37 2014 RFS[19]: Possible network disconnect with primary database Tue Jul 29 02:22:37 2014 RFS[17]: Possible network disconnect with primary database Tue Jul 29 02:23:37 2014 Attempting Fast-Start Failover because the threshold of 60 seconds has elapsed. Tue Jul 29 02:23:39 2014 Data Guard Broker: Beginning failover Tue Jul 29 02:23:44 2014 ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL Tue Jul 29 02:23:45 2014 MRP0: Background Media Recovery cancelled with status 16037 Errors in file /u01/app/oracle/diag/rdbms/maastb/maastb1/trace/maastb1_pr00_3786.trc: ORA-16037: user requested cancel of managed recovery operation Tue Jul 29 02:23:45 2014 Managed Standby Recovery not using Real Time Apply Recovery interrupted! Recovered data files to a consistent state at change 19932270 Tue Jul 29 02:23:49 2014 MRP0: Background Media Recovery process shutdown (maastb1) Managed Standby Recovery Canceled (maastb1) Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE Attempt to do a Terminal Recovery (maastb1) Media Recovery Start: Managed Standby Recovery (maastb1) started logmerger process Tue Jul 29 02:23:55 2014 Managed Standby Recovery not using Real Time Apply Tue Jul 29 02:23:57 2014 Parallel Media Recovery started with 2 slaves Tue Jul 29 02:24:00 2014 Begin: Standby Redo Logfile archival End: Standby Redo Logfile archival Terminal Recovery timestamp is '07/29/2014 02:24:00' Terminal Recovery: applying standby redo logs. Terminal Recovery: thread 1 seq# 1181 redo required Terminal Recovery: Recovery of Online Redo Log: Thread 1 Group 6 Seq 1181 Reading mem 0 Mem# 0: +DG01/maastb/onlinelog/group_6.267.844716079 Mem# 1: +FRA/maastb/onlinelog/group_6.604.844716081 Identified End-Of-Redo (failover) for thread 1 sequence 1181 at SCN 0xffff.ffffffff Terminal Recovery: thread 2 seq# 1395 redo required Terminal Recovery: Recovery of Online Redo Log: Thread 2 Group 9 Seq 1395 Reading mem 0 Mem# 0: +DG01/maastb/onlinelog/group_9.260.844716095 Mem# 1: +FRA/maastb/onlinelog/group_9.591.844716097 Identified End-Of-Redo (failover) for thread 2 sequence 1395 at SCN 0xffff.ffffffff Incomplete Recovery applied until change 19932272 time 07/29/2014 05:42:13 Media Recovery Complete (maastb1) Terminal Recovery: successful completion Tue Jul 29 02:24:02 2014 ARC0: Archiving not possible: error count exceeded Tue Jul 29 02:24:03 2014 ARC3: Archiving not possible: error count exceeded Forcing ARSCN to IRSCN for TR 0:19932272 Attempt to set limbo arscn 0:19932272 irscn 0:19932272 Resetting standby activation ID 728027883 (0x2b64d2eb) ARCH: Archival stopped, error occurred. Will continue retrying ORACLE Instance maastb1 - Archival Error ORA-16014: log 6 sequence# 1181 not archived, no available destinations ORA-00312: online log 6 thread 1: '+DG01/maastb/onlinelog/group_6.267.844716079' ORA-00312: online log 6 thread 1: '+FRA/maastb/onlinelog/group_6.604.844716081' Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WAIT WITH SESSION SHUTDOWN ALTER DATABASE SWITCHOVER TO PRIMARY (maastb1) Maximum wait for role transition is 15 minutes. Backup controlfile written to trace file /u01/app/oracle/diag/rdbms/maastb/maastb1/trace/maastb1_rsm0_3741.trc Tue Jul 29 02:24:08 2014 Standby terminal recovery start SCN: 19932270 RESETLOGS after incomplete recovery UNTIL CHANGE 19932272 Online log +DG01/maastb/onlinelog/group_1.257.844716051: Thread 1 Group 1 was previously cleared Online log +FRA/maastb/onlinelog/group_1.568.844716053: Thread 1 Group 1 was previously cleared Online log +DG01/maastb/onlinelog/group_2.268.844716057: Thread 1 Group 2 was previously cleared Online log +FRA/maastb/onlinelog/group_2.564.844716059: Thread 1 Group 2 was previously cleared Online log +DG01/maastb/onlinelog/group_3.269.844716063: Thread 2 Group 3 was previously cleared Online log +FRA/maastb/onlinelog/group_3.562.844716065: Thread 2 Group 3 was previously cleared Online log +DG01/maastb/onlinelog/group_4.270.844716067: Thread 2 Group 4 was previously cleared Online log +FRA/maastb/onlinelog/group_4.559.844716071: Thread 2 Group 4 was previously cleared Standby became primary SCN: 19932269 Tue Jul 29 02:24:12 2014 Setting recovery target incarnation to 4 Switchover: Complete - Database mounted as primary Completed: ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WAIT WITH SESSION SHUTDOWN Tue Jul 29 02:24:20 2014 ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE AVAILABILITY Completed: ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE AVAILABILITY ALTER DATABASE OPEN Data Guard Broker initializing... This instance was first to open Picked broadcast on commit scheme to generate SCNs Tue Jul 29 02:24:21 2014 Assigning activation ID 732752315 (0x2bace9bb) LGWR: Primary database is in MAXIMUM AVAILABILITY mode Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED LGWR: Destination LOG_ARCHIVE_DEST_1 is not serviced by LGWR Thread 1 advanced to log sequence 2 (thread open) Tue Jul 29 02:24:22 2014 ARC3: Becoming the 'no SRL' ARCH Tue Jul 29 02:24:22 2014 ARC0: Becoming the 'no SRL' ARCH Thread 1 opened at log sequence 2 Current log# 2 seq# 2 mem# 0: +DG01/maastb/onlinelog/group_2.268.844716057 Current log# 2 seq# 2 mem# 1: +FRA/maastb/onlinelog/group_2.564.844716059 Successful open of redo thread 1 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Tue Jul 29 02:24:23 2014 SMON: enabling cache recovery Tue Jul 29 02:24:23 2014 ... ... ... TNS-12564: TNS:connection refused ns secondary err code: 0 nt main err code: 0 nt secondary err code: 0 nt OS err code: 0 Error 12514 received logging on to the standby PING[ARC2]: Heartbeat failed to connect to standby 'maa'. Error is 12514. Instance recovery: looking for dead threads Instance recovery: lock domain invalid but no dead threads Tue Jul 29 02:24:30 2014 FSFP started with pid=40, OS id=9787 Tue Jul 29 02:24:34 2014 minact-scn: Inst 1 is a slave inc#:4 mmon proc-id:3596 status:0x2 minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0000.00000000 gcalc-scn:0x0000.00000000 Tue Jul 29 02:24:37 2014 [3741] Successfully onlined Undo Tablespace 2. Undo initialization finished serial:0 start:8127994 end:8132374 diff:4380 (43 seconds) Verifying file header compatibility for 11g tablespace encryption.. Verifying 11g file header compatibility for tablespace encryption completed Tue Jul 29 02:24:37 2014 SMON: enabling tx recovery Database Characterset is WE8MSWIN1252 Tue Jul 29 02:24:48 2014 No Resource Manager plan active Starting background process GTX0 Tue Jul 29 02:24:57 2014 GTX0 started with pid=41, OS id=9799 Starting background process RCBG Tue Jul 29 02:24:57 2014 RCBG started with pid=45, OS id=9802 replication_dependency_tracking turned off (no async multimaster replication found) Tue Jul 29 02:24:59 2014 Starting background process QMNC Tue Jul 29 02:24:59 2014 QMNC started with pid=46, OS id=9804 Completed: ALTER DATABASE OPEN ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='maastb1'; ALTER SYSTEM SET log_archive_format='arch_%t_%s_%r.arc' SCOPE=SPFILE SID='maastb1'; Tue Jul 29 02:25:18 2014 ... ... ... TNS-12564: TNS:connection refused ns secondary err code: 0 nt main err code: 0 nt secondary err code: 0 nt OS err code: 0 Error 12514 received logging on to the standby PING[ARC2]: Heartbeat failed to connect to standby 'maa'. Error is 12514. Tue Jul 29 02:25:20 2014 ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*'; Tue Jul 29 02:25:20 2014 Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter Tue Jul 29 02:25:20 2014 Starting background process VKRM Tue Jul 29 02:25:20 2014 VKRM started with pid=49, OS id=9823 Tue Jul 29 02:25:22 2014 Starting background process CJQ0 Tue Jul 29 02:25:22 2014 CJQ0 started with pid=58, OS id=9838 ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*'; ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*'; ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*'; ALTER SYSTEM SET db_file_name_convert='+DATA/maa','+DG01/maastb','+FRA/maa','+FRA/maastb' SCOPE=SPFILE; ALTER SYSTEM SET log_file_name_convert='+DATA/maa','+DG01/maastb','+FRA/maa','+FRA/maastb' SCOPE=SPFILE; Setting Resource Manager plan SCHEDULER[0x3189]:DEFAULT_MAINTENANCE_PLAN via scheduler window Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter Tue Jul 29 02:25:27 2014 Starting background process SMCO Tue Jul 29 02:25:27 2014 SMCO started with pid=56, OS id=9845 Tue Jul 29 02:25:28 2014 ALTER SYSTEM SET log_archive_dest_state_2='RESET' SCOPE=BOTH; Tue Jul 29 02:25:28 2014 Active Session History (ASH) performed an emergency flush. This may mean that ASH is undersized. If emergency flushes are a recurring issue, you may consider increasing ASH size by setting the value of _ASH_SIZE to a sufficiently large value. Currently, ASH size is 4194304 bytes. Both ASH size and the total number of emergency flushes since instance startup can be monitored by running the following query: select total_size,awr_flush_emergency_count from v$ash_info; Failover succeeded. Primary database is now maastb.
Observe que no começo do alertlog temos informações que o primary pode estar indisponível e após 60 segundos (que defini quando configurei o Fast-Start Faiolver) o failover automático iniciou. Depois disso tivemos o cancelamento da aplicação dos redo’s por parte do standby. Um pouco mais abaixo tivemos o switchover da base para primary e a informação de qual o SCN utilizado como ponto de virada (RESETLOGS after incomplete recovery UNTIL CHANGE 19932272). Após isso tivemos a abertura da base em modo OPEN e que o destino archive_dest 2 está dessincronizado (isso é normal já que o antigo primary está fora).
No Broker temos a configuração demonstrada abaixo, observe que o banco maastb já aparece listado como primary e o banco maa está com erro. Além disso o Fast-Start Failover está ativo e também que o antigo primary deve sofrer resinstate ao ser iniciado.
[oracle@rac11stb01 ~]$ dgmgrl DGMGRL for Linux: Version 11.2.0.3.0 - 64bit Production Copyright (c) 2000, 2009, Oracle. All rights reserved. Welcome to DGMGRL, type "help" for information. DGMGRL> connect sys@maastb Password: Connected. DGMGRL> SHOW CONFIGURATION VERBOSE; Configuration - dgrac Protection Mode: MaxAvailability Databases: maastb - Primary database Warning: ORA-16817: unsynchronized fast-start failover configuration maa - (*) Physical standby database (disabled) ORA-16661: the standby database needs to be reinstated (*) Fast-Start Failover target Properties: FastStartFailoverThreshold = '60' OperationTimeout = '30' FastStartFailoverLagLimit = '30' CommunicationTimeout = '180' FastStartFailoverAutoReinstate = 'TRUE' FastStartFailoverPmyShutdown = 'TRUE' BystandersFollowRoleChange = 'ALL' Fast-Start Failover: ENABLED Threshold: 60 seconds Target: maa Observer: sbdobs Lag Limit: 30 seconds (not in use) Shutdown Primary: TRUE Auto-reinstate: TRUE Configuration Status: WARNING DGMGRL> DGMGRL> SHOW DATABASE maa; Database - maa Role: PHYSICAL STANDBY Intended State: APPLY-ON Transport Lag: (unknown) Apply Lag: (unknown) Real Time Query: OFF Instance(s): maa1 maa2 Database Status: ORA-16661: the standby database needs to be reinstated DGMGRL> SHOW DATABASE maastb; Database - maastb Role: PRIMARY Intended State: TRANSPORT-ON Instance(s): maastb1 maastb2 Database Warning(s): ORA-16817: unsynchronized fast-start failover configuration Database Status: WARNING DGMGRL>
Como já aconteceu em artigos anteriores, o CRS precisa de um ajuste manual para que o novo primary sempre inicie em modo OPEN:
[oracle@rac11stb01 ~]$ srvctl config database -d maastb -v Database unique name: maastb Database name: Oracle home: /u01/app/oracle/product/11.2.0.3/db_1 Oracle user: oracle Spfile: Domain: Start options: mount Stop options: immediate Database role: PRIMARY Management policy: AUTOMATIC Server pools: maastb Database instances: maastb1,maastb2 Disk Groups: DG01,FRA Mount point paths: Services: Type: RAC Database is administrator managed [oracle@rac11stb01 ~]$ srvctl modify database -d maastb -s OPEN [oracle@rac11stb01 ~]$ srvctl config database -d maastb -v Database unique name: maastb Database name: Oracle home: /u01/app/oracle/product/11.2.0.3/db_1 Oracle user: oracle Spfile: Domain: Start options: open Stop options: immediate Database role: PRIMARY Management policy: AUTOMATIC Server pools: maastb Database instances: maastb1,maastb2 Disk Groups: DG01,FRA Mount point paths: Services: Type: RAC Database is administrator managed [oracle@rac11stb01 ~]$
RELIGANDO O BANCO
Você deve ter notado acima que o antigo primary ainda não foi religado, se você o iniciar agora ele irá sofrer o reinstate e voltará como novo standby. De qualquer forma recomendo que antes de ligar o antigo primary verificar o motivo da queda, não adianta ter um ambiente standby que não é confiável. Muitas vezes é impossível religar o antigo primary e neste caso você precisa remover ele do Broker e adicionar um novo standby.
Supondo que você consiga religar o antigo primary e deixar ele estável verá algo parecido com o alertlog abaixo. É um log com muita informação: o banco tentará abrir em modo OPEN e o DataGuard irá informar que ele não é mais o novo primary. O banco sofrerá flashback para o SCN detectado como ponto de recover e será convertido em physical standby. Por fim um restart do banco e definição deste como standby juntamente com a sincronização dos archives com o primary. Tudo isso está em negrito no log abaixo.
[root@rac11pri01 ~]# tail -f /u01/app/oracle/diag/rdbms/maa/maa1/trace/alert_maa Current log# 1 seq# 1181 mem# 1: +FRA/maa/onlinelog/group_1.286.843488555 Tue Jul 29 04:30:00 2014 Archived Log entry 5432 added for thread 1 sequence 1180 ID 0x2b64d2eb dest 1: Tue Jul 29 05:42:12 2014 Shutting down instance (abort) License high water mark = 7 USER (ospid: 3606): terminating the instance Instance terminated by USER, pid = 3606 Tue Jul 29 05:42:17 2014 Instance shutdown complete Tue Jul 29 06:16:58 2014 Starting ORACLE instance (normal) ****************** Large Pages Information ***************** Total Shared Global Region in Large Pages = 0 KB (0%) Large Pages used by this instance: 0 (0 KB) Large Pages unused system wide = 0 (0 KB) (alloc incr 4096 KB) Large Pages configured system wide = 0 (0 KB) Large Page size = 2048 KB RECOMMENDATION: Total Shared Global Region size is 1026 MB. For optimal performance, prior to the next instance restart increase the number of unused Large Pages by atleast 513 2048 KB Large Pages (1026 MB) system wide to get 100% of the Shared Global Region allocated with Large pages *********************************************************** LICENSE_MAX_SESSION = 0 LICENSE_SESSIONS_WARNING = 0 Private Interface 'eth2:1' configured from GPnP for use as a private interconnect. [name='eth2:1', type=1, ip=169.254.187.5, mac=00-0c-29-8c-91-79, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62] Public Interface 'eth0' configured from GPnP for use as a public interface. [name='eth0', type=1, ip=10.17.42.10, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1] Public Interface 'eth0:2' configured from GPnP for use as a public interface. [name='eth0:2', type=1, ip=10.17.42.22, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1] Public Interface 'eth0:4' configured from GPnP for use as a public interface. [name='eth0:4', type=1, ip=10.17.42.17, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1] Public Interface 'eth0:5' configured from GPnP for use as a public interface. [name='eth0:5', type=1, ip=10.17.42.14, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1] Picked latch-free SCN scheme 3 Autotune of undo retention is turned on. LICENSE_MAX_USERS = 0 SYS auditing is disabled Starting up: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Partitioning and Real Application Clusters options. ORACLE_HOME = /u01/app/oracle/product/11.2.0.3/db_1 System name: Linux Node name: rac11pri01.tjsc.jus.br Release: 2.6.39-400.17.1.el6uek.x86_64 Version: #1 SMP Fri Feb 22 18:16:18 PST 2013 Machine: x86_64 VM name: VMWare Version: 6 Using parameter settings in server-side pfile /u01/app/oracle/product/11.2.0.3/db_1/dbs/initmaa1.ora System parameters with non-default values: processes = 150 spfile = "+DATA/maa/spfilemaa.ora" sga_target = 1G control_files = "+DATA/maa/controlfile/current.273.843488553" control_files = "+FRA/maa/controlfile/current.256.843488553" db_file_name_convert = "+DG01/maastb" db_file_name_convert = "+DATA/maa" db_file_name_convert = "+FRA/maastb" db_file_name_convert = "+FRA/maa" log_file_name_convert = "+DG01/maastb" log_file_name_convert = "+DATA/maa" log_file_name_convert = "+FRA/maastb" log_file_name_convert = "+FRA/maa" db_block_size = 8192 compatible = "11.2.0.0.0" log_archive_dest_1 = "LOCATION=USE_DB_RECOVERY_FILE_DEST VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=maa" log_archive_dest_2 = "SERVICE=maastb SYNC AFFIRM LGWR VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=maastb" log_archive_dest_state_2 = "ENABLE" log_archive_min_succeed_dest= 1 fal_server = "maastb" log_archive_trace = 0 log_archive_config = "DG_CONFIG=(maa,maastb)" log_archive_format = "arch_%t_%s_%r.arc" log_archive_max_processes= 4 archive_lag_target = 0 cluster_database = TRUE db_create_file_dest = "+DATA" db_recovery_file_dest = "+FRA" db_recovery_file_dest_size= 10G standby_file_management = "AUTO" thread = 1 undo_tablespace = "UNDOTBS1" instance_number = 1 remote_login_passwordfile= "EXCLUSIVE" db_domain = "" remote_listener = "rac11pri-scan.tjsc.jus.br:1521" audit_file_dest = "/u01/app/oracle/admin/maa/adump" audit_trail = "DB" db_name = "maa" db_unique_name = "maa" open_cursors = 300 pga_aggregate_target = 256M dg_broker_start = TRUE dg_broker_config_file1 = "+DATA/maa/dr1maa.dat" dg_broker_config_file2 = "+FRA/maa/dr2maa.dat" diagnostic_dest = "/u01/app/oracle" Cluster communication is configured to use the following interface(s) for this instance 169.254.187.5 cluster interconnect IPC version:Oracle UDP/IP (generic) IPC Vendor 1 proto 2 Tue Jul 29 06:17:06 2014 PMON started with pid=2, OS id=7421 Tue Jul 29 06:17:06 2014 PSP0 started with pid=3, OS id=7423 Tue Jul 29 06:17:08 2014 VKTM started with pid=4, OS id=7425 at elevated priority VKTM running at (1)millisec precision with DBRM quantum (100)ms Tue Jul 29 06:17:08 2014 GEN0 started with pid=5, OS id=7430 Tue Jul 29 06:17:08 2014 DIAG started with pid=6, OS id=7432 Tue Jul 29 06:17:08 2014 DBRM started with pid=7, OS id=7434 Tue Jul 29 06:17:08 2014 PING started with pid=8, OS id=7436 Tue Jul 29 06:17:08 2014 ACMS started with pid=9, OS id=7438 Tue Jul 29 06:17:08 2014 DIA0 started with pid=10, OS id=7440 Tue Jul 29 06:17:08 2014 LMON started with pid=11, OS id=7442 Tue Jul 29 06:17:08 2014 LMD0 started with pid=12, OS id=7444 * Load Monitor used for high load check * New Low - High Load Threshold Range = [1920 - 2560] Tue Jul 29 06:17:08 2014 LMS0 started with pid=13, OS id=7446 at elevated priority Tue Jul 29 06:17:08 2014 RMS0 started with pid=14, OS id=7450 Tue Jul 29 06:17:08 2014 LMHB started with pid=15, OS id=7452 Tue Jul 29 06:17:08 2014 MMAN started with pid=16, OS id=7454 Tue Jul 29 06:17:08 2014 DBW0 started with pid=17, OS id=7456 Tue Jul 29 06:17:09 2014 LGWR started with pid=18, OS id=7458 Tue Jul 29 06:17:09 2014 CKPT started with pid=19, OS id=7460 Tue Jul 29 06:17:09 2014 SMON started with pid=20, OS id=7462 Tue Jul 29 06:17:09 2014 RECO started with pid=21, OS id=7464 Tue Jul 29 06:17:09 2014 RBAL started with pid=22, OS id=7466 Tue Jul 29 06:17:09 2014 ASMB started with pid=23, OS id=7468 Tue Jul 29 06:17:09 2014 MMON started with pid=24, OS id=7470 Tue Jul 29 06:17:09 2014 MMNL started with pid=25, OS id=7474 NOTE: initiating MARK startup Starting background process MARK Tue Jul 29 06:17:09 2014 MARK started with pid=26, OS id=7476 lmon registered with NM - instance number 1 (internal mem no 0) NOTE: MARK has subscribed Reconfiguration started (old inc 0, new inc 2) List of instances: 1 (myinst: 1) Global Resource Directory frozen * allocate domain 0, invalid = TRUE Communication channels reestablished Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Set master node info Submitted all remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted Post SMON to start 1st pass IR Submitted all GCS remote-cache requests Post SMON to start 1st pass IR Fix write in gcs resources Reconfiguration complete Tue Jul 29 06:17:11 2014 LCK0 started with pid=28, OS id=7482 Tue Jul 29 06:17:11 2014 Starting background process RSMN Tue Jul 29 06:17:11 2014 RSMN started with pid=29, OS id=7484 Error: can not register my instance state - 4 ORACLE_BASE not set in environment. It is recommended that ORACLE_BASE be set in the environment Reusing ORACLE_BASE from an earlier startup = /u01/app/oracle Tue Jul 29 06:17:12 2014 DMON started with pid=30, OS id=7486 Reconfiguration started (old inc 2, new inc 4) List of instances: 1 2 (myinst: 1) Global Resource Directory frozen Communication channels reestablished Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Set master node info Submitted all remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted Post SMON to start 1st pass IR Submitted all GCS remote-cache requests Post SMON to start 1st pass IR Fix write in gcs resources Reconfiguration complete Tue Jul 29 06:17:13 2014 ALTER SYSTEM SET local_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=10.17.42.14)(PORT=1521))))' SCOPE=MEMORY SID='maa1'; ALTER DATABASE MOUNT /* db agent *//* {1:18282:220} */ This instance was first to mount NOTE: Loaded library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so NOTE: Loaded library: System SUCCESS: diskgroup DATA was mounted NOTE: dependency between database maa and diskgroup resource ora.DATA.dg is established SUCCESS: diskgroup FRA was mounted NOTE: dependency between database maa and diskgroup resource ora.FRA.dg is established Tue Jul 29 06:17:18 2014 NSS2 started with pid=33, OS id=7506 Successful mount of redo thread 1, with mount id 732690857 Allocated 3981120 bytes in shared pool for flashback generation buffer Starting background process RVWR Tue Jul 29 06:17:18 2014 RVWR started with pid=34, OS id=7508 Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE) Lost write protection disabled Completed: ALTER DATABASE MOUNT /* db agent *//* {1:18282:220} */ ALTER DATABASE OPEN /* db agent *//* {1:18282:220} */ Data Guard Broker initializing... Data Guard Broker initialization complete Data Guard: verifying database primary role... Tue Jul 29 06:17:25 2014 Starting Data Guard Broker (DMON) Tue Jul 29 06:17:25 2014 INSV started with pid=36, OS id=7514 Tue Jul 29 06:17:29 2014 NSV1 started with pid=37, OS id=7516 Tue Jul 29 06:17:37 2014 Data Guard: version check completed Data Guard determines a failover has occurred - this is no longer a primary database ORA-16649 signalled during: ALTER DATABASE OPEN /* db agent *//* {1:18282:220} */... Tue Jul 29 06:17:52 2014 Reconfiguration started (old inc 4, new inc 6) List of instances: 1 (myinst: 1) Global Resource Directory frozen * dead instance detected - domain 0 invalid = TRUE Communication channels reestablished Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out Tue Jul 29 06:17:52 2014 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Set master node info Submitted all remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted Post SMON to start 1st pass IR Submitted all GCS remote-cache requests Post SMON to start 1st pass IR Fix write in gcs resources Reconfiguration complete Tue Jul 29 06:17:59 2014 NSV1 started with pid=39, OS id=7566 Tue Jul 29 06:18:07 2014 RSM0 started with pid=40, OS id=7570 Tue Jul 29 06:18:11 2014 Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST RFS[1]: Assigned to RFS process 7573 RFS[1]: Database mount ID mismatch [0x2bace9bb:0x2babf9a9] (732752315:732690857) FLASHBACK DATABASE TO SCN 19932269 Tue Jul 29 06:18:13 2014 RFS[2]: Assigned to RFS process 7577 RFS[2]: Database mount ID mismatch [0x2bace9bb:0x2babf9a9] (732752315:732690857) Flashback Restore Start Flashback Restore Complete Flashback Media Recovery Start started logmerger process Tue Jul 29 06:18:18 2014 Parallel Media Recovery started with 2 slaves Tue Jul 29 06:18:18 2014 Recovery of Online Redo Log: Thread 1 Group 1 Seq 1181 Reading mem 0 Mem# 0: +DATA/maa/onlinelog/group_1.272.843488553 Mem# 1: +FRA/maa/onlinelog/group_1.286.843488555 Recovery of Online Redo Log: Thread 2 Group 3 Seq 1394 Reading mem 0 Mem# 0: +DATA/maa/onlinelog/group_3.257.843489101 Mem# 1: +FRA/maa/onlinelog/group_3.284.843489101 Recovery of Online Redo Log: Thread 2 Group 4 Seq 1395 Reading mem 0 Mem# 0: +DATA/maa/onlinelog/group_4.262.843489103 Mem# 1: +FRA/maa/onlinelog/group_4.283.843489103 Incomplete Recovery applied until change 19932270 time 07/29/2014 05:42:12 Flashback Media Recovery Complete Completed: FLASHBACK DATABASE TO SCN 19932269 alter database convert to physical standby ALTER DATABASE CONVERT TO PHYSICAL STANDBY (maa1) Flush standby redo logfile failed:1649 Clearing standby activation ID 728027883 (0x2b64d2eb) The primary database controlfile was created using the 'MAXLOGFILES 192' clause. There is space for up to 188 standby redo logfiles Use the following SQL commands on the standby database to create standby redo logfiles that match the primary database: ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 52428800; ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 52428800; ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 52428800; ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 52428800; ALTER DATABASE ADD STANDBY LOGFILE 'srl5.f' SIZE 52428800; Waiting for all non-current ORLs to be archived... All non-current ORLs have been archived. Clearing online redo logfile 1 +DATA/maa/onlinelog/group_1.272.843488553 Clearing online log 1 of thread 1 sequence number 1181 Clearing online redo logfile 1 complete Clearing online redo logfile 2 +DATA/maa/onlinelog/group_2.271.843488555 Clearing online log 2 of thread 1 sequence number 1180 Clearing online redo logfile 2 complete Clearing online redo logfile 3 +DATA/maa/onlinelog/group_3.257.843489101 Clearing online log 3 of thread 2 sequence number 1394 Clearing online redo logfile 3 complete Clearing online redo logfile 4 +DATA/maa/onlinelog/group_4.262.843489103 Clearing online log 4 of thread 2 sequence number 1395 Clearing online redo logfile 4 complete Completed: alter database convert to physical standby Tue Jul 29 06:18:29 2014 Shutting down instance (immediate) Shutting down instance: further logons disabled Stopping background process MMNL Stopping background process MMON License high water mark = 6 alter database CLOSE NORMAL ORA-1109 signalled during: alter database CLOSE NORMAL... alter database DISMOUNT Shutting down archive processes Archiving is disabled Tue Jul 29 06:18:33 2014 NOTE: Deferred communication with ASM instance NOTE: deferred map free for map id 5 Tue Jul 29 06:18:33 2014 NOTE: Deferred communication with ASM instance NOTE: deferred map free for map id 150 Tue Jul 29 06:18:34 2014 NOTE: Deferred communication with ASM instance NOTE: deferred map free for map id 2 Completed: alter database DISMOUNT ARCH: Archival disabled due to shutdown: 1089 Shutting down archive processes Archiving is disabled Shutting down Data Guard Broker processes Tue Jul 29 06:18:35 2014 NOTE: Deferred communication with ASM instance Tue Jul 29 06:18:36 2014 Completed: Data Guard Broker shutdown NOTE: force a map free for map id 2 Tue Jul 29 06:18:38 2014 Stopping background process VKTM ARCH: Archival disabled due to shutdown: 1089 Shutting down archive processes Archiving is disabled Tue Jul 29 06:18:38 2014 NOTE: Shutting down MARK background process NOTE: force a map free for map id 192 NOTE: force a map free for map id 190 NOTE: force a map free for map id 188 NOTE: force a map free for map id 186 NOTE: force a map free for map id 184 NOTE: force a map free for map id 182 NOTE: force a map free for map id 143 NOTE: force a map free for map id 191 NOTE: force a map free for map id 189 NOTE: force a map free for map id 187 NOTE: force a map free for map id 185 NOTE: force a map free for map id 183 NOTE: force a map free for map id 181 NOTE: force a map free for map id 142 Tue Jul 29 06:18:41 2014 freeing rdom 0 Tue Jul 29 06:18:43 2014 Instance shutdown complete Tue Jul 29 06:18:44 2014 Starting ORACLE instance (normal) ****************** Large Pages Information ***************** Total Shared Global Region in Large Pages = 0 KB (0%) Large Pages used by this instance: 0 (0 KB) Large Pages unused system wide = 0 (0 KB) (alloc incr 4096 KB) Large Pages configured system wide = 0 (0 KB) Large Page size = 2048 KB RECOMMENDATION: Total Shared Global Region size is 1026 MB. For optimal performance, prior to the next instance restart increase the number of unused Large Pages by atleast 513 2048 KB Large Pages (1026 MB) system wide to get 100% of the Shared Global Region allocated with Large pages *********************************************************** LICENSE_MAX_SESSION = 0 LICENSE_SESSIONS_WARNING = 0 Private Interface 'eth2:1' configured from GPnP for use as a private interconnect. [name='eth2:1', type=1, ip=169.254.187.5, mac=00-0c-29-8c-91-79, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62] Public Interface 'eth0' configured from GPnP for use as a public interface. [name='eth0', type=1, ip=10.17.42.10, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1] Public Interface 'eth0:2' configured from GPnP for use as a public interface. [name='eth0:2', type=1, ip=10.17.42.22, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1] Public Interface 'eth0:4' configured from GPnP for use as a public interface. [name='eth0:4', type=1, ip=10.17.42.17, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1] Public Interface 'eth0:5' configured from GPnP for use as a public interface. [name='eth0:5', type=1, ip=10.17.42.14, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1] Picked latch-free SCN scheme 3 Autotune of undo retention is turned on. LICENSE_MAX_USERS = 0 SYS auditing is disabled Starting up: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Partitioning and Real Application Clusters options. ORACLE_HOME = /u01/app/oracle/product/11.2.0.3/db_1 System name: Linux Node name: rac11pri01.tjsc.jus.br Release: 2.6.39-400.17.1.el6uek.x86_64 Version: #1 SMP Fri Feb 22 18:16:18 PST 2013 Machine: x86_64 VM name: VMWare Version: 6 Using parameter settings in server-side pfile /u01/app/oracle/product/11.2.0.3/db_1/dbs/initmaa1.ora System parameters with non-default values: processes = 150 spfile = "+DATA/maa/spfilemaa.ora" sga_target = 1G control_files = "+DATA/maa/controlfile/current.273.843488553" control_files = "+FRA/maa/controlfile/current.256.843488553" db_file_name_convert = "+DG01/maastb" db_file_name_convert = "+DATA/maa" db_file_name_convert = "+FRA/maastb" db_file_name_convert = "+FRA/maa" log_file_name_convert = "+DG01/maastb" log_file_name_convert = "+DATA/maa" log_file_name_convert = "+FRA/maastb" log_file_name_convert = "+FRA/maa" db_block_size = 8192 compatible = "11.2.0.0.0" log_archive_dest_1 = "LOCATION=USE_DB_RECOVERY_FILE_DEST VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=maa" log_archive_dest_2 = "SERVICE=maastb SYNC AFFIRM LGWR VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=maastb" log_archive_dest_state_2 = "ENABLE" log_archive_min_succeed_dest= 1 fal_server = "maastb" log_archive_trace = 0 log_archive_config = "DG_CONFIG=(maa,maastb)" log_archive_format = "arch_%t_%s_%r.arc" log_archive_max_processes= 4 archive_lag_target = 0 cluster_database = TRUE db_create_file_dest = "+DATA" db_recovery_file_dest = "+FRA" db_recovery_file_dest_size= 10G standby_file_management = "AUTO" thread = 1 undo_tablespace = "UNDOTBS1" instance_number = 1 remote_login_passwordfile= "EXCLUSIVE" db_domain = "" remote_listener = "rac11pri-scan.tjsc.jus.br:1521" audit_file_dest = "/u01/app/oracle/admin/maa/adump" audit_trail = "DB" db_name = "maa" db_unique_name = "maa" open_cursors = 300 pga_aggregate_target = 256M dg_broker_start = TRUE dg_broker_config_file1 = "+DATA/maa/dr1maa.dat" dg_broker_config_file2 = "+FRA/maa/dr2maa.dat" diagnostic_dest = "/u01/app/oracle" Cluster communication is configured to use the following interface(s) for this instance 169.254.187.5 cluster interconnect IPC version:Oracle UDP/IP (generic) IPC Vendor 1 proto 2 Tue Jul 29 06:18:47 2014 PMON started with pid=2, OS id=7618 Tue Jul 29 06:18:47 2014 PSP0 started with pid=3, OS id=7620 Tue Jul 29 06:18:48 2014 VKTM started with pid=4, OS id=7622 at elevated priority VKTM running at (1)millisec precision with DBRM quantum (100)ms Tue Jul 29 06:18:48 2014 GEN0 started with pid=5, OS id=7626 Tue Jul 29 06:18:48 2014 DIAG started with pid=6, OS id=7628 Tue Jul 29 06:18:48 2014 DBRM started with pid=7, OS id=7630 Tue Jul 29 06:18:48 2014 PING started with pid=8, OS id=7632 Tue Jul 29 06:18:48 2014 ACMS started with pid=9, OS id=7634 Tue Jul 29 06:18:49 2014 DIA0 started with pid=10, OS id=7636 Tue Jul 29 06:18:49 2014 LMON started with pid=11, OS id=7638 Tue Jul 29 06:18:49 2014 LMD0 started with pid=12, OS id=7640 Tue Jul 29 06:18:49 2014 LMS0 started with pid=13, OS id=7642 at elevated priority Tue Jul 29 06:18:49 2014 RMS0 started with pid=14, OS id=7646 Tue Jul 29 06:18:49 2014 LMHB started with pid=15, OS id=7648 Tue Jul 29 06:18:49 2014 MMAN started with pid=16, OS id=7650 * Load Monitor used for high load check * New Low - High Load Threshold Range = [1920 - 2560] Tue Jul 29 06:18:49 2014 DBW0 started with pid=17, OS id=7652 Tue Jul 29 06:18:49 2014 LGWR started with pid=18, OS id=7654 Tue Jul 29 06:18:50 2014 CKPT started with pid=19, OS id=7656 Tue Jul 29 06:18:50 2014 SMON started with pid=20, OS id=7658 Tue Jul 29 06:18:50 2014 RECO started with pid=21, OS id=7660 Tue Jul 29 06:18:50 2014 RBAL started with pid=22, OS id=7662 Tue Jul 29 06:18:50 2014 ASMB started with pid=23, OS id=7664 Tue Jul 29 06:18:50 2014 MMON started with pid=24, OS id=7666 Tue Jul 29 06:18:50 2014 MMNL started with pid=25, OS id=7670 lmon registered with NM - instance number 1 (internal mem no 0) NOTE: initiating MARK startup Starting background process MARK Tue Jul 29 06:18:50 2014 MARK started with pid=26, OS id=7672 NOTE: MARK has subscribed Reconfiguration started (old inc 0, new inc 2) List of instances: 1 (myinst: 1) Global Resource Directory frozen * allocate domain 0, invalid = TRUE Communication channels reestablished Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Set master node info Submitted all remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted Post SMON to start 1st pass IR Submitted all GCS remote-cache requests Post SMON to start 1st pass IR Fix write in gcs resources Reconfiguration complete Tue Jul 29 06:18:50 2014 LCK0 started with pid=28, OS id=7678 Starting background process RSMN Tue Jul 29 06:18:51 2014 RSMN started with pid=29, OS id=7680 ORACLE_BASE from environment = /u01/app/oracle Tue Jul 29 06:18:51 2014 DMON started with pid=30, OS id=7682 Tue Jul 29 06:18:51 2014 alter database mount This instance was first to mount Tue Jul 29 06:18:52 2014 ALTER SYSTEM SET local_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=10.17.42.14)(PORT=1521))))' SCOPE=MEMORY SID='maa1'; NOTE: Loaded library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so NOTE: Loaded library: System SUCCESS: diskgroup DATA was mounted SUCCESS: diskgroup FRA was mounted NOTE: dependency between database maa and diskgroup resource ora.DATA.dg is established NOTE: dependency between database maa and diskgroup resource ora.FRA.dg is established Tue Jul 29 06:18:58 2014 NSS2 started with pid=33, OS id=7720 ARCH: STARTING ARCH PROCESSES Tue Jul 29 06:18:58 2014 ARC0 started with pid=34, OS id=7722 ARC0: Archival started ARCH: STARTING ARCH PROCESSES COMPLETE ARC0: STARTING ARCH PROCESSES Tue Jul 29 06:18:59 2014 ARC1 started with pid=35, OS id=7724 Successful mount of redo thread 1, with mount id 732706059 Tue Jul 29 06:18:59 2014 ARC2 started with pid=36, OS id=7726 Allocated 3981120 bytes in shared pool for flashback generation buffer Tue Jul 29 06:18:59 2014 ARC3 started with pid=37, OS id=7728 Starting background process RVWR ARC1: Archival started ARC2: Archival started ARC1: Becoming the 'no FAL' ARCH ARC1: Becoming the 'no SRL' ARCH ARC2: Becoming the heartbeat ARCH Tue Jul 29 06:18:59 2014 RVWR started with pid=38, OS id=7730 Physical Standby Database mounted. Lost write protection disabled ARC2: Becoming the active heartbeat ARCH ARC2: Becoming the active heartbeat ARCH ARC3: Archival started ARC0: STARTING ARCH PROCESSES COMPLETE Completed: alter database mount Tue Jul 29 06:19:04 2014 Starting Data Guard Broker (DMON) Tue Jul 29 06:19:05 2014 INSV started with pid=40, OS id=7747 Tue Jul 29 06:19:11 2014 NSV1 started with pid=41, OS id=7752 Tue Jul 29 06:19:16 2014 RSM0 started with pid=43, OS id=7758 Tue Jul 29 06:19:21 2014 Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST Primary database is in MAXIMUM AVAILABILITY mode Changing standby controlfile to RESYNCHRONIZATION level ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='maa1'; Standby controlfile consistent with primary ALTER SYSTEM SET log_archive_format='arch_%t_%s_%r.arc' SCOPE=SPFILE SID='maa1'; ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*'; ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*'; RFS[1]: Assigned to RFS process 7760 RFS[1]: Selected log 5 for thread 1 sequence 4 dbid 722024964 branch 854159044 ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*'; ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*'; ALTER SYSTEM SET db_file_name_convert='+DG01/maastb','+DATA/maa','+FRA/maastb','+FRA/maa' SCOPE=SPFILE; ALTER SYSTEM SET log_file_name_convert='+DG01/maastb','+DATA/maa','+FRA/maastb','+FRA/maa' SCOPE=SPFILE; ALTER SYSTEM SET fal_server='maastb' SCOPE=BOTH; ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE Attempt to start background Managed Standby Recovery process (maa1) Tue Jul 29 06:19:23 2014 Primary database is in MAXIMUM AVAILABILITY mode Standby controlfile consistent with primary Tue Jul 29 06:19:23 2014 MRP0 started with pid=46, OS id=7764 MRP0: Background Managed Standby Recovery process started (maa1) Standby controlfile consistent with primary RFS[2]: Assigned to RFS process 7762 RFS[2]: Selected log 8 for thread 2 sequence 4 dbid 722024964 branch 854159044 Tue Jul 29 06:19:25 2014 RFS[3]: Assigned to RFS process 7766 RFS[3]: Selected log 6 for thread 1 sequence 3 dbid 722024964 branch 854159044 Tue Jul 29 06:19:25 2014 RFS[4]: Assigned to RFS process 7768 RFS[4]: Selected log 9 for thread 2 sequence 3 dbid 722024964 branch 854159044 RFS[3]: New Archival REDO Branch(resetlogs_id): 854159044 Prior: 847284763 RFS[3]: Archival Activation ID: 0x2bace9bb Current: 0x0 RFS[3]: Effect of primary database OPEN RESETLOGS RFS[3]: Managed Standby Recovery process is active RFS[3]: Incarnation entry added for Branch(resetlogs_id): 854159044 (maa1) Tue Jul 29 06:19:25 2014 Setting recovery target incarnation to 4 Tue Jul 29 06:19:26 2014 Archived Log entry 5433 added for thread 2 sequence 3 ID 0x2bace9bb dest 1: Tue Jul 29 06:19:26 2014 Archived Log entry 5434 added for thread 1 sequence 3 ID 0x2bace9bb dest 1: RFS[4]: Opened log for thread 2 sequence 1395 dbid 722024964 branch 847284763 RFS[3]: Opened log for thread 1 sequence 1181 dbid 722024964 branch 847284763 Archived Log entry 5435 added for thread 2 sequence 1395 rlc 847284763 ID 0x2b64d2eb dest 2: started logmerger process Archived Log entry 5436 added for thread 1 sequence 1181 rlc 847284763 ID 0x2b64d2eb dest 2: Managed Standby Recovery starting Real Time Apply RFS[4]: Opened log for thread 2 sequence 2 dbid 722024964 branch 854159044 Tue Jul 29 06:19:29 2014 RFS[5]: Assigned to RFS process 7786 RFS[5]: Opened log for thread 2 sequence 1 dbid 722024964 branch 854159044 RFS[3]: Opened log for thread 1 sequence 1 dbid 722024964 branch 854159044 Archived Log entry 5437 added for thread 2 sequence 2 rlc 854159044 ID 0x2bace9bb dest 2: Archived Log entry 5438 added for thread 2 sequence 1 rlc 854159044 ID 0x2bace9bb dest 2: Parallel Media Recovery started with 2 slaves Tue Jul 29 06:19:30 2014 Media Recovery start incarnation depth : 1, target inc# : 4, irscn : 19932272 Archived Log entry 5439 added for thread 1 sequence 1 rlc 854159044 ID 0x2bace9bb dest 2: Waiting for all non-current ORLs to be archived... All non-current ORLs have been archived. Tue Jul 29 06:19:30 2014 RFS[6]: Assigned to RFS process 7788 RFS[6]: Opened log for thread 1 sequence 2 dbid 722024964 branch 854159044 Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_2_seq_1395.2326.854173167 Identified End-Of-Redo (failover) for thread 2 sequence 1395 at SCN 0x0.1302470 Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_1_seq_1181.2327.854173167 Identified End-Of-Redo (failover) for thread 1 sequence 1181 at SCN 0x0.1302470 Archived Log entry 5440 added for thread 1 sequence 2 rlc 854159044 ID 0x2bace9bb dest 2: Resetting standby activation ID 728027883 (0x2b64d2eb) Media Recovery End-Of-Redo indicator encountered Media Recovery Continuing Tue Jul 29 06:19:31 2014 Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_2_seq_1.2329.854173169 Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_1_seq_1.2330.854173169 Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_1_seq_2.2331.854173171 Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_2_seq_2.2328.854173169 Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_2_seq_3.2325.854173167 Tue Jul 29 06:19:34 2014 Archived Log entry 5441 added for thread 2 sequence 4 ID 0x2bace9bb dest 1: Tue Jul 29 06:19:34 2014 Primary database is in MAXIMUM AVAILABILITY mode Standby controlfile consistent with primary Standby controlfile consistent with primary RFS[7]: Assigned to RFS process 7806 RFS[7]: Selected log 8 for thread 2 sequence 5 dbid 722024964 branch 854159044 Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_1_seq_3.2324.854173167 Media Recovery Waiting for thread 1 sequence 4 (in transit) Recovery of Online Redo Log: Thread 1 Group 5 Seq 4 Reading mem 0 Mem# 0: +DATA/maa/onlinelog/group_5.263.843615365 Mem# 1: +FRA/maa/onlinelog/group_5.289.843615367 Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_2_seq_4.2332.854173175 Media Recovery Waiting for thread 2 sequence 5 (in transit) Recovery of Online Redo Log: Thread 2 Group 8 Seq 5 Reading mem 0 Mem# 0: +DATA/maa/onlinelog/group_8.259.843615383 Mem# 1: +FRA/maa/onlinelog/group_8.703.843615385 Tue Jul 29 06:19:37 2014 Archived Log entry 5442 added for thread 1 sequence 4 ID 0x2bace9bb dest 1: Tue Jul 29 06:19:38 2014 Primary database is in MAXIMUM AVAILABILITY mode Standby controlfile consistent with primary Media Recovery Waiting for thread 1 sequence 5 (in transit) Standby controlfile consistent with primary RFS[8]: Assigned to RFS process 7809 RFS[8]: Selected log 5 for thread 1 sequence 5 dbid 722024964 branch 854159044 Recovery of Online Redo Log: Thread 1 Group 5 Seq 5 Reading mem 0 Mem# 0: +DATA/maa/onlinelog/group_5.263.843615365 Mem# 1: +FRA/maa/onlinelog/group_5.289.843615367 Tue Jul 29 06:19:46 2014 Reconfiguration started (old inc 2, new inc 4) List of instances: 1 2 (myinst: 1) Global Resource Directory frozen Communication channels reestablished Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out Tue Jul 29 06:19:46 2014 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Set master node info Submitted all remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted Post SMON to start 1st pass IR Submitted all GCS remote-cache requests Post SMON to start 1st pass IR Fix write in gcs resources Reconfiguration complete
No log do Broker irá ver um resumo do está no alertlog. Detecção que o antigo primary voltou e precisa de reinstate, após isso o banco maa passa a ser o destino preferencial.
07/29/2014 02:58:18 REINSTATE DATABASE maa Database maa can be reinstated 07/29/2014 02:58:50 Command REINSTATE DATABASE maa completed with warning ORA-16570 Data Guard Broker Status Summary: Type Name Severity Status Configuration dgrac Warning ORA-16608 Primary Database maastb Warning ORA-16817 Physical Standby Database maa Error ORA-16661 07/29/2014 02:59:31 REINSTATE DATABASE maa Database maa can be reinstated 07/29/2014 02:59:53 Successfully completed reinstatement of database 0x01001000; removing ReinstateContextArray property 07/29/2014 03:00:03 Command REINSTATE DATABASE maa completed Data Guard Broker Status Summary: Type Name Severity Status Configuration dgrac Warning ORA-16608 Primary Database maastb Warning ORA-16817 Physical Standby Database maa Warning ORA-16817 EDIT DATABASE maa SET PROPERTY ActualApplyInstance = maa1 07/29/2014 03:00:04 Apply Instance for database maa is maa1 Command EDIT DATABASE maa SET PROPERTY ActualApplyInstance = maa1 completed
Caso estivesse acompanhando o log do Observer veria as mesmas informações (de forma mais resumida ainda). Conforme o log abaixo demonstra o banco maa não chega a abrir durante o processo devido ao reinstate.
Failover succeeded, new primary is "maastb" 23:45:11.19 Monday, July 28, 2014 00:17:56.55 Tuesday, July 29, 2014 Initiating reinstatement for database "maa"... Reinstating database "maa", please wait... Operation requires shutdown of instance "maa1" on database "maa" Shutting down instance "maa1"... ORA-01109: database not open Database dismounted. ORACLE instance shut down. Operation requires startup of instance "maa1" on database "maa" Starting instance "maa1"... ORACLE instance started. Database mounted. Continuing to reinstate database "maa" ... Reinstatement of database "maa" succeeded 00:19:41.47 Tuesday, July 29, 2014
Como sempre, o ajuste necessário do CRS para não abrir o antigo primary em modo OPEN em caso de restart do server:
[oracle@rac11pri01 ~]$ srvctl config database -d maa -v Database unique name: maa Database name: maa Oracle home: /u01/app/oracle/product/11.2.0.3/db_1 Oracle user: oracle Spfile: +DATA/maa/spfilemaa.ora Domain: Start options: open Stop options: immediate Database role: PHYSICAL_STANDBY Management policy: AUTOMATIC Server pools: maa Database instances: maa1,maa2 Disk Groups: DATA,FRA Mount point paths: Services: Type: RAC Database is administrator managed [oracle@rac11pri01 ~]$ srvctl modify database -d maa -s MOUNT [oracle@rac11pri01 ~]$ srvctl config database -d maa -v Database unique name: maa Database name: maa Oracle home: /u01/app/oracle/product/11.2.0.3/db_1 Oracle user: oracle Spfile: +DATA/maa/spfilemaa.ora Domain: Start options: mount Stop options: immediate Database role: PHYSICAL_STANDBY Management policy: AUTOMATIC Server pools: maa Database instances: maa1,maa2 Disk Groups: DATA,FRA Mount point paths: Services: Type: RAC Database is administrator managed [oracle@rac11pri01 ~]$
AMBIENTE FINAL
Chegamos aqui com o ambiente com dois bancos operando em Maximum Availability. O banco primary é o maastb e o standby é o maa. A troca de papeis ocorreu de forma automática pois o Fast-Start Failover está ativo com o Observer monitorando tudo.
Claro que seu ambiente não é só isso, você precisará que suas aplicações e TNS estejam preparadas para chavear as conexões para o novo primary. Não é o escopo aqui, mas é um ponto importante a ser verificado.
PROXIMO ARTIGO
No próximo artigo falarei do que temos de mudança para o DataGuard do Oracle 12c. Além disso um resumo dos dez artigos para que você possa ter um guia do que pode ser feito para configurar MAA com oracle RAC.