Oracle e MAA – Artigo IX – Failover Automático

Neste penúltimo da série sobre MAA com Oracle RAC 11GR2 vou falar um pouco sobre como ocorre o failover automático em caso de falha do primary. Vou demonstrar que basicamente você não faz nada, todo o trabalho “sujo” será pelo próprio Oracle.

NONO ARTIGO

Este artigo irá mostrar como o MAA resolve de forma automática todos os pontos de um failover. Claro que para isso você tem que ter tudo devidamente configurado e operacional. Você vai precisar de um Observer configurado e o Fast-Start Failover habilitado (como demonstrado aqui) bem como um Broker operacional (veja aqui). Se você leu os artigos anteriores você já tem tudo isso configurado e não irá se preocupar com mais nada.

AMBIENTE

Até o momento você tem um banco de dados primary (maa) e um banco de dados standby (maastb) sincronizados em Maximum Availability (com Real-Time no envio de redo). Além disso o Fast-Start Failover está habilitado.

Já aviso que o artigo pode ser extenso devido aos logs, tentarei suprimir as informações que não são necessárias. Mas mesmo assim recomendo a leitura para compreender tudo o que ocorre.

FAILOVER AUTOMÁTICO

Neste artigo você vai ver que os comandos executados serão exclusivamente para monitoramento das atividades, tudo o que você precisava fazer já foi feito. O que acontecerá abaixo é uma simulação do que você veria em um ambiente real, você não vai querer que ocorra isso no seu ambiente de produção, não é? Mas se você chegou até aqui já tem configurado um ambiente com MAA e Oracle RAC e isso já denota a importância do ambiente e que deseja estar protegido das adversidades.

Como um failover automático não acontece “naturalmente” na vida real, vamos simular com um abort em todas as instâncias do primary:

[oracle@rac11pri01 ~]$ srvctl stop database -d maa -o abort

Note que o log do Observer já detectou isso:

[W000 07/28 21:57:00.35] Observer started.

23:43:15.41 Monday, July 28, 2014
Initiating Fast-Start Failover to database "maastb"...
Performing failover NOW, please wait...
Failover succeeded, new primary is "maastb"
23:45:11.19 Monday, July 28, 2014

Como visto acima, em 3 minutos ocorreu o failover completo do ambiente, o banco maastb passou a responder como novo primary. Claro que esse tempo dependerá do tamanho de sua base de dados mas vamos ver como tudo isso aconteceu “de verdade”.

No log do Broker (da instância maastb01) temos informações mais interessantes, veja abaixo:

07/29/2014 02:23:36
Fast-Start Failover cannot proceed because: "standby has connectivity to the primary"
07/29/2014 02:23:38
FAILOVER TO maastb
07/29/2014 02:23:39
Beginning failover to database maastb
Notifying Oracle Clusterware to teardown database for FAILOVER
07/29/2014 02:24:20
DMON: Old primary "maa" needs reinstatement
07/29/2014 02:25:07
Protection mode set to MAXIMUM AVAILABILITY
07/29/2014 02:25:27
Deferring associated archivelog destinations of sites permanently disabled due to Failover
Notifying Oracle Clusterware to buildup primary database after FAILOVER
Data Guard notifying Oracle Clusterware to start services and other instances change
07/29/2014 02:25:30
Posting DB_DOWN alert ...
		... with reason Data Guard Fast-Start Failover - Primary Disconnected
07/29/2014 02:25:32
Command FAILOVER TO maastb completed
07/29/2014 02:25:35
Data Guard Broker Status Summary:
  Type                        Name                             Severity  Status
  Configuration               dgrac                             Warning  ORA-16608
  Primary Database            maastb                            Warning  ORA-16817
  Physical Standby Database   maa                                 Error  ORA-16661
07/29/2014 02:26:34
Data Guard Broker Status Summary:
  Type                        Name                             Severity  Status
  Configuration               dgrac                             Warning  ORA-16608
  Primary Database            maastb                            Warning  ORA-16817
  Physical Standby Database   maa                                 Error  ORA-16661

Deixei em negrito algumas informações, mas observe que que o antigo primary precisará sofrer o reinstate assim que foi iniciado (já que ocorreu um failover) e que existem mensagens de erro informando que não há nenhum destino para manter a disponibilidade desejada.

Mas existem locais que podem nos dar mais detalhes do que ocorreu (dos passos) e isso está no alertlog da antiga standby. No log abaixo eu suprimi os erros de TNS entre as bases, mas deixei em negrito pontos importantes.

[root@rac11stb01 ~]# tail -f /u01/app/oracle/diag/rdbms/maastb/maastb1/trace/alert_maastb1.log
Tue Jul 29 02:22:36 2014
RFS[15]: Possible network disconnect with primary database
Tue Jul 29 02:22:36 2014
RFS[13]: Possible network disconnect with primary database
Tue Jul 29 02:22:36 2014
RFS[18]: Possible network disconnect with primary database
Tue Jul 29 02:22:37 2014
RFS[19]: Possible network disconnect with primary database
Tue Jul 29 02:22:37 2014
RFS[17]: Possible network disconnect with primary database
Tue Jul 29 02:23:37 2014
Attempting Fast-Start Failover because the threshold of 60 seconds has elapsed.
Tue Jul 29 02:23:39 2014
Data Guard Broker: Beginning failover
Tue Jul 29 02:23:44 2014
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
Tue Jul 29 02:23:45 2014
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /u01/app/oracle/diag/rdbms/maastb/maastb1/trace/maastb1_pr00_3786.trc:
ORA-16037: user requested cancel of managed recovery operation
Tue Jul 29 02:23:45 2014
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Recovered data files to a consistent state at change 19932270
Tue Jul 29 02:23:49 2014
MRP0: Background Media Recovery process shutdown (maastb1)
Managed Standby Recovery Canceled (maastb1)
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE
Attempt to do a Terminal Recovery (maastb1)
Media Recovery Start: Managed Standby Recovery (maastb1)
 started logmerger process
Tue Jul 29 02:23:55 2014
Managed Standby Recovery not using Real Time Apply
Tue Jul 29 02:23:57 2014
Parallel Media Recovery started with 2 slaves
Tue Jul 29 02:24:00 2014
Begin: Standby Redo Logfile archival
End: Standby Redo Logfile archival
Terminal Recovery timestamp is '07/29/2014 02:24:00'
Terminal Recovery: applying standby redo logs.
Terminal Recovery: thread 1 seq# 1181 redo required
Terminal Recovery:
Recovery of Online Redo Log: Thread 1 Group 6 Seq 1181 Reading mem 0
  Mem# 0: +DG01/maastb/onlinelog/group_6.267.844716079
  Mem# 1: +FRA/maastb/onlinelog/group_6.604.844716081
Identified End-Of-Redo (failover) for thread 1 sequence 1181 at SCN 0xffff.ffffffff
Terminal Recovery: thread 2 seq# 1395 redo required
Terminal Recovery:
Recovery of Online Redo Log: Thread 2 Group 9 Seq 1395 Reading mem 0
  Mem# 0: +DG01/maastb/onlinelog/group_9.260.844716095
  Mem# 1: +FRA/maastb/onlinelog/group_9.591.844716097
Identified End-Of-Redo (failover) for thread 2 sequence 1395 at SCN 0xffff.ffffffff
Incomplete Recovery applied until change 19932272 time 07/29/2014 05:42:13
Media Recovery Complete (maastb1)
Terminal Recovery: successful completion
Tue Jul 29 02:24:02 2014
ARC0: Archiving not possible: error count exceeded
Tue Jul 29 02:24:03 2014
ARC3: Archiving not possible: error count exceeded
Forcing ARSCN to IRSCN for TR 0:19932272
Attempt to set limbo arscn 0:19932272 irscn 0:19932272
Resetting standby activation ID 728027883 (0x2b64d2eb)
ARCH: Archival stopped, error occurred. Will continue retrying
ORACLE Instance maastb1 - Archival Error
ORA-16014: log 6 sequence# 1181 not archived, no available destinations
ORA-00312: online log 6 thread 1: '+DG01/maastb/onlinelog/group_6.267.844716079'
ORA-00312: online log 6 thread 1: '+FRA/maastb/onlinelog/group_6.604.844716081'
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE
ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WAIT WITH SESSION SHUTDOWN
ALTER DATABASE SWITCHOVER TO PRIMARY (maastb1)
Maximum wait for role transition is 15 minutes.
Backup controlfile written to trace file /u01/app/oracle/diag/rdbms/maastb/maastb1/trace/maastb1_rsm0_3741.trc
Tue Jul 29 02:24:08 2014
Standby terminal recovery start SCN: 19932270
RESETLOGS after incomplete recovery UNTIL CHANGE 19932272
Online log +DG01/maastb/onlinelog/group_1.257.844716051: Thread 1 Group 1 was previously cleared
Online log +FRA/maastb/onlinelog/group_1.568.844716053: Thread 1 Group 1 was previously cleared
Online log +DG01/maastb/onlinelog/group_2.268.844716057: Thread 1 Group 2 was previously cleared
Online log +FRA/maastb/onlinelog/group_2.564.844716059: Thread 1 Group 2 was previously cleared
Online log +DG01/maastb/onlinelog/group_3.269.844716063: Thread 2 Group 3 was previously cleared
Online log +FRA/maastb/onlinelog/group_3.562.844716065: Thread 2 Group 3 was previously cleared
Online log +DG01/maastb/onlinelog/group_4.270.844716067: Thread 2 Group 4 was previously cleared
Online log +FRA/maastb/onlinelog/group_4.559.844716071: Thread 2 Group 4 was previously cleared
Standby became primary SCN: 19932269
Tue Jul 29 02:24:12 2014
Setting recovery target incarnation to 4
Switchover: Complete - Database mounted as primary
Completed: ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WAIT WITH SESSION SHUTDOWN
Tue Jul 29 02:24:20 2014
ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE AVAILABILITY
Completed: ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE AVAILABILITY
ALTER DATABASE OPEN
Data Guard Broker initializing...
This instance was first to open
Picked broadcast on commit scheme to generate SCNs
Tue Jul 29 02:24:21 2014
Assigning activation ID 732752315 (0x2bace9bb)
LGWR: Primary database is in MAXIMUM AVAILABILITY mode
Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED
LGWR: Destination LOG_ARCHIVE_DEST_1 is not serviced by LGWR
Thread 1 advanced to log sequence 2 (thread open)
Tue Jul 29 02:24:22 2014
ARC3: Becoming the 'no SRL' ARCH
Tue Jul 29 02:24:22 2014
ARC0: Becoming the 'no SRL' ARCH
Thread 1 opened at log sequence 2
  Current log# 2 seq# 2 mem# 0: +DG01/maastb/onlinelog/group_2.268.844716057
  Current log# 2 seq# 2 mem# 1: +FRA/maastb/onlinelog/group_2.564.844716059
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue Jul 29 02:24:23 2014
SMON: enabling cache recovery
Tue Jul 29 02:24:23 2014
...
...
...
TNS-12564: TNS:connection refused
	ns secondary err code: 0
	nt main err code: 0
	nt secondary err code: 0
	nt OS err code: 0
Error 12514 received logging on to the standby
PING[ARC2]: Heartbeat failed to connect to standby 'maa'. Error is 12514.
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
Tue Jul 29 02:24:30 2014
FSFP started with pid=40, OS id=9787
Tue Jul 29 02:24:34 2014
minact-scn: Inst 1 is a slave inc#:4 mmon proc-id:3596 status:0x2
minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0000.00000000 gcalc-scn:0x0000.00000000
Tue Jul 29 02:24:37 2014
[3741] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:8127994 end:8132374 diff:4380 (43 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
Tue Jul 29 02:24:37 2014
SMON: enabling tx recovery
Database Characterset is WE8MSWIN1252
Tue Jul 29 02:24:48 2014
No Resource Manager plan active
Starting background process GTX0
Tue Jul 29 02:24:57 2014
GTX0 started with pid=41, OS id=9799
Starting background process RCBG
Tue Jul 29 02:24:57 2014
RCBG started with pid=45, OS id=9802
replication_dependency_tracking turned off (no async multimaster replication found)
Tue Jul 29 02:24:59 2014
Starting background process QMNC
Tue Jul 29 02:24:59 2014
QMNC started with pid=46, OS id=9804
Completed: ALTER DATABASE OPEN
ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='maastb1';
ALTER SYSTEM SET log_archive_format='arch_%t_%s_%r.arc' SCOPE=SPFILE SID='maastb1';
Tue Jul 29 02:25:18 2014
...
...
...
TNS-12564: TNS:connection refused
	ns secondary err code: 0
	nt main err code: 0
	nt secondary err code: 0
	nt OS err code: 0
Error 12514 received logging on to the standby
PING[ARC2]: Heartbeat failed to connect to standby 'maa'. Error is 12514.
Tue Jul 29 02:25:20 2014
ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*';
Tue Jul 29 02:25:20 2014
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Tue Jul 29 02:25:20 2014
Starting background process VKRM
Tue Jul 29 02:25:20 2014
VKRM started with pid=49, OS id=9823
Tue Jul 29 02:25:22 2014
Starting background process CJQ0
Tue Jul 29 02:25:22 2014
CJQ0 started with pid=58, OS id=9838
ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*';
ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*';
ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*';
ALTER SYSTEM SET db_file_name_convert='+DATA/maa','+DG01/maastb','+FRA/maa','+FRA/maastb' SCOPE=SPFILE;
ALTER SYSTEM SET log_file_name_convert='+DATA/maa','+DG01/maastb','+FRA/maa','+FRA/maastb' SCOPE=SPFILE;
Setting Resource Manager plan SCHEDULER[0x3189]:DEFAULT_MAINTENANCE_PLAN via scheduler window
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Tue Jul 29 02:25:27 2014
Starting background process SMCO
Tue Jul 29 02:25:27 2014
SMCO started with pid=56, OS id=9845
Tue Jul 29 02:25:28 2014
ALTER SYSTEM SET log_archive_dest_state_2='RESET' SCOPE=BOTH;
Tue Jul 29 02:25:28 2014
Active Session History (ASH) performed an emergency flush. This may mean that ASH is undersized. If emergency flushes are a recurring issue, you may consider increasing ASH size by setting the value of _ASH_SIZE to a sufficiently large value. Currently, ASH size is 4194304 bytes. Both ASH size and the total number of emergency flushes since instance startup can be monitored by running the following query:
 select total_size,awr_flush_emergency_count from v$ash_info;
Failover succeeded. Primary database is now maastb.

Observe que no começo do alertlog temos informações que o primary pode estar indisponível e após 60 segundos (que defini quando configurei o Fast-Start Faiolver) o failover automático iniciou. Depois disso tivemos o cancelamento da aplicação dos redo’s por parte do standby. Um pouco mais abaixo tivemos o switchover da base para primary e a informação de qual o SCN utilizado como ponto de virada (RESETLOGS after incomplete recovery UNTIL CHANGE 19932272). Após isso tivemos a abertura da base em modo OPEN e que o destino archive_dest 2 está dessincronizado (isso é normal já que o antigo primary está fora).

No Broker temos a configuração demonstrada abaixo, observe que o banco maastb já aparece listado como primary e o banco maa está com erro. Além disso o Fast-Start Failover está ativo e também que o antigo primary deve sofrer resinstate ao ser iniciado.

[oracle@rac11stb01 ~]$ dgmgrl
DGMGRL for Linux: Version 11.2.0.3.0 - 64bit Production

Copyright (c) 2000, 2009, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
DGMGRL> connect sys@maastb
Password:
Connected.
DGMGRL> SHOW CONFIGURATION VERBOSE;

Configuration - dgrac

  Protection Mode: MaxAvailability
  Databases:
	maastb - Primary database
	  Warning: ORA-16817: unsynchronized fast-start failover configuration

	maa    - (*) Physical standby database (disabled)
	  ORA-16661: the standby database needs to be reinstated

  (*) Fast-Start Failover target

  Properties:
	FastStartFailoverThreshold      = '60'
	OperationTimeout                = '30'
	FastStartFailoverLagLimit       = '30'
	CommunicationTimeout            = '180'
	FastStartFailoverAutoReinstate  = 'TRUE'
	FastStartFailoverPmyShutdown    = 'TRUE'
	BystandersFollowRoleChange      = 'ALL'

Fast-Start Failover: ENABLED

  Threshold:        60 seconds
  Target:           maa
  Observer:         sbdobs
  Lag Limit:        30 seconds (not in use)
  Shutdown Primary: TRUE
  Auto-reinstate:   TRUE

Configuration Status:
WARNING

DGMGRL>
DGMGRL> SHOW DATABASE maa;

Database - maa

  Role:            PHYSICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   (unknown)
  Apply Lag:       (unknown)
  Real Time Query: OFF
  Instance(s):
	maa1
	maa2

Database Status:
ORA-16661: the standby database needs to be reinstated

DGMGRL> SHOW DATABASE maastb;

Database - maastb

  Role:            PRIMARY
  Intended State:  TRANSPORT-ON
  Instance(s):
	maastb1
	maastb2

  Database Warning(s):
	ORA-16817: unsynchronized fast-start failover configuration

Database Status:
WARNING

DGMGRL>

Como já aconteceu em artigos anteriores, o CRS precisa de um ajuste manual para que o novo primary sempre inicie em modo OPEN:

[oracle@rac11stb01 ~]$ srvctl config database -d maastb -v
Database unique name: maastb
Database name:
Oracle home: /u01/app/oracle/product/11.2.0.3/db_1
Oracle user: oracle
Spfile:
Domain:
Start options: mount
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: maastb
Database instances: maastb1,maastb2
Disk Groups: DG01,FRA
Mount point paths:
Services:
Type: RAC
Database is administrator managed
[oracle@rac11stb01 ~]$ srvctl modify database -d maastb -s OPEN
[oracle@rac11stb01 ~]$ srvctl config database -d maastb -v
Database unique name: maastb
Database name:
Oracle home: /u01/app/oracle/product/11.2.0.3/db_1
Oracle user: oracle
Spfile:
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: maastb
Database instances: maastb1,maastb2
Disk Groups: DG01,FRA
Mount point paths:
Services:
Type: RAC
Database is administrator managed
[oracle@rac11stb01 ~]$

RELIGANDO O BANCO

Você deve ter notado acima que o antigo primary ainda não foi religado, se você o iniciar agora ele irá sofrer o reinstate e voltará como novo standby. De qualquer forma recomendo que antes de ligar o antigo primary verificar o motivo da queda, não adianta ter um ambiente standby que não é confiável. Muitas vezes é impossível religar o antigo primary e neste caso você precisa remover ele do Broker e adicionar um novo standby.

Supondo que você consiga religar o antigo primary e deixar ele estável verá algo parecido com o alertlog abaixo. É um log com muita informação: o banco tentará abrir em modo OPEN e o DataGuard irá informar que ele não é mais o novo primary. O banco sofrerá flashback para o SCN detectado como ponto de recover e será convertido em physical standby. Por fim um restart do banco e definição deste como standby juntamente com a sincronização dos archives com o primary. Tudo isso está em negrito no log abaixo.

[root@rac11pri01 ~]# tail -f /u01/app/oracle/diag/rdbms/maa/maa1/trace/alert_maa
  Current log# 1 seq# 1181 mem# 1: +FRA/maa/onlinelog/group_1.286.843488555
Tue Jul 29 04:30:00 2014
Archived Log entry 5432 added for thread 1 sequence 1180 ID 0x2b64d2eb dest 1:
Tue Jul 29 05:42:12 2014
Shutting down instance (abort)
License high water mark = 7
USER (ospid: 3606): terminating the instance
Instance terminated by USER, pid = 3606
Tue Jul 29 05:42:17 2014
Instance shutdown complete
Tue Jul 29 06:16:58 2014
Starting ORACLE instance (normal)
****************** Large Pages Information *****************

Total Shared Global Region in Large Pages = 0 KB (0%)

Large Pages used by this instance: 0 (0 KB)
Large Pages unused system wide = 0 (0 KB) (alloc incr 4096 KB)
Large Pages configured system wide = 0 (0 KB)
Large Page size = 2048 KB

RECOMMENDATION:
  Total Shared Global Region size is 1026 MB. For optimal performance,
  prior to the next instance restart increase the number
  of unused Large Pages by atleast 513 2048 KB Large Pages (1026 MB)
  system wide to get 100% of the Shared
  Global Region allocated with Large pages
***********************************************************
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Private Interface 'eth2:1' configured from GPnP for use as a private interconnect.
  [name='eth2:1', type=1, ip=169.254.187.5, mac=00-0c-29-8c-91-79, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62]
Public Interface 'eth0' configured from GPnP for use as a public interface.
  [name='eth0', type=1, ip=10.17.42.10, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1]
Public Interface 'eth0:2' configured from GPnP for use as a public interface.
  [name='eth0:2', type=1, ip=10.17.42.22, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1]
Public Interface 'eth0:4' configured from GPnP for use as a public interface.
  [name='eth0:4', type=1, ip=10.17.42.17, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1]
Public Interface 'eth0:5' configured from GPnP for use as a public interface.
  [name='eth0:5', type=1, ip=10.17.42.14, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1]
Picked latch-free SCN scheme 3
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning and Real Application Clusters options.
ORACLE_HOME = /u01/app/oracle/product/11.2.0.3/db_1
System name:    Linux
Node name:      rac11pri01.tjsc.jus.br
Release:        2.6.39-400.17.1.el6uek.x86_64
Version:        #1 SMP Fri Feb 22 18:16:18 PST 2013
Machine:        x86_64
VM name:        VMWare Version: 6
Using parameter settings in server-side pfile /u01/app/oracle/product/11.2.0.3/db_1/dbs/initmaa1.ora
System parameters with non-default values:
  processes                = 150
  spfile                   = "+DATA/maa/spfilemaa.ora"
  sga_target               = 1G
  control_files            = "+DATA/maa/controlfile/current.273.843488553"
  control_files            = "+FRA/maa/controlfile/current.256.843488553"
  db_file_name_convert     = "+DG01/maastb"
  db_file_name_convert     = "+DATA/maa"
  db_file_name_convert     = "+FRA/maastb"
  db_file_name_convert     = "+FRA/maa"
  log_file_name_convert    = "+DG01/maastb"
  log_file_name_convert    = "+DATA/maa"
  log_file_name_convert    = "+FRA/maastb"
  log_file_name_convert    = "+FRA/maa"
  db_block_size            = 8192
  compatible               = "11.2.0.0.0"
  log_archive_dest_1       = "LOCATION=USE_DB_RECOVERY_FILE_DEST VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=maa"
  log_archive_dest_2       = "SERVICE=maastb SYNC AFFIRM LGWR VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=maastb"
  log_archive_dest_state_2 = "ENABLE"
  log_archive_min_succeed_dest= 1
  fal_server               = "maastb"
  log_archive_trace        = 0
  log_archive_config       = "DG_CONFIG=(maa,maastb)"
  log_archive_format       = "arch_%t_%s_%r.arc"
  log_archive_max_processes= 4
  archive_lag_target       = 0
  cluster_database         = TRUE
  db_create_file_dest      = "+DATA"
  db_recovery_file_dest    = "+FRA"
  db_recovery_file_dest_size= 10G
  standby_file_management  = "AUTO"
  thread                   = 1
  undo_tablespace          = "UNDOTBS1"
  instance_number          = 1
  remote_login_passwordfile= "EXCLUSIVE"
  db_domain                = ""
  remote_listener          = "rac11pri-scan.tjsc.jus.br:1521"
  audit_file_dest          = "/u01/app/oracle/admin/maa/adump"
  audit_trail              = "DB"
  db_name                  = "maa"
  db_unique_name           = "maa"
  open_cursors             = 300
  pga_aggregate_target     = 256M
  dg_broker_start          = TRUE
  dg_broker_config_file1   = "+DATA/maa/dr1maa.dat"
  dg_broker_config_file2   = "+FRA/maa/dr2maa.dat"
  diagnostic_dest          = "/u01/app/oracle"
Cluster communication is configured to use the following interface(s) for this instance
  169.254.187.5
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
Tue Jul 29 06:17:06 2014
PMON started with pid=2, OS id=7421
Tue Jul 29 06:17:06 2014
PSP0 started with pid=3, OS id=7423
Tue Jul 29 06:17:08 2014
VKTM started with pid=4, OS id=7425 at elevated priority
VKTM running at (1)millisec precision with DBRM quantum (100)ms
Tue Jul 29 06:17:08 2014
GEN0 started with pid=5, OS id=7430
Tue Jul 29 06:17:08 2014
DIAG started with pid=6, OS id=7432
Tue Jul 29 06:17:08 2014
DBRM started with pid=7, OS id=7434
Tue Jul 29 06:17:08 2014
PING started with pid=8, OS id=7436
Tue Jul 29 06:17:08 2014
ACMS started with pid=9, OS id=7438
Tue Jul 29 06:17:08 2014
DIA0 started with pid=10, OS id=7440
Tue Jul 29 06:17:08 2014
LMON started with pid=11, OS id=7442
Tue Jul 29 06:17:08 2014
LMD0 started with pid=12, OS id=7444
* Load Monitor used for high load check
* New Low - High Load Threshold Range = [1920 - 2560]
Tue Jul 29 06:17:08 2014
LMS0 started with pid=13, OS id=7446 at elevated priority
Tue Jul 29 06:17:08 2014
RMS0 started with pid=14, OS id=7450
Tue Jul 29 06:17:08 2014
LMHB started with pid=15, OS id=7452
Tue Jul 29 06:17:08 2014
MMAN started with pid=16, OS id=7454
Tue Jul 29 06:17:08 2014
DBW0 started with pid=17, OS id=7456
Tue Jul 29 06:17:09 2014
LGWR started with pid=18, OS id=7458
Tue Jul 29 06:17:09 2014
CKPT started with pid=19, OS id=7460
Tue Jul 29 06:17:09 2014
SMON started with pid=20, OS id=7462
Tue Jul 29 06:17:09 2014
RECO started with pid=21, OS id=7464
Tue Jul 29 06:17:09 2014
RBAL started with pid=22, OS id=7466
Tue Jul 29 06:17:09 2014
ASMB started with pid=23, OS id=7468
Tue Jul 29 06:17:09 2014
MMON started with pid=24, OS id=7470
Tue Jul 29 06:17:09 2014
MMNL started with pid=25, OS id=7474
NOTE: initiating MARK startup
Starting background process MARK
Tue Jul 29 06:17:09 2014
MARK started with pid=26, OS id=7476
lmon registered with NM - instance number 1 (internal mem no 0)
NOTE: MARK has subscribed
Reconfiguration started (old inc 0, new inc 2)
List of instances:
 1 (myinst: 1)
 Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
 Set master node info
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 Post SMON to start 1st pass IR
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
Tue Jul 29 06:17:11 2014
LCK0 started with pid=28, OS id=7482
Tue Jul 29 06:17:11 2014
Starting background process RSMN
Tue Jul 29 06:17:11 2014
RSMN started with pid=29, OS id=7484
Error: can not register my instance state - 4
ORACLE_BASE not set in environment. It is recommended
that ORACLE_BASE be set in the environment
Reusing ORACLE_BASE from an earlier startup = /u01/app/oracle
Tue Jul 29 06:17:12 2014
DMON started with pid=30, OS id=7486
Reconfiguration started (old inc 2, new inc 4)
List of instances:
 1 2 (myinst: 1)
 Global Resource Directory frozen
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
 Set master node info
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 Post SMON to start 1st pass IR
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
Tue Jul 29 06:17:13 2014
ALTER SYSTEM SET local_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=10.17.42.14)(PORT=1521))))' SCOPE=MEMORY SID='maa1';
ALTER DATABASE MOUNT /* db agent *//* {1:18282:220} */
This instance was first to mount
NOTE: Loaded library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so
NOTE: Loaded library: System
SUCCESS: diskgroup DATA was mounted
NOTE: dependency between database maa and diskgroup resource ora.DATA.dg is established
SUCCESS: diskgroup FRA was mounted
NOTE: dependency between database maa and diskgroup resource ora.FRA.dg is established
Tue Jul 29 06:17:18 2014
NSS2 started with pid=33, OS id=7506
Successful mount of redo thread 1, with mount id 732690857
Allocated 3981120 bytes in shared pool for flashback generation buffer
Starting background process RVWR
Tue Jul 29 06:17:18 2014
RVWR started with pid=34, OS id=7508
Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
Lost write protection disabled
Completed: ALTER DATABASE MOUNT /* db agent *//* {1:18282:220} */
ALTER DATABASE OPEN /* db agent *//* {1:18282:220} */
Data Guard Broker initializing...
Data Guard Broker initialization complete
Data Guard: verifying database primary role...
Tue Jul 29 06:17:25 2014
Starting Data Guard Broker (DMON)
Tue Jul 29 06:17:25 2014
INSV started with pid=36, OS id=7514
Tue Jul 29 06:17:29 2014
NSV1 started with pid=37, OS id=7516
Tue Jul 29 06:17:37 2014
Data Guard: version check completed
Data Guard determines a failover has occurred - this is no longer a primary database
ORA-16649 signalled during: ALTER DATABASE OPEN /* db agent *//* {1:18282:220} */...
Tue Jul 29 06:17:52 2014
Reconfiguration started (old inc 4, new inc 6)
List of instances:
 1 (myinst: 1)
 Global Resource Directory frozen
 * dead instance detected - domain 0 invalid = TRUE
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Tue Jul 29 06:17:52 2014
 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
 Set master node info
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 Post SMON to start 1st pass IR
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
Tue Jul 29 06:17:59 2014
NSV1 started with pid=39, OS id=7566
Tue Jul 29 06:18:07 2014
RSM0 started with pid=40, OS id=7570
Tue Jul 29 06:18:11 2014
Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST
RFS[1]: Assigned to RFS process 7573
RFS[1]: Database mount ID mismatch [0x2bace9bb:0x2babf9a9] (732752315:732690857)
FLASHBACK DATABASE TO SCN 19932269
Tue Jul 29 06:18:13 2014
RFS[2]: Assigned to RFS process 7577
RFS[2]: Database mount ID mismatch [0x2bace9bb:0x2babf9a9] (732752315:732690857)
Flashback Restore Start
Flashback Restore Complete
Flashback Media Recovery Start
 started logmerger process
Tue Jul 29 06:18:18 2014
Parallel Media Recovery started with 2 slaves
Tue Jul 29 06:18:18 2014
Recovery of Online Redo Log: Thread 1 Group 1 Seq 1181 Reading mem 0
  Mem# 0: +DATA/maa/onlinelog/group_1.272.843488553
  Mem# 1: +FRA/maa/onlinelog/group_1.286.843488555
Recovery of Online Redo Log: Thread 2 Group 3 Seq 1394 Reading mem 0
  Mem# 0: +DATA/maa/onlinelog/group_3.257.843489101
  Mem# 1: +FRA/maa/onlinelog/group_3.284.843489101
Recovery of Online Redo Log: Thread 2 Group 4 Seq 1395 Reading mem 0
  Mem# 0: +DATA/maa/onlinelog/group_4.262.843489103
  Mem# 1: +FRA/maa/onlinelog/group_4.283.843489103
Incomplete Recovery applied until change 19932270 time 07/29/2014 05:42:12
Flashback Media Recovery Complete
Completed: FLASHBACK DATABASE TO SCN 19932269
alter database convert to physical standby
ALTER DATABASE CONVERT TO PHYSICAL STANDBY (maa1)
Flush standby redo logfile failed:1649
Clearing standby activation ID 728027883 (0x2b64d2eb)
The primary database controlfile was created using the
'MAXLOGFILES 192' clause.
There is space for up to 188 standby redo logfiles
Use the following SQL commands on the standby database to create
standby redo logfiles that match the primary database:
ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl5.f' SIZE 52428800;
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Clearing online redo logfile 1 +DATA/maa/onlinelog/group_1.272.843488553
Clearing online log 1 of thread 1 sequence number 1181
Clearing online redo logfile 1 complete
Clearing online redo logfile 2 +DATA/maa/onlinelog/group_2.271.843488555
Clearing online log 2 of thread 1 sequence number 1180
Clearing online redo logfile 2 complete
Clearing online redo logfile 3 +DATA/maa/onlinelog/group_3.257.843489101
Clearing online log 3 of thread 2 sequence number 1394
Clearing online redo logfile 3 complete
Clearing online redo logfile 4 +DATA/maa/onlinelog/group_4.262.843489103
Clearing online log 4 of thread 2 sequence number 1395
Clearing online redo logfile 4 complete
Completed: alter database convert to physical standby
Tue Jul 29 06:18:29 2014
Shutting down instance (immediate)
Shutting down instance: further logons disabled
Stopping background process MMNL
Stopping background process MMON
License high water mark = 6
alter database CLOSE NORMAL
ORA-1109 signalled during: alter database CLOSE NORMAL...
alter database DISMOUNT
Shutting down archive processes
Archiving is disabled
Tue Jul 29 06:18:33 2014
NOTE: Deferred communication with ASM instance
NOTE: deferred map free for map id 5
Tue Jul 29 06:18:33 2014
NOTE: Deferred communication with ASM instance
NOTE: deferred map free for map id 150
Tue Jul 29 06:18:34 2014
NOTE: Deferred communication with ASM instance
NOTE: deferred map free for map id 2
Completed: alter database DISMOUNT
ARCH: Archival disabled due to shutdown: 1089
Shutting down archive processes
Archiving is disabled
Shutting down Data Guard Broker processes
Tue Jul 29 06:18:35 2014
NOTE: Deferred communication with ASM instance
Tue Jul 29 06:18:36 2014
Completed: Data Guard Broker shutdown
NOTE: force a map free for map id 2
Tue Jul 29 06:18:38 2014
Stopping background process VKTM
ARCH: Archival disabled due to shutdown: 1089
Shutting down archive processes
Archiving is disabled
Tue Jul 29 06:18:38 2014
NOTE: Shutting down MARK background process
NOTE: force a map free for map id 192
NOTE: force a map free for map id 190
NOTE: force a map free for map id 188
NOTE: force a map free for map id 186
NOTE: force a map free for map id 184
NOTE: force a map free for map id 182
NOTE: force a map free for map id 143
NOTE: force a map free for map id 191
NOTE: force a map free for map id 189
NOTE: force a map free for map id 187
NOTE: force a map free for map id 185
NOTE: force a map free for map id 183
NOTE: force a map free for map id 181
NOTE: force a map free for map id 142
Tue Jul 29 06:18:41 2014
freeing rdom 0
Tue Jul 29 06:18:43 2014
Instance shutdown complete
Tue Jul 29 06:18:44 2014
Starting ORACLE instance (normal)
****************** Large Pages Information *****************

Total Shared Global Region in Large Pages = 0 KB (0%)

Large Pages used by this instance: 0 (0 KB)
Large Pages unused system wide = 0 (0 KB) (alloc incr 4096 KB)
Large Pages configured system wide = 0 (0 KB)
Large Page size = 2048 KB

RECOMMENDATION:
  Total Shared Global Region size is 1026 MB. For optimal performance,
  prior to the next instance restart increase the number
  of unused Large Pages by atleast 513 2048 KB Large Pages (1026 MB)
  system wide to get 100% of the Shared
  Global Region allocated with Large pages
***********************************************************
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Private Interface 'eth2:1' configured from GPnP for use as a private interconnect.
  [name='eth2:1', type=1, ip=169.254.187.5, mac=00-0c-29-8c-91-79, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62]
Public Interface 'eth0' configured from GPnP for use as a public interface.
  [name='eth0', type=1, ip=10.17.42.10, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1]
Public Interface 'eth0:2' configured from GPnP for use as a public interface.
  [name='eth0:2', type=1, ip=10.17.42.22, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1]
Public Interface 'eth0:4' configured from GPnP for use as a public interface.
  [name='eth0:4', type=1, ip=10.17.42.17, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1]
Public Interface 'eth0:5' configured from GPnP for use as a public interface.
  [name='eth0:5', type=1, ip=10.17.42.14, mac=00-0c-29-8c-91-65, net=10.17.42.0/26, mask=255.255.255.192, use=public/1]
Picked latch-free SCN scheme 3
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning and Real Application Clusters options.
ORACLE_HOME = /u01/app/oracle/product/11.2.0.3/db_1
System name:    Linux
Node name:      rac11pri01.tjsc.jus.br
Release:        2.6.39-400.17.1.el6uek.x86_64
Version:        #1 SMP Fri Feb 22 18:16:18 PST 2013
Machine:        x86_64
VM name:        VMWare Version: 6
Using parameter settings in server-side pfile /u01/app/oracle/product/11.2.0.3/db_1/dbs/initmaa1.ora
System parameters with non-default values:
  processes                = 150
  spfile                   = "+DATA/maa/spfilemaa.ora"
  sga_target               = 1G
  control_files            = "+DATA/maa/controlfile/current.273.843488553"
  control_files            = "+FRA/maa/controlfile/current.256.843488553"
  db_file_name_convert     = "+DG01/maastb"
  db_file_name_convert     = "+DATA/maa"
  db_file_name_convert     = "+FRA/maastb"
  db_file_name_convert     = "+FRA/maa"
  log_file_name_convert    = "+DG01/maastb"
  log_file_name_convert    = "+DATA/maa"
  log_file_name_convert    = "+FRA/maastb"
  log_file_name_convert    = "+FRA/maa"
  db_block_size            = 8192
  compatible               = "11.2.0.0.0"
  log_archive_dest_1       = "LOCATION=USE_DB_RECOVERY_FILE_DEST VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=maa"
  log_archive_dest_2       = "SERVICE=maastb SYNC AFFIRM LGWR VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=maastb"
  log_archive_dest_state_2 = "ENABLE"
  log_archive_min_succeed_dest= 1
  fal_server               = "maastb"
  log_archive_trace        = 0
  log_archive_config       = "DG_CONFIG=(maa,maastb)"
  log_archive_format       = "arch_%t_%s_%r.arc"
  log_archive_max_processes= 4
  archive_lag_target       = 0
  cluster_database         = TRUE
  db_create_file_dest      = "+DATA"
  db_recovery_file_dest    = "+FRA"
  db_recovery_file_dest_size= 10G
  standby_file_management  = "AUTO"
  thread                   = 1
  undo_tablespace          = "UNDOTBS1"
  instance_number          = 1
  remote_login_passwordfile= "EXCLUSIVE"
  db_domain                = ""
  remote_listener          = "rac11pri-scan.tjsc.jus.br:1521"
  audit_file_dest          = "/u01/app/oracle/admin/maa/adump"
  audit_trail              = "DB"
  db_name                  = "maa"
  db_unique_name           = "maa"
  open_cursors             = 300
  pga_aggregate_target     = 256M
  dg_broker_start          = TRUE
  dg_broker_config_file1   = "+DATA/maa/dr1maa.dat"
  dg_broker_config_file2   = "+FRA/maa/dr2maa.dat"
  diagnostic_dest          = "/u01/app/oracle"
Cluster communication is configured to use the following interface(s) for this instance
  169.254.187.5
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
Tue Jul 29 06:18:47 2014
PMON started with pid=2, OS id=7618
Tue Jul 29 06:18:47 2014
PSP0 started with pid=3, OS id=7620
Tue Jul 29 06:18:48 2014
VKTM started with pid=4, OS id=7622 at elevated priority
VKTM running at (1)millisec precision with DBRM quantum (100)ms
Tue Jul 29 06:18:48 2014
GEN0 started with pid=5, OS id=7626
Tue Jul 29 06:18:48 2014
DIAG started with pid=6, OS id=7628
Tue Jul 29 06:18:48 2014
DBRM started with pid=7, OS id=7630
Tue Jul 29 06:18:48 2014
PING started with pid=8, OS id=7632
Tue Jul 29 06:18:48 2014
ACMS started with pid=9, OS id=7634
Tue Jul 29 06:18:49 2014
DIA0 started with pid=10, OS id=7636
Tue Jul 29 06:18:49 2014
LMON started with pid=11, OS id=7638
Tue Jul 29 06:18:49 2014
LMD0 started with pid=12, OS id=7640
Tue Jul 29 06:18:49 2014
LMS0 started with pid=13, OS id=7642 at elevated priority
Tue Jul 29 06:18:49 2014
RMS0 started with pid=14, OS id=7646
Tue Jul 29 06:18:49 2014
LMHB started with pid=15, OS id=7648
Tue Jul 29 06:18:49 2014
MMAN started with pid=16, OS id=7650
* Load Monitor used for high load check
* New Low - High Load Threshold Range = [1920 - 2560]
Tue Jul 29 06:18:49 2014
DBW0 started with pid=17, OS id=7652
Tue Jul 29 06:18:49 2014
LGWR started with pid=18, OS id=7654
Tue Jul 29 06:18:50 2014
CKPT started with pid=19, OS id=7656
Tue Jul 29 06:18:50 2014
SMON started with pid=20, OS id=7658
Tue Jul 29 06:18:50 2014
RECO started with pid=21, OS id=7660
Tue Jul 29 06:18:50 2014
RBAL started with pid=22, OS id=7662
Tue Jul 29 06:18:50 2014
ASMB started with pid=23, OS id=7664
Tue Jul 29 06:18:50 2014
MMON started with pid=24, OS id=7666
Tue Jul 29 06:18:50 2014
MMNL started with pid=25, OS id=7670
lmon registered with NM - instance number 1 (internal mem no 0)
NOTE: initiating MARK startup
Starting background process MARK
Tue Jul 29 06:18:50 2014
MARK started with pid=26, OS id=7672
NOTE: MARK has subscribed
Reconfiguration started (old inc 0, new inc 2)
List of instances:
 1 (myinst: 1)
 Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
 Set master node info
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 Post SMON to start 1st pass IR
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
Tue Jul 29 06:18:50 2014
LCK0 started with pid=28, OS id=7678
Starting background process RSMN
Tue Jul 29 06:18:51 2014
RSMN started with pid=29, OS id=7680
ORACLE_BASE from environment = /u01/app/oracle
Tue Jul 29 06:18:51 2014
DMON started with pid=30, OS id=7682
Tue Jul 29 06:18:51 2014
alter database  mount
This instance was first to mount
Tue Jul 29 06:18:52 2014
ALTER SYSTEM SET local_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=10.17.42.14)(PORT=1521))))' SCOPE=MEMORY SID='maa1';
NOTE: Loaded library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so
NOTE: Loaded library: System
SUCCESS: diskgroup DATA was mounted
SUCCESS: diskgroup FRA was mounted
NOTE: dependency between database maa and diskgroup resource ora.DATA.dg is established
NOTE: dependency between database maa and diskgroup resource ora.FRA.dg is established
Tue Jul 29 06:18:58 2014
NSS2 started with pid=33, OS id=7720
ARCH: STARTING ARCH PROCESSES
Tue Jul 29 06:18:58 2014
ARC0 started with pid=34, OS id=7722
ARC0: Archival started
ARCH: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Tue Jul 29 06:18:59 2014
ARC1 started with pid=35, OS id=7724
Successful mount of redo thread 1, with mount id 732706059
Tue Jul 29 06:18:59 2014
ARC2 started with pid=36, OS id=7726
Allocated 3981120 bytes in shared pool for flashback generation buffer
Tue Jul 29 06:18:59 2014
ARC3 started with pid=37, OS id=7728
Starting background process RVWR
ARC1: Archival started
ARC2: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Tue Jul 29 06:18:59 2014
RVWR started with pid=38, OS id=7730
Physical Standby Database mounted.
Lost write protection disabled
ARC2: Becoming the active heartbeat ARCH
ARC2: Becoming the active heartbeat ARCH
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Completed: alter database  mount
Tue Jul 29 06:19:04 2014
Starting Data Guard Broker (DMON)
Tue Jul 29 06:19:05 2014
INSV started with pid=40, OS id=7747
Tue Jul 29 06:19:11 2014
NSV1 started with pid=41, OS id=7752
Tue Jul 29 06:19:16 2014
RSM0 started with pid=43, OS id=7758
Tue Jul 29 06:19:21 2014
Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST
Primary database is in MAXIMUM AVAILABILITY mode
Changing standby controlfile to RESYNCHRONIZATION level
ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='maa1';
Standby controlfile consistent with primary
ALTER SYSTEM SET log_archive_format='arch_%t_%s_%r.arc' SCOPE=SPFILE SID='maa1';
ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*';
ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*';
RFS[1]: Assigned to RFS process 7760
RFS[1]: Selected log 5 for thread 1 sequence 4 dbid 722024964 branch 854159044
ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*';
ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*';
ALTER SYSTEM SET db_file_name_convert='+DG01/maastb','+DATA/maa','+FRA/maastb','+FRA/maa' SCOPE=SPFILE;
ALTER SYSTEM SET log_file_name_convert='+DG01/maastb','+DATA/maa','+FRA/maastb','+FRA/maa' SCOPE=SPFILE;
ALTER SYSTEM SET fal_server='maastb' SCOPE=BOTH;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE
Attempt to start background Managed Standby Recovery process (maa1)
Tue Jul 29 06:19:23 2014
Primary database is in MAXIMUM AVAILABILITY mode
Standby controlfile consistent with primary
Tue Jul 29 06:19:23 2014
MRP0 started with pid=46, OS id=7764
MRP0: Background Managed Standby Recovery process started (maa1)
Standby controlfile consistent with primary
RFS[2]: Assigned to RFS process 7762
RFS[2]: Selected log 8 for thread 2 sequence 4 dbid 722024964 branch 854159044
Tue Jul 29 06:19:25 2014
RFS[3]: Assigned to RFS process 7766
RFS[3]: Selected log 6 for thread 1 sequence 3 dbid 722024964 branch 854159044
Tue Jul 29 06:19:25 2014
RFS[4]: Assigned to RFS process 7768
RFS[4]: Selected log 9 for thread 2 sequence 3 dbid 722024964 branch 854159044
RFS[3]: New Archival REDO Branch(resetlogs_id): 854159044  Prior: 847284763
RFS[3]: Archival Activation ID: 0x2bace9bb Current: 0x0
RFS[3]: Effect of primary database OPEN RESETLOGS
RFS[3]: Managed Standby Recovery process is active
RFS[3]: Incarnation entry added for Branch(resetlogs_id): 854159044 (maa1)
Tue Jul 29 06:19:25 2014
Setting recovery target incarnation to 4
Tue Jul 29 06:19:26 2014
Archived Log entry 5433 added for thread 2 sequence 3 ID 0x2bace9bb dest 1:
Tue Jul 29 06:19:26 2014
Archived Log entry 5434 added for thread 1 sequence 3 ID 0x2bace9bb dest 1:
RFS[4]: Opened log for thread 2 sequence 1395 dbid 722024964 branch 847284763
RFS[3]: Opened log for thread 1 sequence 1181 dbid 722024964 branch 847284763
Archived Log entry 5435 added for thread 2 sequence 1395 rlc 847284763 ID 0x2b64d2eb dest 2:
 started logmerger process
Archived Log entry 5436 added for thread 1 sequence 1181 rlc 847284763 ID 0x2b64d2eb dest 2:
Managed Standby Recovery starting Real Time Apply
RFS[4]: Opened log for thread 2 sequence 2 dbid 722024964 branch 854159044
Tue Jul 29 06:19:29 2014
RFS[5]: Assigned to RFS process 7786
RFS[5]: Opened log for thread 2 sequence 1 dbid 722024964 branch 854159044
RFS[3]: Opened log for thread 1 sequence 1 dbid 722024964 branch 854159044
Archived Log entry 5437 added for thread 2 sequence 2 rlc 854159044 ID 0x2bace9bb dest 2:
Archived Log entry 5438 added for thread 2 sequence 1 rlc 854159044 ID 0x2bace9bb dest 2:
Parallel Media Recovery started with 2 slaves
Tue Jul 29 06:19:30 2014
Media Recovery start incarnation depth : 1, target inc# : 4, irscn : 19932272
Archived Log entry 5439 added for thread 1 sequence 1 rlc 854159044 ID 0x2bace9bb dest 2:
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Tue Jul 29 06:19:30 2014
RFS[6]: Assigned to RFS process 7788
RFS[6]: Opened log for thread 1 sequence 2 dbid 722024964 branch 854159044
Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_2_seq_1395.2326.854173167
Identified End-Of-Redo (failover) for thread 2 sequence 1395 at SCN 0x0.1302470
Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_1_seq_1181.2327.854173167
Identified End-Of-Redo (failover) for thread 1 sequence 1181 at SCN 0x0.1302470
Archived Log entry 5440 added for thread 1 sequence 2 rlc 854159044 ID 0x2bace9bb dest 2:
Resetting standby activation ID 728027883 (0x2b64d2eb)
Media Recovery End-Of-Redo indicator encountered
Media Recovery Continuing
Tue Jul 29 06:19:31 2014
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE
Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_2_seq_1.2329.854173169
Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_1_seq_1.2330.854173169
Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_1_seq_2.2331.854173171
Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_2_seq_2.2328.854173169
Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_2_seq_3.2325.854173167
Tue Jul 29 06:19:34 2014
Archived Log entry 5441 added for thread 2 sequence 4 ID 0x2bace9bb dest 1:
Tue Jul 29 06:19:34 2014
Primary database is in MAXIMUM AVAILABILITY mode
Standby controlfile consistent with primary
Standby controlfile consistent with primary
RFS[7]: Assigned to RFS process 7806
RFS[7]: Selected log 8 for thread 2 sequence 5 dbid 722024964 branch 854159044
Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_1_seq_3.2324.854173167
Media Recovery Waiting for thread 1 sequence 4 (in transit)
Recovery of Online Redo Log: Thread 1 Group 5 Seq 4 Reading mem 0
  Mem# 0: +DATA/maa/onlinelog/group_5.263.843615365
  Mem# 1: +FRA/maa/onlinelog/group_5.289.843615367
Media Recovery Log +FRA/maa/archivelog/2014_07_29/thread_2_seq_4.2332.854173175
Media Recovery Waiting for thread 2 sequence 5 (in transit)
Recovery of Online Redo Log: Thread 2 Group 8 Seq 5 Reading mem 0
  Mem# 0: +DATA/maa/onlinelog/group_8.259.843615383
  Mem# 1: +FRA/maa/onlinelog/group_8.703.843615385
Tue Jul 29 06:19:37 2014
Archived Log entry 5442 added for thread 1 sequence 4 ID 0x2bace9bb dest 1:
Tue Jul 29 06:19:38 2014
Primary database is in MAXIMUM AVAILABILITY mode
Standby controlfile consistent with primary
Media Recovery Waiting for thread 1 sequence 5 (in transit)
Standby controlfile consistent with primary
RFS[8]: Assigned to RFS process 7809
RFS[8]: Selected log 5 for thread 1 sequence 5 dbid 722024964 branch 854159044
Recovery of Online Redo Log: Thread 1 Group 5 Seq 5 Reading mem 0
  Mem# 0: +DATA/maa/onlinelog/group_5.263.843615365
  Mem# 1: +FRA/maa/onlinelog/group_5.289.843615367
Tue Jul 29 06:19:46 2014
Reconfiguration started (old inc 2, new inc 4)
List of instances:
 1 2 (myinst: 1)
 Global Resource Directory frozen
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Tue Jul 29 06:19:46 2014
 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
 Set master node info
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 Post SMON to start 1st pass IR
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete

No log do Broker irá ver um resumo do está no alertlog. Detecção que o antigo primary voltou e precisa de reinstate, após isso o banco maa passa a ser o destino preferencial.

07/29/2014 02:58:18
REINSTATE DATABASE maa
Database maa can be reinstated
07/29/2014 02:58:50
Command REINSTATE DATABASE maa completed with warning ORA-16570
Data Guard Broker Status Summary:
  Type                        Name                             Severity  Status
  Configuration               dgrac                             Warning  ORA-16608
  Primary Database            maastb                            Warning  ORA-16817
  Physical Standby Database   maa                                 Error  ORA-16661
07/29/2014 02:59:31
REINSTATE DATABASE maa
Database maa can be reinstated
07/29/2014 02:59:53
Successfully completed reinstatement of database 0x01001000; removing ReinstateContextArray property
07/29/2014 03:00:03
Command REINSTATE DATABASE maa completed
Data Guard Broker Status Summary:
  Type                        Name                             Severity  Status
  Configuration               dgrac                             Warning  ORA-16608
  Primary Database            maastb                            Warning  ORA-16817
  Physical Standby Database   maa                               Warning  ORA-16817
EDIT DATABASE maa SET PROPERTY ActualApplyInstance = maa1
07/29/2014 03:00:04
Apply Instance for database maa is maa1
Command EDIT DATABASE maa SET PROPERTY ActualApplyInstance = maa1 completed

Caso estivesse acompanhando o log do Observer veria as mesmas informações (de forma mais resumida ainda). Conforme o log abaixo demonstra o banco maa não chega a abrir durante o processo devido ao reinstate.

Failover succeeded, new primary is "maastb"
23:45:11.19  Monday, July 28, 2014
00:17:56.55  Tuesday, July 29, 2014
Initiating reinstatement for database "maa"...
Reinstating database "maa", please wait...
Operation requires shutdown of instance "maa1" on database "maa"
Shutting down instance "maa1"...
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "maa1" on database "maa"
Starting instance "maa1"...
ORACLE instance started.
Database mounted.
Continuing to reinstate database "maa" ...
Reinstatement of database "maa" succeeded
00:19:41.47  Tuesday, July 29, 2014

Como sempre, o ajuste necessário do CRS para não abrir o antigo primary em modo OPEN em caso de restart do server:

[oracle@rac11pri01 ~]$ srvctl config database -d maa -v
Database unique name: maa
Database name: maa
Oracle home: /u01/app/oracle/product/11.2.0.3/db_1
Oracle user: oracle
Spfile: +DATA/maa/spfilemaa.ora
Domain:
Start options: open
Stop options: immediate
Database role: PHYSICAL_STANDBY
Management policy: AUTOMATIC
Server pools: maa
Database instances: maa1,maa2
Disk Groups: DATA,FRA
Mount point paths:
Services:
Type: RAC
Database is administrator managed
[oracle@rac11pri01 ~]$ srvctl modify database -d maa -s MOUNT
[oracle@rac11pri01 ~]$ srvctl config database -d maa -v
Database unique name: maa
Database name: maa
Oracle home: /u01/app/oracle/product/11.2.0.3/db_1
Oracle user: oracle
Spfile: +DATA/maa/spfilemaa.ora
Domain:
Start options: mount
Stop options: immediate
Database role: PHYSICAL_STANDBY
Management policy: AUTOMATIC
Server pools: maa
Database instances: maa1,maa2
Disk Groups: DATA,FRA
Mount point paths:
Services:
Type: RAC
Database is administrator managed
[oracle@rac11pri01 ~]$

AMBIENTE FINAL

Chegamos aqui com o ambiente com dois bancos operando em Maximum Availability. O banco primary é o maastb e o standby é o maa. A troca de papeis ocorreu de forma automática pois o Fast-Start Failover está ativo com o Observer monitorando tudo.

Claro que seu ambiente não é só isso, você precisará que suas aplicações e TNS estejam preparadas para chavear as conexões para o novo primary. Não é o escopo aqui, mas é um ponto importante a ser verificado.

PROXIMO ARTIGO

No próximo artigo falarei do que temos de mudança para o DataGuard do Oracle 12c. Além disso um resumo dos dez artigos para que você possa ter um guia do que pode ser feito para configurar MAA com oracle RAC.

Fernando Simon

Have you hugged your backup today?

Oracle e MAA – Artigo IX – Failover Automático

Leave a Reply Cancel reply