Tag Archives: Data Guard

ZDLRA, Multi-site protection – ZERO RPO for Primary and Standby

ZDLRA can be used from a small single database environment to big environments where you need protection in more than one site at the same time. At every level, you can use different features of ZDLRA to provide desirable protection. Here I will show how to reach zero RPO for both primary and standby databases. All the steps, doc, and tech parts are covered.

You can check the examples the reference for every scenario int these two papers from the Oracle MAA team: MAA Overview On-Premises and Oracle MAA Reference Architectures. They provide good information on how to prepare to reduce RPO and improve RTO. In resume, the focus is the same, reduce the downtime and data loss in case of a catastrophe (zero RPO, and zero RPO).

Multi-site protection

If you looked both papers before, you saw that to provide good protection is desirable to have an additional site to, at least, send the backups. And if you go higher, for GOLD and PLATINUM environments, you start to have multiple sites synced with data guard. These Critical/Mission-critical environments need to be protected for every kind of catastrophic failure, from disk until complete site outage (some need to follow specific law’s requirements, bank as an example).

And the focus of this post is these big environments. I will show you how to use ZDLRA to protect both sites, reaching zero RPO even for standby databases. And doing that, you can survive for a catastrophic outage (like entire datacenter failure) and still have zero RPO. Going further, you can even have zero RPO if you lose completely on site when using real-time redo for ZDLRA, and this is not written in the docs by the way.

Click here to read more…

ZDLRA, Real-Time Redo and Zero RPO

The idea for Real-Time Redo is to reach zero RPO for every kind of database and this includes ones with and without DG. As you can see in my last post, where I showed how to configure Real-Time Redo for one database, some little steps need to be executed and they are pretty similar than a remote destination for archivelog for DG.

But if you noticed, the configuration for the remote destination was defined as ASYNC, and hinted like that at ZDLRA docs (“Protection of Ongoing Transactions” or at “How Real-Time Redo Transport Works”). In the same post, I suggested as “controversial” because the ASYNC does not guarantee the RPO zero. 

You can see more in the DataGuard docs at (Oracle Data Guard Protection Modes and Oracle Data Guard Concepts and Administration), but the resume it is:

  • ASYNC: The primary database does not wait for the response from a remote destination.
  • SYNC/NOAFIRM: The Primary database holds commit until the remote destination report that received the redo data. It does not wait until the remote site report that wrote the data in the disc.
  • SYNC/AFFIRM: The primary database holds commit until the remote destination report that received redo data and wrote it at the disk.

You can read with more details the difference here: Best Practices for Synchronous Redo Transport and Best Practices for Asynchronous Redo Transport.

The idea is simple, if you use ASYNC, there is no guarantee for zero data loss between the primary database and the remote destination.

Click here to read more…

ZDLRA, Real-Time Redo

Real-time redo transport is the feature that allows you to reduce to zero the RPO (Recovery Point Objective) for your database. Check how to configure real-time redo, the steps, parameters, and other details that need to be modified to enable it.

The idea behind real-time redo transport it is easy, basically the ZDLRA it is a remote destination for your redo log buffers/archivelogs of your database. It is really, really, similar to what occurs for data guard configurations (but here you don’t need to set all datafiles as an example). It is not the same too because ZDLRA can detect if the database stops/crash and will generate the archivelog (at ZDLRA side) with all the received redo and this can be used to restore to, at least zero/sub-seconds, of data loss.

Using real-time redo it is the only way to reach RPO zero. With other features of ZDLRA, you can have a better backup window time (but just that) using incremental backups. Just using real-time redo you reach zero RPO and this impacts directly how to configure for MAA compliance. There are a lot of options and level of protection for MAA that you can check at “Maximum Availability Architecture (MAA) – On-Premises HA Reference Architectures 2019”, “Maximum Availability Architecture Best Practices for Oracle Cloud”, “Oracle MAA Reference Architectures”, “Maximum Availability Architecture – Best Practices for Oracle Database 19c”.

This post starts from one environment that you already enrolled in the database at ZDLRA. I already wrote about how to do that, you can check here in my previous post. This is the first post about real-time redo, here you will see how to configure and verify it is working.

Click here to read more…

Observer, Quorum

This article closes the series for DG and Fast-Start Failover that I covered with more details the case of isolation can leverage the shutdown of your healthy/running primary database. The “ORA-16830: primary isolated from fast-start failover partners”.

In the first article, I wrote about how one simple detail that impacts dramatically the reliability of your MAA environment. Where you put your Observer in DG environment (when Fast-Start Failover is in use) have a core figure in case of outages, and you can face Primary isolation and shutdown. Besides that, there is no clear documentation to base yourself about “pros and cons” to define the correct place for Observer. You read more in my article here.

In the second article, I wrote about one new feature that can help to have more protected and cover more scenarios for Fast-Start Failover/DG. Using Multiple Observers you can remove the single point of failure and allow you to put one Observer in each side of your environment (primary, standby and a third one). You can read more in my article here.

In this last article I discuss how, even using all the features, there is no perfect solution. Another point is discussing here is how (maybe) Oracle can improve that. Below I will show more details that even multiple observers continue to shutdown a healthy primary database. Unfortunately, it is a lot of tech info and is a log thread output. But you can jump directly to the end to see the discussion about how this can be improved.

More…

Observer, More Than One

Recently I made a post about a little issue that I got with Oracle Data Guard. In that scenario, because of outage in the standby datacenter, healthy primary database shutdown with error “ORA-16830: primary isolated…”. Just to remember that the database was running with Maximum Availability, Fail-Start Failover enabled and (the most important detail) the Observer was running in the standby datacenter too.

The point from my previous post tried to show that does not exists one doc that provides full details about “pros” and “cons” where put your observer. Whatever the place, on the primary datacenter or in standby, it has little details to check. Even the best (ideal) scenario, with a third datacenter, can be tough to sustain.

Here I will try to show one option that can help you and improve the reliability of your MAA/DG environment. At least, you will have more options to decide how to protect your database. Bellow, I show some details about how to configure and use multiple observers, but if you want to jump and see a little concern you can directly to the end of the post.

More…

ZDLRA, since 2014

In Oracle Open World 2014 the Zero Data Loss Recovery Appliance (ZDLRA) was released and it changed MAA in many ways, but two principals: protection and backup. I watched the ZDLRA presentation and saw that matched with the needs that I had that time.

After OOW in 2014 I started the project (all phases, from conception, requirements until deployments and usage) that become (in 2015) the first ZDLRA installation in Brazil, and one of the first of the world too that use replicated ZDLRA to protect both sites (primary and standby) and many levels of databases (PRO, TST, DEV). The Oracle MAA at its finest was amazing: ZDLRA + Exadata + DG; everything integrated to protect both sites.

Because of the high design level of the project it was chosen to be one of the main presentation in Oracle Open World 2015 about ZDLRA, you can find the link of the presentation that I made together with ZDLRA dev team here. As told before, in this project was integrated two ZDLRA, two Exadata and DG to reach ZERO Recover Point Objective (RPO) and Recovery Time Objective (RTO) and beside that, reduce backup time. You can see the presentation to check the scope and other details.

More…

DML over Standby for Active Data Guard in 19c

With the new 19c version the Data Guard received some attention and now we can do DML over the standby and it will be redirect to primary database. It is not hard to implement, but unfortunately there is no much information about that in the docs about that.

As training exercise I tested this new feature and want to share some information about that. First, the environment that I used (and the requirements too):

  • Primary and Standby databases running 19c.
  • Data Guard in Maximum Availability .
  • Active Data Guard enabled.

Remember that the idea of DML over the standby it is to use in some cases where your reporting application need to update some tables and few records (like audit logins) while processing the data in the standby. The volume of DML is (and will be) low. At this point there is no effort to allow, or create, a multiple active-active datacenters/sites for your database. If you start to execute a lot of DML in the standby side you can impact the primary database and you adding the fact that you can maximize the problems for locks and concurrency.

More…

Observer, Where?

Some months ago I got one error with Oracle Data Guard and now I had time to review it again and write this article. Just to be clear since the begin, the discussion here it is not about the error itself, but about the circumstances that generated it.

The environment described here follow, at least, the most common best practices for DG by Oracle. Have 1 dedicated server for each one: Primary Database, Physical Standby Database and Observer. The primary and standby resides in different datacenters in different cities, dedicated network for interconnect between sites, protection mode was Maximum Availability and runs with Fast-Start Failover enabled (with 30 seconds for threshold). The version here is 12.2, but will be the same for 19c. So, nothing so bad in the environment, basic DG configuration trying to follow the best practices.

More…