Golden Gate Architecture:
GoldenGate enables us to
extract and replicate data across a variety of topologies as shown the diagram
below as well as the exchange and manipulation of data at the transactional
level between a variety of database platforms like Oracle, DB2, SQL Server,
Ingres, MySQL etc.
It can support a number of
different business requirements like:
§ Business Continuity and
High Availability
§ Data migrations and
upgrades
§ Decision Support
Systems and Data Warehousing
§ Data integration and
consolidation
Manager
The Manager process must be
running on both the source as well as target systems before the Extract or
Replicat process can be started and performs a number of functions including
monitoring and starting other GoldenGate processes, managing the trail files
and also reporting.
Extract
The Extract process runs on
the source system and is the data capture mechanism of GoldenGate. It can be
configured both for initial loading of the source data as well as to
synchronize the changed data on the source with the target. This can be
configured to also propagate any DDL changes on those databases where DDL
change support is available.
Replicat
The Replicat process runs
on the target system and reads transactional data changes as well as DDL
changes and replicates then to the target database. Like the Extract process,
the Replicat process can also be configured for Initial Load as well as Change
Synchronization.
Collector
The Collector is a
background process which runs on the target system and is started automatically
by the Manager (Dynamic Collector) or it can be configured to stsrt manually
(Static Collector). It receives extracted data changes that are sent via TCP/IP
and writes then to the trail files from where they are processed by the
Replicat process.
Trails
Trails are series of files
that GoldenGate temporarily stores on disks and these files are written to and
read from by the Extract and Replicat processes as the case may be. Depending
on the configuration chosen, these trail files can exist on the source as well
as on the target systems. If it exists on the local system, it will be known an
Extract Trail or as an Remote Trail if it exists on the target system.
Data Pumps
Data Pumps are secondary
extract mechanisms which exist in the source configuration. This is optional
component and if Data Pump is not used then Extract sends data via TCP/IP to
the remote trail on the target. When Data Pump is configured, the Primary
Extract process will write to the Local Trail and then this trail is read by
the Data Pump and data is sent over the network to Remote Trails on the target
system.
In the absence of Data
Pump, the data that the Extract process extracts resides in memory alone and
there is no storage of this data anywhere on the source system. In case of
network of target failures, there could be cases where the primary extract
process can abort or abend. Data Pump can also be useful in those cases where
we are doing complex filtering and transformation of data as well as when we
are consolidating data from many sources to a central target.
Data source
When processing
transactional data changes, the Extract process can obtain data directly from
the database transaction logs (Oracle, DB2, SQL Server, MySQL etc) or from a
GoldenGate Vendor Access Module (VAM) where the database vendor (for example
Teradata) will provide the required components that will be used by Extract to
extract the data changes.
Groups
To differentiate between
the number of different Extract and Replicat groups which can potentially
co-exist on a system, we can define processing groups. For instance, if we want
to replicate different sets of data in parallel, we can create two Replicat
groups.
A processing group consists
of a process which could be either a Extract or Replicat process, a
corresponding parameter file, checkpoint file or checkpoint table (for
Replicat) and other files which could be associated with the process.
No comments:
Post a Comment