Project

General

Profile

How To Configure Warm Standby with PITRTools » History » Version 6

Joshua Drake, 12/04/2012 10:23 AM

1 1 Ivan Lezhnjov
{{toc}}
2
3
h1. How To Configure Warm Standby with PITRTools
4
5
This how-to demonstrates how to configure a warm standby using PostgreSQL 8.4 and PITRtools.
6
7 6 Joshua Drake
h1. *What Is PITRtools and Why Would You Want To Use It?*
8 1 Ivan Lezhnjov
9
Essentially, PITRtools is a wrapper around standard tools, such as rsync and PostgreSQL's internal functionality, that makes, among other things, creating and managing standby configurations and subsequent failover to a standby a snap.
10
11
With the help of PITRtools you could do more, namely:
12
* secure shipping of log files to configured standby server over SSL protected link;
13
* streaming replication;
14
* enable/disable archiving without the need to restart PostgreSQL;
15
* stay informed by generating alerts based on various levels of severity of events happening during the process on both ends of configuration;
16
* automatically take a base backup, including table spaces, restore archives and purge old ones (if PostgreSQL >8.3);
17
* failover to the latest restore point and point-in-time recovery (restore using timestamps);
18
* etc.
19
20
Obviously, PITRtools attempts to make things simpler, more secure and easier to manage.
21
22
h1. Naming Conventions
23
24
Master server can be referred to also as archiver.
25
Slave is often called standby.
26
27
These names are used interchangeably.
28
29
h1. The Test Setup
30
31
It's a good idea to first setup PITRtools and play with them before you actually go ahead and change configuration of your production servers. To show how PITRtools is configured and work, I'll describe the entire process using a 2 hosts test setup as an example.
32
33
So, there are 2 hosts named @bitarena@ and @bitarena-clone@, both being virtualized instances of Debian Squeeze. I assume you're experienced enough to install Debian yourself and know how to find your way around the system.
34
35
@bitarena@ is designated role of master server aka archiver, @bitarena-clone@ is a slave aka standby server. PITRtools is installed on both hosts, but each host uses different tools from the package.
36
37
38
h1. The Process
39
40
These are major steps in actual order of execution that one would have to follow to get PITRtools-enabled setup running:
41
42
On master server
43
44
		* Turn on archiving
45
		* @cmd_archiver -C $CONFIG -I@
46
47
On slave server
48
49
		* @cmd_standby -C $CONFIG -I@
50
		* @cmd_standby -C $CONFIG -B@
51
		* @cmd_standby -C $CONFIG -S@
52
53
h1. Installing and Configuring PITRtools
54
55
PITRtools project page can be found at https://public.commandprompt.com/projects/pitrtools. Of particular interest is a Wiki page https://public.commandprompt.com/projects/pitrtools/wiki where you can find information on how to obtain PITRtools, other useful notes and links.
56
57
You should use GIT version, because currently tarball offers an outdated version of PITRtools. I was told this is going to be fixed soon, but until then you should rely on the GIT repository. 
58
59
Note that essentially you need to have contents of this repository on both master and slave server. For the sake of brevity, I'm going to show the layout I use only on master server. You would then need to repeat the steps for slave server, because all that is going to change is the host where you clone PITRtools repository to.
60
61
62
<pre>root@bitarena ~/GIT# git clone git://github.com/commandprompt/PITRTools.git
63
Cloning into 'PITRTools'...
64
remote: Counting objects: 284, done.
65
remote: Compressing objects: 100% (183/183), done.
66
remote: Total 284 (delta 182), reused 203 (delta 101)
67
Receiving objects: 100% (284/284), 68.51 KiB, done.
68
Resolving deltas: 100% (182/182), done.
69
root@bitarena ~/GIT# ls
70
PITRTools
71
root@bitarena ~/GIT# cd PITRTools/
72
root@bitarena ~/GIT/PITRTools# ls
73
cmd_archiver             cmd_archiver.README  cmd_standby.ini.sample  cmd_standby.sql
74
cmd_archiver.ini.sample  cmd_standby          cmd_standby.README      cmd_worker.py
75
root@bitarena ~/GIT/PITRTools# mkdir -p /var/lib/postgresql/PITRtools/bin
76
root@bitarena ~/GIT/PITRTools# cp cmd_archiver cmd_standby cmd_worker.py /var/lib/postgresql/PITRtools/bin/
77
root@bitarena ~/GIT/PITRTools# cp *.ini.sample /var/lib/postgresql/PITRtools/
78
root@bitarena ~/GIT/PITRTools# cd /var/lib/postgresql/PITRtools/
79
root@bitarena /var/lib/postgresql/PITRtools# mv cmd_archiver.ini.sample cmd_archiver.ini
80
root@bitarena /var/lib/postgresql/PITRtools# mv cmd_standby.ini.sample cmd_standby.ini
81
root@bitarena /var/lib/postgresql/PITRtools# cd ~/GIT/PITRtools/
82
root@bitarena ~/GIT/PITRTools# chown -R postgres.postgres /var/lib/postgresql/PITRtools
83
root@bitarena:~#</pre>
84
85
h1. SSH Key-based Login for PITRtools
86
87
PITRtools relies on rsync and SSH heavily to do its work, e.g. making a base backup and shipping WAL log files from master to slave server -- all that happens over an SSL-protected communication channel. This is the area where PITRtools makes one's life easier, because otherwise you'd have to find a way to copy log files to a slave server somehow (most likely it would be some sort of networking mount point).
88
89 5 Ivan Lezhnjov
Therefore one of the prerequisites is to configure SSH key-based logins between the two hosts for @postgres@ system user that don't require a password or a passphrase.
90 1 Ivan Lezhnjov
91
*Master*
92
<pre>root@bitarena:~# su - postgres
93
94
postgres@bitarena:~$ ssh-keygen -t rsa
95
... 
96
postgres@bitarena:~$ ls -la /var/lib/postgresql/.ssh/
97
total 16
98
drwx------ 2 postgres postgres 4096 Sep 28 09:17 .
99
drwxr-xr-x 5 postgres postgres 4096 Sep 28 09:17 ..
100
-rw------- 1 postgres postgres 1675 Sep 28 09:17 id_rsa
101
-rw-r--r-- 1 postgres postgres  399 Sep 28 09:17 id_rsa.pub
102
postgres@bitarena:~#
103
</pre>
104
105
*Slave*
106
<pre>root@bitarena-clone:~# su - postgres
107
postgres@bitarena-clone:~$ ssh-keygen -t rsa
108
... </pre>
109
110
Answer all the questions and ssh-keygen will create both private and public (*.pub file) RSA keys. For now just create the key pair, one on on each host (master and slave). When done, proceed to exchanging public keys between the hosts like this:
111
112
*Master*
113
<pre>postgres@bitarena:~$ exit
114
root@bitarena:~# scp /var/lib/postgresql/.ssh/id_rsa.pub root@bitarena-clone:/var/lib/postgresql/.ssh/authorized_keys2
115
root@bitarena-clone's password: 
116
id_rsa.pub                                                                               100%  399     0.4KB/s   00:00
117
root@bitarena:~#</pre>
118
119 5 Ivan Lezhnjov
This will copy @postgres@ system user's public SSH key on master to standby into a file @authorized_keys2@, which will allow @postgres@ on master to login to slave host without a password and also in a secure fashion.
120 1 Ivan Lezhnjov
121 5 Ivan Lezhnjov
Here we check that @authorized_keys2@ has indeed been created on slave host, and we also set correct ownership information for the file.
122 1 Ivan Lezhnjov
123
*Slave*
124
<pre>root@bitarena-clone:~# ls -la /var/lib/postgresql/.ssh/
125
total 20
126
drwx------ 2 postgres postgres 4096 Sep 28 09:30 .
127
drwxr-xr-x 5 postgres postgres 4096 Sep 28 09:23 ..
128
-rw-r--r-- 1 root     root      399 Sep 28 09:30 authorized_keys2
129
-rw------- 1 postgres postgres 1675 Sep 28 09:23 id_rsa
130
-rw-r--r-- 1 postgres postgres  405 Sep 28 09:23 id_rsa.pub
131
root@bitarena-clone:~# chown postgres.postgres /var/lib/postgresql/.ssh/authorized_keys2
132
root@bitarena:~#</pre>
133
134 5 Ivan Lezhnjov
Now, to check if it actually works we go back to master server, become @postgres@ user and try to log into slave. As you can see RSA key is used this time and no password is prompted for. This is exactly what we need to make sure is configured on both hosts for PITRtools to work properly.
135 1 Ivan Lezhnjov
136
*Master*
137
<pre>root@bitarena:~# su - postgres
138
139
postgres@bitarena:~$ ssh bitarena-clone
140
The authenticity of host 'bitarena-clone (10.0.3.3)' can't be established.
141
RSA key fingerprint is dd:78:8e:fb:d6:c4:25:88:d9:57:b4:fb:49:22:de:de.
142
Are you sure you want to continue connecting (yes/no)? yes
143
Warning: Permanently added 'bitarena-clone,10.0.3.3' (RSA) to the list of known hosts.
144
Linux bitarena 2.6.32-5-686 #1 SMP Sun May 6 04:01:19 UTC 2012 i686
145
146
The programs included with the Debian GNU/Linux system are free software;
147
the exact distribution terms for each program are described in the
148
individual files in /usr/share/doc/*/copyright.
149
150
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
151
permitted by applicable law.
152
postgres@bitarena-clone:~$ exit
153
logout
154
Connection to bitarena-clone closed.
155
root@bitarena:~#</pre>
156
157
158
Remember that this has been done on master server only. In a similar manner, you should take care of the slave host. The steps are identical, you just copy *.pub file to a master host and do the rest as described above.
159
160
It's worth noting that PITRtools runs some actions remotely. For example, when a base backup action is run on a slave host, cmd_standby script establishes SSH session to a master host and runs various psql commands to deal with checkpoints, copy log files, etc. This also requires PostgreSQL user password to access the database, and it is often needed to be entered 4 and more times for a base backup action to complete. 
161
162 5 Ivan Lezhnjov
Once you played around with PITRtools enough to get hang of things, you could avoid having to enter password manually each time by either using .pgpass file (which is a "standard feature of PostgreSQL":http://www.postgresql.org/docs/8.4/static/libpq-pgpass.html) or make sure there's a "trust relationship configured for localhost in @pg_hba.conf@ file":http://www.postgresql.org/docs/8.4/static/auth-methods.html.
163 1 Ivan Lezhnjov
164
In the console output examples you're going to see below I entered password manually, but this, of course, is really inconvenient in production environment.
165
166
167
h1. Master Configuration File
168
169
Sample configuration files are pretty good as they are with the defaults they come. Chances are you're not going to change a lot in there. Make sure you read all *.README files, though, because they contain helpful extra information about PITRtools and configuration parameters that will help you decide how to best configure your servers.
170
171
We start with the master host by editing PITRtools configuration file for master/archiver.
172
173
*Master*
174
<pre>postgres@bitarena:~$ vim PITRtools/cmd_archiver.ini
175
	
176
	[DEFAULT]
177
	; online or offline
178
	state: online					
179
	
180
	; The base database directory
181
	
182
	pgdata: /var/lib/postgresql/8.4/main
183
	
184
	; where to remotely copy archives
185
	r_archivedir: /var/lib/postgresql/archive
186
	
187
	; where to locally copy archives
188
	l_archivedir: /var/lib/postgresql/archive
189
	
190
	; where is rsync				
191
	rsync_bin: /usr/bin/rsync			
192
	
193
	; extra rsync flags
194
	rsync_flags: -z
195
	
196
	; option 2 or 3, if running RHEL5 or similar it is likely 2
197
	; if you are running something that ships remotely modern software
198
	; it will be 3
199
	
200
	rsync_version = 3
201
	
202
	; IP of slave					
203
	slaves: bitarena-clone
204
	
205
	; the user that will be using scp				
206
	user: postgres				
207
	
208
	; if scp can't connect in 10 seconds error
209
	ssh_timeout: 10
210
	
211
	; command to process in ok					
212
	notify_ok: echo OK
213
	
214
	; command to process in warning
215
	notify_warning:  echo WARNING
216
	
217
	; command to process in critical
218
	notify_critical: echo CRITICAL
219
	
220
	; if you want to debug on/off only		 	
221
	debug: on
222
	
223
	; if you want ssh debug (warning noisy)
224
	ssh_debug: off
225
</pre>
226
227
228
Note that you can use domain names instead of IP addresses for slaves: parameter, it works either way just fine. You might also want to turn debugging on while you're learning PITRtools. It helps to see what software does, if you try to understand better how it works.
229
230
231
h1. Turn On and Configure Archiving
232
233
When your master configuration file has been created, we need to turn on archiving functionality in PostgreSQL. This is purely a PostgreSQL feature, but we can also take advantage of using PITRtools thanks to flexible design of PostgreSQL.
234 2 Ivan Lezhnjov
235
<pre>postgres@bitarena:~$ vim /etc/postgresql/8.4/main/postgresql.conf
236
237
...
238
archive_mode=on
239
archive_command = '/var/lib/postgresql/PITRtools/bin/cmd_archiver -C /var/lib/postgresql/PITRtools/cmd_archiver.ini -F %p'
240
...
241
</pre>
242
243
244
Here we explicitly turn archiving mode on and tell PostgreSQL to use @cmd_archiver@, part of PITRtools, as the archiving command. 
245
It must be provided with a path to a configuration file (@-C@ switch), and a path to a log file (@-F@ switch). @%p@ is substituted by PostgreSQL with the actual log file location in the file system. To learn more about how to use @cmd_archiver@ run
246
247
<pre>postgres@bitarena:~$ PITRtools/bin/cmd_archiver --help</pre>
248
249
250 5 Ivan Lezhnjov
Please, keep in mind that PITRtools isn't supposed to be run by root user. Technically, it can be run by any system user other than root, and unless you have a customized configuration your default PostgreSQL user will be @postgres@, so PITRtools are expected to be run as that system user.
251 2 Ivan Lezhnjov
252
Restart PostgreSQL for changes to take effect.
253
254
<pre>postgres@bitarena:~$ /etc/init.d/postgresql restart
255
Restarting PostgreSQL 8.4 database server: main.
256
postgres@bitarena:~#
257
</pre>
258
259
260
h1. Install Helper Scripts
261
262 5 Ivan Lezhnjov
Apply cmd_standby.sql to the database of PITRtools user (usually @postgres@). This is required for master server only.
263 2 Ivan Lezhnjov
264
<pre>root@bitarena:~# psql -U postgres < ~/GIT/PITRTools/cmd_standby.sql
265
CREATE FUNCTION
266
COMMENT
267
CREATE FUNCTION
268
CREATE FUNCTION
269
CREATE FUNCTION
270
COMMENT
271
CREATE FUNCTION
272
COMMENT
273
root@bitarena:~#</pre>
274
275
276
277
h1. Initialize Master Environment
278
279
At this point we're pretty much done configuring the master server. All that is left to do is initialize master environment by running
280
281
<pre>postgres@bitarena:~$ PITRtools/bin/cmd_archiver -C PITRtools/cmd_archiver.ini -I
282
We are initializing queues, one moment.
283
284
NOTICE: init_env_func()
285
NOTICE: generate_slave_list_func()
286
NOTICE: Your slaves are: ['bitarena-clone']
287
postgres@bitarena:~$ ls -lah
288
total 40K
289
drwxr-xr-x  7 postgres postgres 4.0K Sep 28 10:25 .
290
drwxr-xr-x 34 root     root     4.0K Sep 27 02:32 ..
291
drwxr-xr-x  3 postgres postgres 4.0K Sep 24 15:10 8.4
292
drwx------  2 postgres postgres 4.0K Sep 27 02:23 .aptitude
293
drwxr-xr-x  3 postgres postgres 4.0K Sep 28 10:25 archive
294
-rw-------  1 postgres postgres 1001 Sep 28 10:03 .bash_history
295
drwxr-xr-x  2 postgres postgres 4.0K Sep 28 10:25 PITRtools
296
-rw-------  1 postgres postgres 1.4K Sep 28 05:08 .psql_history
297
drwx------  2 postgres postgres 4.0K Sep 28 09:29 .ssh
298
-rw-------  1 postgres postgres 3.6K Sep 28 10:25 .viminfo
299
postgres@bitarena:~$ ls -lah archive/bitarena-clone/
300
total 8.0K
301
drwxr-xr-x 2 postgres postgres 4.0K Sep 28 10:25 .
302
drwxr-xr-x 3 postgres postgres 4.0K Sep 28 10:25 ..
303
postgres@bitarena:~#</pre>
304
305
306
As you can see @-I@ switch tells @cmd_archiver@ to do a couple of internal actions, as well as prepare file system layout by creating necessary directories. 
307
308
@archive/@ directory has been created automatically by @cmd_archiver@ and contains sub-directory named after IP address or DNS name of a slave, @bitarena-clone@ in this example. This sub-directory is used in those circumstances when master fails to successfully transfer WAL log files to slave and stores them temporarily in this local directory. Once slave is back online, these files should be transferred to slave.
309
310
Effectively, after you've initialized master it starts trying to ship WAL files to slave. However, the slave host isn't configured yet, so the log delivery will fail and the log files should be found in @l_archivedir/slave_FQDNorIP/@ folder on master host. Once we configure slave, these will be shipped as soon as a new WAL segment is created on master.
311
312
313
h1. Slave aka Standby Server
314
315
Now we can prepare the slave host. SSH key-based login for @postgres@ system user should be working in both directions, from master to slave, as well as from slave to master.
316
317
Assuming you've arranged for this as it was suggested before, we can now go on and edit slave configuration file to perform first important step and initialize slave environment.
318
319 5 Ivan Lezhnjov
One of the configuration file parameters, namely @pgdata:@, will ask you to specify PostgreSQL data directory. You could look it up in @postgresql.conf@, or, If your PostgreSQL is already running, you could see this information like this:
320 2 Ivan Lezhnjov
321
<pre>root@bitarena:~# ps axuwf |grep postgre
322
root      4233  0.0  0.0   3304   756 pts/1    S+   10:56   0:00          \_ grep postgre
323
postgres  3352  0.0  0.4  46452  5464 ?        S    09:59   0:03 /usr/lib/postgresql/8.4/bin/postgres 
324
-D /var/lib/postgresql/8.4/main -c config_file=/etc/postgresql/8.4/main/postgresql.conf
325
...</pre>
326
327
Remember, note down the path somewhere or copy it to the clipboard. We'll need it in a minute.
328
329
330
h1. Slave Configuration File
331
332
Now edit slave configuration file. Most defaults are good and safe options, but your setup may be different from what I have in this how-to. So, be careful to understand what you're doing.
333
334
*Slave*
335
<pre>postgres@bitarena-clone:~$ vim PITRtools/cmd_standby.ini
336
	[DEFAULT]
337
	; what major version are we using?
338
	pgversion: 8.4
339
	
340
	; Used for 8.2 (8.1?), should be set to something > than checkpoint_segments on master				
341
	numarchives: 10			
342
	
343
	; Whether or not to use streaming replication.  If this is set to "on"
344
	; PITRTools will configure the standby server to replicate from master
345
	; using streaming replication.
346
	; This can only be used with PostgreSQL 9.0 and up.
347
	use_streaming_replication: off
348
	
349
	; File to touch to end replication when using streaming replication.
350
	trigger_file: /var/lib/postgresql/PITRtools/cmd_end_recovery
351
	
352
	; User to connect to master DB while using streaming replication,
353
	; ignored if not using streaming replication.
354
	repl_db_user: replication
355
	
356
	; Password for the user repl_db_user.
357
	repl_db_password: secret
358
	
359
	; sslmode to use when connecting for streaming replication. 
360
	; Accepted values: the same as libpq: disable, allow, prefer, require, verify-ca and verify-full
361
	; Default: sslmode: prefer
362
	sslmode: prefer
363
	
364
	; Commands needed for execution
365
	; absolute path to ssh
366
	ssh: /usr/bin/ssh		
367
	
368
	; absolute path to rsync
369
	rsync: /usr/bin/rsync
370
	
371
	; extra rsync flags
372
	rsync_flags: -z
373
	
374
	; the path to to the postgres bin		
375
	pg_standby: /usr/lib/postgresql/8.4/bin/pg_standby
376
	pg_ctl: /usr/lib/postgresql/8.4/bin/pg_ctl
377
	
378
	; path to psql on the master
379
	r_psql: /usr/lib/postgresql/8.4/bin/psql	
380
	
381
	; Generalized information
382
	
383
	; the port postgresql runs on (master)
384
	port: 5432
385
	
386
	; ip or name of master server 			
387
	master_public_ip: bitarena
388
	
389
	; the ip address we should use when processing remote shell		
390
	master_local_ip: 127.0.0.1
391
	
392
	; the user performed initdb	
393
	user: postgres
394
	
395
	; on or off			
396
	debug: on
397
	
398
	; on or off
399
	ssh_debug: off
400
	
401
	; the timeout for ssh before we throw an alarm
402
	ssh_timeout: 30			
403
	
404
	; should be the same as r_archivedir for archiver
405
	archivedir: /var/lib/postgresql/archive
406
	
407
	; where you executed initdb -D to	
408
	pgdata: /var/lib/postgresql/8.4/main
409
	
410
	; Confs
411
	
412
	; This is the postgresql.conf to be used when not in standby
413
	postgresql_conf: /etc/postgresql/8.4/main/postgresql.conf		
414
	
415
	; This is the pg_hba.conf to be used when not in standby
416
	pg_hba_conf: /etc/postgresql/8.4/main/pg_hba.conf		
417
	
418
	; By default postgresql.conf and pg_hba.conf will be copied from the
419
	; locations specified above to pgdata directory on failover.
420
	;
421
	; Uncomment the following to make postgres actually use the above conf
422
	; files w/o copying them to pgdata.
423
	;no_copy_conf: true
424
	
425
	; The recovery.conf file to create when starting up
426
	; Defaults to %(pgdata)/recovery.conf
427
	recovery_conf: /var/lib/postgresql/8.4/main/recovery.conf
428
	
429
	; Useful when postgresql.conf doesn't specify log destination
430
	; Will be passed with -l to pg_ctl when starting the server.
431
	;
432
	; If you're worried about having complete logs, either make sure
433
	; postgresql.conf points to a log file, or use the logfile: parameter.
434
	;
435
	; Otherwise postgresql will print on standard stdout and nothing
436
	; will be recorded in the logs
437
	;
438
	;logfile: /var/log/postgresql/postgresql.log
439
	
440
	; Alarms
441
	
442
	notify_critical: echo CRITICAL
443
	notify_warning: echo WARNING
444
	notify_ok: echo OK
445
	
446
	; On failover action
447
	
448
	; Whatever is placed here will be executed on -FS must return 0
449
	
450
	action_failover: /var/lib/postgresql/PITRtools/failover.sh</pre>
451
452
@action_failover:@ script has to exist and have permissions of at least chmod u+x equivalent, it could be just a placeholder script with a simple action of:
453
454
<pre>#!/bin/bash
455
touch /var/lib/postgresql/PITRtools/failover_happened</pre>
456
	
457
but it's meant as a way to let you do certain actions on failover, those would be very specific for each given setup. It's good to know, though, that PITRtools lets you take actions automatically when failover happens. Use this feature to make your setup more sophisticated.
458
459
460
h1. Initialize Slave Environment
461
462
First stop PostgreSQL, then initialize slave environment.
463
464
<pre>postgres@bitarena-clone:~$ /etc/init.d/postgresql stop
465
postgres@bitarena-clone:~$ PITRtools/bin/cmd_standby -C PITRtools/cmd_standby.ini -I
466
NOTICE: check_pgpid_func()
467
DEBUG: executing query /usr/bin/ssh -o ConnectTimeout=30 -o StrictHostKeyChecking=no  postgres@bitarena  "/usr/lib/postgresql/8.4/bin/psql -A -t -Upostgres -p5432 -dpostgres -h127.0.0.1  by 'SELECT * FROM cmd_get_data_dirs()'
468
Password for user postgres: postgrespass
469
470
DEBUG: /var/lib/postgresql/8.4/main
471
postgres@bitarena-clone:~$</pre>
472
473
Please, note that if @archivedir: /var/lib/postgresql/archive@ hasn't been created, you should do so manually as @postgres@ system user (or set @postgres@ user and group as the ownership information for this directory). PITRtools should do this automatically for you, but earlier versions were known not to do so. This is important, and the next step in slave configuration, which is base backup, will fail if @archivedir:@ doesn't exist.
474 3 Ivan Lezhnjov
475
h1. Making a Base Backup
476
477
Before you proceed check if @archivedir:@ exists on slave and WAL files are being shipped to it from master host. WAL files are generated and shipped only when new data is stored in the database on master host. To help simulate data flow and check whether archiving and shipping is happening, try this SQL statement on master host:
478
479
<pre>postgres@bitarena:~$ psql
480
psql (8.4.13)
481
Type "help" for help.
482
483
postgres=# create table testPITRtools1 as select * from pg_class, pg_description;
484
postgresq=# \q
485
postgres@bitarena:~$</pre>
486
	
487
You could create a couple of tables like that to generate enough WAL segments. See @archivedir:@ directory on slave to check whether any WAL files have been copied there. If they have, everything works as expected and you can try to make a base backup on slave host:
488
489
<pre>postgres@bitarena-clone:~$ PITRtools/bin/cmd_standby -C PITRtools/cmd_standby.ini -B
490
NOTICE: check_pgpid_func()
491
DEBUG: executing query /usr/bin/ssh -o ConnectTimeout=30 -o StrictHostKeyChecking=no  postgres@bitarena  "/usr/lib/postgresql/8.4/bin/psql -A -t -Upostgres -p5432 -dpostgres -h127.0.0.1  by  'checkpoint' 
492
Password for user postgres: postgrespass
493
494
DEBUG: CHECKPOINT
495
DEBUG: executing query /usr/bin/ssh -o ConnectTimeout=30 -o StrictHostKeyChecking=no  postgres@bitarena  "/usr/lib/postgresql/8.4/bin/psql -A -t -Upostgres -p5432 -dpostgres -h127.0.0.1  by  'SELECT cmd_pg_start_backup()' 
496
Password for user postgres: postgrespass
497
498
DEBUG: cmd_pg_start_backup:  1
499
DEBUG: executing query /usr/bin/ssh -o ConnectTimeout=30 -o StrictHostKeyChecking=no  postgres@bitarena  "/usr/lib/postgresql/8.4/bin/psql -A -t -Upostgres -p5432 -dpostgres -h127.0.0.1  by 'SELECT * FROM cmd_get_data_dirs()'
500
Password for user postgres: postgrespass
501
502
DEBUG: executing query /usr/bin/ssh -o ConnectTimeout=30 -o StrictHostKeyChecking=no  postgres@bitarena  "/usr/lib/postgresql/8.4/bin/psql -A -t -Upostgres -p5432 -dpostgres -h127.0.0.1  by 'SELECT * FROM cmd_get_pgdata() LIMIT 1'
503
Password for user postgres: postgrespass
504
505
receiving incremental file list
506
./
507
backup_label
508
backup_label.old
509
postmaster.opts
510
base/1/
511
base/1/pg_internal.init
512
base/11564/
513
base/11564/pg_internal.init
514
base/16499/
515
base/16499/pg_internal.init
516
base/33069/
517
base/33069/33084
518
base/33069/33268
519
base/33069/33974
520
base/33069/33974_fsm
521
base/33069/33974_vm
522
base/33069/33980
523
base/33069/33993
524
base/33069/33993_fsm
525
base/33069/33993_vm
526
base/33069/33999
527
base/33069/33999_fsm
528
base/33069/pg_internal.init
529
global/
530
global/pg_auth
531
global/pg_control
532
global/pg_database
533
pg_clog/0000
534
pg_multixact/offsets/0000
535
pg_stat_tmp/
536
pg_stat_tmp/pgstat.stat
537
pg_subtrans/0001
538
539
Number of files: 1537
540
Number of files transferred: 25
541
Total file size: 189868759 bytes
542
Total transferred file size: 22162037 bytes
543
Literal data: 920320 bytes
544
Matched data: 21241717 bytes
545
File list size: 20500
546
File list generation time: 0.004 seconds
547
File list transfer time: 0.000 seconds
548
Total bytes sent: 65011
549
Total bytes received: 189901
550
551
sent 65011 bytes  received 189901 bytes  101964.80 bytes/sec
552
total size is 189868759  speedup is 744.84
553
DEBUG: executing query /usr/bin/ssh -o ConnectTimeout=30 -o StrictHostKeyChecking=no  postgres@bitarena  "/usr/lib/postgresql/8.4/bin/psql -A -t -Upostgres -p5432 -dpostgres -h127.0.0.1  by  'SELECT cmd_pg_stop_backup()' 
554
Password for user postgres: postgrespass
555
556
DEBUG: cmd_pg_stop_backup: 
557
postgres@bitarena-clone:~$</pre>
558
559
As you can see what happens is that PITRtools puts master into backup mode, synchronizes data directories (including tablespaces, if any) from master to slave and then exits backup mode on master. If base backup action had failed before it properly finished (say, you had lost connection to slave while rsync was copying files over to it), you'd need to intervene and run manually @-Astop_basebackup:@
560
561
<pre>postgres@bitarena-clone:~$ PITRtools/bin/cmd_standby -C PITRtools/cmd_standby.ini -Astop_basebackup
562
...</pre>
563
564
After that run base backup action again. Just make sure it finishes its work properly (use console output for successful base backup action above as a reference).
565
566
If you want a cold standby you're done. If you need a warm standby, then run:
567
568
<pre>postgres@bitarena-clone:~$ PITRtools/bin/cmd_standby -C PITRtools/cmd_standby.ini -S
569
NOTICE: check_pgpid_func()
570
server starting
571
postgres@bitarena-clone:~$ 2012-10-16 02:47:31 PDT LOG:  database system was interrupted; last known up at 2012-10-16 02:44:37 PDT
572
2012-10-16 02:47:31 PDT LOG:  starting archive recovery
573
2012-10-16 02:47:31 PDT LOG:  restore_command = '/usr/lib/postgresql/8.4/bin/pg_standby -s5 -w0 -c -d  /var/lib/postgresql/archive %f %p %r '
574
Trigger file 		: <not set>
575
Waiting for WAL file	: 00000001.history
576
WAL file path		: /var/lib/postgresql/archive/00000001.history
577
Restoring to		: pg_xlog/RECOVERYHISTORY
578
Sleep interval		: 5 seconds
579
Max wait interval	: 0 forever
580
Command for restore	: cp "/var/lib/postgresql/archive/00000001.history" "pg_xlog/RECOVERYHISTORY"
581
Keep archive history	: 000000000000000000000000 and later
582
running restore		:cp: cannot stat `/var/lib/postgresql/archive/00000001.history': No such file or directory
583
cp: cannot stat `/var/lib/postgresql/archive/00000001.history': No such file or directory
584
cp: cannot stat `/var/lib/postgresql/archive/00000001.history': No such file or directory
585
cp: cannot stat `/var/lib/postgresql/archive/00000001.history': No such file or directory
586
not restored
587
history file not found
588
Trigger file 		: <not set>
589
Waiting for WAL file	: 000000010000000100000025.00000020.backup
590
WAL file path		: /var/lib/postgresql/archive/000000010000000100000025.00000020.backup
591
Restoring to		: pg_xlog/RECOVERYHISTORY
592
Sleep interval		: 5 seconds
593
Max wait interval	: 0 forever
594
Command for restore	: cp "/var/lib/postgresql/archive/000000010000000100000025.00000020.backup" "pg_xlog/RECOVERYHISTORY"
595
Keep archive history	: 000000000000000000000000 and later
596
running restore		: OK
597
2012-10-16 02:48:01 PDT LOG:  restored log file "000000010000000100000025.00000020.backup" from archive
598
Trigger file 		: <not set>
599
Waiting for WAL file	: 000000010000000100000025
600
WAL file path		: /var/lib/postgresql/archive/000000010000000100000025
601
Restoring to		: pg_xlog/RECOVERYXLOG
602
Sleep interval		: 5 seconds
603
Max wait interval	: 0 forever
604
Command for restore	: cp "/var/lib/postgresql/archive/000000010000000100000025" "pg_xlog/RECOVERYXLOG"
605
Keep archive history	: 000000000000000000000000 and later
606
running restore		: OK
607
608
2012-10-16 02:48:02 PDT LOG:  restored log file "000000010000000100000025" from archive
609
2012-10-16 02:48:02 PDT LOG:  automatic recovery in progress
610
2012-10-16 02:48:02 PDT LOG:  redo starts at 1/25000020, consistency will be reached at 1/2504FFC4
611
2012-10-16 02:48:03 PDT LOG:  consistent recovery state reached
612
Trigger file 		: <not set>
613
Waiting for WAL file	: 000000010000000100000026
614
WAL file path		: /var/lib/postgresql/archive/000000010000000100000026
615
Restoring to		: pg_xlog/RECOVERYXLOG
616
Sleep interval		: 5 seconds
617
Max wait interval	: 0 forever
618
Command for restore	: cp "/var/lib/postgresql/archive/000000010000000100000026" "pg_xlog/RECOVERYXLOG"
619
Keep archive history	: 000000010000000100000025 and later
620
WAL file not present yet.
621
WAL file not present yet.
622
WAL file not present yet.
623
WAL file not present yet.
624
WAL file not present yet.
625
WAL file not present yet.
626
WAL file not present yet.
627
WAL file not present yet.
628
WAL file not present yet.
629
WAL file not present yet.</pre>
630
631
At this point PostgreSQL armed with PITRtools on master server will be continuously shipping log files to @archivedir:@ on slave. Once shipped, the WAL files will be immediately replayed, because slave in standby mode continuously scans @archivedir:@ for new WAL files and replays them as soon as they become available (this can be seen from example console output above).
632
633
h1. Failing Over
634
635
Now you have a warm standby mirroring changes occurring on the master server. When your master server becomes unavailable due to any reason, you could turn this warm standby server into a production instance by simply running on standby machine a failover action as shown below. For this PostgreSQL on master must not be running, otherwise PITRtools will throw out a warning and refuse to failover.
636
637
<pre>postgres@bitarena-clone:~$ PITRtools/bin/cmd_standby -C PITRtools/cmd_standby.ini -F999
638
639
NOTICE: check_pgpid_func()
640
2012-10-16 02:56:20 PDT LOG:  received fast shutdown request
641
2012-10-16 02:56:20 PDT LOG:  aborting any active transactions
642
waiting for server to shut down....2012-10-16 02:56:20 PDT LOG:  shutting down
643
2012-10-16 02:56:20 PDT LOG:  database system is shut down
644
 done
645
server stopped
646
server starting
647
NOTICE: Statistics are not replicated in warm standy mode.
648
HINT: Execute ANALYZE on your databases
649
postgres@bitarena-clone:~$ 2012-10-16 02:56:22 PDT LOG:  database system was interrupted while in recovery at log time 2012-10-16 02:54:35 PDT
650
2012-10-16 02:56:22 PDT HINT:  If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
651
2012-10-16 02:56:22 PDT LOG:  starting archive recovery
652
2012-10-16 02:56:22 PDT LOG:  restore_command = 'cp /var/lib/postgresql/archive/%f "%p"'
653
cp: cannot stat `/var/lib/postgresql/archive/00000001.history': No such file or directory
654
2012-10-16 02:56:23 PDT LOG:  restored log file "000000010000000100000026" from archive
655
2012-10-16 02:56:23 PDT LOG:  automatic recovery in progress
656
2012-10-16 02:56:23 PDT LOG:  redo starts at 1/2635B428, consistency will be reached at 1/27000000
657
cp: cannot stat `/var/lib/postgresql/archive/000000010000000100000027': No such file or directory
658
2012-10-16 02:56:23 PDT LOG:  could not open file "pg_xlog/000000010000000100000027" (log file 1, segment 39): No such file or directory
659
2012-10-16 02:56:23 PDT LOG:  redo done at 1/265AD744
660
2012-10-16 02:56:23 PDT LOG:  last completed transaction was at log time 2012-10-16 02:55:07.387709-07
661
2012-10-16 02:56:24 PDT LOG:  restored log file "000000010000000100000026" from archive
662
cp: cannot stat `/var/lib/postgresql/archive/00000002.history': No such file or directory
663
cp: cannot stat `/var/lib/postgresql/archive/00000003.history': No such file or directory
664
cp: cannot stat `/var/lib/postgresql/archive/00000004.history': No such file or directory
665
cp: cannot stat `/var/lib/postgresql/archive/00000005.history': No such file or directory
666
cp: cannot stat `/var/lib/postgresql/archive/00000006.history': No such file or directory
667
cp: cannot stat `/var/lib/postgresql/archive/00000007.history': No such file or directory
668
cp: cannot stat `/var/lib/postgresql/archive/00000008.history': No such file or directory
669
cp: cannot stat `/var/lib/postgresql/archive/00000009.history': No such file or directory
670
2012-10-16 02:56:24 PDT LOG:  selected new timeline ID: 9
671
cp: cannot stat `/var/lib/postgresql/archive/00000001.history': No such file or directory
672
2012-10-16 02:56:25 PDT LOG:  archive recovery complete
673
2012-10-16 02:56:26 PDT LOG:  database system is ready to accept connections
674
2012-10-16 02:56:26 PDT LOG:  autovacuum launcher started
675
676
postgres@bitarena-clone:~$</pre>
677
678
This will create @recovery.conf@ file under @pg_data:@ directory and restart PostgreSQL to enter production mode of operation.
679
680
After this, you'd be basically running a copy of production master server. Keep in mind that you would also need to change IP addresses and/or load balancing configuration, routing, firewall rules or anything else that might be in the way of establishing a successful connection to this host. This is where @action_failover:@ script could come in handy. 
681
682
Plan in advance, figure this all out to avoid any downtime before you will need to failover.
683
684
685
h1. Other things you could do with PITRtools and some tips
686
687
This how-to is meant to help you get started with PITRtools. PITRtools can do more, though, than just help you configure standby.
688
689
690
h2. Point-In-Time Recovery
691
692
<pre>postgres@bitarena-clone:~$ PITRtools/bin/cmd_standby -C PITRtools/cmd_standby.ini -F999 -R '2008-05-28 11:00:38.059389'
693
...</pre>
694
695
This is essentially a restore to a specific point in time action. In general, you can only restore to a point in time while using cold standby, because PITR will "stop" recovering at the point in time you've specified, and both warm and hot standby servers have already recovered all the WAL files they can. In this regard, it would be a good idea to have a cold standby around for disaster (at logical level) recovery.
696
697
Once this has been done, you can't choose another timestamp to restore to.
698
699
h2. Entering Standby Mode After Failover On Slave
700
701
Suppose you failed over to your standby slave, which is running now as a replacement of master for your applications. You've fixed problems with the actual master and want this slave host to enter standby mode again. Here's how you'd do it:
702
703
<pre>postgres@bitarena-clone:~$ PITRtools/bin/cmd_standby -C PITRtools/cmd_standby.ini -Astop
704
...</pre>
705
706
This will stop entire PostgreSQL service. You could also use PostgreSQL init script to achieve the same instead. If you need more fine-grained control use @pg_ctlcluser 8.4 main stop@ (see @man pg_ctlcluster@ for more details). Take a new base backup as before and enter standby mode:
707
708
<pre>postgres@bitarena-clone:~$ PITRtools/bin/cmd_standby -C PITRtools/cmd_standby.ini -B
709
...
710
711
postgres@bitarena-clone:~$ PITRtools/bin/cmd_standby -C PITRtools/cmd_standby.ini -S
712
...</pre>
713
714
Again, if you want a cold standby just don't run -S action after -B.
715 4 Ivan Lezhnjov
716
h2. Alerts
717
718
Alerting is designed to run your custom scripts. You could easily integrate PITRtools alerting to your existing NMS be it Nagios, Zabbix, or anything else, send e-mails or take actions you decide.
719
720
h2. Logging
721
722
It bears repeating -- just In case you overlooked this in standby sample configuration file notes -- if your @postgresql.conf@ doesn't specify a log file to write to and you don't use @logfile:@ parameter in @cmd_standby.ini@ the output will be directed to stdout (your console) and nothing will ever be written to a log file on disk. 
723
724
If you restart PostgreSQL, that'll fix the problem, but you can avoid it in the first place by either specifying log file to write to in @postgresql.conf@ or by using @logfile:@ parameter in @cmd_standby.ini@.
725
726
h2. Troubleshooting
727
728
Set @debug:@ parameter in configuration files to @on@ and scrutinize the information. PostgreSQL log file is also a good place to look at.
729
730
h1. Getting Help
731
732
A very low-traffic mailing list for PITRtools can be found here http://lists.commandprompt.com/mailman/listinfo/pitrtools/
733
734
There is also consulting available from "Command Prompt":https://commandprompt.com/contact/