In the startup log entries indicate the autovacuum not working. I query the pg_stat_user_tables table and the last_vacuum and last_autovacuum columns are empty in spite of the vacuum query I ran just before. Connecting pgadmin to the database gives the indication that the vacuum is not working.
I am using postgresql on two Ubuntu Azure VM's. One VM is set up to be the master, the second is the replicated database by means of streaming. Roughly discribed in https://www.digitalocean.com/community/tutorials/how-to-set-up-master-slave-replication-on-postgresql-on-an-ubuntu-12-04-vps.
All seems so work except for the autovacuum. During the startup the following error is logged:
LOG: test message did not get through on socket for statistics collector
LOG: disabling statistics collector for lack of working socket
WARNING: autovacuum not started because of misconfiguration
HINT: Enable the "track_counts" option.
LOG: database system was shut down at 2017-01-19 14:07:13 UTC
DEBUG: checkpoint record is at 38/F6000028
In the postgresql.config I use the following settings:
track_counts = on
autovacuum = on
log_autovacuum_min_duration = 200
autovacuum_max_workers = 1
autovacuum_naptime =960
autovacuum_vacuum_threshold = 128
autovacuum_analyze_threshold = 256
A query (select * from pg_stat_user_tables) on the database to find the last (auto)vacuum gives empty colums for the last (auto)vacuum in stead of an datetime. Were just before I ran the VACUUM FULL VERBOSE; and this gave me vacuum results.
If I query for the vacuum settings with:
select *
from pg_settings
where name like 'autovacuum%'
This is the result:
"autovacuum";"on"<br />
"autovacuum_analyze_scale_factor";"0.1"
"autovacuum_analyze_threshold";"256"
"autovacuum_freeze_max_age";"200000000"
"autovacuum_max_workers";"1"<br />
"autovacuum_multixact_freeze_max_age";"400000000"
"autovacuum_naptime";"960"<br />
"autovacuum_vacuum_cost_delay";"20"
"autovacuum_vacuum_cost_limit";"-1"
"autovacuum_vacuum_scale_factor";"0.2"
"autovacuum_vacuum_threshold";"128"
"autovacuum_work_mem";"-1"
These are the 'track_' results:
"track_activities";"on"
"track_activity_query_size";"1024"
"track_commit_timestamp";"off"
"track_counts";"off"
"track_functions";"none"
"track_io_timing";"off"
The pg_hba.conf (without the replication and network/user settings) looks like this:
local all all trust
host all all localhost trust
host all all 10.1.1.5/32 md5
host all all 127.0.0.1/32 md5
host all all 0.0.0.0 0.0.0.0 md5
the /etc/hosts:
127.0.0.1 localhost
127.0.1.1 ubuntu
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
This is the result of 'netstat -ant|grep 5432' It if cleaned up and formatted.
User@Machine:/datadrive/log/postgresql/pg_log$ netstat -ant|grep 5432
tcp 0 0 0.0.0.0:5432 0.0.0.0:* LISTEN
tcp 39 0 InternIpMaster:5432 InternIpSlave:36338 ESTABLISHED
tcp 0 0 InternIpMaster:5432 IpJob:63814 TIME_WAIT
tcp 0 0 InternIpMaster:5432 IpJob:22192 TIME_WAIT
tcp 0 0 InternIpMaster:5432 IpJob:47729 TIME_WAIT
tcp 0 0 InternIpMaster:5432 IpJob:55663 TIME_WAIT
tcp6 0 0 :::5432 :::* LISTEN
I don't expect the autovacuum needed work yet because of the
So during startup the track_counts are disabled on runtime.
I have been looking for solutions changing the iptables. Without any iptable rules it will not work. I have connected to localhost as a host. I have changed the firewall settings in Azure. I opened the 5432 to access the vm from all ip's. I am able to access the database from other other systems. I have reset the conf to the default with only replication changes. I restarted the service many times.
What am i missing?
You want to fix this:
The stats collector expects UDP packets from localhost. Given than
localhost
looks fine in your/etc/hosts
(specifically it does not resolve to IPv6) the next more plausible explanation is that there's a firewall filtering these packets.Related: Problem in creating UDP sockets solved with: Found and resolved the problem in creating UDP sockets. It was because of the OS firewall (iptables) restricting in creating UDP sockets.
According to your link,
You should now be able to ssh freely between your two servers as the postgres user.
So, you need set up the trusted relation for the postgres user from master to the slave and slave to the master.You could use
ssh-keygen
to create a pair of key with blank password.shui@shui:~$ ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/home/shui/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/shui/.ssh/id_rsa. Your public key has been saved in /home/shui/.ssh/id_rsa.pub. The key fingerprint is: SHA256:mCyBHNLeEdCH2VqBjhtOC8njVLSXnjU7V9GbufK+hlE shui@shui The key's randomart image is: +---[RSA 2048]----+ |..++.*.. .. | | o.+B = .. | |.o+=.B o . + | |o+= *oooo . E | |o+.+.o+oS. . . | | .+ . o o . | | = | | . o | | oo. | +----[SHA256]-----+
More information please refer to this link.Also, you need open port 5432 on Azure NSG.
I want to elaborate on the answer @Daniel gave and the solution to my problem.
I had set up the iptables in order to get acces to postgresql like this:
I assumed this was plenty. However when i used
sudo iptables --flush
and restarted the postgres server the error disabling statistics collector for lack of working socket was gone.I also used iptraf to investigate the traffic (
sudo apt-get install iptraf
sudo iptraf
). I noticed that a traffic originated on the ip local (subnet) address of the server but on different ports. This is the traffic on the slave machine (without the azure traffic).I assume that this traffic is blocked by the iptables for it is not going via the loopback. Therefore I cleaned the iptables. This is the result:
I included the subnet. I think this is what makes it work, for the SubnetIpSlave and SubnetIpMaster are in this range. I probably am allowed to remove the ESTABLISHED,RELATED rule.
The log looks like it should:
I am happy ;)