Peter Eisentraut

professional automator

Using GNU GLOBAL with PostgreSQL

| Comments

When you are coding in a source tree as big as PostgreSQL’s, you will at some point want to look into some kind of source code indexing. It’s often convenient not to bother, since git grep is actually superfast. But when you want to find where a function is defined among all the call sites, some more intelligence is useful.

The traditional tools for this are ctags and etags, which create index files intended for use by vi and Emacs, respectively. The PostgreSQL source tree has some customized support for these in the tools src/tools/make_ctags and src/tools/make_etags. Because these tools operate on a directory level, those wrapper scripts create a single tag file (named tags or TAGS respectively) in the top-level directory and symlink it to all the other directories. This allows you to easily look for entries across the entire source tree. But it’s clearly a hack, and at least Emacs is often somewhat confused by this setup.

But there is something much better that works very similarly: GNU GLOBAL. A main difference is that GNU GLOBAL works on a project basis not on a directory basis, so you don’t need to do contortions to create and manage tags files all over your source tree. Also, GLOBAL can be used from the command line, so you don’t need to be an editor wizard to get started with it. Plus, it appears to be much faster.

The whole thing is very simple. Install the package, which is usually called global and available in most operating system distributions. To start, run

$ gtags

in the top-level directory. This creates the files GPATH, GRTAGS, and GTAGS.

Then you can use global to search for stuff, like

$ global elog

Or you can look for places a function is called:

$ global -r write_stderr

You can run global in any directory.

Or how about you want to look at the code where something is defined:

$ less -t elog

Note no file name is required. (See the manual for the required setup to make this work with less.)

Or of course use editor integration. For Emacs, there is ggtags-mode.

Here is some fine-tuning for use with the PostgreSQL source tree. Generally, I don’t want to index generated files. For example, I don’t want to see hits in gram.c, only in gram.y. Plus, you don’t want to index header files under tmp_install. (Super annoying when you use this to jump to a file to edit and later find that your edits have been blown away by make check.) But when you run gtags in a partially built tree, it will index everything it finds. To fix that, I have restricted gtags to only index files that are registered in Git, by first running

git ls-files >gtags.files

in the top-level directory. Then gtags will only consider the listed files.

This will also improve the workings of the Emacs mode, which will at random times call global -u to update the tags. If it finds a gtags.files file, it will observe that and not index random files lying around.

I have a shell alias pgconfigure which calls configure with a bunch of options for easier typing. It’s basically something like

pgconfigure() {
    ./configure --prefix=$(cd .. && pwd)/pg-install --enable-debug --enable-cassert ...

At the end I call

git ls-files >gtags.files
gtags -i &

to initialize the source tree for GNU GLOBAL, so it’s always there.

Some git log tweaks

| Comments

Here are some tweaks to git log that I have found useful. It might depend on the workflow of individual projects how applicable this is.

Git stores separate author and committer information for each commit. How these are generated and updated is sometimes mysterious but generally makes sense. For example, if you cherry-pick a commit to a different branch, the author information stays the same but the committer information is updated. git log defaults to showing the author information. But I generally care less about that than the committer information, because I’m usually interested in when the commit arrived in my or the public repository, not when it was initially thought about. So let’s try to change the default git log format to show the committer information instead. Again, depending on the project and the workflow, there can be other preferences.

To create a different default format for git log, first create a new format by setting the Git configuration item pretty.somename. I chose pretty.cmedium because it’s almost the same as the default medium but with the author information replaced by the committer information.

cmedium="format:%C(auto,yellow)commit %H%C(auto,reset)%nCommit:     %cn <%ce>%nCommitDate: %cd%n%n%w(0,4,4)%s%n%+b"

Unfortunately, the default git log formats are not defined in terms of these placeholders but are hardcoded in the source, so this is my best reconstruction using the available means.

You can use this as git log --pretty=cmedium, and you can set this as the default using


If you find this useful and you’re the sort of person who is more interested in their own timeline than the author’s history, you might also like two more tweaks.

First, add %cr for relative date, so it looks like

cmedium="format:%C(auto,yellow)commit %H%C(auto,reset)%nCommit:     %cn <%ce>%nCommitDate: %cd (%cr)%n%n%w(0,4,4)%s%n%+b"

This adds a relative designation like “2 days ago” to the commit date.

Second, set


to have all timestamps converted to your local time.

When you put all this together, you turn this

commit e2c117a28f767c9756d2d620929b37651dbe43d1
Author: Paul Eggert <>
Date:   Tue Apr 5 08:16:01 2016 -0700

into this

commit e2c117a28f767c9756d2d620929b37651dbe43d1
Commit:     Paul Eggert <>
CommitDate: Tue Apr 5 11:16:01 2016 (3 days ago)

PS: If this is lame, there is always this:

Check your pg_dump compression levels

| Comments

I was idly wondering what was taking pg_dump so long and noticed that it always seemed to be pegged at 100% CPU usage on the client. That was surprising because naively one might think that the bottleneck are the server’s or the client’s disk or the network. Profiling quickly revealed that the compression library zlib was taking most of the run time on the client. And indeed, turning compression off caused pg_dump to fly without getting anywhere near 100% CPU.

When using the custom output format in pg_dump (-Fc), the output is automatically compressed, using the same default level that gzip uses. By using the option -Z, one can select a compression level between 0 (off) and 9 (highest). Although it is not documented, the default corresponds to level 6.

Some simple testing has shown that lowering the level from 6 to 1 can speed up the dump run time by a factor of 3 or more while only increasing the output size by 10%. Even the levels in between give significant speed increases with only minimal differences in output size. On the other hand, increasing the compression level to 9 only decreases the output size by about 1% while causing slow downs by a factor of 2. (In this test, level 1 was about twice as slow as no compression, but the output size was about 40% of the uncompressed size. So using at least some compression was still desirable.)

I encourage experimentation with these numbers. I might actually default my scripts to -Z1 in the future.

Have problems with PostgreSQL? Try using Hive!

| Comments

So I had this PostgreSQL database that was getting a bit too big, and since it was really only for analytics, I figured it would be a good fit for putting in Hadoop+Hive instead.

(For those not completely familiar with this: Hadoop is sort of a job tracker and distributed file system. Hive is an SQL-like layer on top of that. I know the cool kids are now using Spark. Maybe for another day.)

The first thing you need to learn about the Hadoop ecosystem is its idiosyncratically fragmented structure. With PostgreSQL, you basically have the community website, the community mailing lists, the community source code distribution, the community binaries, and a handful of binaries made by Linux distributions. If you search the web for a problem with PostgreSQL, you will normally gets hits on one or more of: the documentation, the mailing lists, third-party mirrors of the mailing lists, or Stack Overflow. With Hadoop, you have the resources provided by the Apache Software Foundation, including the source distribution, bug tracker, documentation, and then bunch of commercial vendors with their parallel universes, including their own mutually incompatible binary distributions, their own copy of the documentation, their own mailing lists, their own bug trackers, etc. When you search for a problem with Hadoop, you will typically get hits from three separate copies of the documentation, about eight mailing lists, fifteen tutorials, and one thousand blog posts. And about 20 unanswered posts on Stack Overflow. Different vendors also favor different technology extensions. So if, say, you read that you should use some storage method, chances are it’s not even supported in a given distribution.

The next thing to know is that any information about Hadoop that is older than about two years is obsolete. Because they keep changing everything from command names to basic architecture. Don’t even bother reading old stuff. Don’t even bother reading anything.

So Hive. The basic setup is actually fairly well documented. You set up a Hadoop cluster, HDFS, create a few directories. Getting the permissions sorted out during these initial steps is not easy, but it seldom is. So you can create a few tables, load some data, run a few queries.

Nevermind that in its default configuration hive spits out about a dozen warnings on every startup about deprecated parameters and jar file conflicts. This is apparently well known. Look around in the internet for hive examples. They show the same output. Apparently the packaged versions of Hadoop and Hive are not tuned for each other.

Then you learn: In the default configuration, there can only be one Hive session connected at once. It doesn’t tell you this. Instead, when the second session wants to connect, it tells you

Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

followed by hundreds of lines of exception traces. This is Hive-speak for: “there is already one session connected”.

You see, Hive needs a, cough, cough, relational database to store its schema. By default, it uses embedded Derby, which allows only one connection at a time. If you want to connect more than one session at once, you need to set up an external “Hive metastore” on a MySQL or PostgreSQL database.

Nevermind that Derby can actually run in server mode. That’s apparently not supported by Hive.

So I had a PostgreSQL database handy and tried to set that up. I installed the PostgreSQL JDBC driver, created an external database, changed the Hive configuration to use an external database.

At this point, it turned out that the PostgreSQL JDBC driver was broken, so I had to downgrade to an older version. (The driver has since been fixed.)

After I got one that was working, Hive kept complaining that it couldn’t find a driver that matches the JDBC URL jdbc:postgresql://somehost/hive_metastore. The PostgreSQL JDBC driver explains in detail how to load the driver, but how do I get that into Hive?

The first suggestion from the internet was to add something like this to .hiverc:

add jar /usr/share/java/postgresql-jdbc.jar;

That doesn’t work. Remember, don’t believe anything you read on the internet.

In between I even tried download the MySQL JDBC driver (no, I don’t want to sign in with my Oracle account), but it had the same problem.

hive is actually a shell script which loads another shell script which loads a bunch of other shell scripts, which eventually starts java. After randomly poking around I determined that if I did

export HIVE_AUX_JARS_PATH=/usr/share/java/

it would pick up the jar files in that directory. OK, that worked.

Now I can create tables, load data, run simple queries, from more than one session. So I could do

SELECT * FROM mytable;

But as soon as I ran

SELECT count(*) FROM mytable;

it crapped out again: File does not exist: hdfs://namenode/usr/share/java/jline-0.9.94.jar

So it’s apparently looking for some jar file on HDFS rather than the regular file system. Some totally unrelated jar file, too.

The difference between the two queries is that the first one is answered by just dumping out data locally, whereas the second one generates a distributed map-reduce job. It doesn’t tell you that beforehand, of course. Or even afterwards.

After a while I figured that this must have something to do with the HIVE_AUX_JARS_PATH setting. I changed that to

export HIVE_AUX_JARS_PATH=/usr/share/java/postgresql-jdbc.jar;

so it would look at only one file, and sure enough it now complains File does not exist: hdfs://namenode/usr/share/java/postgresql-jdbc.jar

Apparently, the HIVE_AUX_JARS_PATH facility is for adding jars that contain user-defined functions that you need at run time. As far as I can tell, there is no separate setting for adding jars that you only need locally.

There are workarounds for that on the internet, of varying bizarreness, none of which worked. Remember, don’t believe anything you read on the internet.

In the end, I indulged it and just uploaded that jar file into HDFS. Whatever.

I then put my data loading job into cron, which quickly crapped out because JAVA_HOME is not set in the cron environment. After that was fixed, I let my data loading jobs run for a while.

Later, I wanted clear out the previous experiments, drop all tables, and start again. Apparently, dropping a table in Hive takes a very long time. Actually, no. When you use PostgreSQL for the Hive metastore, any attempt to drop a table will hang indefinitely.

Someone summarized the issue:

You are the first person I have heard of using postgres. I commend you for not succumbing to the social pressure and just installing mysql. However I would advice succumbing to the social pressure and using either derby or mysql.

The reason I say this is because jpox “has support” for a number of data stores (M$ SQL server) however, people have run into issues with them. Databases other then derby and mysql ‘should work’ but are generally untested.

Not that actually testing it would take much work. It’s not like Hive doesn’t have any tests. Just add some tests.

It’s funny that they didn’t write “You are the first person I have heard of using hive”. Clearly, nobody has ever actually used this.

Anyway, somehow I ended up creating the metastore schema manually by copying and pasting various pieces from the internet and raw files. Shudder.

How about more fun? Here is a run-of-the-mill SQL parse error:

NoViableAltException(26@[221:1: constant : ( Number | dateLiteral | StringLiteral | stringLiteralSequence | BigintLiteral | SmallintLiteral | TinyintLiteral | DecimalLiteral | charSetStringLiteral | booleanValue );])
  at org.antlr.runtime.DFA.noViableAlt(
  at org.antlr.runtime.DFA.predict(
  at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.constant(
  at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.partitionVal(
  at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.partitionSpec(
  at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.tableOrPartition(
  at org.apache.hadoop.hive.ql.parse.HiveParser.tableOrPartition(
  at org.apache.hadoop.hive.ql.parse.HiveParser.insertClause(
  at org.apache.hadoop.hive.ql.parse.HiveParser.regular_body(
  at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatement(
  at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(
  at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(
  at org.apache.hadoop.hive.ql.parse.HiveParser.statement(
  at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(
  at org.apache.hadoop.hive.ql.Driver.compile(
  at org.apache.hadoop.hive.ql.Driver.compile(
  at org.apache.hadoop.hive.ql.Driver.runInternal(
  at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(
  at org.apache.hadoop.hive.cli.CliDriver.processCmd(
  at org.apache.hadoop.hive.cli.CliDriver.processLine(
  at org.apache.hadoop.hive.cli.CliDriver.processLine(
  at org.apache.hadoop.hive.cli.CliDriver.processReader(
  at org.apache.hadoop.hive.cli.CliDriver.processFile(
  at org.apache.hadoop.hive.cli.CliDriver.executeDriver(
  at org.apache.hadoop.hive.cli.CliDriver.main(
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(
  at java.lang.reflect.Method.invoke(
  at org.apache.hadoop.util.RunJar.main(
FAILED: ParseException line 4:20 cannot recognize input near 'year' '(' 'event_timestamp' in constant

In PostgreSQL, this might say

syntax error at or near "("

with a pointer to the actual query.

I just put a function call somewhere where it didn’t belong. The documentation is very terse and confusing about a lot of these things. And the documentation is kept as a series of wiki pages.

So now I have a really slow distributed version of my PostgreSQL database, which stores its schema in another PostgreSQL database. I forgot why I needed that.

Storing URIs in PostgreSQL

| Comments

About two months ago, this happened:

And a few hours later:

It took a few more hours and days after this to refine some details, but I have now tagged the first release of this extension. Give it a try and let me know what you think. Bug reports and feature requests are welcome.

(I chose to name the data type uri instead of url, as originally suggested, because that is more correct and matches what the parsing library calls it. One could create a domain if one prefers the other name or if one wants to restrict the values to certain kinds of URIs or URLs.)

(If you are interested in storing email addresses, here is an idea.)

Retrieving PgBouncer statistics via dblink

| Comments

PgBouncer has a virtual database called pgbouncer. If you connect to that you can run special SQL-like commands, for example

$ psql -p 6432 pgbouncer
=# SHOW pools;
┌─[ RECORD 1 ]───────────┐
│ database   │ pgbouncer │
│ user       │ pgbouncer │
│ cl_active  │ 1         │
│ cl_waiting │ 0         │
│ sv_active  │ 0         │
│ sv_idle    │ 0         │
│ sv_used    │ 0         │
│ sv_tested  │ 0         │
│ sv_login   │ 0         │
│ maxwait    │ 0         │

This is quite nice, but unfortunately, you cannot run full SQL queries against that data. So you couldn’t do something like

SELECT * FROM pgbouncer.pools WHERE maxwait > 0;

Well, here is a way: From a regular PostgreSQL database, connect to PgBouncer using dblink. For each SHOW command provided by PgBouncer, create a view. Then that SQL query actually works.

But before you start doing that, I have already done that here:

Here is another useful example. If you’re tracing back connections from the database server through PgBouncer to the client, try this:

SELECT * FROM pgbouncer.servers LEFT JOIN pgbouncer.clients ON = clients.ptr;

Unfortunately, different versions of PgBouncer return a different number of columns for some commands. Then you will need different view definitions. I haven’t determined a way to handle that elegantly.

The history of replication in PostgreSQL

| Comments

2001: PostgreSQL 7.1: write-ahead log

PostgreSQL 7.1 introduced the write-ahead log (WAL). Before that release, all open data files had to be fsynced on every commit, which is very slow. Slow fsyncing is still a problem today, but now we’re only worried about fsyncing the WAL, and fsyncing the data files during the checkpoint process. Back then, we had to fsync everything all the time.

In the original design of university POSTGRES, the lack of a log was intentional, and contrasted with heavily log-based architectures such as Oracle. In Oracle, you need the log to roll back changes. In PostgreSQL, the nonoverwriting storage system takes care of that. But probably nobody thought about implications for fsyncing back then.

Note that the WAL was really just an implementation detail at this point. You couldn’t read or archive it.

2004: Slony

Just for context: Slony-I 1.0 was released in July 2004.

2005: PostgreSQL 8.0: point-in-time recovery

PostgreSQL 8.0 added the possibility to copy the WAL somewhere else, and later play it back, either all the way or to a particular point in time, hence the name point-in-time recovery (PITR) for this feature. This feature was mainly intended to relieve pg_dump as a backup method. Until then, the only backup method was a full dump, which would get impractical as databases grew. Hence this method to take an occasional base backup, which is the expensive part, and then add on parts of the WAL, which is cheaper.

The basic configuration mechanisms that we still use today, for example the recovery.conf file, were introduced as part of this feature.

But still no replication here.

2008: PostgreSQL 8.3: pg_standby

Crafty people eventually figured that if you archived WAL on one server and at the same time “recovered” endlessly on another, you’d have a replication setup. You could probably have set this up with your own scripts as early as 8.0, but PostgreSQL 8.3 added the pg_standby program into contrib, which gave everyone a standard tool. So, arguably, 8.3 is the first release that contained a semblance of a built-in replication solution.

The standby server was in permanent recovery until promotion, so it couldn’t be read from as it was replicating. This is what we’d now call a warm standby.

I think a lot of PostgreSQL 8.3 installations refuse to die, because this is the first version where you could easily have a reasonably up-to-date reserve server without resorting to complicated and sometimes problematic tools like Slony or DRBD.

2010: PostgreSQL 9.0: hot standby, streaming replication

In PostgreSQL 9.0, two important replication features arrived completely independently. First, the possibility to connect to a standby server in read-only mode, making it a so-called hot standby. Whereas before, a standby server was really mainly useful only as a reserve in case the primary server failed, with hot standby you could use secondary servers to spread out read-only loads. Second, instead of relying solely on the WAL archive and recovery functionalities to transport WAL data, a standby server could connect directly to the primary server via the existing libpq protocol and obtain WAL data that way, so-called streaming replication. The primary use in this release was that the standby could be more up to date, possibly within seconds, rather than several minutes with the archive-based approach. For a robust setup, you would still need to set up an archive. But streaming replication was also a forward-looking feature that would eventually make replication setups easier, by reducing the reliance on the old archiving mechanisms.

PostgreSQL 9.0 was the first release where one could claim that PostgreSQL “supports replication” without having to make qualifications or excuses. Although it is scheduled to go EOL later this year, I expect this release will continue to live for a long time.

2011: PostgreSQL 9.1: pg_basebackup, synchronous replication

pg_basebackup was one of the features facilitated by streaming replication that made things easier. Instead of having to use external tools like rsync for base backups, pg_basebackup would use a normal libpq connection to pull down a base backup, thus avoiding complicated connection and authentication setups for external tools. (Some people continue to favor rsync because it is faster for them.)

PostgreSQL 9.1 also added synchronous replication, which ensures that data is replicated to the designated synchronous standby before a COMMIT reports success. This feature is frequently misunderstood by users. While it ensures that your data is on at least two servers at all times, it might actually reduce the availability of your system, because if the standby server goes down, the primary will also go down, unless you have a third server available to take over the synchronous standby duty.

Less widely know perhaps is that PostgreSQL 9.1 also added the pg_last_xact_replay_timestamp function for easy monitoring of standby lag.

In my experience, the availability of pg_basebackup and pg_last_xact_replay_timestamp make PostgreSQL 9.1 the first release were managing replication was reasonably easy. Go back further, and you might feel constrained by the available tools. But in 9.1, it’s not that much different from what is available in the most recent releases.

2012: PostgreSQL 9.2: cascading replication

Not as widely acclaimed, more for the Slony buffs perhaps, PostgreSQL 9.2 allowed standbys to fetch their streaming replication data from other standbys. A particular consequence of that is that pg_basebackup could copy from a standby server, thus taking the load off the primary server for setting up a new standby or standalone copy.

2013: PostgreSQL 9.3: standby can follow timeline switch

This did not even make it into the release note highlights. In PostgreSQL 9.3, when a primary has two standbys, and one of the standbys is promoted, the other standby can just keep following the new primary. In previous releases, the second standby would have to be rebuilt. This improvement makes dynamic infrastructure changes much simpler. Not only does it eliminate the time, annoyance, and performance impact of setting up a new standby, more importantly it avoids the situation that after a promotion, you don’t have any up to update standbys at all for a while.

2014: PostgreSQL 9.4: replication slots, logical decoding

Logical decoding got all the press for PostgreSQL 9.4, but I think replication slots are the major feature, possibly the biggest replication feature since PostgreSQL 9.0. Note that while streaming replication has gotten more sophisticated over the years, you still needed a WAL archive for complete robustness. That is because the primary server didn’t actually keep a list of its supposed standby servers, it just streamed whatever WAL happened to be requested if it happened to have it. If the standby server fell behind sufficiently far, streaming replication would fail, and recovery from the archive would kick in. If you didn’t have an archive, the standby would then no longer be able to catch up and would have to be rebuilt. And this archiving mechanism has essentially been unchanged since version 8.0, when it was designed for an entirely different purpose. So a replication setup is actually quite messy: You have to configure an access path from the primary to the standby (for archiving) and an access path from the standby to the primary (for streaming). And if you wanted to do multiple standbys or cascading, maintaining the archive could get really complicated. Moreover, I think a lot of archiving setups have problematic archive_command settings. For example, does your archive_command fsync the file on the receiving side? Probably not.

No more: In PostgreSQL 9.4, you can set up so-called replication slots, which effectively means that you register a standby with the primary, and the primary keeps around the WAL for each standby until the standby has fetched it. With this, you can completely get rid of the archiving, unless you need it as a backup.

2015? PostgreSQL 9.5? pg_rewind?

One of the remaining problems is that promoting a standby leaves the old primary unable to change course and follow the new primary. If you fail over because the old primary died, then that’s not an issue. But if you just want to swap primary and standby, perhaps because the standby has more powerful hardware, then the old primary, now standby, needs to be rebuilt completely from scratch. Transforming an old primary into a new standby without a completely new base backup is a rather intricate problem, but a tool that can do it (currently named pg_rewind) is proposed for inclusion into the next PostgreSQL release.


One of the problems that this evolution of replication has created is that the configuration is rather idiosyncratic, quite complicated to get right, and almost impossible to generalize sufficiently for documentation, tutorials, and so on. Dropping archiving with 9.4 might address some of these points, but configuring even just streaming replication is still weird, even weirder if you don’t know how it got here. You need to change several obscure configuration parameters, some on the primary, some on the standby, some of which require a hard restart of the primary server. And then you need to create a new configuration file recovery.conf, even though you don’t want to recover anything. Making changes in this area is mostly a complex political process, because the existing system has served people well over many years, and coming up with a new system that is obviously better and addresses all existing use cases is cumbersome.

Another issue is that all of this functionality has been bolted on to the write-ahead log mechanism, and that constrains all the uses of the write-ahead log in various ways. For example, there are optimizations that skip WAL logging in certain circumstances, but if you want replication, you can’t use them. Who doesn’t want replication? Also, the write-ahead log covers an entire database system and is all or nothing. You can’t replicate only certain tables, for example, or consolidate logs from two different sources.

How about not bolting all of this on to the WAL? Have two different logs for two different purposes. This was discussed, especially around the time streaming replication was built. But then you’d need two logs that are almost the same. And the WAL is by design a bottleneck, so creating another log would probably create performance problems.

Logical decoding breaks many of these restrictions and will likely be the foundation for the next round of major replication features. Examples include partial replication and multimaster replication, some of which are being worked on right now.

What can we expect from plain WAL logging in the meantime? Easier configuration is certainly a common request. But can we expect major leaps on functionality? Who knows. At one point, something like hot standby was thought to be nearly impossible. So there might be surprises still.

Listing screen sessions on login

| Comments

There is a lot of helpful information about screen out there, but I haven’t found anything about this. I don’t want to “forget” any screen sessions, so I’d like to be notified when I log into a box and there are screens running for me. Obviously, there is screen -ls, but it needs to be wrapped in a bit logic so that it doesn’t annoy when there is no screen running or even installed.

After perusing the screen man page a little, I came up with this for .bash_profile or .zprofile:

if which screen >/dev/null; then
    screen -q -ls
    if [ $? -ge 10 ]; then
        screen -ls

The trick is that -q in conjuction with -ls gives you exit codes about the current status of screen.

Here is an example of how this looks in practice:

~$ ssh host
Last login: Fri Feb 13 11:30:10 2015 from
There is a screen on:
        31572.pts-0.foobar      (2015-02-15 13.03.21)   (Detached)
1 Socket in /var/run/screen/S-peter.


Directing output to multiple files with zsh

| Comments

Normally, this doesn’t work as one might naively expect:

program > firstfile > secondfile

The second redirection will override the first one. You’d have to use an external tool to make this work, maybe something like:

program | tee firstfile secondfile

But with zsh, this type of thing actually works. It will duplicate the output and write it to multiple files.

This feature also works with a combination of redirections and pipes. For example

ls > foo | grep bar

will write the complete directory listing into file foo and print out files matching bar to the terminal.

That’s great, but this feature pops up in unexpected places.

I have a shell function that checks whether a given command produces any output on stderr:

! myprog "$arg" 2>&1 >/dev/null | grep .

The effect of this is:

  • If no stderr is produced, the exit code is 0.
  • If stderr is produced, the exit code is 1 and the stderr is shown.

(Note the ordering of 2>&1 >/dev/null to redirect stderr to stdout and silence the original stdout, as opposed to the more common incantation of >/dev/null 2>&1, which silences both stderr and stdout.)

The reason for this is that myprog has a bug that causes it to print errors but not produce a proper exit status in some cases.

Now how will my little shell function snippet behave under zsh? Well, it’s quite confusing at first, but the following happens. If there is stderr output, then only stderr is printed. If there is no stderr output, then stdout is passed through instead. But that’s not what I wanted.

This can be reproduced simply:

ls --bogus 2>&1 >/dev/null | grep .

prints an error message, as expected, but

ls 2>&1 >/dev/null | grep .

prints a directory listing. That’s because zsh redirects stdout to both /dev/null and the pipe, which makes the redirection to /dev/null pointless.

Note that in bash, the second command prints nothing.

This behavior can be changed by turning off the MULTIOS option (see zshmisc man page), and my first instinct was to do that, but options are not lexically scoped (I think), so this would break again if the option was somehow changed somewhere else. Also, I think I kind of like that option for interactive use.

My workaround is to use a subshell:

! ( myprog "$arg" 2>&1 >/dev/null ) | grep .

The long-term fix will probably be to write an external shell script in bash or plain POSIX shell.

ccache and clang, part 3

| Comments

In part 1 and part 2 I investigated how to use ccache with clang. That was more than three years ago.

I got an email the other day that ccache bug 8118, which I filed while writing part 1, was closed, as ccache 3.2 was released. The release notes of ccache 3.2 contain several items related to clang. So it was time to give this another look.

Basically, the conclusions from part 2 still stand: You cannot use ccache with clang without using CCACHE_CPP2. And it is now becoming clear to me that this is an issue that is not going to go away, and it’s not really even Clang’s fault.


The problem is that clang’s -Wall can cause warnings when compiling the preprocessed version of otherwise harmless C code. This can be illustrated by this piece of C code:

        int *p, *q;

        p = q = 0;
        p = p;
        if (p == p)
                return 1;
        if ((p == q))
                return 2;
        return 0;

When compiled by gcc-4.9 -Wall, this gives no warnings. When compiled by clang-3.5 -Wall, this results in

test.c:7:4: warning: explicitly assigning value of variable of type 'int *' to itself [-Wself-assign]
test.c:8:8: warning: self-comparison always evaluates to true [-Wtautological-compare]
test.c:10:9: warning: equality comparison with extraneous parentheses [-Wparentheses-equality]
test.c:10:9: note: remove extraneous parentheses around the comparison to silence this warning
test1.c:10:9: note: use '=' to turn this equality comparison into an assignment

You wouldn’t normally write code like this, but the C preprocessor could create code with self-assignments, self-comparisons, extra parentheses, and so on.

This example represents the issues I saw when trying to compile PostgreSQL 9.4 with ccache and clang; there might be others.

You can address this issue in two ways:

  1. Use CCACHE_CPP2, as discussed in part 2. With ccache 3.2, you can now also put this into a configuration file: run_second_cpp = true in ~/.ccache/ccache.conf

  2. Turn off the warnings mentioned above: -Wno-parentheses-equality, -Wno-tautological-compare, -Wno-self-assign (and any others you might find). One might think that these are actually useful warnings that one might want to keep, but GCC doesn’t warn about them, and if you develop primarily with GCC, your code might contain these issues anyway. In particular, I have found that -Wno-tautological-compare is necessary for legitimate code.

I think CCACHE_CPP2 is the way to go, for two reasons. Firstly, having to add more and more options to turn off warnings is obviously somewhat stupid. Secondly and more importantly, there is nothing stopping GCC from adding warnings similar to Clang’s that would trigger on preprocessed versions of otherwise harmless C code. Unless they come up with a clever way to annotate the preprocessed code to the effect of “this code might look wrong to you, but it looked OK before preprocessing, so don’t warn about it”, in a way that creates no extra warnings and doesn’t lose any warnings, I don’t think this issue can be solved.


Now the question is, how much would globally setting CCACHE_CPP2 slow things down?

To test this, I have built PostgreSQL 9.4rc1 with clang-3.5 and gcc-4.8 (not gcc-4.9 because it creates some unrelated warnings that I don’t want to deal with here). I have set export CCACHE_RECACHE=true so that the cache is not read but new cache entries are computed. That way, the overhead of ccache on top of the compiler is measured.


  • clang-3.5
    • Using ccache is 10% slower than not using it at all.
    • Using ccache with CCACHE_CPP2 on is another 10% slower.
  • gcc-4.8
    • Using ccache is 19% slower than not using it at all.
    • Using ccache with CCACHE_CPP2 is another 9% slower.

(There different percentages between gcc and clang arise because gcc is faster than clang (yes, really, more on that in a future post), but the overhead of ccache doesn’t change.)

10% or so is not to be dismissed, but let’s remember that this applies only if there is a cache miss. If everything is cached, both methods do the same thing. Also, if you use parallel make, the overhead is divided by the number of parallel jobs.

With that in mind, I have decided to put the issue to rest for myself and have made myself a ~/.ccache/ccache.conf containing

run_second_cpp = true

Now Clang or any other compiler should run without problems through ccache.


There is one more piece of news in the new ccache release: Another thing I talked about in part 1 was that ccache will disable the colored output of clang, and I suggested workarounds. This was actually fixed in ccache 3.2, so the workarounds are no longer necessary, and the above configuration change is really the only thing to make Clang work smoothly with ccache.