Junior developer

puts "Hello world!"

My Photo


I've been Twittering

The U.S. Navy reading list

Tom Copeland's Recent Posts

  • Things to keep from RubyForge
  • Generating Parsers with JavaCC, Second Edition now available
  • Transferring gem namespaces on RubyForge
  • Upgrading PostgreSQL with pg_migrator
  • The rubyforge gem and the RubyForge REST API
  • Applying the Rails XSS vulnerability patch
  • What gem namespaces does that RubyForge project own?
  • How much disk space is my RubyForge project using?
  • Refactoring RubyForge svn+http directives
  • Monit, Mongrel, and /usr/local/bin

RSS Feeds

All/ Java/ Ruby/ PostgreSQL/ General

Upgrading PostgreSQL with pg_migrator

I've always done PostgreSQL upgrades via the time-honored "dump and load" method. This works, but it takes a while as you can see from the notes on the RubyForge PostgreSQL 8.2 => 8.3 upgrade.

Happily, with PostgreSQL 8.4, the dump/load routine is no longer necessary. There's a new game in town, and it's pg_migrator. This utility copies over the appropriate bits from your old PostgreSQL installation and adds them to your new one, with the result being a much shorter downtime.

I've done two 8.3 => 8.4 upgrades so far; here are some notes:

  • If you don't compile PostgreSQL with --disable-integer-datetimes you might get an error from pg_migrator. This has happened in both migrations for me. To fix, just recompile with that flag and move on. Also, since you probably did an initdb before running pg_migrator, you'll need to run that again. Easiest fix is probably to just rm -rf /path/to/pgsql84/ and recompile, reinstall, and initdb.
  • When installing pg_migrator I found I needed to first run make USE_PGXS=1 PG_CONFIG=/var/pgsql/bin/pg_config and then run a sudo make USE_PGXS=1 PG_CONFIG=/var/pgsql/bin/pg_config install. The two files that need to be copied over to your new pgsql directory are func/pg_migrator.so, which goes in /var/pgsql/lib/, and src/pg_migrator, which goes in /var/pgsql/bin/. Now that I'm thinking about it I'm wondering if that "sudo make install" is really necessary... I'll experiment next time I'm doing an upgrade.
  • In one database I had a test table that was a copy of pg_class. This table had a field of type name, which caused problems for pg_migrator. In my case I could just drop the table and rerun pg_migrator, so it was no big deal.
  • The docs recommend running vacuumdb --all --analyze on the new cluster once the migration is complete. I found this takes almost as long as the actual migration! But it's probably the right time to do it since the system is down anyhow. Update: In the comments of this post Greg Smith notes that this can be run after the system is brought back online, although performance won't be great until all those stats are recomputed. But it would get the system back online more quickly.
  • pg_migrator has a --link parameter that does an in-place upgrade. This scares me - I want to be able to quickly fall back if something goes wrong - so I haven't used it. That would probably save a little time... not sure how much. It would definitely save some disk I/O, too. Update: In the comments of this post Bruce Momjian notes that this can save orders of magnitude of time. Update 2: Reran a migration using --link and migration time went from 13 minutes to 1 minute. Nice! Update 2: I used --link when migrating RubyForge from 8.3 to 8.4 - actual database migration took only 1m17s! This vs dump/load time a year ago of 40 minutes; would have been more this time for sure!
  • For one system I was using the postgres gem to hook my Rails app the the PostgreSQL database. I should upgrade to the pg gem, but anyhow, I didn't need to reinstall the gem after this upgrade.
  • I'm also running Sphinx on PostgreSQL and didn't need to reinstall that either; every thing just kept on ticking.

That's all I can think of for now. I'll add some notes to this after I do the RubyForge upgrade... these first couple upgrades have been "area familiarity".

September 30, 2009 in PostgreSQL | Permalink | Comments (3) | TrackBack (0)

An excellent PostgreSQL book

If that pink O'Reilly book "Practical PostgreSQL" is the most recent PostgreSQL book on your shelf, it's time for an upgrade to Neil Matthew and Richard Stones' second edition of Beginning Databases with PostgreSQL.  I kind of hesitated to pick this one up since it sounded like it might be too much of a beginner's book, but I've been very pleased with it.  After a couple of introductory chapters it gets right down to business with a good discussion of psql, advanced querying (correlated subqueries and outer joins and whatnot), and fine sections on triggers and stored procedures.

Most of the PostgreSQL work I do (on indi) is using Ruby and the ActiveRecord object/relational mapping library, so I don't do much actual SQL or querying PostgreSQL from C.  But it's nice to know that the book includes a chapter on libpq in case I ever need to hack around the Ruby PostgreSQL extension. 

Anyhow, I highly recommend this book; it's been very useful for me.  Now I just keep it on my desk instead of always keeping a browser window open to the online PostgreSQL docs :-)

March 20, 2006 in PostgreSQL | Permalink | Comments (0) | TrackBack (0)

Tweaking an old GForge SQL query

RubyForge is running a rather old version of GForge - 4.0.2.  It's pretty far out of date, and hopefully I'll be able to upgrade it soonish.  But anyhow, Daniel Berger had reported that searching tracker items was extremely slow.  Just searching a single project's support tracker took a few minutes... which pegged RubyForge's CPU, blah.

Here's the old query:

SELECT DISTINCT ON (a.group_artifact_id,a.artifact_id) a.group_artifact_id,
a.artifact_id,a.summary,a.open_date,users.realname
FROM artifact a
LEFT OUTER JOIN artifact_message am USING (artifact_id), users
WHERE a.group_artifact_id='102'
  AND users.user_id=a.submitted_by
    AND ((a.details ILIKE '%wiki%')
      OR (a.summary ILIKE '%wiki%'))
      OR (am.body ILIKE '%wiki%')
ORDER BY group_artifact_id ASC, a.artifact_id ASC

And here's the new, much faster query:

SELECT DISTINCT ON (a.group_artifact_id,a.artifact_id) a.group_artifact_id,
a.artifact_id,a.summary,a.open_date,users.realname
FROM artifact a
LEFT OUTER JOIN artifact_message am USING (artifact_id), users
WHERE a.group_artifact_id='102'
  AND users.user_id=a.submitted_by
    AND ((a.details ILIKE '%wiki%')
      OR (a.summary ILIKE '%wiki%')
      OR (am.body ILIKE '%wiki%'))
ORDER BY group_artifact_id ASC, a.artifact_id ASC

See the difference?  It's quite small - just moved a parenthesis so that the second OR condition wasn't floating out there on its own.  This prevents a full table scan of artifact_messages - which only has 4600 records or so but scanning the text of all those message bodies isn't cheap.

At any rate, you can see the new query in action here.  Snappy!

In retrospect, I should have checked the latest code from GForge - because this is fixed in GForge 4.5.6.  So Tim Perdue and his team are on the ball, good times.  Ah well, it was an interesting experience, spelunking around in SQL; indi runs on Rails and so ActiveRecord shields me from most of that.

March 03, 2006 in PostgreSQL | Permalink | Comments (0) | TrackBack (0)

Jabber, PostgreSQL, and xdb

I've got Jabberd 1.4 set up so's it stores all the data in a PostgreSQL database; more details on how that's set up is here.  But recently I needed to change the connect string so that it connects to a database whose name is different than the user name.  Here's how to do that; this is the postgresql element that's a child of the xdb_sql element:

<postgresql>
<conninfo>user=game password=secret dbname=gamedb</conninfo>
</postgresql>

That's dbname, not db or database.   This is probably documented somewhere, but hopefully this post will make it easier to Google up...

February 15, 2006 in PostgreSQL | Permalink | Comments (0) | TrackBack (0)

Jabber server + PostgreSQL == good

Recently I needed to get Jabber 1.4.4 to store rosters and users and presence and so forth to a PostgreSQL database.

  • I modified the SQL scripts to create tables in PostgreSQL; script is here.  Note that there are some extra sequences and id columns in there so that ActiveRecord can its object-relational mapping magic
  • I'm not too familiar with PostgreSQL date formats, so I just put storetime, delivertime, and lastmodified columns in TEXT format

I modified the sample jabber.xml so now the queries use PostgreSQL CASE and COALESCE statements vs the MySQL IF statement; sample jabber.xml is here.  Just scroll down to the xdb element to see the new queries.

For what it's worth, this setup works fine with the Ruby client library for Jabber.  Not surprising, but, cool to know anyway.  We might end up adding some load testing cases to the Jabber4R project.

Was this helpful?  If so, please buy my completely unrelated book!

October 13, 2005 in PostgreSQL | Permalink | Comments (2) | TrackBack (0)