PostgreSQL as a Data Warehouse in Ruby

PostgreSQL is a popular, free, open-source relational database that can be used for different workloads. Postgres provides querying abilities and windowing functions.

Its versatility makes it possible to be used as a transactional database as well as a data warehouse for analytics. Postgres boasts high reliability, extensibility, and data integrity. Postgres also supports a unique object-relational model, which is generally not found in counterparts like MySQL and MariaDB. This feature allows users to define custom data types if the already rich data types are not enough for the use case.

Key Features of PostgreSQL

Other than the standard features of a relational database that includes typical data types, transactional integrity, and querying layer, Postgres possesses some unique features that make it different from other databases in comparison. Let us look into these features as listed below:

  1. Postgres has a comprehensive querying layer that conforms to the internationally accepted SQL standards as much as possible. It supports 170 of the 179 mandatory features of the SQL standards.
  2. Postgres has an object-relational model feature that allows users to define custom data types and functions to deal with the custom-defined data types.
  3. Other than the traditional indexing mechanism, Postgres also supports expression index, partial index, and full-text indexes based on JSON data that is stored in binary format.
  4. Postgres allows you to define temporary tables that exist only for the lifetime of a query. It helps developers neatly organize their queries. It is beneficial in the case of running analytical use cases where queries can become complex.
  5. External data sources like MySQL can be used as data sources in Postgres. This makes Postgres an ideal choice for architectures that contain many external data sources.
  6. Postgres supports physical and logical replication.
  7. Materialized views are Postgres features that allow users to create a snapshot of tables in an alternate form suitable for specific queries. Materialized views store the alternate form of the table physically.
  8. Postgres has a rich window function set that supports functions like RANK, NTILE, etc. which can be specifically used in analytical queries.

Using Postgres in Ruby

To use Postgres in Ruby gem pg can be used. It is a Ruby interface to Postgres. Here is a small example of using pg gem in ruby

require 'pg'

module Database
    class Postgres
        # Initializes the class
        def initialize
          @connection = connect

        # Execute postgres queries
        # @param [Array] queriy Postgres query to execute
        # @return [Hash] response when query is executed
        # @option response [Boolean] :success query execution status
        # @option response [Hash] :result execution output
        def execute(queries)
          response = {}
          status = []

          queries.each do |query|
            puts "Query: #{query.inspect}"
              resp = @connection.exec("#{query}")
              response = {success: true, result: resp}
            rescue => err
              status = [query, err.class, err.message]
              response = {success: false, result: status}
              puts "Error running query #{query} - #{err.class}, #{err.message}"
          puts "Query execution status: #{response}"

          return response

        # Establishes connection with POSTGRES
        def connect
          PG.connect({:host => POSTGRES_DB_HOST, :dbname => POSTGRES_DATABASE,
                      :port => POSTGRES_DB_PORT, :user => POSTGRES_DB_USER,
                      :password => POSTGRES_DB_PASSWORD})

        # Disconnects with postgres
        def disconnect

Sending queries to the execute method of the Postgres class in the example above will execute them in the Postgres database. For example,

# Relative path to the postgres class file.
require_relative 'postgres'

query = "INSERT INTO `users` `name` VALUES ('abc')"
postgresInstance =
status = postgresInstance.execute([query])
unless status[:success]
puts "Postgres query error status: #{status[0]} #{status[1]} #{status[2]}"
raise "Failed insert query #{query} with status #{status}"
puts "Postgres data insert complete."


As evident from the prose above, using Postgres as the primary analytics database in ruby makes a strong case. Unless you have TBs of data and require a database with horizontal scaling.

Yash Shah

Yash Shah