Capistrano MySQL Backups for Rails

Posted by Dan Sosedoff on August 10, 2011

If you need to backup your application before each deployment, here is the small manual.

Capistrano versions: >= 2.5

Configuration

First, you need to define application environment:

# Define server-side rails environment
set :rails_env, "production"
 
# Primary deployment location
set :deploy_to, "/home/#{user}/#{application}"
 
# Place where all backups will be dumped
set :backup_to, "/home/#{user}/#{application}/backups"

Also, add this function. It allows capistrano to check if remote file exists:

def remote_file_exists?(full_path)
  'true' ==  capture("if [ -e #{full_path} ]; then echo 'true'; fi").strip
end

Now, we need to add a backup task:

namespace :utils do
  desc 'Backup database before deploy'
  task :backup, :roles => :db, :only => {:primary => true} do
    run "mkdir -p #{backup_to}" # Create a backup folder unless exists
 
    # Primary backup filename
    filename = "#{backup_to}/#{application}_predeploy_#{Time.now.strftime("%m%d%Y%H%I%S")}.sql.gz"
 
    # Check if we've got database config
    if remote_file_exists?("#{deploy_to}/current/config/database.yml")    
      text = capture("cat #{deploy_to}/current/config/database.yml")
      config = YAML::load(text)[rails_env]
 
      on_rollback { run "rm #{filename}" }
      run "mysqldump -u #{config['username']} -p #{config['database']} | gzip --best > #{filename}" do |ch, stream, out|
        ch.send_data "#{config['password']}\n" if out =~ /^Enter password:/
      end
    else
      logger.debug("[Backup] No configuration file was found.")
    end
  end
end

And finally, add the capistrano before hook:

before "deploy", "utils:backup"

Testing

To test if the task works, run:

cap utils:backup

On the server side you should see a backup file:

/home/USER/APP/backups/APP_predeploy_MMDDYYHHMMSS.sql.gz.

Custom field aggregations in Sphinx using SphinxQL

Posted by Dan Sosedoff on September 06, 2010

Sphinx is a really powerful tool for a full-text database search. It is the perfect option as a search engine on your website’s data.
In default mode it works as a regular tcp server and has multiple native language bindings for php, ruby, c, etc. But its another outstanding feature is MySQL Protocol Connectoin and SphinxQL, which is similar to native mysql query language.

So, ok. Lets say we have N documents with M attributes. Attributes could be different: string, integer, double, boolean. Out objective is to perform attribute aggregation based on specified search term (user-defined, etc). That will give us full information on data selected only by search term. Its only use-case when you really need to get these aggregate fields. Next part is tricky and not really efficient.

First of all, you have to setup Sphinx search daemon instance using different configuration file (it could not run both). Another problem – you have to setup another data sources and index files, Sphinx puts a lock on all used-right-now files.

Lets assume we have a database of books. We need to build a form with sliders which could be used as user-friendly search filter. All we need is to get a list of min and max attributes values. But there is a problem: sometimes, while working with sphinx you might find yourself trying to use it like you usually do with regular RDMS. Unfortunately, sphinx has a different design. Basically, sphinx has one primary field which presents in each search request – DocumentID. Its an unique id that represents your data ID, which makes it harder to product aggregate data. And there is no way to get rid of that field.
The whole idea of our aggregation – using boolean match mode with no weighting performed at all. In that case all results will have weight field = 1. That will give us ability to group all the results by weight field, rejecting the DocumentID field.

Here is the sample query:

SELECT
  MIN(reviews) AS min_reviews, MAX(reviews) AS max_reviews,
  MIN(pages) AS min_pages, MAX(pages) AS max_pages,
  MIN(pub_year) AS min_date, MAX(pub_year) AS max_date,
  @weight AS w
FROM 
  INDEX_NAME
WHERE
  MATCH('SEARCH_TERM') AND pages > 30
GROUP BY w OPTION ranker = none

The result of this query will be one row with field alias names. Thats’s it.

All statements are fully customizable. Just check full SphinxQL reference for details.

Disable auto-incremental field in Rails Migrations

Posted by Dan Sosedoff on August 12, 2010

Since i`ve been using both DataMapper (Merb/Sinatra) and ActiveRecord (Rails) a lot i noticed that AR acts weight when i manually set PK key, particularly ID field, which you dont have to define by default. In DM you have to define it as ‘Serial’.

So, the task is to create/update records in your database which is supposes to represent data from primary database. In such cases all ID`s should be unique and equal to each other.

To disable autoincremental ID field in your AR models just use this option:

create_table :foo, :id => false do |t|
  t.integer   :id, :options => 'PRIMARY KEY'
  # .... rest of columns
end

Date separated MySQL backups

Posted by Dan Sosedoff on September 18, 2009

Here is the bash shell script that makes archived dumps of your database server. All databases are separated from each other and stored into date based folders.

#!/bin/bash

MyUSER="root"
MyPASS=""
MyHOST="localhost"
NOW="$(date +"%d-%m-%Y")"
STOREDIR="/home/storage/backup/database/by_dates/$NOW"
DBLIST="$(mysql -u $MyUSER -h $MyHOST -Bse 'show databases')"

[ ! -d $STOREDIR ] && mkdir -p $STOREDIR || :

for db in $DBLIST
do
	FILE="$STOREDIR/$db.gz"
	mysqldump -u $MyUSER -h $MyHOST $db | gzip -9 > $FILE
done

Useless filesystem over MySQL

Posted by Dan Sosedoff on April 06, 2009

Here is a completely useless filesystem based on MySQL database storage – mysqlfuse, implemented with Fuse.
I didnt find any way how i can use it, but meanwhile, this fs working. Not perfect of course, in that case its not maintained for a long time. Doesnt support information about free drive space, so any filemanager keeps saying ‘Error: No space left on device’. Such case making it more useless.

It`s really easy to set it up.

First, we need to install developer headers for fuse:

$ apt-get install libfuse-dev

Next, getting sources (32bit only, not working in 64bit):

$ wget http://voxel.dl.sourceforge.net/sourceforge/mysqlfs/mysqlfs-0.4.0-rc1.tar.bz2

Unpack it, and compile:

$ tar -xjvf mysqlfs-0.4.0-rc1.tar.bz2
$ cd mysqlfs-0.4.0-rc1
$ ./configure && make && make install

Next, we need to setup the database

CREATE DATABASE mysqlfs;
GRANT SELECT, INSERT, UPDATE, DELETE ON mysqlfs.* TO mysqlfs@"%" IDENTIFIED BY 'password';
FLUSH PRIVILEGES;

And create database schema. SQL file located in root folder of the sources

$ mysql -uroot -p mysqlfs < schema.sql

And finally, mount filesystem to some folder:

$ mysqlfs -ohost=MYSQLHOST -ouser=MYSQLUSER -opassword=MYSQLPASS -odatabase=mysqlfs MOUNT_DIR

Now, its gonna be working. To use automatic configuration parameters you can create section [mysqlfs] in your mysql configuration file (my.cnf)

Parameters:

-ohost=
    MySQL server host

  -ouser=
    MySQL username

  -opassword=

    MySQL password

  -odatabase=
    MySQL database name

That`s it. Anyway, using FUSE there is a way to create so weird filesystems proxy. For example, there is SQLite over FUSE. And it is too old. Next time i`ll write about Amazon S3 over FUSE projects.

Updating order fields in table with MySQL

Posted by Dan Sosedoff on February 05, 2009

This is a simple piece of code which describing how to update special order field in table. For example, you have table ‘photos’:

+ --------------------------------- +
| ID    | UserID | OrderID | ImgURL |
+ --------------------------------- +
| 1132  | 1      | 0       | ...    |
| 2124  | 1      | 0       | ...    |
| 3456  | 1      | 0       | ...    |
.....................................
| N     | 1      | 0       | ...    |
+ --------------------------------- +

OrderID field value by default is 0. If some images has been deleted order lose its consistency.
To repair order we can use single transaction:

BEGIN;
 
SET @ORDER := 0;
UPDATE photos
SET order_id = (SELECT @ORDER := @ORDER + 1)
WHERE user_id = 1;
 
COMMIT;

After executing this on table ‘images’ we`ll get such result:

+ --------------------------------- +
| ID    | UserID | OrderID | ImgURL |
+ --------------------------------- +
| 1132  | 1      | 1       | ...    |
| 2124  | 1      | 2       | ...    |
| 3456  | 1      | 3       | ...    |
.....................................
| N     | 1      | M       | ...    |
+ --------------------------------- +

Some developers instead of using resources of mysql script language trying to solve this problem by executing numerous call, such ‘UPDATE table SET order_id = [value] WHERE id = [id] LIMIT 1;’. It`s wrong. MySQL language have a lot of ways to make it easy. At least UDF`s.

Simple MySQL backup script

Posted by Dan Sosedoff on January 15, 2009

There is a small useful ruby script to backup your MySQL databases in small projects, where speed of backup not so important.
Source:

#!/usr/bin/ruby
# MySQL Backup Utility
# Usage: ./mysql_backup.rb or ruby mysql_backup.rb
 
$backup_archive = true # gzip files after processing
$backup_dir = "/home/storage/backup/" # output directory
$backup_template = "project-%s-%s.sql" # text-%dbname-%timestamp.sql
$backup_cmd = "mysqldump -u local_backup --add-drop-table --databases %s &gt; %s"
$backup_dblist = [ # list of databases to backup
  'main',
  'users',
  'admin',
  'cards'
] 
 
def backup_database(database)
  time = Time.now()
  time_str = sprintf("%02i-%02i-%04i-%02i%02i%02i",time.day, time.month, time.year, time.hour, time.min, time.sec)
  filename = sprintf($backup_template,database,time_str)
  filename = "#{$backup_dir}#{filename}"
 
  cmd = sprintf($backup_cmd,database,filename)
  if system(cmd) then
    system("gzip --best #{filename}") if $backup_archive
  end
end
 
$backup_dblist.each do |db|
  puts "Processing database... #{db}"
  backup_database(db)
end

Paste: http://pastie.org/341839