Scaling images with RScale

Posted by Dan Sosedoff on January 22, 2011

There are few different image processing libraries out there right now:
- Paperclip
- DragonFly
- CarrierWave

But sometimes you need just a tool that does the simple job – scale and save to fs as easy as possible. Here is the library i made for this specific role: RScale. It is a simple image processing library for ruby scripts based on ImageMagick terminal tool. Allows you to define a set of image formats and its exact dimensions and generate thumbnails just with one call. It does not have any other features than making thumbnails, neither it keeps the original source. You also can use it with Rails 2/3, Sinatra or any other framework.

Installation

Make sure you have ImageMagick installed on your system.
You can install it using aptitude or compile from source.

sudo apt-get install imagemagick

Install Rscale as ruby gem:

sudo gem install rscale

Getting Started

First, we need to setup the actual store folder. In rails it would be Rails.root + “/public”.
Make sure this folder is writable.

RScale.configure do |c|
  c.public = "PATH_TO_YOUR_OUTPUT_DIR"
end

Now, we need to define formats. Format is a holder of different image styles.
Here is ‘avatar’ format with 3 styles (64×64, 128×128, 256×256).

 RScale.format :avatar do |f|
   f.url = '/static/:format/:style/:uuid_dir/:uuid.jpg' # optional
   f.style :small,       :size => '64x64', :sharp => true, :q => 50
   f.style :medium,  :size => '128x128'
   f.style :large,       :size => '256x256'
 end

Style options:

  • :size – Exact image size in pixels as follows: ‘Width x Height’
  • :sharp – Sharpen image after processing (true/false)
  • :q – Output image quality (0..100)

URL parameter is just a path to store generated thumbnails, relative to public path defined in configuration block. Available URL parameters:

  • :uuid – 32-byte UUID string
  • :uuid_dir – /xx/xx directory structure generated from uuid string
  • :md5 – 32-byte source image MD5 checksum
  • :time – Unix timestamp
  • :extension – Original extension of source image
  • :filename – Original filename of source image
  • :format – Name of user-defined format
  • :style – Name of user-defined format style (ex. :small, :medium, :large)
  • Usage example

    path = '/tmp/.....' # path to the source/uploaded image
    result = RScale.image_for :avatar, path
     
    # If source file cannot be processed result will always be null
    unless result.nil?
      # result will contain processed thumbnails with path relative to public path
      result[:small]           # 64x64
      result[:medium]      # 128x128
      result[:large]           # 256x256
     end

    RScale does not support uploads to any remote storage systems like AmazonS3, CloudFiles, etc.
    Maybe it will support it later, but i dont think it needs that due to its purpose.

    Source

    Feel free to extend the library: RScale on Github

    Custom field aggregations in Sphinx using SphinxQL

    Posted by Dan Sosedoff on September 06, 2010

    Sphinx is a really powerful tool for a full-text database search. It is the perfect option as a search engine on your website’s data.
    In default mode it works as a regular tcp server and has multiple native language bindings for php, ruby, c, etc. But its another outstanding feature is MySQL Protocol Connectoin and SphinxQL, which is similar to native mysql query language.

    So, ok. Lets say we have N documents with M attributes. Attributes could be different: string, integer, double, boolean. Out objective is to perform attribute aggregation based on specified search term (user-defined, etc). That will give us full information on data selected only by search term. Its only use-case when you really need to get these aggregate fields. Next part is tricky and not really efficient.

    First of all, you have to setup Sphinx search daemon instance using different configuration file (it could not run both). Another problem – you have to setup another data sources and index files, Sphinx puts a lock on all used-right-now files.

    Lets assume we have a database of books. We need to build a form with sliders which could be used as user-friendly search filter. All we need is to get a list of min and max attributes values. But there is a problem: sometimes, while working with sphinx you might find yourself trying to use it like you usually do with regular RDMS. Unfortunately, sphinx has a different design. Basically, sphinx has one primary field which presents in each search request – DocumentID. Its an unique id that represents your data ID, which makes it harder to product aggregate data. And there is no way to get rid of that field.
    The whole idea of our aggregation – using boolean match mode with no weighting performed at all. In that case all results will have weight field = 1. That will give us ability to group all the results by weight field, rejecting the DocumentID field.

    Here is the sample query:

    SELECT
      MIN(reviews) AS min_reviews, MAX(reviews) AS max_reviews,
      MIN(pages) AS min_pages, MAX(pages) AS max_pages,
      MIN(pub_year) AS min_date, MAX(pub_year) AS max_date,
      @weight AS w
    FROM 
      INDEX_NAME
    WHERE
      MATCH('SEARCH_TERM') AND pages > 30
    GROUP BY w OPTION ranker = none

    The result of this query will be one row with field alias names. Thats’s it.

    All statements are fully customizable. Just check full SphinxQL reference for details.

    Setting processor affinity for a certain task or process in Linux

    Posted by Dan Sosedoff on June 06, 2010

    When you are using SMP you might want to override the kernel’s process scheduling and bind a certain process to a specific CPU(s).

    What is this?

    CPU affinity is nothing but a scheduler property that “bonds” a process to a given set of CPUs on the SMP system. The Linux scheduler will honor the given CPU affinity and the process will not run on any other CPUs. Note that the Linux scheduler also supports natural CPU affinity:

    The scheduler attempts to keep processes on the same CPU as long as practical for performance reasons. Therefore, forcing a specific CPU affinity is useful only in certain applications. For example, application such as Oracle (ERP apps) use # of cpus per instance licensed. You can bound Oracle to specific CPU to avoid license problem. This is a really useful on large server having 4 or 8 CPUS

    Setting processor affinity for a certain task or process using taskset command

    taskset is used to set or retrieve the CPU affinity of a running process given its PID or to launch a new COMMAND with a given CPU affinity. However taskset is not installed by default. You need to install schedutils (Linux scheduler utilities) package.

    $ apt-get install shedutils

    Under latest version of Debian / Ubuntu Linux taskset is installed by default using util-linux package.

    The CPU affinity is represented as a bitmask, with the lowest order bit corresponding to the first logical CPU and the highest order bit corresponding to the last logical CPU. For example:

    • 0×00000001 is processor #0 (1st processor)
    • 0×00000003 is processors #0 and #1
    • 0×00000004 is processors #2 (3rd processor)

    To set the processor affinity of process 13545 to processor #0 (1st processor) type following command:

    $ taskset 0x00000001 -p 13545

    If you find a bitmask hard to use, then you can specify a numerical list of processors instead of a bitmask using -c flag:

    $ taskset -c 1 -p 13545
    $ taskset -c 3,4 -p 13545

    where -p : Operate on an existing PID and not launch a new task (default is to launch a new task)

    via http://www.cyberciti.biz/tips/setting-processor-affinity-certain-task-or-process.html

    Writing simple daemons in C

    Posted by Dan Sosedoff on February 13, 2009

    Since i started writing simple manuals about how to make system daemons i found bunch of interesting documents. For today, i just want to publish one of them instead of writing source code. This is manual originally written by Devin Watson, can be very useful for those how have no idea how to develop such system daemons. It`s only a basic information.

    http://www.netzmafia.de/skripten/unix/linux-daemon-howto.html