Posted by Dan Sosedoff
on June 15, 2010
Snippet (found on net) for removing files from repository that are no longer present under your project.
$ git rm $(git ls-files -d)
For best use add it to bash alias file: ~/.bashrc or ~/.bash-aliases (under ubuntu):
alias gitclean='git rm $(git ls-files -d)'
Posted by Dan Sosedoff
on June 13, 2010
While working on one of the projects, i tried to find multi-purpose HTTP request class that can use different network interfaces/ip addresses with retry option (if connection slow or server not responding for some reason).
Here is a small class wrapper build on top of Ruby Curb implemented as a module:
module ApiRequest
USER_AGENTS = [
'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3',
'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727)',
'Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.3) Gecko/20100423 Ubuntu/10.04 (lucid) Firefox/3.6.3',
'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_3; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.70 Safari/533.4',
'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.2) Gecko/20100323 Namoroka/3.6.2',
'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100401 Ubuntu/9.10 (karmic) Firefox/3.5.9'
]
CONNECTION_TIMEOUT = 10
@@interfaces = []
# get random user-agent string for usage
def random_agent
USER_AGENTS[rand(USER_AGENTS.size-1)]
end
# get random IP/network interface specified in @@interfaces
def random_interface
size = @@interfaces.size
size > 0 ? @@interfaces[rand(size-1)] : nil
end
# perform request, assign_to - specify network interface/ip
def perform(url, assign_to=nil)
puts url
interface = assign_to.nil? ? self.random_interface : assign_to
req = Curl::Easy.new(url)
req.timeout = CONNECTION_TIMEOUT
req.interface = interface unless interface.nil?
req.headers['User-Agent'] = self.random_agent
begin
req.perform
if req.response_code == 200
return req.downloaded_bytes > 0 ? req.body_str : nil
else
nil
end
rescue Exception
return nil
end
end
# perform request by number of attempts
def fetch(url, attempts=3)
result = nil
1.upto(attempts) do |a|
result = self.perform(url)
break unless result.nil?
end
return result
end
end
And sample usage:
class TestRequest
include ApiRequest
def foo
body = self.fetch('http://google.com')
end
end
If module variable “@@interfaces” is array of ip addresses or network interfaces then one of them (randomly selected) will be used to perform request. Also, function “fetch” has parameter “attempts” which set to 3 by default. It means that operation will be invoked n times until result is downloaded from url. Otherwise – it returns nil.
Function perform has a parameter “assign_to” (which it not used in “fetch” function) that allows to bind request to specified interface. It is useful if you have situation when you might use different workers that bound to exact interface or just one that uses random ip`s. Also, class ApiRequest has a list of user agents which it uses randomly for each performed request.
Pastie: http://pastie.org/private/j19j3hbebte9bjqaydslmg
Posted by Dan Sosedoff
on June 06, 2010
When you are using SMP you might want to override the kernel’s process scheduling and bind a certain process to a specific CPU(s).
What is this?
CPU affinity is nothing but a scheduler property that “bonds” a process to a given set of CPUs on the SMP system. The Linux scheduler will honor the given CPU affinity and the process will not run on any other CPUs. Note that the Linux scheduler also supports natural CPU affinity:
The scheduler attempts to keep processes on the same CPU as long as practical for performance reasons. Therefore, forcing a specific CPU affinity is useful only in certain applications. For example, application such as Oracle (ERP apps) use # of cpus per instance licensed. You can bound Oracle to specific CPU to avoid license problem. This is a really useful on large server having 4 or 8 CPUS
Setting processor affinity for a certain task or process using taskset command
taskset is used to set or retrieve the CPU affinity of a running process given its PID or to launch a new COMMAND with a given CPU affinity. However taskset is not installed by default. You need to install schedutils (Linux scheduler utilities) package.
$ apt-get install shedutils
Under latest version of Debian / Ubuntu Linux taskset is installed by default using util-linux package.
The CPU affinity is represented as a bitmask, with the lowest order bit corresponding to the first logical CPU and the highest order bit corresponding to the last logical CPU. For example:
- 0×00000001 is processor #0 (1st processor)
- 0×00000003 is processors #0 and #1
- 0×00000004 is processors #2 (3rd processor)
To set the processor affinity of process 13545 to processor #0 (1st processor) type following command:
$ taskset 0x00000001 -p 13545
If you find a bitmask hard to use, then you can specify a numerical list of processors instead of a bitmask using -c flag:
$ taskset -c 1 -p 13545
$ taskset -c 3,4 -p 13545
where -p : Operate on an existing PID and not launch a new task (default is to launch a new task)
via http://www.cyberciti.biz/tips/setting-processor-affinity-certain-task-or-process.html
Posted by Dan Sosedoff
on June 28, 2009
First, we need to install all dependencies:
# yum install gettext-devel expat-devel curl-devel zlib-devel openssl-devel
Next, get the git 1.6.x sources:
# wget http://kernel.org/pub/software/scm/git/git-1.6.3.3.tar.gz
Then, unpack and cd into git sources folder and install it:
# make && make install & make clean
That`s it, now you`ll have git system ready to go.
Posted by Dan Sosedoff
on April 06, 2009
Here is a completely useless filesystem based on MySQL database storage – mysqlfuse, implemented with Fuse.
I didnt find any way how i can use it, but meanwhile, this fs working. Not perfect of course, in that case its not maintained for a long time. Doesnt support information about free drive space, so any filemanager keeps saying ‘Error: No space left on device’. Such case making it more useless.
It`s really easy to set it up.
First, we need to install developer headers for fuse:
$ apt-get install libfuse-dev
Next, getting sources (32bit only, not working in 64bit):
$ wget http://voxel.dl.sourceforge.net/sourceforge/mysqlfs/mysqlfs-0.4.0-rc1.tar.bz2
Unpack it, and compile:
$ tar -xjvf mysqlfs-0.4.0-rc1.tar.bz2
$ cd mysqlfs-0.4.0-rc1
$ ./configure && make && make install
Next, we need to setup the database
CREATE DATABASE mysqlfs;
GRANT SELECT, INSERT, UPDATE, DELETE ON mysqlfs.* TO mysqlfs@"%" IDENTIFIED BY 'password';
FLUSH PRIVILEGES;
And create database schema. SQL file located in root folder of the sources
$ mysql -uroot -p mysqlfs < schema.sql
And finally, mount filesystem to some folder:
$ mysqlfs -ohost=MYSQLHOST -ouser=MYSQLUSER -opassword=MYSQLPASS -odatabase=mysqlfs MOUNT_DIR
Now, its gonna be working. To use automatic configuration parameters you can create section [mysqlfs] in your mysql configuration file (my.cnf)
Parameters:
-ohost=
MySQL server host
-ouser=
MySQL username
-opassword=
MySQL password
-odatabase=
MySQL database name
That`s it. Anyway, using FUSE there is a way to create so weird filesystems proxy. For example, there is SQLite over FUSE. And it is too old. Next time i`ll write about Amazon S3 over FUSE projects.