<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>dan.thoughts &#187; curb</title>
	<atom:link href="http://blog.sosedoff.com/tag/curb/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.sosedoff.com</link>
	<description>Web-development, PHP, Ruby, Sinatra, Merb, Rails, MySQL, SQLite, Web Services.</description>
	<lastBuildDate>Wed, 25 Jan 2012 18:54:45 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Handy HTTP requests with Curb and Ruby</title>
		<link>http://blog.sosedoff.com/2010/06/13/handy-http-requests-with-curb-and-ruby/</link>
		<comments>http://blog.sosedoff.com/2010/06/13/handy-http-requests-with-curb-and-ruby/#comments</comments>
		<pubDate>Sun, 13 Jun 2010 05:43:49 +0000</pubDate>
		<dc:creator>Dan Sosedoff</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Web Services]]></category>
		<category><![CDATA[API's]]></category>
		<category><![CDATA[crawlers]]></category>
		<category><![CDATA[curb]]></category>
		<category><![CDATA[curl]]></category>
		<category><![CDATA[http]]></category>
		<category><![CDATA[networking]]></category>
		<category><![CDATA[requests]]></category>

		<guid isPermaLink="false">http://blog.sosedoff.com/?p=232</guid>
		<description><![CDATA[While working on one of the projects, i tried to find multi-purpose HTTP request class that can use different network interfaces/ip addresses with retry option (if connection slow or server not responding for some reason). 
Here is a small class wrapper build on top of Ruby Curb implemented as a module:

module ApiRequest
  USER_AGENTS = [...]]]></description>
			<content:encoded><![CDATA[<p>While working on one of the projects, i tried to find multi-purpose HTTP request class that can use different network interfaces/ip addresses with retry option (if connection slow or server not responding for some reason). </p>
<p>Here is a small class wrapper build on top of Ruby <a href="http://curb.rubyforge.org/">Curb</a> implemented as a module:</p>

<div class="wp_syntax"><div class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#9966CC; font-weight:bold;">module</span> ApiRequest
  USER_AGENTS = <span style="color:#006600; font-weight:bold;">&#91;</span>
    <span style="color:#996600;">'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3'</span>,
    <span style="color:#996600;">'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727)'</span>,
    <span style="color:#996600;">'Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.3) Gecko/20100423 Ubuntu/10.04 (lucid) Firefox/3.6.3'</span>,
    <span style="color:#996600;">'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_3; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.70 Safari/533.4'</span>,
    <span style="color:#996600;">'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.2) Gecko/20100323 Namoroka/3.6.2'</span>,
    <span style="color:#996600;">'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100401 Ubuntu/9.10 (karmic) Firefox/3.5.9'</span>
  <span style="color:#006600; font-weight:bold;">&#93;</span>
&nbsp;
  CONNECTION_TIMEOUT = <span style="color:#006666;">10</span>
&nbsp;
  @@interfaces = <span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006600; font-weight:bold;">&#93;</span>
&nbsp;
  <span style="color:#008000; font-style:italic;"># get random user-agent string for usage</span>
  <span style="color:#9966CC; font-weight:bold;">def</span> random_agent
    USER_AGENTS<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#CC0066; font-weight:bold;">rand</span><span style="color:#006600; font-weight:bold;">&#40;</span>USER_AGENTS.<span style="color:#9900CC;">size</span><span style="color:#006600; font-weight:bold;">-</span><span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#41;</span><span style="color:#006600; font-weight:bold;">&#93;</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
&nbsp;
  <span style="color:#008000; font-style:italic;"># get random IP/network interface specified in @@interfaces</span>
  <span style="color:#9966CC; font-weight:bold;">def</span> random_interface
    size = @@interfaces.<span style="color:#9900CC;">size</span>
    size <span style="color:#006600; font-weight:bold;">&gt;</span> <span style="color:#006666;">0</span> ? @@interfaces<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#CC0066; font-weight:bold;">rand</span><span style="color:#006600; font-weight:bold;">&#40;</span>size<span style="color:#006600; font-weight:bold;">-</span><span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#41;</span><span style="color:#006600; font-weight:bold;">&#93;</span> : <span style="color:#0000FF; font-weight:bold;">nil</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
&nbsp;
  <span style="color:#008000; font-style:italic;"># perform request, assign_to - specify network interface/ip</span>
  <span style="color:#9966CC; font-weight:bold;">def</span> perform<span style="color:#006600; font-weight:bold;">&#40;</span>url, assign_to=<span style="color:#0000FF; font-weight:bold;">nil</span><span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#CC0066; font-weight:bold;">puts</span> url
    interface = assign_to.<span style="color:#0000FF; font-weight:bold;">nil</span>? ? <span style="color:#0000FF; font-weight:bold;">self</span>.<span style="color:#9900CC;">random_interface</span> : assign_to
    req = <span style="color:#6666ff; font-weight:bold;">Curl::Easy</span>.<span style="color:#9900CC;">new</span><span style="color:#006600; font-weight:bold;">&#40;</span>url<span style="color:#006600; font-weight:bold;">&#41;</span>
    req.<span style="color:#9900CC;">timeout</span> = CONNECTION_TIMEOUT
    req.<span style="color:#9900CC;">interface</span> = interface <span style="color:#9966CC; font-weight:bold;">unless</span> interface.<span style="color:#0000FF; font-weight:bold;">nil</span>?
    req.<span style="color:#9900CC;">headers</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#996600;">'User-Agent'</span><span style="color:#006600; font-weight:bold;">&#93;</span> = <span style="color:#0000FF; font-weight:bold;">self</span>.<span style="color:#9900CC;">random_agent</span>
    <span style="color:#9966CC; font-weight:bold;">begin</span>
      req.<span style="color:#9900CC;">perform</span>
      <span style="color:#9966CC; font-weight:bold;">if</span> req.<span style="color:#9900CC;">response_code</span> == <span style="color:#006666;">200</span>
        <span style="color:#0000FF; font-weight:bold;">return</span> req.<span style="color:#9900CC;">downloaded_bytes</span> <span style="color:#006600; font-weight:bold;">&gt;</span> <span style="color:#006666;">0</span> ? req.<span style="color:#9900CC;">body_str</span> : <span style="color:#0000FF; font-weight:bold;">nil</span>
      <span style="color:#9966CC; font-weight:bold;">else</span>
        <span style="color:#0000FF; font-weight:bold;">nil</span>
      <span style="color:#9966CC; font-weight:bold;">end</span>
    <span style="color:#9966CC; font-weight:bold;">rescue</span> <span style="color:#CC00FF; font-weight:bold;">Exception</span>
      <span style="color:#0000FF; font-weight:bold;">return</span> <span style="color:#0000FF; font-weight:bold;">nil</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
&nbsp;
  <span style="color:#008000; font-style:italic;"># perform request by number of attempts</span>
  <span style="color:#9966CC; font-weight:bold;">def</span> fetch<span style="color:#006600; font-weight:bold;">&#40;</span>url, attempts=<span style="color:#006666;">3</span><span style="color:#006600; font-weight:bold;">&#41;</span>
    result = <span style="color:#0000FF; font-weight:bold;">nil</span>
    1.<span style="color:#9900CC;">upto</span><span style="color:#006600; font-weight:bold;">&#40;</span>attempts<span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>a<span style="color:#006600; font-weight:bold;">|</span>
      result = <span style="color:#0000FF; font-weight:bold;">self</span>.<span style="color:#9900CC;">perform</span><span style="color:#006600; font-weight:bold;">&#40;</span>url<span style="color:#006600; font-weight:bold;">&#41;</span>
      <span style="color:#9966CC; font-weight:bold;">break</span> <span style="color:#9966CC; font-weight:bold;">unless</span> result.<span style="color:#0000FF; font-weight:bold;">nil</span>?
    <span style="color:#9966CC; font-weight:bold;">end</span>
    <span style="color:#0000FF; font-weight:bold;">return</span> result
  <span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div>

<p>And sample usage:</p>

<div class="wp_syntax"><div class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#9966CC; font-weight:bold;">class</span> TestRequest
  <span style="color:#9966CC; font-weight:bold;">include</span> ApiRequest
&nbsp;
  <span style="color:#9966CC; font-weight:bold;">def</span> foo
     body = <span style="color:#0000FF; font-weight:bold;">self</span>.<span style="color:#9900CC;">fetch</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'http://google.com'</span><span style="color:#006600; font-weight:bold;">&#41;</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div>

<p>If module variable &#8220;<em>@@interfaces</em>&#8221; is array of ip addresses or network interfaces then one of them (randomly selected) will be used to perform request. Also, function &#8220;<em>fetch</em>&#8221; has parameter &#8220;<em>attempts</em>&#8221; which set to 3 by default. It means that operation will be invoked n times until result is downloaded from url. Otherwise &#8211; it returns nil.<br />
Function perform has a parameter &#8220;<em>assign_to</em>&#8221; (which it not used in &#8220;<em>fetch</em>&#8221; function) that allows to bind request to specified interface.  It is useful if you have situation when you might use different workers that bound to exact interface or just one that uses random ip`s. Also, class <em>ApiRequest</em> has a list of user agents which it uses randomly for each performed request.</p>
<p>Pastie: <a href="http://pastie.org/private/j19j3hbebte9bjqaydslmg">http://pastie.org/private/j19j3hbebte9bjqaydslmg</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.sosedoff.com/2010/06/13/handy-http-requests-with-curb-and-ruby/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Making HTTP requests from different network interfaces with Ruby and Curb</title>
		<link>http://blog.sosedoff.com/2010/06/09/making-http-requests-from-different-network-interfaces-with-ruby-and-curb/</link>
		<comments>http://blog.sosedoff.com/2010/06/09/making-http-requests-from-different-network-interfaces-with-ruby-and-curb/#comments</comments>
		<pubDate>Thu, 10 Jun 2010 04:31:17 +0000</pubDate>
		<dc:creator>Dan Sosedoff</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Web Development]]></category>
		<category><![CDATA[Web Services]]></category>
		<category><![CDATA[curb]]></category>
		<category><![CDATA[curl]]></category>
		<category><![CDATA[get]]></category>
		<category><![CDATA[http]]></category>
		<category><![CDATA[net]]></category>
		<category><![CDATA[requests]]></category>

		<guid isPermaLink="false">http://blog.sosedoff.com/?p=228</guid>
		<description><![CDATA[At some point you will find that you have reached requests per IP limit while using some API or crawling resources. And if you`re doing it via standard Net::HTTP you`ll face the problem that you cannot assign request class to specified network interface (or IP). Bummer? No. Even if you cant do it with core [...]]]></description>
			<content:encoded><![CDATA[<p>At some point you will find that you have reached requests per IP limit while using some API or crawling resources. And if you`re doing it via standard <a href="http://ruby-doc.org/core/classes/Net/HTTP.html">Net::HTTP</a> you`ll face the problem that you cannot assign request class to specified network interface (or IP). Bummer? No. Even if you cant do it with core class you might take a look on <a href="http://curb.rubyforge.org/">Curb</a> &#8211; libcurl ruby binding. It has everything that you need to make regular get/post/etc requests. And of course &#8211; easy.</p>
<p>A simple example (real ip`s are changed):</p>

<div class="wp_syntax"><div class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#996600;">'rubygems'</span>
<span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#996600;">'curb'</span>
&nbsp;
ip_addresses = <span style="color:#006600; font-weight:bold;">&#91;</span>
  <span style="color:#996600;">'1.1.1.1'</span>,
  <span style="color:#996600;">'2.2.2.2'</span>,
  <span style="color:#996600;">'3.3.3.3'</span>,
  <span style="color:#996600;">'4.4.4.4'</span>,
  <span style="color:#996600;">'5.5.5.5'</span>
<span style="color:#006600; font-weight:bold;">&#93;</span>
&nbsp;
ip_addresses.<span style="color:#9900CC;">each</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>ip<span style="color:#006600; font-weight:bold;">|</span>
  req = <span style="color:#6666ff; font-weight:bold;">Curl::Easy</span>.<span style="color:#9900CC;">new</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'http://www.ip-adress.com/'</span><span style="color:#006600; font-weight:bold;">&#41;</span>
  req.<span style="color:#9900CC;">interface</span> = ip
  req.<span style="color:#9900CC;">perform</span>
  result_ip = req.<span style="color:#9900CC;">body_str</span>.<span style="color:#9900CC;">scan</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">/&lt;</span>h2<span style="color:#006600; font-weight:bold;">&gt;</span>My IP address is: <span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&#91;</span>\d\.<span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#123;</span><span style="color:#006666;">1</span>,<span style="color:#006600; font-weight:bold;">&#125;</span><span style="color:#006600; font-weight:bold;">&#41;</span><span style="color:#006600; font-weight:bold;">&lt;</span>\<span style="color:#006600; font-weight:bold;">/</span>h2<span style="color:#006600; font-weight:bold;">&gt;/</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">first</span>
  <span style="color:#CC0066; font-weight:bold;">puts</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;for #{ip} got response: #{result_ip}&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div>

<p>Output (ip`s are changed):</p>
<pre>
for 1.1.1.1 got response: 1.1.1.1
for 2.2.2.2 got response: 2.2.2.2
for 3.3.3.3 got response: 3.3.3.3
for 4.4.4.4 got response: 4.4.4.4
for 5.5.5.5 got response: 5.5.5.5
</pre>
<p>At least its working. Havent done any performance tests.<br />
Sample on pastie: <a href="http://pastie.org/private/afxlcuk1npwjov3wer5hw">http://pastie.org/private/afxlcuk1npwjov3wer5hw</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.sosedoff.com/2010/06/09/making-http-requests-from-different-network-interfaces-with-ruby-and-curb/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

