<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>Gillius&apos;s Programming</title>
    <link rel="alternate" type="text/html" href="http://gillius.org/" />
    <link rel="self" type="application/atom+xml" href="http://gillius.org/atom.xml" />
    <id>tag:gillius.org,2011-01-04://1</id>
    <updated>2012-01-30T00:24:41Z</updated>
    <subtitle>Gillius&apos;s Programming.  C/C++ tutorials, games, java, allegro, and libraries.</subtitle>
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type 5.12</generator>

<entry>
    <title>Removing php extensions in IIS with Ionic&apos;s Isapi Rewrite Filter - GP Blog</title>
    <link rel="alternate" type="text/html" href="http://gillius.org/blog/2012/01/removing-php-extensions-in-iis-with-ionics-isapi-rewrite-filter.html" />
    <id>tag:gillius.org,2012:/blog//2.243</id>

    <published>2012-01-29T23:20:03Z</published>
    <updated>2012-01-30T00:24:41Z</updated>

    <summary> I was placed in a situation of updating and redeploying a PHP application on an older IIS 6 server. I have some small experience with PHP, but none at all with IIS (only Apache). One thing that I feel...</summary>
    <author>
        <name>Jason Winnebeck</name>
        <uri>http://gillius.org/</uri>
    </author>
    
    
    <content type="html" xml:lang="en-us" xml:base="http://gillius.org/blog/">
        <![CDATA[<p>
	I was placed in a situation of updating and redeploying a PHP application on an older IIS 6 server. I have some small experience with PHP, but none at all with IIS (only Apache). One thing that I feel is very important that I know not everyone cares about is having clean URLs, or at least, removing the extension of dynamic pages. I don&#39;t know why a user to a website should have to see and know (when typing) whether my page is implemented in PHP, ASP, ASPX, JSP, or whatever. Typically I also like to remove .html as well, but actually I feel you can go either way on that one (because HTML is the content type the browser sees, not a backend). In this case the site was already moving URLs, and the long-term move away from PHP was also possible. I didn&#39;t want to risk having to break bookmarks again in the future.</p>
<p>
	Normally in Apache I would turn on MultiViews and also rely on default files like index.htm/php. While IIS 6 supports default files, it doesn&#39;t support MultiViews as far as I can tell. My searching didn&#39;t find much workarounds. I did find <a href="http://iirf.codeplex.com/">Ionic&#39;s Isapi Rewrite Filter</a> though, which bascially is an open-source mod_rewrite for IIS. Read on for details on my configuration.</p>
]]>
        <![CDATA[<p>
	I couldn&#39;t get this to work as well as I wanted, or with the rules as clean as I wanted. I know there must be a better way, so I encourage comments, or&nbsp;<a href="/contact">contact me</a>&nbsp;if you know of a better way. I did achieve all of my important requirements, though, for my specific application:</p>
<ul>
	<li>
		For a resource or downloadable content (like jpg, css, zip, etc), leave the extension.</li>
	<li>
		If requesting a directory (URL ending with /), try first for&nbsp;<tt>index.php</tt>, then&nbsp;<tt>index.html</tt>, then&nbsp;<tt>index.htm</tt>.</li>
	<li>
		If requesting a non-directory location that doesn&#39;t exist (like&nbsp;<tt>/whatever</tt>), look for&nbsp;<tt>whatever.php</tt>,&nbsp;<tt>whatever.html</tt>, and&nbsp;<tt>whatever.htm</tt>&nbsp;in that order, and use the first one found. I do rely on IIS&#39;s behavior that if the user goes to&nbsp;<tt>/somedir</tt>&nbsp;and&nbsp;<tt>somedir</tt>&nbsp;is a directory, it redirects them to&nbsp;<tt>/somedir/</tt>, then my rules will look for index properly.</li>
</ul>
<p>
	I allowed direct links with .php on the end to work, and I didn&#39;t redirect those to the &quot;php-free&quot; URL, but this is something I think is easily possible. Here is my configuration:</p>
<pre>
RewriteEngine ON
#uncomment this to enable the iirf status page (see documentation)
#StatusInquiry ON /iirfStatus RemoteOk

IterationLimit 5

# don&#39;t rewrite any request that ends with one of these extensions,
# even if they don&#39;t exist.
RewriteRule (.+\.)(jpg|png|jpeg|gif|ttf|sql|txt|zip|css)$   -   [L]

# If there is an exact match, then don&#39;t do any rewriting
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^.*$ - [L]

#If this is a URL ending with a /, try index.php, then index.html
#(I don&#39;t know how to get it to search as if we had done /index):
RewriteCond %{REQUEST_FILENAME}index.php -f
RewriteRule ^(.*)/$  $1/index.php [L]
RewriteCond %{REQUEST_FILENAME}index.html -f
RewriteRule ^(.*)/$  $1/index.html [L]
RewriteCond %{REQUEST_FILENAME}index.htm -f
RewriteRule ^(.*)/$  $1/index.htm [L]

# Try to see if this is a PHP page:
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^/(.*)$  /$1.php [L]

# Try to see if this is an HTML page:
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^/(.*)$  /$1.html [L]
RewriteCond %{REQUEST_FILENAME}.htm -f
RewriteRule ^/(.*)$  /$1.htm [L]
</pre>
<p>
	&nbsp;</p>
]]>
    </content>
</entry>

<entry>
    <title>Java and Scientific Python with JEPP? Nope - GP Blog</title>
    <link rel="alternate" type="text/html" href="http://gillius.org/blog/2012/01/java-and-python-with-jepp-or-not.html" />
    <id>tag:gillius.org,2012:/blog//2.242</id>

    <published>2012-01-21T17:46:00Z</published>
    <updated>2012-01-21T17:50:20Z</updated>

    <summary> At work, I and the other software developers work primarily with the Java programming language. Part of our organization&#39;s goal involves algorithms and scientific data analysis of data sets, which is researched by another team. Traditionally there has been...</summary>
    <author>
        <name>Jason Winnebeck</name>
        <uri>http://gillius.org/</uri>
    </author>
    
    
    <content type="html" xml:lang="en-us" xml:base="http://gillius.org/blog/">
        <![CDATA[<p>
	At work, I and the other software developers work primarily with the Java programming language. Part of our organization&#39;s goal involves algorithms and scientific data analysis of data sets, which is researched by another team. Traditionally there has been a lot of data analysis and scientific work with Matlab, but since the team has switched to Python. Python appears to have a strong scientific community and tools (such as <a href="http://code.google.com/p/pythonxy/">pythonxy</a>) for rapid development for scientific computing, data analysis, and data visualization. Since then, I have looked at ways where we can collaborate by running their algorithms in the JVM, without the need for costly and error-prone porting to Java. Ideally I&#39;d like a way for the Python code to leverage a codebase developed over 8 years for our problem domain, and for Java code to leverage new work being done in Python.</p>
<p>
	Read on for my current progress...</p>
]]>
        <![CDATA[<p>
	I have been watching&nbsp;<a href="http://www.jython.org/">Jython</a>&nbsp;(formerly JPython) for quite some time, but most of the popular scientific/visualization libraries (like&nbsp;<a href="http://numpy.scipy.org/">numpy</a>&nbsp;and others) are really wrappers around C code and use the extensions specific for CPython, which Jython doesn&#39;t support. There is a desire for Jython to support the C extensions API but it hasn&#39;t happened yet. If it did support it, then I think Jython would be the ideal solution, even if does update slower than CPython. I saw&nbsp;<a href="http://jpype.sourceforge.net/">JPype</a>&nbsp;but it looks more like calling Java from Python, which is not quite what we want. Then I found&nbsp;<a href="http://sourceforge.net/projects/jepp/">JEPP</a>, which is more like calling python from Java.</p>
<p>
	However, my experience with JEPP was horrible and non-working at all, at least on Windows x86-32. The initial warning sign was the lack of updates for awhile and the combination of the files being named with each permutation of java version and python version; it sounded like the PHP/Apache madness of matching 32/64 bit, Apache/PHP version, and C runtime version. It&#39;s like the native precompiled binary wheel of fortune where you spin it and just hope you get a set of compatible libraries. Here were the list of problems I ran into before I gave up:</p>
<ul>
	<li>
		JEPP has a very minimal website and virtually no documentation, although it does look simple to use.</li>
	<li>
		JEPP&#39;s latest version supports only up to Python 2.6. I couldn&#39;t find an old pythonxy verison for Python 2.6, besides even if I did I&#39;d never see an update again, so I couldn&#39;t use pythonxy.</li>
	<li>
		I tried the Python 2.6.4 installer from the JEPP site. The installer appeared to install all files but at the end the installer said that it failed to complete; however, everything seemed to be installed and I could run Python so I went with it. JEPP would actually run with this version and I sucessfully ran some trivial Python with the included console.py. However, I couldn&#39;t get numpy 1.6.1 to work even in the native Python; it complains the DLL was not found, so I didn&#39;t even try with JEPP.</li>
	<li>
		OK, so I uninstalled everything and tried the latest 2.6 ActiveState ActivePython (2.6.7.20). I installed Numpy 1.6.1 and it works. Then I install JEPP and try to use it in the same was as before, now the JVM segfaults calling some unicode string handling method while trying to run the script, so I couldn&#39;t run anything at all in JEPP.</li>
	<li>
		At this point I did a lot of searching the web and found no solace. In fact, I even found a&nbsp;<a href="http://stackoverflow.com/questions/7592565/when-embedding-cpython-in-java-why-does-this-hang">stackoverflow post</a>&nbsp;only a few months old saying that numpy and JEPP aren&#39;t compatible due to the sub-interpreters used by JEPP and the&nbsp;<a href="http://wiki.python.org/moin/GlobalInterpreterLock">GIL</a>.</li>
</ul>
<p>
	After that experience, I didn&#39;t try to go any farther, even if I got it to work I would be stuck forever on an old Python and there would be many more C extensions to worry about.</p>
<p>
	In the end I am thinking it is not currently possible to run typical scientific Python code in the same process as JVM langauges. So the fallback would be either porting code, or running JVM and Python in two separate processes and using whatever IPC mechanism is the easiest (pipes, TCP, XML, etc.) with a very loosely bound interface. However, this makes it much more challenging (and much higher overhead) to build a single, integrated product and eliminates any ability for rapidly leveraging library code in JVM or Python for prototyping. I just heard about&nbsp;<a href="http://codespeak.net/execnet/">execnet</a>, that might make the IPC easier, though.</p>
]]>
    </content>
</entry>

<entry>
    <title>Jove crash when compiling automake - GP Blog</title>
    <link rel="alternate" type="text/html" href="http://gillius.org/blog/2012/01/jove-crash-when-compiling-automake.html" />
    <id>tag:gillius.org,2012:/blog//2.241</id>

    <published>2012-01-08T20:01:10Z</published>
    <updated>2012-01-16T09:13:18Z</updated>

    <summary><![CDATA[ Short answer for the web searchers: if /usr/bin/emacs points to jove, install emacs. I&#39;ve had the privilege of starting to play around with Angstrom&nbsp;(fork of OpenEmbedded) on the BeagleBone. I needed to build some bleeding-edge software with bitbake on...]]></summary>
    <author>
        <name>Jason Winnebeck</name>
        <uri>http://gillius.org/</uri>
    </author>
    
    
    <content type="html" xml:lang="en-us" xml:base="http://gillius.org/blog/">
        <![CDATA[<p>
	Short answer for the web searchers: if /usr/bin/emacs points to jove, install emacs.</p>
<p>
	I&#39;ve had the privilege of starting to play around with <a href="http://www.angstrom-distribution.org/">Angstrom</a>&nbsp;(fork of OpenEmbedded) on the <a href="http://beagleboard.org/bone">BeagleBone</a>. I needed to build some bleeding-edge software with bitbake on an Ubuntu 10.10 but I ran into an interesting problem. When building the distribution, it builds all of the tools from the ground up, including automake. However, automake would crash when building some sort of support for emacs with a jove crash. I couldn&#39;t understand how they were related.</p>
<p>
	In the end, I found that since jove was installed but Emacs was not, /usr/bin/emacs actually pointed to starting jove. Automake was trying to use some kind of compiler for emacs (I guess it was making some emacs macros or syntax thing?). It must have thought emacs was properly installed and called Jove with a set of parameters that caused it to crash. In the end, the solution was to install Emacs. The build system itself wasn&#39;t in error, it was my machine, so it can&#39;t be reasonably &quot;fixed&quot; in the code so I put this post in hopes of people finding it if they search for the same problem.</p>
]]>
        
    </content>
</entry>

<entry>
    <title>Web Security: XSS and CSRF - GP Blog</title>
    <link rel="alternate" type="text/html" href="http://gillius.org/blog/2011/10/web-security-xss-and-csrf.html" />
    <id>tag:gillius.org,2011:/blog//2.240</id>

    <published>2011-10-21T02:22:28Z</published>
    <updated>2012-01-16T09:24:09Z</updated>

    <summary><![CDATA[ At work I&#39;ve been making a foray into web development. I&#39;ve always been one to be very interested in &quot;how to do it right&quot; rather than just &quot;get it done&quot; and stop when it looks like it works. Security...]]></summary>
    <author>
        <name>Jason Winnebeck</name>
        <uri>http://gillius.org/</uri>
    </author>
    
    
    <content type="html" xml:lang="en-us" xml:base="http://gillius.org/blog/">
        <![CDATA[<p>
	At work I&#39;ve been making a foray into web development. I&#39;ve always been one to be very interested in &quot;how to do it right&quot; rather than just &quot;get it done&quot; and stop when it looks like it works. Security is one of the things where it works perfectly for the user -- but also for the attacker if you don&#39;t do it right. I&#39;ve had a lot of experience with securing things with encryption, but the web is an entirely new (and scary) world.</p>
<p>
	The two attack vectors I looked into is XSS (Cross-site scripting) and CSRF (cross-site request forgery, aka XSRF).</p>
]]>
        <![CDATA[<p>
	I know this topic is covered in detail in other sites (more authoritative than mine), but I hope that this article helps provide another perspective or at least increases the awareness of these issues in the field.</p>
<h2>
	Cross-Site Scripting</h2>
<p>
	This exploit happens when someone can inject script code into your site. The primary way an application is vulnerable to this is when you provide unescaped code to a client that would be interpreted as HTML. An obvious way would be a forum post that allowed users to post a <tt>&lt;script&gt;</tt>. This script could steal information, the most important being the session cookie information and sending that to the attacker&#39;s site who can then &quot;log in&quot; as you.</p>
<p>
	OK, everyone knows to protect forum posts, of course. A good programmer protects any user-developed content on your site. But keep thinking -- one place you might not realize: The error handling pages. Everyone loves a good &quot;404&quot; page with a message &quot;Oh, the URL ABC you tried to visit doesn&#39;t exist.&quot; Typically some web admin set that up once and no one thought about it again. Well, what if someone gets your user to visit (encoded of course) <tt>http://mysite.example.com/?&lt;script&gt;</tt>....? Now they have just injected a script into your site through your 404 page. What about a login page? &quot;The user ABC doesn&#39;t exist&quot;. What about the user <tt>&lt;script&gt;bad_code();&lt;/script&gt;</tt>? Does he exist?</p>
<p>
	There is plenty of more information on the Internet, and I&#39;m not trying to outdo <a href="http://en.wikipedia.org/wiki/Cross-site_scripting">Wikipedia</a> here, but here are some tips:</p>
<ul>
	<li>
		Don&#39;t echo back ANYTHING to the user without escaping potential HTML code on the server side. In the case of the 404 and login examples above, consider not even echoing anything back at all.</li>
	<li>
		If you have client-side JavaScript reading from your server, escape there as well for another layer of precaution when it is possible.</li>
	<li>
		Consider a cookie value that wouldn&#39;t work if requested from a different machine (IP address) or browser. This isn&#39;t bulletproof if the attacker has control of the network to spoof this information, but it makes it a lot harder.</li>
</ul>
<p>
	Another resource is this <a href="http://www.openajax.org/whitepapers/Ajax%20and%20Mashup%20Security.php">OpenAjax site page</a>, which also covers CSRF.</p>
<h2>
	Cross-Site Request Forgery</h2>
<p>
	Learning about this really blew my mind, because I understood XSS but I didn&#39;t realize CSRF. CSRF occurs when an attacker causes an operation on your site. Even though the &quot;request&quot; came from the attacker&#39;s site, the browser will send the cookies (or HTTP authentication credentials) for your site to your site. This makes sense. Of course, the attacker can&#39;t see your cookies or see what you are doing on your site, but it might not matter.</p>
<p>
	In order to exploit a CSRF vulnerability, your application needs to have a request that performs an action beneficial to the user that is the same for all users. The typical example is the bank one, where visiting <tt>http://mybank.example.com/transfer?to=attacker@evil.com&amp;amount=1234</tt> will result in a money transfer. The attacker doesn&#39;t need the user&#39;s credentials nor does he even need to observe the request or response. That URL can be placed anywhere on the Internet easily, because most sites will let you publish content with images linked to any URL. Put that URL into the &quot;img src&quot; and the user will see a broken image but not know what happened. The link also needs to be put on a site that your visitors would visit as well, so of course large targets like Facebook and Google are far more likely targets.</p>
<p>
	OK, so HTTP &quot;GET&quot; requests with side-effects are bad style anyway. You can fix the img problem but not allowing this, but it doesn&#39;t fix the problem. If the evil (or compromised) site contains a hidden HTML form and javascript, it can perform a POST as well.</p>
<p>
	Unfortunately, to me it appears the browser cookie model and HTTP authentication models are broken in browsers, because they provide authentication information without prompting you. This makes sense otherwise every single page you click on would require the browser to ask you to confirm first. There&#39;s no way for a user to stop this except to always &quot;log out&quot; of your site and not view any other sites in that browser while logged in.</p>
<p>
	The solution is to foil the attacker by making it so that each user has a unique request that can only be generated by that user <strong>that does not involve solely the browser&#39;s cookie or HTTP authentication system</strong>. My understanding through reading is that any of the following ways to add user-specific data to the request would work:</p>
<ul>
	<li>
		If using AJAX:
		<ul>
			<li>
				Take the value of the user&#39;s auth/session cookie; put it into the XML/JSON/form data. Since the evil site can&#39;t read your cookies, they can&#39;t form that request</li>
			<li>
				Put the cookie&#39;s value into an HTTP header. Headers can be forged by attackers but they still can&#39;t see the cookie.</li>
			<li>
				You could use a token from the server other than the auth cookie, but has to be specific to that user. But I&#39;m not sure it provides any more protection.</li>
		</ul>
	</li>
	<li>
		If not using scripting, generate an HTML form with a hidden field with a random value, saved to that user or session. Verify that the value is the same on submit.</li>
</ul>
<p>
	You can also read more at <a href="http://en.wikipedia.org/wiki/Cross-site_request_forgery">Wikipedia</a>.</p>
]]>
    </content>
</entry>

<entry>
    <title>Custom CRT Resolution on NVIDIA GTX 460 - GP Blog</title>
    <link rel="alternate" type="text/html" href="http://gillius.org/blog/2011/09/custom-crt-resolution-on-nvidia-gtx-460.html" />
    <id>tag:gillius.org,2011:/blog//2.235</id>

    <published>2011-09-05T05:15:00Z</published>
    <updated>2012-01-16T09:25:01Z</updated>

    <summary><![CDATA[ I have a dual monitor setup that consists of an old 19&quot; CRT (Philips 109P) and a 20.1&quot; 1600x1200 LCD (Samsung 204B). I&#39;ve always just managed with having different resolutions on the monitors, but one thing that always bugged...]]></summary>
    <author>
        <name>Jason Winnebeck</name>
        <uri>http://gillius.org/</uri>
    </author>
    
    
    <content type="html" xml:lang="en-us" xml:base="http://gillius.org/blog/">
        <![CDATA[<p>
	I have a dual monitor setup that consists of an old 19&quot; CRT (Philips 109P) and a 20.1&quot; 1600x1200 LCD (Samsung 204B). I&#39;ve always just managed with having different resolutions on the monitors, but one thing that always bugged me is that the pixels were different sizes if I picked standard resolutions. When you drag a window from one monitor to the other you get an odd &quot;zoom&quot; effect. But I always knew that CRTs don&#39;t have fixed resolutions and you can pick whatever you want. I figured out the setup to create a custom resolution in Windows so the two differently sized monitors would match up exactly.</p>
<p>
	Continue reading for the details.</p>
]]>
        <![CDATA[<p>
	Typically I would run the CRT in 1152x864, which is the next resolution down from 1600x1200, as I felt that was too small for the 19&quot; CRT. I wanted to figure out the proper resolution for the CRT such that the pixels are the same size. That means the physical image would be the same size on both monitors -- just the CRT would have less space.</p>
<p>
	I measured the width and height of the LCD monitor at 16 x 12 inches. Both monitors are 4:3, which means a 1.33 aspect ratio. The height of the viewable area on the CRT is 10.8 inches. From here the math was simple:</p>
<p>
	CRT height / LCD height = height ratio<br />
	10.8 / 12 = 0.9</p>
<p>
	Ratio * LCD resolution height = equivalent CRT resolution height<br />
	0.9 * 1200 = 1080</p>
<p>
	Since the CRT is 4:3, we can calculate the width:<br />
	1080 * 1.333 = 1440</p>
<p>
	This results in the CRT having an ideal resolution of 1440 x 1080.</p>
<p>
	Based on the spec sheet, which listed refresh rates at a list of standard supported resolutions, I guessed that I should be able to support 85hz at this setting.</p>
<p>
	Now all I had to do is figure out is how to set this resolution in Windows. Fortunately, I found a way in the NVIDIA control panel. I don&#39;t know if there is a way to do it in Windows in a video card-independent manner. Other vendors may have similar options in their control panels.</p>
<p>
	In the NVIDIA control panel, go to Display -&gt; Change Resolution. Select the CRT and pick customize.</p>
<p>
	<img alt="CustomRes_1.png" class="mt-image-none" height="685" src="http://gillius.org/blog/2011/09/05/CustomRes_1.png" style="" width="720" /></p>
<p>
	Select the resolution you want and the refresh rate. You might have to play with the timings; I set mine to GTF and that seemed to work and get me the refresh rate I wanted.</p>
<p>
	<img alt="CustomRes_2.png" class="mt-image-none" height="553" src="http://gillius.org/blog/2011/09/05/CustomRes_2.png" style="" width="552" /></p>
<p>
	In my case, the results came out quite nicely. Now when I move windows/images between the monitors I don&#39;t get a disconcerting zooming effect. Hopefully another perfectionist out there can benefit from this.</p>
]]>
    </content>
</entry>

<entry>
    <title>Coder Grub - Quick Chunky Pasta Sauce - GP Blog</title>
    <link rel="alternate" type="text/html" href="http://gillius.org/blog/2011/09/coder-grub---quick-chunky-pasta-sauce.html" />
    <id>tag:gillius.org,2011:/blog//2.237</id>

    <published>2011-09-05T04:29:02Z</published>
    <updated>2012-01-06T08:56:47Z</updated>

    <summary> Even coders have to eat sometime. Check out my guest post on my wife&#39;s blog, Kitchen Trial and Error. Summer is great for food because you can always find fresh ingredients at the local farmers&#39; market or your own...</summary>
    <author>
        <name>Jason Winnebeck</name>
        <uri>http://gillius.org/</uri>
    </author>
    
    
    <content type="html" xml:lang="en-us" xml:base="http://gillius.org/blog/">
        <![CDATA[<p>
	Even coders have to eat sometime. Check out my <a href="http://kitchentrialanderror.blogspot.com/2011/09/quick-chunky-pasta-sauce.html">guest post</a> on my wife&#39;s blog, <a href="http://kitchentrialanderror.blogspot.com/">Kitchen Trial and Error</a>.</p>
<p>
	Summer is great for food because you can always find fresh ingredients at the local farmers&#39; market or your own garden. Sometimes you bought (or harvested) a lot of great vegtables and don&#39;t know what to do with them. With no meal planned on a lazy Saturday afternoon, we were getting very hungry and needed something fast. I made a fresh pasta sauce from the tons of fresh vegtables in the house.</p>
]]>
        
    </content>
</entry>

<entry>
    <title>Conversion to Movable Type - Part 3 - Assets - GP Blog</title>
    <link rel="alternate" type="text/html" href="http://gillius.org/blog/2011/07/conversion-to-movable-type---part-3---assets.html" />
    <id>tag:gillius.org,2011:/blog//2.234</id>

    <published>2011-07-23T00:04:00Z</published>
    <updated>2011-12-16T00:57:32Z</updated>

    <summary><![CDATA[ The last challenge in the Gillius.org conversion was assets. At first, I tried to find an external solution like the &quot;Asset Handler&quot; plugin, but it only works for MT4. In the end I wrote my own Java code to...]]></summary>
    <author>
        <name>Jason Winnebeck</name>
        <uri>http://gillius.org/</uri>
    </author>
    
    
    <content type="html" xml:lang="en-us" xml:base="http://gillius.org/blog/">
        <![CDATA[<p>
	The last challenge in the Gillius.org conversion was assets. At first, I tried to find an external solution like the &quot;Asset Handler&quot; plugin, but it only works for MT4. In the end I wrote my own Java code to scan my assets and upload each of them via the XML-RPC interface.</p>
]]>
        <![CDATA[<p>
	This post is the third part of a 3 part series. See also <a href="conversion-to-movable-type---part-1">part 1</a> and <a href="conversion-to-movable-type---part-2---page-import">part 2</a>.</p>
<p>
	The proper way, given enough time, would be to reverse engineer (or find documentation for) the MT database format and understand it well enough to construct all of the rows to describe the assets as if I had uploaded them. But I already had a published and tested XMLRPC interface, this is a one off approach, and I was running out of steam working on this, so I used the XMLRPC way to reupload all of the files. But, my custom approach had a few drawbacks:</p>
<ul>
	<li>
		I had to reupload all of the files that were already on my server, overwriting them</li>
	<li>
		Since I was overwriting the files, the modify/create dates on the files were all reset to the time of the import; I couldn&#39;t find a reasonable way within my motivation to fix the file timestamps (thus affecting Apache&#39;s return) or via MT
		<ul>
			<li>
				The table in MT looked like it was easy to fix, but who cares, if the HTTP server still returns the wrong date because the file&#39;s date on disk is wrong?</li>
		</ul>
	</li>
	<li>
		I had to modify the DeniedAssetFileExtensions in mt-config.cgi to allow the upload of certain files (like exe for my installers) (this was new upon upgrade to 5.11 from 5.04)</li>
</ul>
<p>
	For the code, I used the newMediaObject implementation of MoveableTypeImpl from part 2 to actually upload the files. The files, their content and metadata was found simply by walking the filesystem. I wrote some custom code in the SiteScanner object (not shown below) to filter out certain paths that I didn&#39;t want to manage as assets in my site. From a high-level view, it&#39;s a simple as that, so the rest of the post will be the code. The sitePath is the root of the website, which is also used to determine the relative path needed for the MT interface (i.e. C:\Projects\MyWebsite\tutorials\file.txt to /tutorials/file.txt).</p>
<pre class="brush: java">
    private static final String FILE_SEP = System.getProperty( &quot;file.separator&quot; );

    private static void importAssets( MovableTypeImpl mt, File sitePath, SiteScanner scanner )
            throws IOException, XmlRpcException {

        long totalLength = 0;
        long totalFiles = 0;
        for ( File file : scanner ) {
            totalLength += file.length();
            ++totalFiles;
        }
        System.out.println( &quot;About to upload &quot; + totalLength + &quot; bytes in &quot; + totalFiles + &quot; files&quot; );

        for ( File file : scanner ) {
            String filePath = convertToSiteRelativePath( sitePath, file );

            //TODO: can I preserve the date?
            System.out.println( &quot;Uploading &quot; + filePath + &quot;( &quot; + file.length() + &quot; bytes)&quot; );
            String dest = mt.newMediaObject( filePath, getFileData( file ) );
            System.out.println( &quot;  Uploaded to: &quot; + dest );
        }
    }

    private static String convertToSiteRelativePath( File sitePath, File file ) {
        String filePath = file.getAbsolutePath();
        if ( file.getAbsolutePath().startsWith( sitePath.getAbsolutePath() ) ) {
            filePath = filePath.substring( sitePath.getAbsolutePath().length(), filePath.length() );
            if ( filePath.startsWith( FILE_SEP ) )
                filePath = filePath.substring( FILE_SEP.length(), filePath.length() );
            filePath = filePath.replace( FILE_SEP, &quot;/&quot; );
        }
        return filePath;
    }

    private static byte[] getFileData( File file ) throws IOException {
        int total = 0;
        int length = (int) file.length();
        byte[] ret = new byte[length];
        FileInputStream reader = new FileInputStream( file );
        while ( total &lt; length ) {
            int read = reader.read( ret );
            if ( read &lt; 0 ) throw new IOException( &quot;Cannot read all data from &quot; + file );
            total += read;
        }

        return ret;
    }
</pre>
<p>
	&nbsp;</p>
]]>
    </content>
</entry>

<entry>
    <title>Conversion to Movable Type - Part 2 - Page Import - GP Blog</title>
    <link rel="alternate" type="text/html" href="http://gillius.org/blog/2011/07/conversion-to-movable-type---part-2---page-import.html" />
    <id>tag:gillius.org,2011:/blog//2.233</id>

    <published>2011-07-20T02:02:00Z</published>
    <updated>2011-12-14T09:07:00Z</updated>

    <summary><![CDATA[ Using the XMLRPC interfaces I mentioned in part 1, I needed to parse my HTML content, strip it of the original &quot;template&quot;, and upload it to my development Movable Type instance. I used Java and XMLRPC for this, Java...]]></summary>
    <author>
        <name>Jason Winnebeck</name>
        <uri>http://gillius.org/</uri>
    </author>
    
    
    <content type="html" xml:lang="en-us" xml:base="http://gillius.org/blog/">
        <![CDATA[<p>
	Using the XMLRPC interfaces I mentioned in part 1, I needed to parse my HTML content, strip it of the original &quot;template&quot;, and upload it to my development Movable Type instance. I used Java and XMLRPC for this, Java because I was most familiar and XMLRPC since that is what MT provides.</p>
]]>
        <![CDATA[<p>
	This post is the second of a three part series, the first part is <a href="conversion-to-movable-type---part-1">here</a>.</p>
<h2>
	Upload Interface</h2>
<p>
	The key to uploading pages is through the wp.newPage method, which MT supports and is best defined in the <a href="http://codex.wordpress.org/XML-RPC_wp#wp.getPage">WordPress API documentation</a>. The following is the relevant parts for my project, as well as some MT-specific information I gleaned from the soruce code, because I never could find a good, proper page on MT&#39;s site documenting this.</p>
<p>
	<strong>wp.newPage( blogId, userId, pass, item )</strong><br />
	<br />
	item is a struct:</p>
<ul>
	<li>
		title = HTML title</li>
	<li>
		description = HTML content</li>
	<li>
		permaLink = the actual, absolute URL where you want it published</li>
	<li>
		dateCreated = date of the file imported</li>
</ul>
<p>
	It appears that if dateCreated is in the future that there is code to set the post to be published in the future, but I did not test this.</p>
<p>
	Other fields I didn&#39;t care about that I could set (For others who might care):</p>
<ul>
	<li>
		mt_convert_breaks</li>
	<li>
		mt_allow_comments</li>
	<li>
		mt_excerpt</li>
	<li>
		mt_text_more</li>
	<li>
		mt_keywords</li>
	<li>
		mt_tags (based on the source, uses author&#39;s tag delimiter setting to separate)</li>
	<li>
		mt_tb_ping_urls</li>
</ul>
<p>
	The fields that can&#39;t be &quot;empty&quot; (such as mt_allow_comments) are set to their defaults if not explicitly specified in the XML-RPC call. There also appears to be some kind of generic mechanism to support mt_* fields, but I&#39;m not sure what that can set, perhaps custom fields?<br />
	<br />
	The permaLink was the tricky part, because I thought that Movable Type folders mapped to categories (especically since they are reported back as such when you get post information). But the category field is read only for blog posts, not pages. By looking at the Perl source code I determined that it only looks at permaLink, and parses the fragments out of it to determine the folder and &quot;basename&quot; (or &quot;Filename&quot; in the web interface). Before I had tried setting property &quot;mt_basename&quot; and this works great if you don&#39;t care about folders and want everything in the root.</p>
<p>
	I didn&#39;t try this, but looking at the source code it appears the permaLink parsing only takes affect if entry type is set to &quot;page&quot;, which appears only to happen if you call wp.newPage as opposed to metaWeblog.newPost. I could be wrong, though.</p>
<p>
	I set MT to have &quot;htm&quot; instead of &quot;html&quot; as the end file name to match my existing site. I used htm on the permaLinks too. As a test of curiousity I tried &quot;html&quot; and got the error message &quot;Requested permalink ... is not available for this page&quot;.</p>
<p>
	So in the client, I have to set both the URL prefix (&quot;http://gillius.org/&quot; as well as the suffix &quot;.htm&quot;) in the client.</p>
<h2>
	XMLRPCServer bug</h2>
<p>
	The XML-RPC calls filter all of the basename settings for adds or edits, whether through mt_basename, wp_slug, or permaLink, through the _apply_basename function. The basename is checked whether or not it is unique. The problem is that it doesn&#39;t check if it is unique within a folder. The web interface seems to handle this. For example <tt>folderA/the_page.htm</tt> and <tt>folderB/the_page.htm</tt> are considered &quot;the same&quot; to the XML-RPC interface but unique in the web interface. When the XML-RPC interface encounters an attempt to edit or add the second &quot;<tt>the_page.htm</tt>&quot;, it will rename it based on the content as if you didn&#39;t give it a basename at all.</p>
<p>
	I tried hard to find a way around this, but in the end I had no alternative but to edit the source code for my import.</p>
<p>
	This even has a problem where if you &quot;wp.getPage&quot; then pass back that exact, unmodified page to &quot;wp.editPage&quot;, it actually changes the permaLink and basename, whereas normally it wouldn&#39;t. In my opinion, that&#39;s a bug if you can&#39;t do a &quot;round trip&quot;.</p>
<p>
	I&#39;m still not sure whether this is a &quot;bug&quot; or a feature, but since the web interface lets me do it, I don&#39;t see any reason not to:</p>
<pre class="brush: perl">
    if (defined $basename) {
        # Ensure this basename is unique.
        my $entry_class = ref $entry;
        my $basename_uses = $entry_class-&gt;exist({
            blog_id  =&gt; $entry-&gt;blog_id,
            basename =&gt; $basename,
            ($entry-&gt;id ? ( id =&gt; { op =&gt; &#39;!=&#39;, value =&gt; $entry-&gt;id } ) : ()),
        });
        if ($basename_uses) {
            $basename = MT::Util::make_unique_basename($entry);
        }

        $entry-&gt;basename($basename);
    }</pre>
<p>
	I commented the lines except for the if statement itself and the setter for basename. In MT 5.04, these were lines 180 to 188 in lib/MT/XMLRPCServer.pm</p>
<p>
	I&#39;m not totally sure the solution to this because I know virtually nothing about Perl. I would guess that one could just add folders to the associative array in the exist call to &quot;narrow&quot; the search, but I&#39;m not sure. Since I was importing my existing, valid, site from files on disk, I knew that there wouldn&#39;t be any actual basename collisions, so for the import I commented it out. I decided to uncomment it after the import, to leave the code in a &quot;pristine&quot; state.</p>
<p>
	This is one of those moments that I&#39;m really glad for open source.</p>
<h2>
	<span id="cke_bm_417E" style="display: none;">&nbsp;</span>Writing the Code</h2>
<p>
	I may release the importer source code at some point, in some form. I thought of making a &quot;polished&quot; program but I ended up having to be pretty specific to my website and it took enough effort that I wasn&#39;t motivated to make a version that is bullet-proof, so my code would cause a lot more questions than answers. When it failed or if I found a special case or file it had trouble handling I had to coax it along. There&#39;s no GUI for it which would be a necessity to review what it was going to import (based on your scanning rules) as well as review if it parsed the content properly.</p>
<p>
	That said, there are a lot of things I can say to help others interested in writing their own. I wrote it in Java but python or groovy would work well.</p>
<p>
	In python you can use xmlrpclib:</p>
<pre class="brush: python">
import xmlrpclib
proxy = xmlrpclib.ServerProxy(&quot;http://mt.example.com/cgi-bin/mt/mt-xmlrpc.cgi&quot;)
proxy.mt.newPage( blogId, username, password, [ &quot;title&quot; : title, &quot;dateCreated&quot; : dateCreated, &quot;permaLink&quot; : permaLink ] )</pre>
<p>
	In groovy you can use groovy-xmlrpc:</p>
<pre class="brush: groovy">
import groovy.net.xmlrpc.*
def server = new XMLRPCServerProxy(&quot;http://mt.example.com/cgi-bin/mt/mt-xmlrpc.cgi&quot;)
server.mt.supportedMethods().join(&#39;\n&#39;)</pre>
<p>
	I ended up ultimately going with Java because I was more comfortable with it at the time compared to python. Since I wrote that code, I&#39;ve been seriously learning <a href="http://groovy.codehaus.org/">Groovy</a> and I would have used it had I known at the time.</p>
<p>
	In Java I used the org.apache.xmlrpc:xmlrpc-client:3.1.3 library, which was easily obtained from Maven. I made interfaces MetaWeblog, MovableType, and WordPress to encapsulate the methods that I cared about, and implemented them in a MovableTypeImpl, which is shown below:</p>
<pre class="brush: java">
public class MovableTypeImpl implements WordPress, MovableType, MetaWeblog {
    private final XmlRpcClient client;
    private final String username;
    private final String password;
    private int blogId;

    public MovableTypeImpl( URL url, String username, String password, int blogId ) {
        XmlRpcClientConfigImpl config = new XmlRpcClientConfigImpl();
        config.setServerURL( url );
        client = new XmlRpcClient();
        client.setConfig( config );

        this.username = username;
        this.password = password;
        this.blogId   = 1;
    }

    public String[] supportedMethods() throws XmlRpcException {
        Object[] result = (Object[]) execute( &quot;mt.supportedMethods&quot; );
        return Arrays.copyOf( result, result.length, String[].class );
    }

    public String newPage( Date pageDate, String title,
                           String content, String link ) throws XmlRpcException {
        Map&lt;String, Object&gt; item = new HashMap&lt;String, Object&gt;();
        if ( title != null )
            item.put( &quot;title&quot;, title );
        if ( pageDate != null )
            item.put( &quot;dateCreated&quot;, pageDate );
        if ( link != null )
            item.put( &quot;permaLink&quot;, link );

        item.put( &quot;description&quot;, content );

        return execute( &quot;wp.newPage&quot;, blogId, username, password, item ).toString();
    }

    public String newMediaObject( String fileName, byte[] fileData ) throws XmlRpcException {
        Map&lt;String, Object&gt; file = new HashMap&lt;String, Object&gt;();
        file.put( &quot;name&quot;, fileName );
        file.put( &quot;bits&quot;, fileData );

        return execute( &quot;metaWeblog.newMediaObject&quot;, blogId, username, password, file ).toString();
    }

    private Object execute( String method, Object... params )
            throws XmlRpcException {
        return client.execute( method, params );
    }
}</pre>
<p>
	For the final conclusion, view the <a href="conversion-to-movable-type---part-3---assets">third part</a> covering asset upload.</p>
]]>
    </content>
</entry>

<entry>
    <title>Conversion to Movable Type - Part 1 - GP Blog</title>
    <link rel="alternate" type="text/html" href="http://gillius.org/blog/2011/07/conversion-to-movable-type---part-1.html" />
    <id>tag:gillius.org,2011:/blog//2.231</id>

    <published>2011-07-09T04:10:00Z</published>
    <updated>2011-12-07T09:16:56Z</updated>

    <summary> The original Gillius.org site was entirely static content. In order to move to Movable Type, I needed to first convert the original site&#39;s content in a way that would be compatible with the old one, both in looks, content,...</summary>
    <author>
        <name>Jason Winnebeck</name>
        <uri>http://gillius.org/</uri>
    </author>
    
    
    <content type="html" xml:lang="en-us" xml:base="http://gillius.org/blog/">
        <![CDATA[<p>
	The original Gillius.org site was entirely static content. In order to move to Movable Type, I needed to first convert the original site&#39;s content in a way that would be compatible with the old one, both in looks, content, and even the URLs themselves. I did choose MT partly because it appeared to be easier to import. I also looked at Drupal and Wordpress as well.</p>
<p>
	In this first part, I will cover conversion of the blog posts, conversion of the style, and parsing the original static HTML pages into a format that can be uploaded into MT. Read on for the details.</p>
]]>
        <![CDATA[<h2>
	Why Movable Type?</h2>
<p>
	Ultimately, I chose MT because it allowed for &quot;static publishing&quot; -- meaning it would generate static HTML pages and put them on the site, but still allow for hooks for comments/trackbacks/etc. such that I can update the site easily. The performance on my host for PHP and other dynamic content is not quite fast, and I&#39;ve always enjoyed my site&#39;s very low overhead and fast loading, even on mobile devices. I&#39;m not trying to be the next media mogul here.</p>
<p>
	Another requirement that I wanted was a remotely accessible interface (web service). MT supports XMLRPC, similarly to other solutions as well -- there are even &quot;standard&quot; interfaces like <a href="http://www.xmlrpc.com/metaWeblogApi">MetaWeblog API</a> and WordPress has an <a href="http://codex.wordpress.org/XML-RPC_wp">API</a> which MT also supports some of those calls as well. The biggest problem I had is that I had a horrible time trying to find actual documentation from MT. Ultimately to figure out the details of some items I ended up diving into the source code. Even though I&#39;m not a Perl wizard, I was able to glean enough to see how some of the calls worked. That came in handy to workaround/fix some of the quirks/bugs/limitations I encountered when trying to import the site. This is one situation where it&#39;s really handy to work with an open-source project.</p>
<p>
	I noticed two main versions of MT, MT 4 and MT 5. I chose MT 5 to be the &quot;latest&quot; at first but then noticed that at least at this time the vast majority are still on MT 4. I&#39;m not completely sure why, but my understanding is that MT 5 supports the concept of a &quot;website&quot; in addition to a &quot;blog&quot;, which was important for my site -- it&#39;s mostly a website and secondly a blog, although I wanted to move towards the latter over time.</p>
<p>
	I ended up writing a custom piece of software in Java to convert my site (except for the blog part, which was easy enough to do in pure SQL).</p>
<h2>
	Converting the Blog</h2>
<p>
	The blog was by far the easiest part to convert. The original blog was a simple PHP script that I wrote against a MySQL database. To move this to Movable Type, I just performed a SQL SELECT to put the data into the Movable Type import/export format as documented <a href="http://www.movabletype.org/documentation/appendices/import-export-format.html">here</a>. The only fields in my news items were title, date and content (as string of HTML), which made the export/import straightforward. I was able to do the export with a single statement:</p>
<pre class="brush: sql">
SELECT CONCAT(&#39;TITLE: &#39;, `title`, &#39;\nDATE: &#39;, DATE_FORMAT( `newsDate`, &#39;%m/%d/%Y %H:%i:%s&#39; ), &#39;\n-----\nBODY:\n&#39;, `content`, &#39;\n-----\n--------&#39; )
FROM `News`
ORDER BY `newsDate` ASC</pre>
<p>
	The output then, looks like the following:</p>
<pre>
--------
TITLE: Super IsoBomb 3D Announced
DATE: 09/15/2003 00:00:00
-----
BODY:
A continuation to the Super IsoBomb game...</pre>
<p>
	The News table is specific to my site, but if you had a custom system or even another database-based blog, the SQL may be similar. You just need a view that consists of the above items, but more importantly the content needs to be an HTML fragment and not in some markup language.</p>
<h2>
	Converting the Style</h2>
<p>
	My original static HTML site was originally based on Dreamweaver-style templates, since I worked with Dreamweaver in the late 90s. I used only the basic functionality for templates, which basically is given a template page you could view on its own with markers to indicate where content should be placed:</p>
<pre class="brush: html">
&lt;html&gt;&lt;!-- #BeginTemplate &quot;/Templates/main.dwt&quot; --&gt;&lt;!-- DW6 --&gt;
&lt;head&gt;
&lt;!-- #BeginEditable &quot;doctitle&quot; --&gt;
&lt;title&gt;Contact Gillius&lt;/title&gt;
&lt;!-- #EndEditable --&gt;
...
&lt;/head&gt;
&lt;body&gt;
... template content ...
&lt;div class=&quot;contentArea&quot;&gt;&lt;!-- #BeginEditable &quot;content&quot; --&gt;
&lt;!-- #EndEditable --&gt; &lt;/div&gt;
&lt;/body&gt;
&lt;!-- #EndTemplate --&gt;&lt;/html&gt;
</pre>
<p>
	Since everything generated was just plain HTML, after I lost access to Dreamweaver, I could still edit it by hand (using gVim), and upload by SCP. You could also see how easy it would be to replace the template content with regular expressions -- just select everything between the unique tags and replace. Even still, it was a pain and you can see it by the very limited updates to the site in the past years. This was my motivation to move to a content management system.</p>
<p>
	Since my pages were structured this way, I could copy and paste the shared portions of the template into MT&#39;s template design system, and bring over my CSS styles. I had to tweak some of MT&#39;s built-in widget HTML to add/change styles to make it easier to fit into my existing CSS. I thought about a site redesign, but decided to keep things easy for myself and tackle only one problem at a time and try to convert the site verbatim before I redesign when I have the time and motivation.</p>
<h2>
	Exporting the Pages</h2>
<p>
	Ultimately, for the export process, I would need to get the pages into a similar format in memory to export via a series of XMLRPC calls. Pages are basically like blog posts, you need the following:</p>
<ol>
	<li>
		Location of the page (aka the &quot;permalink&quot;)</li>
	<li>
		Date the page was last updated</li>
	<li>
		Title of the page</li>
	<li>
		Content of the page as an HTML fragment</li>
</ol>
<p>
	Since I have the web site mirrored in my disk and in version control, it was easy to get the first two items just by traversing the filesystem and looking at the file&#39;s name and date. Getting the title is possible since it&#39;s in the HTML head, and getting the content was easy because it&#39;s simply everything between the tags. Therefore, I was able to use the following Java patterns:</p>
<pre class="brush: java">
    public static finalPattern titlePattern = Pattern.compile(
            &quot;(?s)(?i)&lt;html.*?&gt;.*&lt;head&gt;.*&lt;title&gt;(.*)&lt;/title&gt;&quot; );

    public static final List&lt;Pattern&gt; contentPatterns = asList(
            Pattern.compile( &quot;(?s)&lt;!-- #BeginEditable \&quot;content\&quot; --&gt;(.*)&lt;!-- #EndEditable --&gt;&quot; ),
            Pattern.compile( &quot;(?s)(?i)&lt;body.*?&gt;(.*)&lt;/body&gt;&quot; )
    );</pre>
<p>
	The second pattern in the contentPatterns list is in case the first one failed to select any content. Not all of my pages on the site fit into the template, and for those pages I wanted to just simply pick out all of the HTML content in the body. There were a few pages I excluded from the export, such as the GNE tutorials, that don&#39;t use my site template at all since they are meant to be distributed standalone. As with any export process you also find a few &quot;exceptions&quot; to the rule that I had to just smooth over by hand. Now that I had my pages broken up into the 4 pieces, I was ready to upload them with XMLRPC.</p>
<p>
	The <a href="conversion-to-movable-type---part-2---page-import">second part</a> covers XMLRPC and uploading the static HTML content into Movable Type.</p>
<p>
	The <a href="conversion-to-movable-type---part-3---assets">third part</a> covers the asset (images, zip files, etc) conversion.</p>
]]>
    </content>
</entry>

<entry>
    <title>HTTP Caching and the Refresh Blues - GP Blog</title>
    <link rel="alternate" type="text/html" href="http://gillius.org/blog/2011/07/http-caching-and-the-refresh-blues.html" />
    <id>tag:gillius.org,2011:/blog//2.232</id>

    <published>2011-07-06T00:34:23Z</published>
    <updated>2011-07-07T11:36:14Z</updated>

    <summary> While developing the updated Gillius.org site, I noticed that sometimes the pages would not update when I went to them, until I hit refresh, even after I exited the browser and restarted it. I was using Firefox, which I...</summary>
    <author>
        <name>Jason Winnebeck</name>
        <uri>http://gillius.org/</uri>
    </author>
    
    
    <content type="html" xml:lang="en-us" xml:base="http://gillius.org/blog/">
        <![CDATA[<p>
	While developing the updated Gillius.org site, I noticed that sometimes the pages would not update when I went to them, until I hit refresh, even after I exited the browser and restarted it. I was using Firefox, which I learned made the problem more apparent versus IE, but I learned that it was my site&#39;s cache settings that was having a problem. I&#39;ve fixed the problem, so if you see content that doesn&#39;t look right, missing comments, or is out of date you might have to refresh manually once, but hopefully not again after this. Read on for the details.</p>
]]>
        <![CDATA[<p>
	I know a bit about HTTP and its headers from general knowledge and from some programming of RESTful web services. But, what I thought is that when the server returns a Last-Modified header in the HTTP response, that the clients will always issue an HTTP GET with the If-Modified-Since header. What this means, is that if the page hasn&#39;t been updated since the date in the browser&#39;s cache, the server will return a &quot;304 Not Modified&quot; message and not the whole page. Sounds great, right?<br />
	<br />
	Except that I learned about the Expires and Cache-Control headers. While I knew about Expires, what I didn&#39;t realize is what it REALLY means is that if you try to go to a page before the Expires date, the HTTP cache, which can be your HTTP proxy if you have one, or the browser itself, is allowed to return the cached version of the content without even contacting the origin website. That was what I didn&#39;t understand.<br />
	<br />
	However, that didn&#39;t explain it all. My server was sending Last-Modified but not the Expires header, and when I updated the page I checked on the server to make sure the file date was updated and correct. So what was wrong? Well I pulled out a very handy Firefox extension called <a href="http://getfirebug.com/">Firebug</a>, which allows you to see all of the HTTP traffic (or non-traffic) in my case. What I learned is even when I restarted Firefox and went to my website there was absolutely no traffic to it. I looked at the cache and saw that some of the pages I was viewing were set to expire months from now. Based on what I was seeing, Firefox would wait 4 months before trying to hit the page unless I hit refresh???<br />
	<br />
	Well I learned from an <a href="http://blog.httpwatch.com/2008/10/15/two-important-differences-between-firefox-and-ie-caching/">HttpWatch blog posting</a> that per RFC 2616, it suggests using a heuristic to set Expires based on Last-Modified if Expires isn&#39;t set, to 10% of the time. So if the page was updated 10 weeks ago, it will set the expires to 1 week from now. Well, some of the pages on my site haven&#39;t been updated since 2006, or 5 years ago, which means Firefox&#39;s cache expires setting is 6 months.</p>
<p>
	I also learned that Expires doesn&#39;t mean that the cache goes away -- it just means the browser will &quot;revalidate&quot; the page -- that is contact the origin webserver with a If-Modified-Since, which only returns content if it&#39;s changed. Also, apparently IE&#39;s default settings work more like how I thought browsers always work, which is to revalidate pages in the cache at least once for each time you start the browser. Firefox won&#39;t do this even through a restart.</p>
<p>
	So what is the solution? Instead of setting Expires, you can set Cache-Control&#39;s max-age field to have an offset, which has the same effect as setting Expires to some time in the future from when the browser/proxy read the file. You can also set no-cache, which at first I thought meant don&#39;t store it at all on the disk, which is not true. All it means is that (at least for Firefox 5 according to Firebug) the page is revalidated against the cache each time. I could be wrong for IE, maybe max-age=0 is better. Please comment if you have some better suggestions.</p>
<p>
	<a href="http://blog.httpwatch.com/2007/12/10/two-simple-rules-for-http-caching/">Another post on HttpWatch</a> suggests setting no-cache for HTML/dynamic pages, and &quot;expire forever&quot; for all other resources (images, JS, CSS, etc). What if you need to change one of these resources? The suggestion is to change the name. This would be annoying, except that specifying a query parameter is sufficient. I checked and I saw this in practice. If you have site.js, and you update it, change your HTML to link to site.js?2. Your HTTP server won&#39;t care, but the proxy/browser will load it as a new file. The content of the query string is irrelevant except to make it unique.</p>
<p>
	I needed a fast fix, and I&#39;m scared I can follow that pattern since I&#39;m still tweaking my site heavily, so for now I just set all resources to expire in 2 hours and HTML pages to no-cache. That doesn&#39;t mean resources are totally redownloaded every 2 hours, it just means an &quot;If-Modified-Since&quot; query is made rather than no query at all; the server will respond &quot;Not Modified&quot; if there has been no change; just a quick ping. I&#39;m not concerned since I get 1000s of hits a month and not millions right now. The following .htaccess is what I&#39;ve got for now:</p>
<pre>
#Force clients to revalidate cache for &quot;resources&quot; every 2 hours
#This is because I&#39;m still commonly changing resources. If I start to
#add &quot;version query parameters&quot; to js files and such, I can bump this
#up for those types
&lt;FilesMatch &quot;\.(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)$&quot;&gt;
  Header merge Cache-Control max-age=7200
&lt;/FilesMatch&gt;

#Except for HTML files, we want to revalidate each time...
#This handles comments/new posts/page edits as well as forums
&lt;FilesMatch &quot;\.(html|htm|xml|txt|xsl)$&quot;&gt;
  Header merge Cache-Control no-cache
&lt;/FilesMatch&gt;</pre>
<p>
	&nbsp;</p>
]]>
    </content>
</entry>

<entry>
    <title>Gillius.org Now Out of the 90s - GP Blog</title>
    <link rel="alternate" type="text/html" href="http://gillius.org/blog/2011/07/gilliusorg-now-out-of-the-90s.html" />
    <id>tag:dev.gillius.org,2011:/blog//2.230</id>

    <published>2011-07-04T16:37:19Z</published>
    <updated>2012-01-09T08:32:34Z</updated>

    <summary>Previously, I used to maintain this site as static content (with the exception of the news page). Now I am using the popular blogging software Movable Type to help manage the site. My intention is to start doing a lot...</summary>
    <author>
        <name>Jason Winnebeck</name>
        <uri>http://gillius.org/</uri>
    </author>
    
    
    <content type="html" xml:lang="en-us" xml:base="http://gillius.org/blog/">
        <![CDATA[<p>Previously, I used to maintain this site as static content (with the exception of the news page). Now I am using the popular blogging software Movable Type to help manage the site. My intention is to start doing a lot more blogging posts, rather than articles/static content that have been seen (and hardly touched) over the last 10 years.</p>

<p>Moving to a proper system now means that you can search the site, leave comments and trackbacks, and navigate by categories/tags/dates. It also means you will see more content as it is easier for me to post now.</p>

<p>In the short-term, my next posts will detail some of my experiences converting my existing site to Movable Type.</p>]]>
        
    </content>
</entry>

<entry>
    <title>RealDB Finished - GP Blog</title>
    <link rel="alternate" type="text/html" href="http://gillius.org/blog/2010/10/realdb-finished.html" />
    <id>tag:dev.gillius.org,2010:/gp_blog//2.38</id>

    <published>2010-10-21T01:36:46Z</published>
    <updated>2012-01-16T09:23:44Z</updated>

    <summary>As of October 5th, I&apos;m finished with all of the requirements of my master&apos;s degree! I gave a presentation of RealDB, which can be seen on its page. Source code is also there, and the project is licensed under GPL...</summary>
    <author>
        <name>Jason Winnebeck</name>
        <uri>http://gillius.org/</uri>
    </author>
    
    
    <content type="html" xml:lang="en-us" xml:base="http://gillius.org/blog/">
        <![CDATA[<p>As of October 5th, I'm finished with all of the requirements of my master's degree! I gave a presentation of RealDB, which can be seen on <a href="/realdb">its page</a>. Source code is also there, and the project is licensed under GPL v3.</p>
<p>I know that it's been some time, but I plan on bringing back the site somewhat as a blog. I've been wanting to do that more for a time but felt weird putting it here, as it is the "news" section. I might just make the posts here, or maybe I will start up a separate section just for the blog.</p>]]>
        
    </content>
</entry>

<entry>
    <title>GNE Page Updated - GP Blog</title>
    <link rel="alternate" type="text/html" href="http://gillius.org/blog/2008/11/gne-page-updated.html" />
    <id>tag:dev.gillius.org,2008:/gp_blog//2.37</id>

    <published>2008-11-25T03:33:33Z</published>
    <updated>2011-12-06T09:17:22Z</updated>

    <summary>After responding to a bug report in the forums, I updated the link on the GNE subversion, which was obsolete since SourceForge changed their SVN URL scheme. Users working with GCC 4.3.0 should now be able to compile GNE when...</summary>
    <author>
        <name>Jason Winnebeck</name>
        <uri>http://gillius.org/</uri>
    </author>
    
    
    <content type="html" xml:lang="en-us" xml:base="http://gillius.org/blog/">
        <![CDATA[After responding to a bug report in the forums, I updated the link on the <a href="/gne">GNE</a> subversion, which was obsolete since SourceForge changed their SVN URL scheme. Users working with GCC 4.3.0 should now be able to compile GNE when working with version 0.75.]]>
        
    </content>
</entry>

<entry>
    <title>Master&apos;s Project - GP Blog</title>
    <link rel="alternate" type="text/html" href="http://gillius.org/blog/2008/05/masters-project.html" />
    <id>tag:dev.gillius.org,2008:/gp_blog//2.36</id>

    <published>2008-05-21T03:24:41Z</published>
    <updated>2011-01-10T02:04:09Z</updated>

    <summary>I have created a site for the work on my master&apos;s project, RealDB. More information is coming soon on its site....</summary>
    <author>
        <name>Jason Winnebeck</name>
        <uri>http://gillius.org/</uri>
    </author>
    
    
    <content type="html" xml:lang="en-us" xml:base="http://gillius.org/blog/">
        <![CDATA[I have created a site for the work on my master's project, RealDB. More information is coming soon on its <a href="/realdb">site</a>.]]>
        
    </content>
</entry>

<entry>
    <title>CSS Handheld Profile Support - GP Blog</title>
    <link rel="alternate" type="text/html" href="http://gillius.org/blog/2006/09/css-handheld-profile-support.html" />
    <id>tag:dev.gillius.org,2006:/gp_blog//2.35</id>

    <published>2006-09-13T03:13:22Z</published>
    <updated>2011-01-10T02:04:09Z</updated>

    <summary>Sorry for the news spam, but one last item (seriously)... I&apos;ve also added handheld support to the CSS profile. I used ems in the CSS sizing before for the site and it always looked well, but now I have some...</summary>
    <author>
        <name>Jason Winnebeck</name>
        <uri>http://gillius.org/</uri>
    </author>
    
    
    <content type="html" xml:lang="en-us" xml:base="http://gillius.org/blog/">
        Sorry for the news spam, but one last item (seriously)...  I&apos;ve also added handheld support to the CSS profile.  I used ems in the CSS sizing before for the site and it always looked well, but now I have some handheld-specific tweaks.  I&apos;ve tested this functionality against IE in Windows Mobile 2005 only so far.  I plan on doing a little more tweaking in the next few days but please make comments if you have any.
        
    </content>
</entry>

</feed>

