Saturday, September 26, 2009

A False Equation: Results do not equal Intentions

It amazes me how regularly I come across phrases like "We have to try something!" or "We will create a law making it so." I believe these ideas are rooted in the misconception that action based on intention will yield desired result. An important rule of thumb to keep in mind when dealing with humans is that results do not in fact equal intentions. Many times actions even have the exact opposite outcome than what was hoped for.

One classic example of this is the Law of Unintended Consequences. This is basically the observation that reality is complex and that often action on one variable effects another variable in an unexpected way. In other words, each action has multiple effects. Common examples of this are the Treaty of Versailles in dealing with Germany after world war I, funding of the Afghan Mujahideen which led to the rise of Al-Qaeda, and price controls resulting in the scarcity of goods (check into rent control or Nixon's gasoline price controls in the 1970s).

Groups can be especially prone to this fallacy as they often elect committees or recruit "experts" which are tasked to carry out their intentions. This predictably results in simplistic top down thinking being applied to complex areas leading to failure to achieve the desired results. Working in the software industry I have observed many a multi-million dollar project left with only disappointed customers and wasted resources to show for the intentions.

One group in particular falls victim to this fallacy with egregious results: the state. When people decide that they want people to act in certain ways, they assume they can create state institutions to be the one size fits all solution to the problem. They believe politician's promises who are only chosen based on their intentions and seldom held accountable for their results (due to time lag and other well documented factors: see Sowell, Mises, or Rothbard for exhaustive analysis). This can be very damaging to society as the state is such a strong hammer and once selected it does not permit competition. This is because state solutions appropriate large amounts of resources and are not subject to the feedback of pricing and profit, erect barriers to entry in the forms of licensing and regulation (FDA), and sometimes resort to straight out monopoly (post office).

When seeking solutions to problems, keep in mind that merely wishing something to be does not make it so, nor does spending massive resources and making large sweeping actions necessarily achieve the desired end while often creating huge side effects that can be worse than the original problem. With these things in mind, be aware that using the state as default argument for how things should be accomplished is not a reasonable position; the burden of proof is high.

Sunday, August 30, 2009

Intro to Anarchism without Adjectives on YouTube

My views on politics and reality have been enormously helped along by a fantastic community on YouTube putting together terrific videos on all areas of thought. Here are my recommendations on where to start.



Confederal Socialist



This is my favorite video describing why it is exciting and desirable to think as an anarchist.




A comparison of church and state.



Visit his channel and watch all his videos starting from the beginning: you will not regret it.


junior00bacon00chee


A simple explanation of why top down planning is not desirable:


Wednesday, February 04, 2009

Scaling Notes

Don't!

Yes the tried and true advice of don't scale until you must. scaling takes time and energy away from creating something that people want. Make sure you do that first. Once you have people banging down your door, read on.

Monitoring/Logging

Related to the previous note, If you don't know what your current systems are doing, you can't effectively plan or design for the next stage. I wasted 2 weeks rewriting a ruby component to a java implementation only to realize I had wasted the effort, ruby was performing just fine thank you very much. Metrics like 99th percentile requests per second, reported once per hour can easily be accomplished with request logs and standard unix tools.

cron, grep, awk or a small ruby script, Nagios

cat request.log | grep "POST /messages" | ruby times.rb

Procrastinate

From your logging and planning you should know what kind of loads you have. Some tasks roll in all at once, but can be done later. Push these to the background or schedule it for later. Background threads, work queues, and simple file drops/cron jobs are great ways to distribute the load. Many types of processing are more efficient when done in a batch mode. If its an option, do it.

JMS, ActiveMQ, cron,

Share the load

Embrace the stateless! The world without side effects allows you to easily add nodes and load balance. Sometimes you can fake stateless through sharding and replication. The goal is
to get as much work done by small short running tasks that can be accomplished on one of many nodes.

HAProxy, ec2, rsync

Memoize (aka cache)

Just as functional languages have shown us that stateless calls can gain drastic perf benefits from memoizing (recording the previous work done so you don't have to do it again), so can your component system benefit through stateless services and caching.

- memcache
- http caching proxy
- almost cache (cache all but the most recent changes to minimize work and focus optimization)

memcache, squid, disk

partition/shard

Think about how to partition your data/algorithms. this comes into play in many areas, from database query size, to full text search size. Eventually you will outrun your cpu, memory, or storage capacity for an individual node. You need to determine your requirements for
each resource, and prioritize on a triage like manner.

disk i/o

before you waste time worrying about disk i/o, raid schemes, or SAN tactics, you need to first ask yourself, should this be on mechanical disk? If you need to serve thousands of requests per second or more, the answer is likely NO! Fancy disk subsystems buy you time,
they do not buy your orders of magnitude. A typical seek time on a drive is around 10ms. This limits you to around 100 requests per spindle. You are looking at best 1 order of magnitude over this with many 15k scsi drives, expensive controllers, and lots of ops setup/monitoring.

If you are limited to 4 spindles per instance, then you will need at least 10 times as many instances to achieve the same perf as an in memory service. So you need to find out,
how big is my data? How many instances would it take to fit it into RAM? Is it partitionable (both lookup and store)? A cost benefit analysis here can be enlightening.

iostat, tmpfs, MySQL, memcache