Wednesday, February 04, 2009

Scaling Notes

Don't!

Yes the tried and true advice of don't scale until you must. scaling takes time and energy away from creating something that people want. Make sure you do that first. Once you have people banging down your door, read on.

Monitoring/Logging

Related to the previous note, If you don't know what your current systems are doing, you can't effectively plan or design for the next stage. I wasted 2 weeks rewriting a ruby component to a java implementation only to realize I had wasted the effort, ruby was performing just fine thank you very much. Metrics like 99th percentile requests per second, reported once per hour can easily be accomplished with request logs and standard unix tools.

cron, grep, awk or a small ruby script, Nagios

cat request.log | grep "POST /messages" | ruby times.rb

Procrastinate

From your logging and planning you should know what kind of loads you have. Some tasks roll in all at once, but can be done later. Push these to the background or schedule it for later. Background threads, work queues, and simple file drops/cron jobs are great ways to distribute the load. Many types of processing are more efficient when done in a batch mode. If its an option, do it.

JMS, ActiveMQ, cron,

Share the load

Embrace the stateless! The world without side effects allows you to easily add nodes and load balance. Sometimes you can fake stateless through sharding and replication. The goal is
to get as much work done by small short running tasks that can be accomplished on one of many nodes.

HAProxy, ec2, rsync

Memoize (aka cache)

Just as functional languages have shown us that stateless calls can gain drastic perf benefits from memoizing (recording the previous work done so you don't have to do it again), so can your component system benefit through stateless services and caching.

- memcache
- http caching proxy
- almost cache (cache all but the most recent changes to minimize work and focus optimization)

memcache, squid, disk

partition/shard

Think about how to partition your data/algorithms. this comes into play in many areas, from database query size, to full text search size. Eventually you will outrun your cpu, memory, or storage capacity for an individual node. You need to determine your requirements for
each resource, and prioritize on a triage like manner.

disk i/o

before you waste time worrying about disk i/o, raid schemes, or SAN tactics, you need to first ask yourself, should this be on mechanical disk? If you need to serve thousands of requests per second or more, the answer is likely NO! Fancy disk subsystems buy you time,
they do not buy your orders of magnitude. A typical seek time on a drive is around 10ms. This limits you to around 100 requests per spindle. You are looking at best 1 order of magnitude over this with many 15k scsi drives, expensive controllers, and lots of ops setup/monitoring.

If you are limited to 4 spindles per instance, then you will need at least 10 times as many instances to achieve the same perf as an in memory service. So you need to find out,
how big is my data? How many instances would it take to fit it into RAM? Is it partitionable (both lookup and store)? A cost benefit analysis here can be enlightening.

iostat, tmpfs, MySQL, memcache

Sunday, December 21, 2008

Memory Leak in ActiveMessaging

ActiveMessaging is great. It allows you to easily hook up to ActiveMQ to offload all your batch processing needs. Only problem, is it eats memory like crazy. Just hookup a simple queue with a publisher and consumer, write a few hundred thousand tickets, and watch the consumer eat all your available memory (will quickly eat a couple hundred megs and go on to use more than a GB).

In the gateway.rb, there is a dispatch method that routes the message to the appropriate processor:


def dispatch(message)
@@guard.synchronize {
begin
prepare_application
_dispatch(message)
rescue Object => exc
ActiveMessaging.logger.error "Dispatch exception: #{exc}"
ActiveMessaging.logger.error exc.backtrace.join("\n\t")
raise exc
ensure
reset_application
end
}
end
If you comment out the prepare_application and reset_application the memory consumption stops. You can chew through millions of tickets and stay at a steady usage. Only problem is that now, ActiveRecord will not keep its MySQL connection fresh, aka you will get a MySQL::Error: Mysql has gone away

These methods seem to wedge deep in rails' dispatch foo. Somewhere in there, it is likely doing validation on the connection. So, the trick will probably be to override the process!(message) method of the base processor class, and rescue MySQL::Error and call ActiveRecord::Base.verify_active_connections! and retry.

I will update this once I can validate it to see if this fixes the stale connection issue and if I run into any other issues, or if any kind commenter leaves the answer.

Monday, November 17, 2008

Entrepreneurs, Love, and Signals

Intelligences gets things done by planning, acting, and then making continual adjustments. In order to be successful they must be able to evaluate their current course against where they were planning on going. If things are off, they adjust, change directions, speed, etc. to get back on the right path. As the old tired quote goes, “How do you get to the top of a mountain? Just make sure each step leads you higher”.

Now we get signals along the way that indicate how we are doing. A simple signal would be a measurement. Are we higher or lower than last step? This can then be compared with the desired goal if it is itself measurable (I want to make it to 5000 ft). In this way we can make adjustments with each step. We can even make sophisticated decisions like should I go down 10 steps to the bridge to cross so I do not have to go down 1000 steps to get across a gorge.

Not all signals are equal. Some goals have very good signals (like in our example of height). Others have much more complex signals. A good example of a goal with complex signals is love. Love is many things to many people, but usually has something to do with prioritization of another over ones self, or at least over others compared to the one loved. But this notion of prioritization is very complex and has to do with hundreds of social conventions and traditions. Even amongst personality types we see differences in what is perceived as meeting this goal.

In fact almost all intra-personal relations are subject to these same complex signals. From friendship, to family, to gift giving, politics, and leadership. All enormously complex with a multitude of signals that very in their significance from person to person and even from time to time for the same individual.

Now a given intelligence is only capable of so much planning and signal processing. Like any other resource, planning, adjusting, and signal processing are subject to the limits of time, processing power, and material resources that can be used in goal attainment. This is a two way street. Interpersonal relations are most commonly a very intimate one-on-one type of goal. Even the best individuals a re by themselves capable of attaining goals with only a very few set of relations (relative to the total number of people in existence).

Now in any society of individuals inevitably a system of trade (market) comes into play . People produce and consume goods and services and through specialization come to use a common medium of exchange called a currency or more commonly money (This is obviously a very specious treatment of the subject. For those so inclined I highly recommend the books Human Action and Theory of Money and Credit by Von Mises, and Man, Economy, & State by Rothbard). Prices (the costs in money of specific goods) becomes a universal signal for making decisions relating to the market. Prices adjust relative to supply and demand for any specific good, which reflect the aggregate desires of individuals across a huge section of every day life (remember labor is just as much of a good as any other).

Entrepreneurs use these market signals (prices) to attain their goal (profit). In so doing they fulfill the desires of a great number of individuals in addition to their own. By creating a giving good and with an advanced distribution network as we have developed today, can impact the lives of millions. A humble farmer can now, with current technology, produce enough food for hundreds of individuals. A clothing designer can create fashion to be enjoyed by millions.

Markets effectively offer a very simple signal which masks the complexity of the underlying system of interpersonal signals. By using the simplified market signal, we are able to achieve a scale far greater than that which is available to us on a direct interpersonal level.

For the few interpersonal relations you have resources for, by all means work towards your goals. They are greatly rewarding. But if you want to maximize your goal obtainment, then use the market's signals and produce!

Sunday, September 28, 2008

Meta and Politics

Sometimes we as humans are drawn to simple little systems, where the rules are clearly laid out, and the basic axioms are short. Things like Newtonian physics and Fibonacci sequences. When these simple systems have the ability to combine their rules in an open ended manner, the results can become quite complex, even though the ground rules are simple. These systems are happily adapted to computers which can handle insanely detailed application of these rules.

Reality however, is rarely one of these systems. Real systems tend to be much more complex. Intelligence for example, is certainly not readily describable in terms of a few basic axioms. Categorization, pattern matching, and allegory are all heavily dependent on the Meta. Anything dealing with general intelligence, communication, or flexible systems tends to be heavily imbued with Meta rules, that is, rules for describing rules.

When a system at its base is not a set of rules applied to the direct problem domain, but is instead made up of ground rules for the creation of rules depending on the current situation, potential for complexity soars. These meta rules allow the system to be very adaptable, and tends to fool our overly aggressive causal tendencies by producing counter-intuitive results. In fact, many times these Meta rules are completely hidden to the outside observer, masked by the generated rules for a given circumstance.

Many Meta systems are inherently recursive, not only allowing for several levels of rules, but also rules that depend and are defined in terms of themselves. This creates a situation in which all the rules can be explained, but the implications of those same rules is highly unknown (even when they are not hidden). The lines between Meta-rule and rule tend to blur.

Human intelligence appears to be highly meta, insanely complex, and capable of producing very unexpected results. Economics is the study of these systems in action, in parallel with billions of other similar systems all interacting at once. The resultant complexity astounds the mind. When so called "economists" are called upon to make predictions as to the future state of this massive interaction of complex systems (itself now a new monster system called the economy), one can only laugh, and shake one's head, as we watch the circus of those pretending to be the masters of meta fall down again and again.

So create your wrong headed doomed "Rescue" bills and regulations. Sell it to the public as a snake oil salesmen fleeces the poor and down trodden. And as disaster strikes yet again, sleep tight with the defense of "it couldn't be helped!", and "no one can blame us, we listened to the experts!".

Sunday, September 14, 2008

Meaningful Phrases

I am trying to recall some of the moments in my past where I had or read a particular thought that produced or summed up a great deal of the way I think. Here is a sample of what came to mind.

People are fundamentally selfish. They act according to their most "perceived beneficial" action at all times. I was first exposed to this concept formally in C.S. Lewis' Mere Christianity. It was later spelled out in exhaustive detail by Von Mises' Human Action, and Murray Rothboard's Man, Economy, and State, and makes the basis of the Austrian School of Economics.

The Utopian vision of man vs the corrupt image of man. I first read about this in Thomas Sowell's book Basic Economics. Basically is man approaching some long away super society where he corrects his flaws and lives in harmony, or is where we are now basically where we have been all along and will always be. Basically realizing that you could split most political, moral, and social systems/thought with this one question. I tend to believe in the corrupt version.

The amount of information in a signal varies directly with how random it is. Put another way, the ability to compress information varies indirectly with how random it is. As an example, the particle movements of a wave crashing on the beach has an incredible amount of information in it. It would take incredible amounts of computer horse power to simulate it exactly. The main interest for me is that complexity is not in of itself useful or powerful. The more complex a system is not fundamentally better, and the reverse is more generally true.

Humans are rationalizing creatures rather than rational. The world is far more random than we think. This was an intuitive hunch for most of my life, with pointer's from Steven Pinker's books on the mind, but laid out beautifully by Taleb in The Black Swan, and Fooled by Randomness. People are fundamentally wired to see cause and effect in every situation and believe they know these causes, even in very complex systems. Humans are continually fooled by selection bias, and we share a universal inability to handle probabilistic thinking. This explains hero worship, our inability to learn from history, and explains the extreme epistemic hubris of most people. The apeal to me and the defining characteristic of the Austrian school of economics is its humility in regards to complex systems and its reluctance in determining cause and effect in history.

These are a few that came to mind this afternoon, what are yours?

Monday, August 18, 2008

Priming Surveys

Checking out frogmetrics.com and thinking it is really cool, but reminds me...

How many times have you seen a survey with the answer being from 1-10 or 1-5 stars? Every time I come across one of these I always wonder how in the world they aggregate these responses in a meaningful way. A very level person will answer most things close to the center (every thing is a 4-6) where a excitable personality tends to hit the extremes. They both likely meant the same thing by their feedback. The same thing goes with netflix movie rankings for example, and then they try and give me recommendations based off of what other users felt. Problem being is that the level heads and the excited are all mixed in together.

So is there a way you could get a sense of what personality they are to help classifier their answers? What if you asked a single question somewhere in the survey that was a "primer" question. Something that has a decent emotional response, like how would winning 100 dollars make you feel 1-10?

Friday, August 15, 2008

Named Parameters in Ruby

Ever forget whether the name or email parameter is first on a method like this?

signup(name,email)

Named parameters can help out so that you don't have to remember the order of parameters. As a fringe benefit the code can read a bit nicer and you can add in new optional parameters without breaking existing code.

Ruby as a language doesn't support named parameters to functions. But it does support converting named pairs into a hash if you provide an argument for it.


def signup(params)
name=params[:name]
email=params[:email]
...
end


This takes a little more work on the function declaration but it's not too bad. Now we can call the function like this:

signup(:name=>'Me', :email=>'me@net.com')

Suppose you wanted your name parameter to be optional and default to the email parameter. You can easily set default values for one or more of your expected parameters:


def signup(params)
email=params[:email]
name=params[:name]||email
...
end


With named parameters it often behooves you to do a bit more parameter checking.


def signup(params)
email=params[:email] || raise("email parameter is required!")
name=params[:name]||email
...
end


To make all parameters optional, set a default value for your parameter to {}.

def optional(params={})