Carl the Zealot

April 2, 2009

Becoming one with Eclipse…

Filed under: Code, Java, Linux, News, Open Source, Software, Troubleshooting — Carl Myers @ 3:26 am

Hey all… Been playing around with Eclipse a lot of late. Yes, yes, as a die-hard vim zealot, it is blasphemy. But let me tell ya, this Java stuff could really grow on me.

As part of my personal experimentation, I went ahead and installed Eclipse and the SVN plugin. I ran into a problem I feel I should blog about for posterity. When I tried to load a project from my subversion repository using Subclipse, I got this error:
Unable to load default SVN Client

Of course, the message was very confusing since the SVN CLI was in fact already installed on my system, and contrary to similar blog posts, not an old version, rather the correct one (1.5+). The problem was the SVN Java bindings were missing. For Debian, you apt-get install libsvn-java. On other platforms, search for “JavaHL” and it should pop right up in your package management (if not, get yourself a better distro! =)

I’m sure more blog entries on my efforts to “vim-itize” eclipse will be coming soon… ;D

Edit: I also had to add the following to my eclipse.ini:

-Djava.library.path=/usr/share/java/
-Djava.library.path=/usr/lib/jni/

Edit 2: I also had to run update-alternatives --config java and choose a 1.5 JDK, as the svn client library didn’t yet support 1.6.

Edit 3:Also make sure that you are using Subclipse 1.4.X if you are using Subversion 1.5.X client/server, and Subclipse 1.6.X if you are using Subversion 1.6.X client/server. There are two different eclipse update server URLs for the two different minor versions of Subclipse. =)

September 30, 2008

Remembering Weird Dreams

Filed under: Code, Uncategorized — Carl Myers @ 11:57 pm

I almost never remember my dreams. I’m not even sure I have them very often. I slept particularly well last night, and I had a dream, and I woke up remembering it. What follows is every detail I can recall, I went to blog it immediately so I wouldn’t forget anything, since this is such a rare occurrence for me.

I dreamt I was playing in an orchestra dress rehearsal. By that, I don’t mean I was dressed up, rather, we were having a rehearsal in a performing venue, probably very close to the actual concert. The few faces I remember were all from my old high school orchestra, included my orchestra director Winifred Crock. I was sitting last chair cello, which I used to sit sophomore year, but instead of having only 3 or 4 cellos, we actually had 4 or 5 stands of cellos, a much larger section. Also like sophomore year, I had no stand partner. We were in a large, dark amphitheater, the stage had a dark, shiny, reflective look, and there was a row of lights on the front, but I don’t recall *any* overhead lighting. Also, the entire audience was just a huge pool of water, one at least 10 feet deep.

We were playing exclusively show tunes, which my high school orchestra rarely did. Lots of John Williams, hard stuff. I suppose there was one exception, I think we played Holst’s Planets also – maybe just Mars. As usual, Mrs. Crock was way stressed out over how difficult the material she had chosen was and our apparent difficulties in playing even now, days before the concert.

We finished our second-to-last song, and somehow I remember that Star Wars was up next. I carefully put my cello down, and slip away to get Mrs. Crock a soda, hoping it would make her feel better. As I leave the amphitheater, I realize it must be inside a mall, and it must be after 9 or 10pm, because the entire mall is dark and deserted. I walk around looking for a soda machine, but I eventually find a poorly lit arts and crafts store which happens to be open.

I don’t see anyone in there at first, I only noticed it because one of the two employees there calls out to me “hey you, over here!” as I walk by. Strangely, there are also other customers in the store. Also strangely, they have some cans of soda. The only diet they have is Pepsi. I could have sworn Mrs. Crock drank diet, but I couldn’t recall if she preferred Pepsi or Coke. I was pretty sure in fact that it was Coke. I got 2 cans of the Pepsi anyways (hey, gotta get one for myself, so if she doesn’t want it it wasn’t a wasted trip).

I go up to the clerks with the cans. As I approach the middle register, one explains “Sorry, this register is sorta like our calander, we don’t use it to ring people up” and the other one explains to her “silly, of course we could use it to ring him up, we just normally don’t” but at this point I have already moved over to another register. Just before they tell me what I owe, I realize my wallet is gone and I have no money at all. I say “hang on, I’ll be right back!” and I scurry off to find my wallet or some money or something. I distinctly recall briefly considering pan handling, but dismiss the idea since there is nobody around in this dark, deserted mall.

I approach the amphitheater. As I approached, I saw that a large group was joining ours, probably just for the star wars. They were all dressed in costume (dancers often do dress rehearsals in costume because it affects them more). As I approached the theater I saw rows of girls dressed in Star Wars type outfits – not the nerdy sort, the sort the background characters wear at the “end of movie celebrations”. I think I heard the celebration music coming from the theater too, and they were dancing to it (think the music from the end of episode 4).

I get inside the amphitheater. but this time, there is no way to get back to my chair without getting wet. I don’t recall how I got out of the theater but going in is a problem. Mrs. Crock is still conducting Star Wars stuff. I slip into the water and swim (mostly underwater, to be more quiet) up to the stage on the left side (where the cellos are). The stage is too high, I can’t get on it from here, especially not without Mrs. Crock noticing. I don’t recall if I had the cans of soda or not at this point, but I assume not. I might have thought about something like “would she still be as mad at me if I had the soda?” I swim over to the right side, and find a sort of ramp I can walk up, then I sneak back to my seat. I don’t recall how I did it, but I assume I would have gone around the back side of the orchestra, rather than walking right behind Mrs. Crock. I don’t know why I feared her anger, she was always a kind and understanding person to us. I think I was just in “full suck-up mode” or something, so anything bad in her eyes was to be avoided.

Well, that’s about all I remember. I think I woke up after that. I wanted to blog about this before I forgot it, because I almost NEVER remember my dreams, and because this particular dream reminded me of many people I miss a lot from high school, not the least of whom being Mrs. Crock.

I could analyze this dream, try to figure out where it came from. There are lots of references to the day’s activities. I cleaned out my wallet, which I hadn’t done in a long time, so that is the “losing my wallet” thing – my wallet lost some weight as I removed a ton of business cards and things. I recently met someone new at DDR last night, so I had mentioned that I started playing cello in 5th grade, which brought up all these memories of those fun times. We also talked about my musical preferences a bit. DDR could represent the dancers, and the show tunes could represent all the music I “don’t dislike, but don’t really seek out, I just listen to it when I come across it”. Though my high school orchestra rarely played show tunes, we always wanted to. My orchestra here in Seattle, on the other had, just recently had an almost full-concert of just show tunes (but not good ones, very little John Williams, mostly stuff from the 1960s). I walk past pan handlers every day practically, in Seattle, so that is not abnormal, but it would have been a strange thought in high school. I’m not really sure where the swimming came from.

That’s all I can really think of, to analyze the dream. Weird stuff, huh?

May 30, 2008

Genetic Algorithms From Scratch: Harvesting the fruits of not-so-natural selection

Filed under: Code, Free Thought, Open Source, Science — Carl Myers @ 8:44 pm

There are people in this world who “don’t believe in evolution”. There are also people in this world who don’t “believe” the earth is round, or more than 6000 years old. In case any of you were wondering, I consider those two things to be equally well-proven by the evidence I have personally seen (that it to say – very well proven). There are some who would argue “have you personally seen this evidence? have you personally seen those experiments? Did you verify them yourself?”. Of course, that is a ridiculous and slippery slope. I’ve never seen my own heart, but I believe it’s there, dutifully pumping blood through my veins and arteries. Likewise, if some huge fraction of the scientific community accepts some result, I do as well, until evidence to the contrary is brought up.

Why this line of reasoning now? Where is Carl going with this? Well, it all started a few years ago, when I read an amazing article in Scientific American magazine, in the January 2003 issue. The article was called “Evolving Inventions”, and it was about creating software and hardware using genetic algorithms. It made me think “It’s hard enough to doubt evolution, genetics, and natural selection, when we find old bones and genetic markers and other highly-specialized evidence in a wide range of very specialized fields… but it’d be neigh impossible to doubt if you could see it work before your eyes.” The thing is, by definition, natural selection of humans (and most living things besides tiny bacteria) operates on geological scales, beyond the human imagination in most respects. But this article had the solution – Use a genetic algorithm to create complex things – hardware or software algorithms – from the simple, or even nothing. Surely, after seeing that before their eyes, nobody could doubt. Not to mention, icing on the cake, genetic algorithms proved to do a very good job of designing algorithms, according to the article. How potentially useful!

I learned several facts about “genetic algorithms” from this article. Here are the essential bits:

  • The three critical parts of evolution are mutation, sexual recombination, and selection – optimally all three must be present.
  • The most important genetic operation is sexual reproduction, or crossover,
    which mates pairs of the better organisms to sire offspring composed of genetic
    material from the two parents.
  • In addition to sexual reproduction, genetic programming copies about 9% of the “fittest” individuals in a population unaltered into the next generation, which generally ensures that the best organisms in each
    generation are at least as fit as those of the previous generation.
  • About 1 percent of the members of a new population undergo mutation

With that list of requirements, I felt ready to start thinking about my own implementation for running evolutionary experiments in programming. One phrase I really liked which the article used was “Think of it as a creative search through the space of all possible [objects].” In our case, we will be working on perl snippets that “do something”. So, we are trying to perform a very “creative” search through the space of all possible chunks of Perl code – a large space indeed. What we need to do to make this project useful is to abstract many of the specifics away.

What will we call the “object” we are selecting on? For convenience, I am going to call it a “fragment”. Additionally, I am going to say this fragment is a string of code which we can evaluate. The fragment can accept an object (in Perl, a hashref) which can have zero, one or more arguments in it. Additionally, the program can return a hashref with zero, one, or more arguments. Here is an example fragment which ignores its inputs and returns an output of the string “Hello, World!\n”.

my $fragment = 'my $output = "Hello, World!\n"; return { output => $output };';
my $code = sub { eval $fragment };
my $inputs = {};
my $results = $code->($inputs);
printf("Results: " . Data::Dumper($results) . "\n\n\n");

Here is an example of a fragment which takes the argument “input” and returns the result “input^2″:

my $fragment = 'my $input = shift->{'input'}; return { 'output' => $input * $input };'

With the “API” somewhat clearly defined now, we come to our three critical parts. How do we “implement” mutation, sexual recombination, and selection? First, let’s consider mutation. Obviously, mutation could mean doubling a line – having a statement execute twice – or removing a line. Mutation could mean making a copy of a variable, or changing an arithmetic expression to be slightly different. Mutation in the real world could mean the organism dies with 100% probability, which in this case would be represented by a syntax error. Since there is little point in keeping around a program with a syntax error, another mutation is unlikely to fix that error, we should make sure mutations always produce a fragment which “compiles”. Runtime errors are probably ok, they will just be selected against heavily.

The real problem posed by mutation is how do we exercise new constructs not represented in our original programs? For example, if we were selecting for a fragment which took two arguments,

{ 'a' => [int], 'b' => [int]}

, and returned the result

{ output => [a ^b] }

, but we started with the hello world program, we’d have a pretty long way to go. We’d somehow have to get the line

"my $a = $_->{'a'};"

, and a similar one for b. We’d somehow have to get the line

"my $result = $a ** $b;"

, but the ‘**’ (binary exponentiation operator) almost never shows up in practice. For now, my solution will be to implement a module which generates random mutations. mutations will affect a random line in the fragment, will have a chance of containing one or more variables mentioned on a previous line with probability proportional to it’s distance away in lines, will have a chance of defining a new variable, will have a chance of containing an arithmetic expression, will have a chance of calling a built-in keyword (shift, pop, push, first, reverse, splice, etc…) with reasonable arguments, will have a chance of creating a loop around the next N lines where N is randomly chosen such that the probability distribution of N is approximately 1/x (big N is less likely). The probabilities of everything in this module should be easy to tweak, and I’m sure I will tweak them. The following parameters should be configurable, at least, along with my “guess” for what the initial value will be:

  • Probability of any mutation happening – 0.01

If a mutation occurs:

  • Probability of a line being repeated – 0.15
  • Probability of a line being removed – 0.15
  • Probability of inserting a random generated line – 0.70

If a random line is generated:

  • Probability a built-in is called – 0.10
  • Probability a new variable is defined – 0.10
  • Probability of a new loop being added over the next N lines – 0.10
  • Probability one or more existing variables are referenced – 0.70

If previous variables are referenced:

  • Probability a variable is incremented or decremented – 0.10
  • Probability a variable is assigned to – 0.10
  • Probability an arithmetic expression is assigned to – 0.10

Any time a random expression is needed:

  • Probability expression is ‘1′ – 0.20
  • Probability expression is ‘0′ – 0.20
  • Probability expression is ‘undef’ – 0.20
  • Probability expression is a previously mentioned variable – 0.20
  • Probability expression is [expr1] [op] [expr2] where op is a random binary operator and expr1 and expr2 are randomly generated expressions – 0.10
  • Probability expression is [expr1] [op] or [op] [expr1] where op is a random unary operator and expr1 is a randomly generated expression – 0.05
  • Probability expression is [expr1] [op] [expr2] [op] [expr3] where [op] is a random trinary operator and expr1, expr2, expr3 are randomly generated expressions – 0.05

Ok, now that nasty mutation is out of the way, we have our next difficult task to design: Sexual Recombination. In humans, every human has two genes, and each offspring gets one of the two from each parent, making for only 4 combinations on each gene. The problem is, the basic unit of programming in Perl is the statement, but there is no clear way to make statements in two programs “line up” in a way that will make sense. If an offspring gets a line which references a variable from parent A, but not the line which declares such a variable, bam, syntax error – stillborn fragment. How do we combine two fragments in a meaningful way?

I’m afraid the simplest way is to “do it a lot” with small chunks, and hope the correct things randomly get selected over time. For example, instead of combining two programs into one program made up of equal parts of the two parents, the “best” parent will be copied, then a random chunk of the “second best” parent will be added to the copy, and the copy may or may not have a similar sized chunk of it removed to “make room”. For example:

my $fragmentA = 'my $args = shift; my $a = $args->{\'first\'}; my $b = $args->{\'second\'}; my $result = $a + $b; return { \'result\' => $result };';
my $fragmentB = 'my $args = shift; my $a = 5; my $b = $args->{\'first\'} + $a; return { \'first\' => $b, \'second\' => $a };';

Here you can see program A is a simple program which returns the sum of arguments “first” and “second”. Program B is a simple program which takes one argument and returns (”5+first”, 5). Now imagine they were combined, and statement 4 of program B was randomly put into program A before the last statement:

my $fragmentA = 'my $args = shift; my $a = $args->{\'first\'}; my $b = $args->{\'second\'}; my $result = $a + $b; return { \'first\' => $b, \'second\' => $a }; return { \'result\' => $result };';

The last statement is now ignored and the function of this fragment has totally changed. It now reverses the arguments passed to it – it is a simple swap() function! The chances of any random sexual combination like that happening “just right” is pretty small, so we will have to make sure this is tried a LARGE number of times.

For those keeping score, we now have “not-even-psuedo-code” solutions, as I like to call them, for two of our big three problems. The remaining problem is selection – how do I “select” fragments of code – how do I compare one to another? Each algorithm I am trying to develop is going to have to have certain tests to “guide” the selection index towards what we want – but there are some general aspects of a fragment which are universal too.

  • Like all good code – shorter is better. For two fragments which otherwise perform identically, but for which one fragment is shorter than the other in code length, the shorter one should win – but for two fragments where one significantly outperforms the other, the less efficient/correct one should always loose out, no matter how short it is.
  • If you try to evaluate code with a syntax error, it dies the same way a runtime error dies. It’d be nice to make a syntax error a “worse problem”, something we select against more strongly than code which sometimes gets runtime errors, but generally either of these conditions are something to select against.
  • Obviously, another “external test” we can do is track runtime. An algorithm which runs more quickly is preferred to one which is slow. As with length, correctness is far more important than runtime, but correctness and other factors being equal, faster runtime should win.
  • Simpler code is also better code – all other things being equal, code with fewer deeply nested scopes, fewer branch/goto statements, or fewer statements, should be preferred.
  • If an easy way can be found to detect memory footprint, smaller footprint should be selected for as well.

Obviously, besides external generic metrics to select against, we need some problem-specific metrics.

  • If the fragment is supposed to use arguments to accomplish its task, then the initial programs should read in the arguments and any program which doesn’t use/read all of the arguments should be selected against.
  • If the fragment is supposed to return certain results, then the initial programs should return undefined, but vivified arguments for those results, and any program which doesn’t return a hash with those arguments at least vivified should be selected against.
  • Any fragment which returns the correct “type” of data should be selected for. (I.E. a number where you expect a number)
  • Any fragment which returns something “close” to the correct answer should be selected for – for each expected output there should be some way to compare answers to expected and a weighted score should be possible which improves as the fragments return closer to the correct results.

Well, I mostly wanted to get my ideas written down before I started trying to implement this mess. Hopefully this post accomplishes that and organized my thoughts a bit. Now, to the implementation grind! =)

Powered by WordPress