Running Compojure as a Service
30 Nov 2009
My usual setup for running Compojure web apps is to run Jetty from ant and run ant in a screen session so it doesn't get killed when I log off. This got harder to manage as I move more applications to Compojure, I ended up with six consoles running. If you are not careful or update apps six in the morning, you end up killing the wrong app.
I got tired of this and hacked the following bash script, with it I can start/stop Compojure apps much like a service. No need to keep screen running.
You need to set the following variables,
- NOHUP - Location of the nohup binary.
- JAVA - Location of the java binary.
- JARFOLDER - Folder containing Compojure jars.
- ROUTES - Entry point for your application.
Thats all the setup that is needed.
#!/bin/bash
DESC="Compojure"
NAME="nakkaya.com"
NOHUP="nohup"
JAVA="java"
JARFOLDER="extLibs/"
PID_FILE="comp.pid"
ROUTES="app/routes.clj"
## build jar list
JARS="-cp "
for i in `find $JARFOLDER -name *.jar`
do
JARS=${JARS}:${i}
done
d_start(){
if [ -e $PID_FILE ]
then
PID=$(cat $PID_FILE)
if ps -p $PID > /dev/null
then
echo "$NAME already running.."
exit 0
fi
fi
$NOHUP $JAVA $JARS clojure.main $ROUTES &
COMPOJURE_PID=$!
echo $COMPOJURE_PID > $PID_FILE
}
d_stop(){
if [ -e $PID_FILE ]
then
COMPOJURE_PID=$(cat $PID_FILE)
kill $COMPOJURE_PID
rm $PID_FILE
else
echo "$NAME is not running.."
fi
}
case "$1" in
start)
echo "Starting $DESC: $NAME"
d_start
;;
stop)
echo "Stopping $DESC: $NAME"
d_stop
;;
*)
echo "Usage: compctl {start|stop}"
exit 1
;;
esac
exit 0
What we do is run Java using "nohup", this way process won't get killed when we log off. "nohup" redirects the output to nohup.out, PID of the Java process is saved to a file called "comp.pid".
Emacs and International Characters
29 Nov 2009
Every now and then, I work on an application needs to read/write files that contain Unicode characters, even though everything I use is set to use UTF-8 encoding something gets messed up along the way. One of the first things I do is check the encoding Emacs uses on the buffer, to determine which one, Emacs or the application messed it up. Since I do this once in every 6 months or so, I tend to forget the commands. Following is a self reference of commands to mess with buffer encoding in Emacs.
To set which coding system to use during save/open,
C-x RET f (set-buffer-file-coding-system)
To ask emacs what it is doing with your files,
C-h C coding <RET>
If you want to make utf-8 as your default encoding for new files you can use,
(setq locale-coding-system 'utf-8)
(set-terminal-coding-system 'utf-8)
(set-keyboard-coding-system 'utf-8)
(set-selection-coding-system 'utf-8)
(prefer-coding-system 'utf-8)
Clojure Concurrency - 404 Checker
27 Nov 2009
Recently a friend of mine wanted me to scrape some URLs from DMOZ. Unfortunately some URLs in DMOZ pages no longer exists. This is my third time scraping URLs from DMOZ so I already have a Java class to filter URLs based on the their HTTP response code.
My Java code is single threaded, since it was late at night when I wrote it, I didn't mind it taking couple of hours to finish since I was going to bed anyway. Concurrency being Clojure's biggest strength, I wanted a concurrent version in Clojure.
I began with a function, that takes a URL, if the URL returns a response code of 200, it will return the URL else it will return nil.
(defn check [url]
(try
(let [conn (-> (java.net.URL. url) .openConnection)]
(.connect conn)
(if (= 200 (.getResponseCode conn)) url))
(catch Exception e nil)))
user=> (check "http://nakkaya.com")
"http://nakkaya.com"
user=> (check "ttp://Malformed-url")
nil
Since checking URLs does not require any coordination between threads, I settled on using agents. Agents work just like refs, they wrap an initial state in my case URL itself, invalid URLs will get set to nil.
(defn run [f]
(let [list (line-seq (java.io.BufferedReader. (java.io.FileReader. f)))
agents (map #(agent %) list)]
(doseq [agent agents] (send agent check))
(apply await agents)
(doseq [url (filter #(not (nil? @%)) agents)]
(println @url))))
run takes a file name, it will read the file and produce a sequence of URLs to check. Each url is passed to an agent, await will block the current thread until all jobs posted are complete. When await returns, we filter agents that has their state set to nil and print the rest.
send vs send-off
Normally for blocking actions you would want to use send-off which will spawn it's own thread for the function, otherwise you would block the thread pool for non blocking operations, but in this case I wanted to block all threads so I can dump all URLs at once, and not worry about spawning thousands of threads.
Cryptography with Clojure - One Time Pad
25 Nov 2009
One-time pad is a "perfect" encryption algorithm, when implemented correctly it can not be broken. One-time pad combines the plain text message with a pad (key) using XOR operations. As long as the following factors holds true it is theoretically unbreakable,
- Pad must be as long as the message.
- Pad must only be used once.
- Pad stream must be truly random.
Even though it is theoretically unbreakable, in real world it is pretty much useless because since keys can only be used once, your message security problem turns into key distribution problem. On the plus side since each pad is unique and used only once it is highly resistant to all forms of cryptanalysis.
We begin our implementation by getting some random bytes, for this we will use the SecureRandom class, which provides a cryptographically strong pseudo-random number generator (PRNG), meaning these are truly random values not generated by a mathematical formula.
(defn rand-bytes [size]
(let [rand (java.security.SecureRandom/getInstance "SHA1PRNG")
buffer (make-array Byte/TYPE size)]
(.nextBytes rand buffer)
buffer ))
Encoding messages, is as easy as XORing the message with the pad we just generated.
(defn encrypt [m]
(let [message (.getBytes m)
size (count message)
pad (rand-bytes size)
code (map bit-xor message pad)]
{:pad (vec pad) :msg (vec code)} ))
This will result in a map containing two vectors a pad (key) and the encoded message. Note that the message is binary data if you want to turn it in to a string you need to turn it in to hex string or encode it using Base64.
user=> (encrypt "Attack At Down")
{:pad [-33 21 65 71 94 97 77 5 80 -111 87 100 83 -29],
:msg [-98 97 53 38 61 10 109 68 36 -79 19 11 36 -115]}
Decoding the message is even simpler, this time we XOR the pad to the message, and turn each byte into char then concatenate them all.
(defn decrypt [pad message]
(apply str (map char (map bit-xor pad message))))
user=> (let [message (encrypt "Attack At Down")]
(decrypt (:pad message) (:msg message)))
"Attack At Down"
Notice that One-time pad is extremely simple to implement yet it is unbreakable in theory, it's security comes from the protocol not from some complex mathematical function. So you don't want to get it wrong, like the Soviets, which as a result let the U.S. Army's Signal Intelligence Service to read their spies' traffic in the Venona program.
Poor Man's Foxyproxy for Safari
24 Nov 2009
Safari is a lot faster than Firefox on Mac OS X, I have been thinking about switching to Safari but not having a Foxyproxy equivalent was a big problem.
Fortunately Apple does provide a command to set/enable/disable proxy settings. Following script implements a poor man's version of Foxyproxy when you run it, it will setup a SSH SOCKS proxy to the server and enable proxy settings for safari, when killed with Ctrl-C, it will kill the SSH connection and disable proxy settings.
#!/bin/bash
DEVICE="Airport"
HOST="127.0.0.1"
PORT="9999"
echo "[+] Connecting"
ssh -ND $PORT user@server.com &
FIND_PID=$!
sleep 5
echo "[+] Enabling Proxy"
sudo networksetup -setsocksfirewallproxy $DEVICE $HOST $PORT off
function quit {
echo "[+] Disabling Proxy"
sudo networksetup -setsocksfirewallproxystate $DEVICE off
kill -9 $FIND_PID
exit
}
trap "quit" SIGINT SIGTERM
while :
do
sleep 60
done
Save it somewhere on your machine, and make it executable.
chmod 755 foxy-proxy.sh
Now you are ready to defeat that evil proxy.
Converting HTML to Compojure DSL
23 Nov 2009
Compojure DSL for creating HTML/XML is great unless you have a lot of HTML code already written. At first my plan was to parse it, write it to a file and manually format it, then I stumbled on this post from compojure mailing list, it is a small utility function written by Robin Brandt. It converts the given HTML file to clojure/compojure DSL.
(ns de.evernet2000.util
(:use clojure.contrib.str-utils)
(:use clojure.contrib.duck-streams)
(:use clojure.contrib.pprint)
(:use [clojure.xml :only (parse)])
(:import (java.io File)))
(defn format-attrs
[m]
(when m
(format "%s" m)))
(defn empty-when-null
[x]
(if (nil? x)
""
x))
(declare format-full-node)
(defn format-node
[node]
(cond
(string? node) (format "\"%s\"" (.trim node))
(nil? node) nil
:else (format-full-node node)))
(defn format-full-node
[node]
(format "[%s %s %s]\n"
(:tag node)
(empty-when-null (format-attrs (:attrs node)))
(str-join " " (map format-node (:content node)))))
(defn transform-file
[filename]
(print (pprint (read-string (format-node (parse filename))))))
It will complain if you have badly written HTML, in my case it only complained about a bunch of br statements, a simple search and replaced fixed it. If you can't get it to accept your HTML try running it through JTidy, that should fix it.
Smashing Java for Fun and Profit
22 Nov 2009
Every Java programming forum or mailing list, i am subscribed to has hundreds of people asking the same question, "How can i prevent people from reversing my application?". Short answer is you can not prevent reverse engineering regardless of the programming language used, unfortunately Java/C# just makes the process a lot more easier. This post will go over the process of cracking a very simple Java application.
We begin with a very simple, "Hello, World!" application. Given a correct serial number, which is "1234" it will print "Hello World!" to the console, for any other serial it will exit without a prompt.
public class hello {
public static boolean checkSerial(String serial){
if (serial.equals("1234"))
return true;
else
return false;
}
public static void main (String args[]) {
if (args.length != 1){
System.err.println("Serial Needed..");
return;
}
if (checkSerial(args[0]) == false )
System.exit(0);
System.out.println("Hello World!");
}
}
Corresponding ant file to build a jar file for the project,
<project name="hello" default="def" basedir=".">
<target name="def">
<javac srcdir="." includes="hello.java" fork="yes"/>
<jar destfile="hello.jar" >
<manifest>
<attribute name="Main-Class" value="hello"/>
</manifest>
<fileset dir=".">
<include name="hello.class"/>
</fileset>
</jar>
</target>
</project>
Type,
ant
to build the application, it will create a hello.jar in the same directory.
Everyone says it is very easy to decompile Java applications, but actually how easy it is? My favorite tool for this job is the JD-GUI. Try opening the jar file we produced.

As you can see, for this application you do not need to do anything to crack it, serial is written in plain text. For demonstration purposes, assume that the checkSerial function is a proper algorithm to check if a serial is valid or not.
Now a cracker has two options at this point, he can learn the algorithm that checkSerial uses and create a serial that will pass the inspection, or we can patch the checkSerial function to return true no matter, what serial is passed.
I'll go with the second route, for this we'll use a library called Javassist, which allows you to manipulate bytecode.
Jar files are glorified zip files, so we can extract the content of the jar file using,
unzip hello.jar
Using the javassist we write a small snippet, that will read the bytecode rename the checkSerial function to something else then create a new function that will always return true and add that. Finally we write back the modified class file.
import javassist.*;
class smash{
public static void main(String[] argv) throws Exception{
//Load the class that we will be patching...
ClassPool pool = ClassPool.getDefault();
CtClass klass = pool.get("hello");
//Get the method we want to patch, and rename...
CtMethod orig = klass.getDeclaredMethod("checkSerial");
orig.setName( "checkSerial$impl" );
// Create a new function that will always return true...
CtMethod patch = CtNewMethod.copy(orig, "checkSerial", klass, null);
patch.setBody("{ return true; }");
// Add patched method..
klass.addMethod( patch );
klass.writeFile();
System.out.println("Done Patching.");
CtMethod[] methods = klass.getDeclaredMethods();
for( int i=0; i<methods.length ; i++){
System.out.println( "\t" + methods[i].getLongName() );
}
}
}
Compile this file,
javac -cp .:javassist.jar smash.java
Run it in the same directory containing the .class file,
java -cp .:javassist.jar smash
You should see a output similar to,
Done Patching.
hello.checkSerial$impl(java.lang.String)
hello.main(java.lang.String[])
hello.checkSerial(java.lang.String)
Put back the patched .class file in to the jar file,
$ zip hello.jar hello.class
Now we can run the application passing any serial we want, and it will work.
$ java -jar hello.jar 34345345
Hello World!
Lets move to the other end of the spectrum, What can be done to prevent this attack?
Well, not much as long as the application runs on the hostile territory (user), it can be reversed and patched. You can however make reversing process harder, by obfuscating your class files.
Bytecode obfuscators, protects your class files by replacing package, class, method, and field names with inexpressive characters. Some bytecode obfuscators do more than just name mangling such as scrambling your code flow in a way that makes it really hard to follow.
In my experience the obfuscator that causes the minimal amount of hassle is yGuard.
<!-- yGuard Ant task. -->
<taskdef name="yguard"
classname="com.yworks.yguard.YGuardTask"
classpath="yguard.jar"/>
<!-- Integrated obfuscation and name adjustment... -->
<yguard>
<inoutpair in="./hello.jar" out="./hello-final.jar"/>
<rename logfile="./test.log" replaceClassNameStrings="true">
<property name="obfuscation-prefix" value="name"/>
<keep>
<class name="hello"/>
<method name="void main(java.lang.String[])"
class="hello" />
</keep>
</rename>
</yguard>
We add the yGuard task to our build process, we keep the main class intact not to break the jar file, if we open the resulting jar file "hello-final.jar" in the JD-GUI,

We see that the method name has been changed to A, well for this simple example it is still trivial to figure out what is going on but in a code base composed of 100's of class files, it becomes pretty hard to figure out what is going on.
There are other schemes in the tubes, such as encrypting the class files then load them with a custom class loader, well the problem people don't get is if you want it to run on a CPU, it has to be decoded at some point and can be reversed.
Making Recommendations
21 Nov 2009
Now that the similarity algorithms are in place, it is time to move on to making recommendations any one of the previously covered similarity scores would work,
We begin with calculating similarity scores for every critic, against the person we are looking for, discard anyone whose similarity is below 0. We can plug either one of the similarity scores.
(defn similarities [prefs person algo]
(filter
#(<= 0 (second %))
(reduce
(fn[h p] (assoc h (first p) (algo (prefs person) (second p))))
{} (dissoc prefs person))))
user=> (similarities critics "Toby" pearson)
(["Jack Matthews" 0.66284898035987] ["Mick LaSalle" 0.9244734516419049]
["Claudia Puig" 0.8934051474415647] ["Gene Seymour" 0.38124642583151164]
["Lisa Rose" 0.9912407071619299])
Next we filter preferences, remove entries that we already ranked and multiple remaining entries with the users similarity score, that way their ranks only contribute by how much they are similar to the user we are looking for.
(defn weight-prefs [prefs similarity person]
(reduce
(fn [h v]
(let [other (first v) score (second v)
diff (filter #(not (contains? (prefs person) (key %))) (prefs other))
weighted-pref (apply hash-map
(interleave (keys diff)
(map #(* % score) (vals diff))))]
(assoc h other weighted-pref))) {} similarity))
user=> (weight-prefs critics (similarities critics "Toby" pearson) "Toby")
{"Lisa Rose" {"Lady in the Water" 2.4781017679048247,
"The Night Listener" 2.97372212148579,
"Just My Luck" 2.97372212148579},
"Gene Seymour" {"Lady in the Water" 1.143739277494535,
"The Night Listener" 1.143739277494535,
"Just My Luck" 0.5718696387472675},
"Claudia Puig" {"The Night Listener" 4.020323163487041,
"Just My Luck" 2.680215442324694},
"Mick LaSalle" {"Lady in the Water" 2.7734203549257144,
"The Night Listener" 2.7734203549257144,
"Just My Luck" 1.8489469032838097},
"Jack Matthews" {"Lady in the Water" 1.9885469410796102,
"The Night Listener" 1.9885469410796102}}
Using the weighted preferences we calculated, we can build a list of movies to recommend by adding all the ranks for the movies,
(defn sum-scrs [prefs]
(reduce (fn [h m] (merge-with #(+ %1 %2) h m)) {} (vals prefs)))
user=> (sum-scrs (weight-prefs critics (similarities critics "Toby" pearson) "Toby"))
{"Just My Luck" 8.074754105841562,
"The Night Listener" 12.899751858472692,
"Lady in the Water" 8.383808341404684}
In order not to give any advantage to movies that are ranked the most, we need to divide rank, to the sum of similarity of all the critics that ranked the movie,
(defn sum-sims [weighted-pref scores sim-users]
(reduce (fn [h m]
(let [movie (first m)
rated-users (reduce
(fn [h m] (if (contains? (val m) movie)
(conj h (key m)) h))
[] weighted-pref)
similarities (apply + (map #(sim-users %) rated-users))]
(assoc h movie similarities) ) ) {} scores))
{Lady in the Water 2.9598095649952163,
The Night Listener 3.853214712436781,
Just My Luck 3.190365732076911}
Now we have everything to make a recommendation to a user, final score for a movie is calculated by diving its total score to the total of the similarities,
(defn recommend [prefs person algo]
(let [similar-users (into {} (similarities prefs person algo))
weighted-prefs (weight-prefs prefs similar-users person)
scores (sum-scrs weighted-prefs)
sims (sum-sims weighted-prefs scores similar-users)]
(interleave (keys scores) (map #(/ (second %) (sims (first %))) scores))))
user=> (recommend critics "Toby" pearson)
("Just My Luck" 2.5309807037655645
"The Night Listener" 3.3477895267131013
"Lady in the Water" 2.832549918264162)
Keeping Secrets with Emacs and GPG
19 Nov 2009
We all know we should use a unique password for each website or application we use, but most of us don't because it is much easier to use the same password everywhere. Using easy-pg and outline-mode you can let emacs take care of managing your passwords and keeping them encrypted, only one master passphrase is needed to unlock your passwords.
This being a post about emacs, I'm not going to delve in to specifics about using GPG. But if you don't already have private key use,
gpg --gen-key
to create one. Pick a long, not easily guessable passphrase, but remember, if you forget your passphrase there is no way to get your passwords back.
easy-pg is included with the latest distribution of Emacs, the only configuration that is needed, is to set the path to the gpg executable,
(setq epg-gpg-program "/opt/local/bin/gpg")
Now anytime you open a file ending with the extension .gpg Emacs will take care of encrypting and decrypting it for you, you will be asked for your passphrase.
Now create a file to store your passwords, make the following,
-*- mode: org -*- -*- epa-file-encrypt-to: ("your@email.com") -*-
the first line in the file. Now every time the file is opened it will be opened using org-mode.
A nice feature of org-mode is you can group stuff, make tables that can grow, shrink as needed automatically,
|Header1 |Header2 |Header3|
as soon as you hit TAB, org-mode will build the table for you.
| App | Login | Pass |
|------+-----------+-----------|
| app1 | username1 | username2 |
| | | |
You can use headings to organize passwords in to different categories.
* Bank
* Web
* Application
Categories can be hidden or shown using the TAB key.
Unit Testing in Clojure
18 Nov 2009
One thing I love about Clojure is the built in unit tests. Unit tests are great for making sure your code does what it needs to do and introducing new features or bug fixes doesn't break anything. You can refactor your code anytime you want and be sure that you did not break anything. Unit tests also serve as a living documentation for your code base, newcomers can look at the code base and get basic understanding of how your API works.
Clojure's core library includes a test framework written by Stuart Sierra. If there is anything that is not covered here the best place to look for it is the source code itself.
Defining Tests
Testing framework is under the namespace clojure.test,
(ns your-test-namespace
(:use clojure.test))
is all thats needed to load the framework. Assuming we would like to test the following function,
(defn add2 [x]
(+ x 2))
There are two ways to define tests, you can either define your tests with the function itself,
(with-test
(defn add2 [x]
(+ x 2))
(is (= 4 (add2 2)))
(is (= 5 (add2 3))))
but I believe that just bloats the code base, or you can define your tests separately using the deftest macro,
(deftest test-adder
(is (= 24 (add2 22))))
Tests can also be grouped together,
(deftest arithmetic
(addition)
(subtraction))
For testing private functions, you need to use the following macro (courtesy of chouser),
(defmacro with-private-fns [[ns fns] & tests]
"Refers private fns from ns and runs tests in context."
`(let ~(reduce #(conj %1 %2 `(ns-resolve '~ns '~%2)) [] fns)
~@tests))
then wrap your tests with with-private-fns,
(with-private-fns [org.foo.bar [fn1 fn2]]
(deftest test-fn1..)
(deftest test-fn2..))
Running Tests
To run the tests you defined from REPL you can use,
(run-tests)
With out a namespace run-tests will run the tests defined in the namespace you are in, you can pass it namespaces to run tests defined in other namespaces as well.
(run-tests 'your.namespace 'some.other.namespace)
If you want to run all tests in all namespaces,
(run-all-tests)
can be used.
Ant Integration
If you call your tests from an ant target, build completes successfully even if tests fail, from clojure source code I cannibalized some functions to use in my build process. Ant task that we will use to call our tests look like this,
<target name="test" depends="">
<java classname="clojure.main"
fork="true" failonerror="true">
<classpath>
<pathelement path="${test-dir}" />
<pathelement path="${src-dir}" />
<pathelement location="${extLibs.dir}/clojure.jar"/>
<pathelement location="${extLibs.dir}/clojure-contrib.jar"/>
</classpath>
<arg value="-e" />
<arg value="
(use 'clojure.test)
(use 'app-test)
(run-ant)" />
</java>
</target>
Now create a file in test-dir called app_test.clj that will be the main entry point in your application for tests. In it put the following definitions from clojure source.
(def test-names
[:app-test])
(def test-namespaces
(map #(symbol (str (name %)))
test-names))
(defn run
"Runs all defined tests"
[]
(println "Loading tests...")
(apply require :reload-all test-namespaces)
(apply run-tests test-namespaces))
Runs all defined tests, prints report to *err*, throw if failures. This works well for running in an ant java task.
(defn run-ant []
(let [rpt report]
(binding [;; binding to *err* because, in ant, when the test target
;; runs after compile-clojure, *out* doesn't print anything
*out* *err*
*test-out* *err*
report (fn report [m]
(if (= :summary (:type m))
(do (rpt m)
(if (or (pos? (:fail m))
(pos? (:error m)))
(throw
(new Exception (str (:fail m)
" failures, "
(:error m)
" errors.")))))
(rpt m)))]
(run))))
Add namespaces you wish to test to test-names. Now when one tests fails, ant build process will fail also.
Java Native Access from Clojure
16 Nov 2009
I tried to pick up JNI multiple times but in the end, i got bored. There is so much boiler plate code that you have to write even for trivial things. A while ago i stumbled upon a project called JNA (Java Native Access), it allows you to access native shared libraries from Java without using the Java Native Interface. I have been meaning to play with it for a while, last night i had some free time, i thought i give it a shot.
I have created two implementations, first one is the documented way of calling native libraries, it works but it will present problems for some functions, such as there is no way to create a method that accepts variable number of arguments using gen-interface macro which is a big problem for functions like printf, you have to know before hand how many variables you will call it with. There is also the problem of structs,
// Original C code
typedef struct _Point {
int x, y;
} Point;
In order to represent this struct, in Java one would use,
// Equivalent JNA mapping
class Point extends Structure { public int x, y; }
which can't be done in Clojure, at first i thought i was stuck, but turns out there are workarounds.
First, documented way of calling printf,
(gen-interface
:name jna.CLibrary
:extends [com.sun.jna.Library]
:methods [[printf [String] void]])
We create a interface that extends com.sun.jna.Library (use full package name even if you import it!!), and define which methods we will be calling. You need to compile this before hand. Now you can call printf,
(def glibc (Native/loadLibrary "c" jna.CLibrary))
(.printf glibc "Hello, World.. \n")
Obvious problem here, is that this will only work for simple functions, pretty much all functions that does something interesting, will expect some sort of structure as a parameter which we can not emulate in Clojure.
While digging through the documentation, i found the Function class which allows you to make calls without creating an interface, with it we can now pass variables as an array which allows us to call printf with variable length arguments.
(defmacro jna-call [lib func ret & args]
`(let [library# (name ~lib)
function# (com.sun.jna.Function/getFunction library# ~func)]
(.invoke function# ~ret (to-array [~@args]))))
With a simple macro we can now make any native call we want,
(jna-call :c "printf" Integer "kjhkjh")
;Some POSIX Calls
(jna-call :c "mkdir" Integer "/tmp/jnatesttemp" 07777)
(jna-call :c "rename" Integer "/tmp/jnatesttemp" "/tmp/jnatesttempas")
(jna-call :c "rmdir" Integer "/tmp/jnatesttempas")
Armed with this macro, i thought i can solve the age old Java question, How to find the free space available on the disk? (Pre 1.6). This is where i hit the second wall, the call to get free space on my Mac OS X is, statvfs which expects a string pointing to the directory and a struct that it will fill the information for us, a struct which we can not emulate in Clojure. Couple more hours of google fun, it turns out that this can also be worked around. You can request a Pointer object from JNA which you can pass to functions,
(defmacro jna-malloc [size]
`(let [buffer# (java.nio.ByteBuffer/allocateDirect ~size)
pointer# (Native/getDirectBufferPointer buffer#)]
(.order buffer# java.nio.ByteOrder/LITTLE_ENDIAN)
{:pointer pointer# :buffer buffer#}))
You give JNA a ByteBuffer it will give you a pointer, you can pass this Pointer around instead of a Structure.
(let [struct (jna-malloc 44)]
(jna-call :c "statvfs" Integer "/git" (:pointer struct))
(let [fbsize (.getInt (:buffer struct))
frsize (.getInt (:buffer struct) 4)
blocks (.getInt (:buffer struct) 8)
bfree (.getInt (:buffer struct) 12)
bavail (.getInt (:buffer struct) 16)]
(println "f_fbsize" fbsize)
(println "f_frsize" frsize)
(println "blocks" blocks)
(println "bfree" bfree)
(println "bavail" bavail)))
Now we can just do the math and get free space. C equivalent would be,
#include <stdio.h>
#include <string.h>
#include <sys/statvfs.h>
int main( int argc, char *argv[] ){
struct statvfs fiData;
char fnPath[128];
strcpy(fnPath, argv[1]);
statvfs(fnPath,&fiData);
printf("Disk %s: \n", fnPath);
printf("\tf_bsize: %u\n", fiData.f_bsize);
printf("\tf_frsize: %i\n", fiData.f_frsize);
printf("\tf_blocks: %i\n", fiData.f_blocks);
printf("\tf_bfree: %i\n", fiData.f_bfree);
printf("\tf_bavail: %i\n", fiData.f_bavail);
}
C output,
$ gcc spc.c && ./a.out /git
Disk /git:
f_bsize: 1048576
f_frsize: 4096
f_blocks: 60965668
f_bfree: 33754724
f_bavail: 33690724
Clojure output,
jna=> f_fbsize 1048576
f_frsize 4096
blocks 60965668
bfree 33754724
bavail 33690724
nil
I have picked up a few tips from this experiment, get a very simple C/C++ program going, you need to know the sizes of different types and structures, you are still playing with C so be prepared to play with bytes to get/send the information you need.
Overall this is a very good weapon to add to your arsenal, when you need some functionality which Java does not support. JNA is much slower than JNI, so this is not useful to speed things up. C is C so you will have crashes and complex function signatures will drive you nuts.
Resources
- Java Native Access on Wikipedia
- JNA API Documentation
- STATVFS(3)
Extracting Audio Track from Videos
15 Nov 2009
Nice trick that might come in handy later, I needed to extract some audio from a video, turns out it is much easier than i thought,
ffmpeg -i video.flv video.mp3
BEncoding Objects in Clojure
14 Nov 2009
I plan on playing with the Bittorrent protocol, i already have a bencode decoder to play with torrent files, but since i need to communicate with trackers, i need encoding. This post will walk through the steps required to encode objects using bencoding, i have updated bencode.clj, it can now both decode and encode.
(defn encode [obj]
(let [stream (ByteArrayOutputStream.)]
(encode-object obj stream)
(.toByteArray stream)))
To encode an object, we call encode on it. We get a byte array representing the encoded object, you can then write it to a file or look at it by creating a String from it.
(defn- encode-object [obj stream]
(cond (string? obj) (encode-string obj stream)
(number? obj) (encode-number obj stream)
(vector? obj) (encode-list obj stream)
(map? obj) (encode-dictionary obj stream)))
encode-object is where encoding begins, depending on the type of object passed to it, it will call the appropriate function.
(defn- encode-string [obj stream]
(let [bytes (.getBytes obj "UTF-8")
bytes-length (.getBytes (str (count bytes) ":") "UTF-8")]
(.write stream bytes-length 0 (count bytes-length))
(.write stream bytes 0 (count bytes))))
An encoded string has the format,
<string length encoded in base ten ASCII>:<string data>
4:spam -> "spam"
so what we do is we turn the string in to a byte array, calculate it's length write everything to stream according to the format.
(defn- encode-number [number stream]
(let [string (str "i" number "e")
bytes (.getBytes string "UTF-8")]
(.write stream bytes 0 (count bytes))))
An encoded number has the format,
i<integer encoded in base ten ASCII>e
i3e -> 3
we build a string by prepending "i" and appending "e" to the number write the bytes to the stream.
(defn- encode-list [list stream]
(.write stream (int \l))
(doseq [item list]
(encode-object item stream))
(.write stream (int \e)))
In my implementation, bencoded lists are represented as clojure vectors, a bencoded list has the following format,
l<bencoded values>e
l4:spam4:eggse -> [ "spam", "eggs" ]
what we do is, iterate over the vector and for each object found, call encode-object on it.
(defn- encode-dictionary [dictionary stream]
(.write stream (int \d))
(doseq [item dictionary]
(encode-object (first item) stream)
(encode-object (second item) stream))
(.write stream (int \e)))
An encoded map has the format,
d<bencoded string><bencoded element>e
d3:cow3:moo4:spam4:eggse -> { "cow" => "moo", "spam" => "eggs" }
the technique to encode a map is the same as a vector, we iterate over the map but call encode-object twice once for the key and once for the value.
Download code.
Pearson Correlation Score
13 Nov 2009
This post will cover another topic from Programming Collective Intelligence that is used to define similarities between items called Pearson correlation score, the formula for this algorithm looks like the following,

This calculation returns a value between -1 and 1. Two users with a similarity of 1 have rated every item identically. Unlike Euclidean Distance Score this formula doesn't need to be normalized. Pearson correlation score, also accounts for average ratings for each user, a user that rates everything 5 and a user that rates everything 1 will have a similarity of 1. This may or may not be the behavior you want depending on your situation.
(defn pearson [x y]
(let [shrd (filter x (keys y))]
(if (= 0 (count shrd))
0
(let [sum1 (reduce (fn[s mv] (+ s (x mv))) 0 shrd)
sum2 (reduce (fn[s mv] (+ s (y mv))) 0 shrd)
sum1sq (reduce (fn[s mv] (+ s (Math/pow (x mv) 2))) 0 shrd)
sum2sq (reduce (fn[s mv] (+ s (Math/pow (y mv) 2))) 0 shrd)
psum (reduce (fn[s mv] (+ s (* (x mv) (y mv)))) 0 shrd)
num (- psum (/ (* sum1 sum2) (count shrd)))
den (Math/sqrt (*
(- sum1sq (/ (Math/pow sum1 2) (count shrd)))
(- sum2sq (/ (Math/pow sum2 2) (count shrd)))))]
(if (= den 0)
0
(double (/ num den))) ))))
Using the same critics map from Euclidean Distance Score,
user=> (pearson (critics "Lisa Rose") (critics "Gene Seymour"))
0.39605901719066977
user=> (pearson (critics "Lisa Rose") {})
0
mp4 Conversion for Mac OS X and Linux
12 Nov 2009
One annoying thing about iPhone is that it does not play any video format other than mp4. There are a bunch of GUI tools out there but the problem is during conversion my machine crawls, as i was looking for alternatives i came across ffmpeg, which is a cross-platform solution to record, convert and stream audio and video. This is a quick and dirty bash script to convert any video you pass it to and write the output to SAVE_LOCATION.
SAVE_LOCATION=~/Desktop/
if [ -z "$1" ]
then
echo "requires file name..."
exit
fi
filename=${1##*/}
out_file=$SAVE_LOCATION${filename%%.*}".mp4"
out_file=${out_file// /-}
echo "Converting: "$1
echo "Saving: "$out_file
ffmpeg -i "$1" -f mp4 \
-acodec libfaac -ar 44100 -ab 128 \
-vcodec mpeg4 -maxrate 2000 -b 1500 \
-qmin 3 -qmax 5 -bufsize 4096 -g 300 \
-s 320x240 -r 30000/1001 $out_file
Euclidean Distance Score
11 Nov 2009
Recently, I started rereading excellent book Programming Collective Intelligence. I did not implement any of the examples first time around, so this time i thought, i implement each one.
Euclidean distance is a method of calculating a score of how similar two things are. We get a value between 0 and 1, 1 meaning they are identical 0 meaning they don't have anything in common.
I am using the movie critics example from the book, converted to a clojure map,
(def critics
{"Lisa Rose" {"Lady in the Water" 2.5 "Snakes on a Plane" 3.5
"Just My Luck" 3.0 "Superman Returns" 3.5
"You, Me and Dupree" 2.5 "The Night Listener" 3.0}
"Gene Seymour" {"Lady in the Water" 3.0 "Snakes on a Plane" 3.5
"Just My Luck" 1.5 "Superman Returns" 5.0
"The Night Listener" 3.0 "You, Me and Dupree" 3.5}
"Michael Phillips" {"Lady in the Water" 2.5 "Snakes on a Plane" 3.0
"Superman Returns" 3.5 "The Night Listener" 4.0}
"Claudia Puig" {"Snakes on a Plane" 3.5 "Just My Luck" 3.0
"The Night Listener" 4.5 "Superman Returns" 4.0
"You, Me and Dupree" 2.5}
"Mick LaSalle" {"Lady in the Water" 3.0 "Snakes on a Plane" 4.0
"Just My Luck" 2.0 "Superman Returns" 3.0
"The Night Listener" 3.0 "You, Me and Dupree" 2.0},
"Jack Matthews" {"Lady in the Water" 3.0 "Snakes on a Plane" 4.0
"The Night Listener" 3.0 "Superman Returns" 5.0
"You, Me and Dupree" 3.5}
"Toby" {"Snakes on a Plane" 4.5 "You, Me and Dupree" 1.0
"Superman Returns" 4.0}})
To calculate an Euclidean score between two people, first we need to find what movies they ranked in common, then for each movie, calculate the difference in ranks and square it, when we sum all squares we a get similarity score, all that is need to be done is normalize that score so that it falls between 0 and 1.

This is basically Euclidean distance between two points in n-dimensions, except we don't take the square root of the sum, because it is computationally expensive and all we are interested is the order of the distances, order will remain the same whether we take the square root or not.
(defn euclidean [person1 person2]
(let [shared-items (filter person1 (keys person2))
score (reduce (fn[scr mv]
(let [score1 (person1 mv)
score2 (person2 mv)]
(+ scr (Math/pow (- score1 score2) 2))))
0 shared-items)]
(if (= (count shared-items) 0)
0
(/ 1 (+ 1 score)))))
Now we can calculate a similarity score between two people,
user=> (euclidean (critics "Lisa Rose") (critics "Gene Seymour"))
0.14814814814814814
user=> (euclidean (critics "Lisa Rose") (critics "Lisa Rose"))
1.0
user=> (euclidean (critics "Lisa Rose") {})
0
This allows us to ask the question which critics are similar to Lisa?
(defn sort-by-similarity [critics critic]
(sort-by second
(reduce (fn[h p]
(let [name (first p)
prefs (second p)
similarity (euclidean critic prefs)]
(assoc h name similarity) )) {} critics)))
We iterate through the critics map and calculate similarity score for each person then sort the map using this score,
user=> (sort-by-similarity critics (critics "Lisa Rose"))
(["Gene Seymour" 0.14814814814814814]
["Jack Matthews" 0.21052631578947367]
["Toby" 0.2222222222222222]
["Claudia Puig" 0.2857142857142857]
["Mick LaSalle" 0.3333333333333333]
["Michael Phillips" 0.4444444444444444]
["Lisa Rose" 1.0])
Using Java Mail API from Clojure
10 Nov 2009
Java API does not provide a way to send mail or interface with POP/IMAP servers, but Sun does provide a framework to build mail and messaging applications.
You can get the JavaMail API here. Following is a simple function that allows you to send mail from clojure.
- activation.jar
- mailapi.jar
- smtp.jar
If you just need to send email from your clojure applications, just grab the jar's listed above. I use this snippet to send email through GMail, i have not tested it anywhere else, but it should work.
(defn mail [& m]
(let [mail (apply hash-map m)
props (java.util.Properties.)]
(doto props
(.put "mail.smtp.host" (:host mail))
(.put "mail.smtp.port" (:port mail))
(.put "mail.smtp.user" (:user mail))
(.put "mail.smtp.socketFactory.port" (:port mail))
(.put "mail.smtp.auth" "true"))
(if (= (:ssl mail) true)
(doto props
(.put "mail.smtp.starttls.enable" "true")
(.put "mail.smtp.socketFactory.class"
"javax.net.ssl.SSLSocketFactory")
(.put "mail.smtp.socketFactory.fallback" "false")))
(let [authenticator (proxy [javax.mail.Authenticator] []
(getPasswordAuthentication
[]
(javax.mail.PasswordAuthentication.
(:user mail) (:password mail))))
session (javax.mail.Session/getDefaultInstance props authenticator)
msg (javax.mail.internet.MimeMessage. session)]
(.setFrom msg (javax.mail.internet.InternetAddress. (:user mail)))
(doseq [to (:to mail)]
(.setRecipients msg
(javax.mail.Message$RecipientType/TO)
(javax.mail.internet.InternetAddress/parse to)))
(.setSubject msg (:subject mail))
(.setText msg (:text mail))
(javax.mail.Transport/send msg))))
(mail :user user@gmail.com"
:password "pass"
:host "smtp.gmail.com"
:port 465
:ssl true
:to ["nurullah@nakkaya.com" ]
:subject "I Have Rebooted."
:text "I Have Rebooted.")
Command Line Progress Bar
08 Nov 2009
I frequently need a progress bar for applications, in order to visualize what is going on in the application. Following is a simple progress bar implemented in three different languages, C++, Clojure and Java.
[================> ] 33%
They all look the same, just call the appropriate function with the percentage to show.
C++
void printProgBar( int percent ){
std::string bar;
for(int i = 0; i < 50; i++){
if( i < (percent/2)){
bar.replace(i,1,"=");
}else if( i == (percent/2)){
bar.replace(i,1,">");
}else{
bar.replace(i,1," ");
}
}
std::cout<< "\r" "[" << bar << "] ";
std::cout.width( 3 );
std::cout<< percent << "% " << std::flush;
}
Clojure
(defn percent-bar [percent]
(let [slen (int (/ percent 2))
shaft (apply str (repeat slen "="))
filler (apply str (repeat (- 50 slen) " "))]
(str "[" shaft ">" filler "]")))
Tom Hicks in the comments provided a more elegant version than mine, by computing the sequences of characters that we need directly.
(defn print-progress-bar [percent]
(let [bar (StringBuilder. "[")]
(doseq [i (range 50)]
(cond (< i (int (/ percent 2))) (.append bar "=")
(= i (int (/ percent 2))) (.append bar ">")
:else (.append bar " ")))
(.append bar (str "] " percent "% "))
(print "\r" (.toString bar))
(flush)))
Java
public static void printProgBar(int percent){
StringBuilder bar = new StringBuilder("[");
for(int i = 0; i < 50; i++){
if( i < (percent/2)){
bar.append("=");
}else if( i == (percent/2)){
bar.append(">");
}else{
bar.append(" ");
}
}
bar.append("] " + percent + "% ");
System.out.print("\r" + bar.toString());
}
Remote File Editing Using Emacs
06 Nov 2009
Emacs has a package called TRAMP (Transparent Remote (file) Access, Multiple Protocol) which allows you to edit files on remote machines via SSH. Since Emacs 22, TRAMP is included with the distribution.
All you need to do is add the following lines to your .emacs file,
(require 'tramp)
(setq tramp-default-method "scp")
Then in order to open a file on a remote machine, you can use,
C-x C-f /user@your.host.com:/path/to/file
If you don't want to enter your password every time you open or save a file consider using Public Key Authentication.
TRAMP mode can also be used to edit files on the same machine as another user, if you want to open some file as root you can use,
C-x C-f /root@127.0.0.1:/path/to/file
Clojure on Google App Engine
04 Nov 2009
I was moving some compojure applications to Google Application Engine. After creating the same directory structure and same configuration three times, i wanted to automate this. So i build an ant script to automate the creation of the necessary directory structure and source files. Download and setup SDK and compojure.
Code is hosted on github, like any other piece of code it is released under Beerware license.
To get started, you can either download the repo or clone it. The only thing you need to configure is your app-id and app-display-name in the build file.
Running,
ant setup
will download the necessary SDK and compojure files, and create source files required for a Hello, World application.
You can test the application by running,
ant devserver
When you are ready to deploy your application to Google, all you need to run is,
ant deploy
Before your first deployment you need to run appcfg.sh manually and set your credentials by running,
./sdk/bin/appcfg.sh
Decoding BEncoded Streams in Clojure
02 Nov 2009
EDIT: I have updated the code to include both decoding and encoding, information on how to encode can be found here
Bencode is the encoding used by file sharing system BitTorrent. Torrent files are simply Bencoded dictionaries. This post will walk you through my Bencode decoder, if you want to jump right in to code, it is here.
You can read about the BitTorrent specification here there is also a lot of information on theory.org of course don't forget to check out the Wikipedia article.
Now specs out of the way, let's dissect the code,
(defn decode [stream & i]
(let [indicator (if (nil? i) (.read stream) (first i))]
(cond
(and (>= indicator 48)
(<= indicator 57)) (decode-string stream indicator)
(= (char indicator) \i) (decode-number stream \e)
(= (char indicator) \l) (decode-list stream)
(= (char indicator) \d) (decode-map stream))))
decode will read one byte from the stream determine it's type and call the appropirate function.
(defn- decode-number [stream delimeter & ch]
(loop [i (if (nil? ch) (.read stream) (first ch)), result ""]
(let [c (char i)]
(if (= c delimeter)
(BigInteger. result)
(recur (.read stream) (str result c))))))
decode-number takes the stream that we are processing and a delimiter, when delimiter is read we stop reading, this function is used to decode both numbers formatted as "i23e" and byte string length "10:".
(defn- decode-string [stream ch]
(let [length (decode-number stream \: ch)
buffer (make-array Byte/TYPE length)]
(.read stream buffer)
(String. buffer "ISO-8859-1")))
decode-string will parse the string length variable and read indicated bytes from the string, i build the string using ISO-8859-1 that way SHA-1 hashes will not be corrupted.
(defn- decode-list [stream]
(loop [result []]
(let [c (char (.read stream))]
(if (= c \e)
result
(recur (conj result (decode stream (int c))))) )))
decode-list will call decode on the items until the list delimiter "e" is read. Lists are returned as Clojure vectors.
(defn- decode-map [stream]
(apply hash-map (decode-list stream)))
decode-map will decode the map as a list then apply hash-map on it producing a Clojure map.
Download code here.
Clojure Persistence for Java Programmers
01 Nov 2009
Using Java for a long time, whenever I needed to save some data structure to disk, my first response was to serialize it to a file. While working on a Clojure application, I did just that, it worked half the time because not every data structure implements Serializable.
Then I remembered Clojure being a Lisp, code is data. This allows you to dump everything as a String to a file and read it back as a data structure.
user=> (doc prn)
-------------------------
clojure.core/prn
([& more])
Same as pr followed by (newline). Observes *flush-on-newline*
nil
You can pass prn a vector,map or any object you want, it will print the object to the output stream.
(defstruct db :file :data)
(defn write-db [db]
(binding [*out* (java.io.FileWriter. (:file db))]
(prn (:data db))))
By binding *out* to a FileWriter we can easily dump any object to a file,
(write-db (struct db "test" [1 2 3]))
(write-db (struct db "test" {:test "test" :ax "ax"}))
To read it back we use read-string function,
user=> (doc read-string)
-------------------------
clojure.core/read-string
([s])
Reads one object from the string s
nil
read-string takes a string and returns an object,
(defn read-db [fname]
(try
(let [object (read-string (slurp fname))]
(struct db fname object))
(catch Exception e nil)))
(read-db "test")