Pretty Printing XML with Clojure
27 Mar 2010
The other day, I did some XML cleanup. I am posting the snippet here for safekeeping purposes in case I need to refer to it later.
(defn ppxml [xml]
(let [in (javax.xml.transform.stream.StreamSource.
(java.io.StringReader. xml))
writer (java.io.StringWriter.)
out (javax.xml.transform.stream.StreamResult. writer)
transformer (.newTransformer
(javax.xml.transform.TransformerFactory/newInstance))]
(.setOutputProperty transformer
javax.xml.transform.OutputKeys/INDENT "yes")
(.setOutputProperty transformer
"{http://xml.apache.org/xslt}indent-amount" "2")
(.setOutputProperty transformer
javax.xml.transform.OutputKeys/METHOD "xml")
(.transform transformer in out)
(-> out .getWriter .toString)))
Now you can pass your XML string,
(ppxml "<root><child>aaa</child><child/></root>")
and get the pretty printed version,
<?xml version="1.0" encoding="UTF-8"?>
<root>
<child>aaa</child>
<child/>
</root>
You can also use it to pretty print Compojure output either manually,
(ppxml (html
[:html
[:head
[:title "Hello World"]]
[:body "Hello World!"]]))
or using a middleware,
(defn with-ppxml [handler]
(fn [request]
(let [response (handler request)]
(assoc response :body (ppxml (:body response))))))
and have your pretty printed HTML,
<html>
<head>
<title>Hello World</title>
</head>
<body>Hello World!</body>
</html>
Steganography with Clojure - Hiding Text in Images
23 Mar 2010
Steganography is the process of hiding data in other data so no one apart from the sender and the receiver knows the existence or transmission of the message. It allows us to send a message within a seemingly unimportant message or something that does not attract attention.
Steganography has been used throughout the history, some old school methods include,
- Greeks and wax covered tablets
- Histiaeus and the shaved head
- Invisible inks in WWII
- Microdots
This post will cover hiding textual data in images using LSB (Least Significant Bit) technique. Each pixel in an image is a 32 bit int, split into 8 bit values representing alpha, red, green, blue.
0xAARRGGBB
Changing least significant bit in each of these four values would allow minor variations in color and it should be unnoticable to the naked eye, even when noticed it can easily be mistaken for flaws in the quality of the picture. So by changing last bit of all four values we can encode 4 bits of data per pixel. (Not all image formats support alpha for those you can encode 3 bits per pixel.)
(defn bits [n]
(reverse (map #(bit-and (bit-shift-right n %) 1) (range 8))))
Given a byte bits will return a sequence of bits that represent that byte,
steganography=> (bits (int \C))
(0 1 0 0 0 0 1 1)
numb reverses the process given a sequence of bits, you get the original byte,
(defn numb [bits]
(BigInteger. (apply str bits) 2))
steganography=> (char (numb (bits (int \C))))
\C
Using set-lsb we will encode one bit per a r g b value, given a byte and one bit from the data, we set the LSB to the bit given,
(defn set-lsb [bits bit]
(concat (take 7 bits) [bit]))
steganography=> (set-lsb (bits 255) 0)
(1 1 1 1 1 1 1 1) => (1 1 1 1 1 1 1 0)
We take the string we want to encode, pad it with ";" which will indicate we have reached the end of our message while decoding, then turn it into a sequence of bits,
(defn string-to-bits [msg]
(flatten (map #(bits %) (.getBytes (str msg ";")))))
steganography=> (string-to-bits "cb")
(0 1 1 0 0 0 1 1 0 1 1 0 0 0 1 0 0 0 1 1 1 0 1 1)
Next using this bit sequence we created, we match every four bits to a coordinate,
(defn match-bits-coords [bits img]
(partition 2
(interleave (partition 4 bits)
(take (/ (count bits) 4)
(for [x (range (.getWidth img))
y (range (.getHeight img))] [x y])))))
steganography=> (match-bits-coords (string-to-bits "c")
(ImageIO/read (File. "drive.png")))
(((0 1 1 0) [0 0]) ((0 0 1 1) [0 1])
((0 0 1 1) [0 2]) ((1 0 1 1) [0 3]))
We iterate over this bit coordinate sequence, for each pixel, we retrieve its argb value, match each a r g b vals with a bit, then encode it using set-lsb and set this new color we calculated for the pixel,
(defn set-pixels [img d]
(doseq [[data cord] d]
(let [color-bit (partition 2 (interleave (get-argb img cord) data))
color (map #(let [[n b] %]
(numb (set-lsb (bits n) b))) color-bit)]
(set-argb img cord color))))
In order to encode data, we read the image, match bits to coordinates, iterate through the pixels calculating and setting new colors and finally writing the image,
(defn encode [fname msg]
(let [img (ImageIO/read (File. fname))
data (match-bits-coords (string-to-bits msg) img)]
(set-pixels img data)
(ImageIO/write img "png" (File. (str "encoded_" fname)))))
Extracting data we encoded is much simpler,
(defn get-pixels [img]
(map #(get-argb img %) (for [x (range (.getWidth img))
y (range (.getHeight img))] [x y])))
steganography=> (take 3 (get-pixels (ImageIO/read (File. "encoded_drive.png"))))
([0 255 254 254] [0 254 254 255] [0 255 255 255])
First build a sequence of argb values for each pixel,
(defn split-lsb [data]
(map #(last (bits %)) data))
after flattening this sequence, we extract least significant bit from each byte giving us a sequence of 0's and 1's. Our original string as a bit string,
(defn decode [fname]
(let [img (ImageIO/read (File. fname))
to-char #(char (numb (first %)))]
(loop [bytes (partition 8 (split-lsb (flatten (get-pixels img))))
msg (str)]
(if (= (to-char bytes) \;)
msg
(recur (rest bytes) (str msg (to-char bytes)))))))
Now all we have to do is partition that sequence into groups of 8, each representing a char. We just keep casting bits into a char until we read ";" which denotes we have reached the end of our message. Okay, enough typing let's see it in action, assuming we want to encode "Attack At Down!!".
Image before steganography,

steganography=> (encode "drive.png" "Attack At Down!!")
steganography=> (decode "encoded_drive.png")
"Attack At Down!!"
Image after steganography,

You are not limited to encoding text in images, you can embed images within images, although I used 4 bits per pixel if you think you can get away with more degradation in quality you can embed more bits per pixel.
org-mode in Your Pocket - Setting Up MobileOrg
19 Mar 2010
MobileOrg is an iPhone application that lets you view, modify org files on the go. Its a great application but documentation is scarce and a bit confusing. This post documents the steps required to configure org-mode so it can sync with MobileOrg.
By default org-mode looks into the "~/org/" folder for your org files if you keep them somewhere else set org-directory variable to point to it,
(setq org-directory "~/Documents/org/")
(setq org-mobile-inbox-for-pull "~/Documents/org/from-mobile.org")
MobileOrg uses WebDav to synchronize your files, if you mount your WebDav as a disk, you need to set org-mobile-directory to point to it, alternatively you can use org-mobile push/pull hooks and use scp instead.
(setq org-mobile-directory "/Volumes/nakkaya.com/org/")
By default no files are staged to WebDav, you need to set org-mobile-files to the list of files you want to have access on the iPhone,
(setq org-mobile-files (quote ("gtd.org")))
When you sync your org files org-mobile will add a property drawer to your files, if you want to get rid of it you can use,
(setq org-mobile-force-id-on-agenda-items nil)
but beware that if you have file structure such as,
* Task
** SubTask
* Task
** SubTask
and you edit one of the subtasks org-mobile will have no way to determine which one to edit, other than that you will be safe. As for agendas only your custom agenda views are synchronized, I also suggest you use org-agenda-show-all-dates and set it to nil, so it filters empty days, it makes viewing agendas easier.
(setq org-agenda-custom-commands
'(("w" todo "TODO")
("h" agenda "" ((org-agenda-show-all-dates nil)))
("W" agenda "" ((org-agenda-ndays 21)
(org-agenda-show-all-dates nil)))
("A" agenda ""
((org-agenda-ndays 1)
(org-agenda-overriding-header "Today")))))
Adding Custom Libraries Into Local Leiningen Repository
16 Mar 2010
Sometimes, your project depends on a library which is not in clojars, or maybe it is propriety library which you can't upload to clojars. In this case, you can put it to your local repository your self to solve the dependency.
mvn install:install-file \
-Dfile=mysql-connector-java-5.1.10-bin.jar \
-DgroupId=self \
-DartifactId=mysql-connector \
-Dversion=5.1.10 \
-Dpackaging=jar \
-DgeneratePom=true
This will add the mysql adapter into your local Maven2 repository under groupId self and artifactId mysql-connector, you can then edit your project.clj, adding this dependency as,
[self/mysql-connector "5.1.10"]
Visualizing Maps Using Incanter
09 Mar 2010
When I first saw Mathematica's WorldPlot function I was impressed, its a nice way to visualize various forms of geographical data, for some time I thought this should be a very labor intensive task, who would go around labeling each pixel. Couple of days ago I somehow ended up reading the Processing article on Wikipedia, it contains an example which shows a map of the results of the 2008 USA presidential election, turns out using Scalable Vector Graphics implementing WorldPlot functionality is extremely easy.
We will be plotting how population moved between different regions in Turkey (in to and out of a region). I am using data provided by The Turkish Statistical Institute, you can grab the map I used from Wikipedia here.
;; Data for 2009
(def pop-taken [{:id 1 :name "Marmara" :population 582771}
{:id 2 :name "Iç Anadolu" :population 297919}
{:id 3 :name "Ege" :population 164896}
{:id 4 :name "Akdeniz" :population 188441}
{:id 5 :name "Karadeniz" :population 256654}
{:id 6 :name "Güneydo?u Anadolu" :population 171910}
{:id 7 :name "Do?u Anadolu" :population 214082}])
(def pop-given [{:id 1 :name "Marmara" :population 677395}
{:id 2 :name "Iç Anadolu" :population 310293}
{:id 3 :name "Ege" :population 181459}
{:id 4 :name "Akdeniz" :population 193231}
{:id 5 :name "Karadeniz" :population 247397}
{:id 6 :name "Güneydo?u Anadolu" :population 118611}
{:id 7 :name "Do?u Anadolu" :population 148287}])
Turkey is divided in to seven geographical regions, pop-taken represents how many people moved in to that particular region and pop-given represents how many people moved out of that region during 2009.
(defn region-color [val min max]
(lerp-color (color 0xffd120) (color 0x920903) (norm val min max)))
In order to paint the map like a heat map, we need to assign colors using the amount of people moved in or out of a region, given a min, max and a value in between norm will normalize a value to exist between 0 and 1, lerp-color on the other hand will calculate a color between the given range using the normalized value. So our map will go from yellow to dark red depending on the people moved.
(defn map-region-color [regions]
(let [min (apply min (map #(:population %) regions))
max (apply max (map #(:population %) regions))]
(map #(vector (:id %) (region-color (:population %) min max)) regions)))
Now all we need to do is calculate min and max values in the data set, iterate over the data set and return a sequence of [id color] pairs.
(defn sktch [regions]
(sketch
(setup [])
(draw
[]
(let [tr-map (load-shape this "MapTurkishProvincesNumbers.svg")]
(.shape this tr-map 0 0)
(doseq [region (map-region-color regions)]
(let [[id color] region
child (.getChild tr-map (str id))]
(.disableStyle child)
(.fill this color)
(.noStroke this)
(.shape this child 0 0)
no-loop))))))
Using incanter-processing library, we can load and access parts of the SVG map. Processing sketches are made up of the functions setup and draw, in setup as its name suggests you setup your stuff frame rate, stroke properties etc. Draw will be called once or multiple times depending on your frame rate, we load the map as a shape then paint it on the canvas, then we iterate over the data set, using getChild method of the PShape class we can access parts of the image, the map we are using has 7 children named 1 through 7 corresponding to the geographical regions of the country, we get the child then paint it using the color we calculated on to the canvas. One thing to note, sketch macro just returns a PApplet so for any function not implemented in incanter, you can access them just like any other Java function. Now lets see the results,
(view (sktch pop-given) :size [1052 744])

(view (sktch pop-taken) :size [1052 744])

WebDAV + SSL on Debian
05 Mar 2010
I was looking for a way to easily share documents between machines, since WebDAV shares can be accessed by Windows, Linux or Mac machines out of the box, I choose WebDAV over SSL. I don't use SSL for anything so WebDAV is served from DocumentRoot. I've been using it for a few days, so far it beats carrying USB sticks around.
Enable relevant Apache modules,
a2enmod ssl
a2enmod dav_fs
a2enmod dav
Create SSL certificate,
mkdir /etc/apache2/ssl
openssl req $@ -new -x509 -days 365 -nodes -out /etc/apache2/ssl/apache.pem \
-keyout /etc/apache2/ssl/apache.pem
chmod 600 /etc/apache2/ssl/apache.pem
Create your WebDAV directory and create a password file,
mkdir /path/to/webdav/
chown www-data /path/to/webdav/
htpasswd -c /path/to/passwd.dav user
Edit and add the following snippet to the configuration for the host you want to enable WebDAV,
<VirtualHost *:443>
ServerAdmin user@host.com
DocumentRoot /path/to/webdav
SSLEngine on
SSLCertificateFile /etc/apache2/ssl/apache.pem
<Directory /path/to/webdav/>
DAV On
AuthType Basic
AuthName "webdav"
AuthUserFile /path/to/passwd.dav
Require valid-user
</Directory>
ErrorLog /path/to/webdav/error.log
CustomLog /path/to/webdav/access.log combined
</VirtualHost>
Reload Apache configuration,
/etc/init.d/apache2 reload
Analytics with Incanter
02 Mar 2010
These past few days I've been playing with Incanter, which is a Clojure-based, R-like platform for statistical computing and graphics. This post covers the basic steps of using Clojure to access your Google Analytics Data with the Google Analytics Data Export API and visualize, filter the data returned using Incanter.
Google provides a Java library to simplify use of any Google Data API with Java, to access Analytics you need to grab the following list of Jars from gdata-java-client and google-collections.
- gdata-client-1.0.jar
- gdata-client-meta-1.0.jar
- gdata-core-1.0.jar
- gdata-analytics-2.1.jar
- gdata-analytics-meta-2.1.jar
- google-collect-1.0.jar
After running "lein deps" add them to the lib/ subdirectory.
(defn service [username pass]
(doto (AnalyticsService. "Clojure_Incanter_Sample")
(.setUserCredentials username pass)))
In order to retrieve data we need a service object which handles all interaction between our application and Analytics Data Export API.
(defn account-feed [service & args]
(let [url (URL. (str "https://www.google.com/analytics/"
"feeds/accounts/default?max-results=50"))
feed (.getFeed service url (get-class "AccountFeed"))
accs (reduce #(assoc %1
(-> %2 .getTitle .getPlainText)
(-> %2 .getTableId .getValue))
{} (.getEntries feed))]
(if (nil? args) accs (accs (first args)))))
To retrieve data for a profile, we need its table id. Asking service for an account feed, returns a list of entries containing title, table id and profile id but we are only interested in title and table id.
(defn data-feed [service & args]
(let [args (apply hash-map args)
feed (.getFeed service (.getUrl (query args)) (get-class "DataFeed"))
cols (map #(str "ga" %) (concat (:dimensions args) (:metrics args)))]
(map (fn [e]
(map #(.stringValueOf e %) cols))
(.getEntries feed))))
As with the account feed, first thing we need to do is build a feed request URL, query function handles that nothing fancy, it just calls a bunch of setters for dimensions, metrics etc. Querying analytics service with a data feed URL returns a list of entries, data-feed maps over them and returns a sequence containing dimensions and metrics we requested.
Now that we have some data to play with, we can start off by doing fairly standard things, like which pages got the most visits for the past month,
(def analytics (service "username" "password"))
(def acc-nakkaya (account-feed analytics "nakkaya.com"))
(def pageview (data-feed analytics
:date ["2010-01-26" "2010-02-25"]
:dimensions [:pageTitle :pagePath]
:metrics [:pageviews]
:sort [:pageviews]
:num-result 10
:id acc-nakkaya))
This is where incanter makes things fun, as long as you have a sequence of rows, in this case what data-feed returns you can call view to visualize the data,
(view pageview)

or we can filter the data leaving only portions of it which we are interested, such as pages with views more than 200 and lower than 800,
(with-data (col-names (map (fn [[x y z]] [x y (BigInteger. z)]) pageview)
[:title :path :views])
(view ($where {:views {:$gt 200 :$lt 800}})))
Alternatively you can filter the data in Clojure, requesting top 10 keywords people used to find your website and filtering the ones that contain "clojure" or "java" in them,
(def keywords (data-feed analytics
:date ["2010-01-26" "2010-02-25"]
:dimensions [:keyword]
:metrics [:visits]
:sort [:visits]
:num-result 10
:id acc-nakkaya))
(let [words ["clojure" "java"]]
(reduce (fn[h v]
(if (some true? (map #(.contains (first v) %) words))
(conj h v) h)) [] keywords))
analytics.core=> [("clojure xml" "62") ("clojure turtle graphics" "31")
("clojure opencv" "26") ("detect faces from webcam+java" "26")]
Besides visualizing stuff using tables, we can plot graphs containing the information we are intrested,
(def browsers (data-feed analytics
:date ["2010-01-26"]
:dimensions [:browser]
:metrics [:visits]
:sort [:visits]
:num-result 10
:id acc-nakkaya))
(view (bar-chart (take 4 (map first browsers))
(take 4 (map #(BigInteger. (last %)) browsers))
:title "Browser/Visits"
:x-label "Browsers"
:y-label "Visits"))

(def view-date (data-feed analytics
:date ["2009-11-26" "2010-02-25"]
:dimensions [:date]
:metrics [:visitors]
:sort [:visitors]
:num-result 10
:id acc-nakkaya))
(view (line-chart (map first view-date)
(map #(BigInteger. (last %)) view-date)
:title "Visits"
:x-label "Date"
:y-label "Visits"))
