2010-08-30

Of Cows and Power Lines (using Google Earth)

The IEEE Spectrum magazine has an interesting study of "Of Cows and Power Lines" (related articles: 1, 2), which states that Cows (amongst some other animals) naturally align in an North/South direction.

Studies also found that this orientation is distorted when Cows are close to power lines.

What's interesting to Google is that the studies are using Google Earth to find the herd of cattles and power lines. Google is surely achieving the mission of "organize the world's information and make it universally accessible and useful." =)

2010-08-16

Textbooks 101

IEEE Computing in Science & Engineering published a small primer on writing your first book (portion from a conference paper of the same title) by Steven Barrett and Daniel Pack, who have now written 6+ books. The article outlines the overall process in writing textbooks, including:
1. considering coauthors;
2. selecting a topic;
3. preparing prospectus (including the books' philosophy, overview, and outline);
4. time commitment;
5. finding a publisher;
6. book contract; and
7. writing the book.
(Some of the questions from the acquisition editor are also good.)

It's a good read if you intend to write a book.

2010-06-21

Concatenating process outputs

It's very common and simple to pipe the output of a process to another process as its input. But how do you concatenate output from multiple processes together to another process? It's actually extremely simple: just put parenthesis around the sequences of commands to run, that is:

(command 1; command 2; ...; command n) | receiving process


As an example, to combine messages from two Google Gadget messagebundle files, we can define a bash function that takes three parameters as follows:
function mergemsg {
[[ -f $1 && -f $2 && -n $3 ]] && \
(head -2 $1 ; \
(sed -e '1,2d;$d' $1 ; sed -e '1,2d;$d' $2) | sort; \
tail -1 $1) > $3

}

  • [[ -f $1 && -f $2 && -n $3 ]] checks that the first two parameters are files, and the third is a non-empty string
  • && (short-circuit AND) will run the latter only if the former is true
  • head -2 $1 prints the first two lines of file $1, which is the XML declaration and the opening root tag
  • sed -e '1,2d;$d' $1 strips the first two lines and the last line of the message bundle file $1; similar for file $2
  • (sed ... $1 ; sed ... $2 ) | sort combines the messages from file $1 and $2 and sort them in scending order
  • tail -1 $1 prints the last line of file $1, which is the closing root tag
  • ( ... ) > $3 finally writes the combined output to file $3
We can then simply run:
mergemsg zh-TW_ALL.xml ALL_hk.xml zh-TW_NEWS.xml
mergemsg ar_ALL.xml ar_mx.xml ar_NEWS.xml
...

2010-05-30

Unintuitive probabilities

There are a few well-known examples where the probabilities of occurrences are contrary to intuition. One of them is the birthday paradox, with an example question such as "what is the probability that any two students in a class of 50 would have the same birthday?" Since there's 365 different birthdays (or 366 in leap years, though we ignore this for the moment), many people would think the chance of collision is not high. However, it can be shown that the probability of a collision is 1-365*364*...*316/(365^50) = 97% which is actually very likely.

Another example is the Monty Hall paradox. There are three doors, A, B, and C, to choose from in this puzzle, and there's a prize behind one of the doors. The probability of choosing the door with the prize can easily be figured out to be 1/3. However, if after choosing, say, door A, and before checking if it contains the prize, door B is revealed to not contain the prize. At this time, you are free to stay with door A or choose door C instead. Most people would think that it doesn't matter, with both door A and door C having a 1/2 chance of containing the prize. This is however incorrect, and the real probabilities of door A and door C containing the prize is 1/3 and 2/3, respectively.

One other example is determining the probability of contracting a virus given the result of an imperfect virus test. Assume the probability of contracting the virus is one in ten thousand (i.e., 0.01%) and the accuracy of the virus test is 99% (that is, 99 of 100 that really contracted the virus would show positive results and one would show negative; also, 99 of 100 of those that did not contract the virus would show negative results, but one would show positive), what would be the probability of a person really contracting the virus if his/her test is positive? Intuition would suggest the probability to be ~99%, but in reality it is 1%.

2010-05-24

Common acronyms seen in internal mails

We see many acronyms in our internal mails, and some may not be obvious, especially to non-native speakers. Here's some of the more commonly seen ones:

AFAICT: As Far As I Can Tell
AFAIK: As Far As I Know
ASAP: As Soon As Possible
ETA: Estimated Time of Arrival (estimate to when something is expected to be done, or the time someone or something will be arriving, etc)
FYI: For Your Information
FWIW: For What It's Worth
IIRC: if I recall correctly or if I remember correctly
IMO: in my opinion
IMHO: In My Humble Opinion. can also mean In My Honest Opinion

A lot of acronyms/jargons can be found here: http://www.computerhope.com/jargon/f/fwiw.htm

2010-04-12

Microsoft Word: add overline to a character

Underlining in Microsoft Word is trivial (CTRL-U), but adding a overline is hard. Most people would resort to using equations, which is slow to render, and may have different font, spacing, and linespacing with the surrounding text. Here's a text-based way of doing this:
  1. Press CTRL+F9 to insert field braces.
  2. Enter EQ \o(W,¯) and delete any extra spaces in the field. The overscore character can be entered by holding down ALT and typing 0175.
  3. Press SHIFT+F9 to show the results of the field code.
  4. If the overline is too wide or narrow, press SHIFT+F9 to show the field code again and select the overscore character.
  5. Press Ctrl-D to edit font properties for the overscore character.
  6. Change to the character spacing tab, adjust the font scale (eg, 150% if we want to overline the character W), and click Ok.
  7. Press SHIFT+F9 to show the results of the field code. Iterate if necessary.

2010-03-24

Vim: toggling between .h / -inl.h / .cc / .py / .js / _test.* / _unittest.*

It's often that you need to toggle between x.h, x.cc, x_test.cc, x_unittest.cc, or x-inl.h.
Here's script to do this:
"switch between .h / -inl.h / .cc / .py / .js / _test.* / _unittest.*
"with ,h ,i ,c ,p ,j ,t ,u
"(portion from old mail from David Reiss)
let pattern = '\(\(_\(unit\)\?test\)\?\.\(cc\|js\|py\)\|\(-inl\)\?\.h\)$'
nmap ,c :e =substitute(expand("%"), pattern, ".cc", "")
nmap ,h :e =substitute(expand("%"), pattern, ".h", "")
nmap ,i :e =substitute(expand("%"), pattern, "-inl.h", "")
nmap ,t :e =substitute(expand("%"), pattern, "_test.", "") . substitute(expand("%:e"), "h", "cc", "")
nmap ,u :e =substitute(expand("%"), pattern, "_unittest.", "") . substitute(expand("%:e"), "h", "cc", "")
nmap ,p :e =substitute(expand("%"), pattern, ".py", "")
nmap ,j :e =substitute(expand("%"), pattern, ".js", "")


,h: X.cc or X-inl.h or X_test.cc or X_unittest.cc goes to X.h
,c: X.h or X-inl.h or X_test.cc or X_unittest.cc goes to X.cc
,i: X.h or X.cc or X_test.cc or X_unittest.cc goes to X-inl.h
,p: X_test.py or X_unittest.py goes to X.py
,j: X_test.js or X_unittest.js goes to X.js
,t: X.h or X-inl.h or X.cc or X_unittest.cc goes to X_test.cc
,t: X.py/js or X_unittest.py/js goes to X_test.py/js
,u: X.h or X-inl.h or X.cc or X_test.cc goes to X_unittest.cc
,u: X.py/js or X_test.py/js goes to X_unittest.py/js


Obviously you can customize different key mappings to do the toggle. =)