predefined character classes in grep 06. Jul 2008

Many regex implementations have “macros” for various character classes. In Perl, for example, \d matches any digit ([0-9]) and \w matches any “word character” ([a-zA-Z0-9_]). Grep uses a slightly different notation for the same thing: [:digit:] for digits and [:alnum:] for alphanumeric characters. (BSD)

Finally, certain named classes of characters are predefined within bracket expressions, as follows. Their names are self explanatory, and they are [:alnum:], [:alpha:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:], and [:xdigit:]. For example, [[:alnum:]] means [0-9A-Za-z]. (grep man page)

Was unter Ruby

line = "length 1450"
puts line if line =~ /\d{4}/

heißt wird also unter der bash mit grep zu

export line="length 1450"
echo $line | egrep [[:digit:]]{4}

was zwar die regulären Ausdrücke unnötig aufbläht, aber immerhin die gleiche Funktionalität zur Verfügung stellt.

 

beautiful network monitoring 27. Jun 2008

Monitoring networks with tcpdump works fine, but even in quiet mode tcpdump outputs too much information if you’re interested in application layer protocols like HTTP or IMAP. A nice alternative on the command line is ngrep which has a much more readable output. ngrep filters all tcp packets with an empty data part and strips the header of non-empty tcp packets. The only beauty flaw in my eyes is the dot ngrep inserts for every tab and for every carriage return. In my opinion it should offer an option to specify the tabulator size and just ignore the carriage returns.

But see for yourself, here the tcpdump output of an IMAP session,

$ sudo tcpdump -i en1 -A -s 0 -qtn port imap
tcpdump: verbose output suppressed,
use -v or -vv for full protocol decode
listening on en1, link-type EN10MB (Ethernet),
capture size 65535 bytes
IP 192.168.2.22.50556 > 80.237.145.78.143: tcp 0
E..@..@.@.......P..N.|...7.n........Go.............
/!..........
IP 80.237.145.78.143 > 192.168.2.22.50556: tcp 0
E..<..@.9...P..N.......|.....7.o...............
w.A./!......
IP 192.168.2.22.50556 > 80.237.145.78.143: tcp 0
E..4..@.@.......P..N.|...7.o...............
/!..w.A.
IP 80.237.145.78.143 > 192.168.2.22.50556: tcp 21
E..I..@.9...P..N.......|.....7.o...........
w.A./!..* OK Dovecot ready.

IP 192.168.2.22.50556 > 80.237.145.78.143: tcp 0
E..4.P@.@..j....P..N.|...7.o...............
/!..w.A.
IP 192.168.2.22.50556 > 80.237.145.78.143: tcp 33
E..U.8@.@..`....P..N.|...7.o...............
/!.2w.A.1 LOGIN foo@bar.net password

IP 80.237.145.78.143 > 192.168.2.22.50556: tcp 0
E..4..@.9...P..N.......|.....7.......<.....
w.F'/!.2
IP 80.237.145.78.143 > 192.168.2.22.50556: tcp 17
E..E..@.9...P..N.......|.....7......c......
w.F./!.21 OK Logged in.

IP 192.168.2.22.50556 > 80.237.145.78.143: tcp 0
E..4.0@.@.......P..N.|...7.....?.....D.....
/!.>w.F.
IP 192.168.2.22.50556 > 80.237.145.78.143: tcp 14
E..B.#@.@.......P..N.|...7.....?...........
/!.cw.F.1 LIST """%"

IP 80.237.145.78.143 > 192.168.2.22.50556: tcp 0
E..4..@.9...P..N.......|...?.7.............
w.H./!.c
IP 80.237.145.78.143 > 192.168.2.22.50556: tcp 513
E..5..@.9...P..N.......|...?.7.............
w.H./!.c* LIST (\HasChildren) "." "Trash"
* LIST (\HasNoChildren) "." "Sent"
* LIST (\HasNoChildren) "." "Spam"
* LIST (\HasNoChildren) "." "Sent Messages"
* LIST (\HasNoChildren) "." "Drafts"
* LIST (\HasNoChildren) "." "Spamtraining"
* LIST (\HasNoChildren) "." "Hamtraining"
* LIST (\HasNoChildren) "." "Spamtesting"
* LIST (\HasNoChildren) "." "Hamtesting"
* LIST (\HasNoChildren) "." "Deleted Messages"
* LIST (\HasNoChildren) "." "Spamverdacht"
* LIST (\HasNoChildren) "." "INBOX"
1 OK List completed.

IP 192.168.2.22.50556 > 80.237.145.78.143: tcp 0
E..4]'@.@.8.....P..N.|...7.....@...........
/!.cw.H.
IP 192.168.2.22.50556 > 80.237.145.78.143: tcp 10
E..>.L@.@..c....P..N.|...7.....@.....N.....
/!..w.H.1 LOGOUT

IP 80.237.145.78.143 > 192.168.2.22.50556: tcp 19
E..G..@.9...P..N.......|...@.7.............
w.IP/!..* BYE Logging out

IP 192.168.2.22.50556 > 80.237.145.78.143: tcp 0
E..4.I@.@..p....P..N.|...7.....S.....%.....
/!..w.IP
IP 80.237.145.78.143 > 192.168.2.22.50556: tcp 24
E..L..@.9...P..N.......|...S.7.............
w.IP/!..1 OK Logout completed.

IP 192.168.2.22.50556 > 80.237.145.78.143: tcp 0
E..4C#@.@.R.....P..N.|...7.....l...........
/!..w.IP
IP 192.168.2.22.50556 > 80.237.145.78.143: tcp 0
E..4.;@.@.......P..N.|...7.....l...........
/!..w.IP
IP 80.237.145.78.143 > 192.168.2.22.50556: tcp 0
E..4..@.9...P..N.......|...l.7.......j.....
w.IQ/!..

20 packets captured
28 packets received by filter
0 packets dropped by kernel

compared with the corresponding ngrep output.

$ sudo ngrep -d en1 -W byline port imap
interface: en1 (192.168.2.0/255.255.255.0)
filter: (ip) and ( port imap )
####
T 80.237.145.78:143 -> 192.168.2.22:50556 [AP]
* OK Dovecot ready..

##
T 192.168.2.22:50556 -> 80.237.145.78:143 [AP]
1 LOGIN foo@bar.net password.

##
T 80.237.145.78:143 -> 192.168.2.22:50556 [AP]
1 OK Logged in..

##
T 192.168.2.22:50556 -> 80.237.145.78:143 [AP]
1 LIST """%".

##
T 80.237.145.78:143 -> 192.168.2.22:50556 [AP]
* LIST (\HasChildren) "." "Trash".
* LIST (\HasNoChildren) "." "Sent".
* LIST (\HasNoChildren) "." "Spam".
* LIST (\HasNoChildren) "." "Sent Messages".
* LIST (\HasNoChildren) "." "Drafts".
* LIST (\HasNoChildren) "." "Spamtraining".
* LIST (\HasNoChildren) "." "Hamtraining".
* LIST (\HasNoChildren) "." "Spamtesting".
* LIST (\HasNoChildren) "." "Hamtesting".
* LIST (\HasNoChildren) "." "Deleted Messages".
* LIST (\HasNoChildren) "." "Spamverdacht".
* LIST (\HasNoChildren) "." "INBOX".
1 OK List completed..

##
T 192.168.2.22:50556 -> 80.237.145.78:143 [AP]
1 LOGOUT.

#
T 80.237.145.78:143 -> 192.168.2.22:50556 [AP]
* BYE Logging out.

##
T 80.237.145.78:143 -> 192.168.2.22:50556 [AFP]
1 OK Logout completed..

28 received, 0 dropped
 

force unmount on mac os x 27. Jun 2008

When using Mac OS X you might have encountered this message.

The disk "foobar" is in use and could not be ejected.
Try quitting appplications and try again.

This message may occur even if you have all closed all programs. The guilty party for this behavious may be Leopard’s FSEvents, which sometimes has a lock on a volume. To force the unmount of a volume just open a Terminal and use umount.

sudo umount -f /Volumes/FooBar
 

tcpdump: packets dropped by kernel 25. Jun 2008

tcpdump is a really nice tool, but it may render useless with it’s default settings.

$ sudo tcpdump -i en1

In my case led to

80 packets captured
7705 packets received by filter
6794 packets dropped by kernel

To solve the issue I had to turn off address translation.

$ sudo tcpdump -i en1 -n

Now the results look as expected.

6941 packets captured
7011 packets received by filter
0 packets dropped by kernel
 

slow iterators in ruby 1.9 on mac 17. Jun 2008

If you ever wondered why your Ruby programs are not faster but much slower with Ruby 1.9 you might be using each or times iterators on large collections.

10000000.times {}

In fact if you do a benchmark,

require 'benchmark'

n = 10000000

Benchmark.bm do |x|
  x.report { n.times {} }
end

you get stunning results.

$ ruby bench.rb
   user     system      total        real
   0.720000   0.000000   0.720000 (  0.721872)
$ ruby1.9 bench.rb 
   user     system      total        real
   16.670000   9.110000  25.780000 ( 25.923977)

When increasing the number of iterations by powers of ten it looks like the computational complexity of the times iterator is O(n) in Ruby 1.9. According to Antonio Cangiano this behaviour is specific to Mac OS X and does not appear on Linux.

 

1 2 3