You are probably compiling without optimization. Try to compile with -d:release; it is faster for me, then (this is aside from other performance improvements you can introduce).
Note that even the Python version will spend most of the time executing C routines (the implementation of in and readlines), and there's a fair I/O component, so the speedup you can obtain is somewhat limited.