Mandatory knowledge for programming interviews

(This is a “kids of today” complaint, so skip it unless you’re willing to hear complaints about job interviews.)

Public service announcement to job applicants: you must know what O(N) notation is for the code you write for both runtime operations and memory usage. I don’t care if you have a computer science degree or not, but you cannot write good implementations of algorithms if you don’t have a clue about it. Your “8 years of experience” on your resume is garbage if you can’t tell me why an O(N) array search is slower than an O(log N) search in almost every case. Also, you need to know how much memory you allocated, even if you don’t have to explicitly free it because the garbage collector cleaned up your mess.

No, the programming language doesn’t matter. No, your favorite library does not matter. No, there is no magic speedup to be found in Linq, inlined functions, lambda functions, or whatever other garbage you googled. Kids of today! Geez, get offa my lawn.

And while I’m at it, you must know the fundamental rule of multi-threaded programming in Windows: there is only one UI thread, it owns all UI objects, and you cannot access any UI objects from any other thread. Not ever! Your program will crash and burn with some cryptic errors the next time your callback in the thread pool tries to change the text displayed on screen. If you don’t know how to transition from background threads to the UI thread, that is your fault. No, you can’t get away with it because there was no compiler warning message. No, you can’t catch every multi-threading exception and silently discard them. No, the operating system will not help you to fix this error. No, your favorite programming language does not absolve you of this responsibility, even if it does have a billion ways to manage threads (C#, I’m looking at you).

Does nobody learn these concepts anymore in college? There are so many excellent textbooks available, and it is your responsibility as a professional programmer to read at least some of them. Reading “Sams Teach Yourself C# in 24 Hours” and writing garbage code for a few years does not make you a professional.

Please read some or all of these before applying to a job and sending me an 8 page resume with glowing but vacuous recommendations:

Foundations of Computer Science, which is the Stanford textbook used in programming and CS classes. Specifically, Ch 3. The Running Time of Programs.

Introduction to Algorithms, the MIT standard textbook. Please dear lord tell me you know what a hash function is and what it does. (Or map, dictionary, associative array, or whatever other name your favorite language is now using; the name doesn’t matter as long as you know when to use it.)

Cracking the Coding Interview: 150 Programming Questions. If you can’t write code without an IDE (i.e., by hand on a whiteboard or on paper), you’re not going to pass an interview.

The Art of Computer Programming by Don Knuth. Any volume. Really, anything written by Knuth. If you don’t know what a regex or Turing Machine is, you’ve got a lot of catching up to do. No, seeing “The Imitation Game” does not count.

Operating Systems: Design and Implementation by Tanenbaum et al. You must know what a mutex is and why you’d need to use it. No, you cannot google it during the interview. No, you cannot avoid learning about how the OS schedules and manages threads inside your process, even if you never explicitly control it.

Kids of today! Geez. Now please excuse me, I have to mow my lawn.

Wonderware Training

I took the Wonderware System Platform 1 (Wonderware Application Server) training course, which I thought was excellent. I had learned some Wonderware scripting at my current job, but that turned out to be a skewed perspective on what Wonderware is actually intended to do. We use just the Manufacturing Execution Systems (MES) aspect of Wonderware, which is a shame. It’s a deep and powerful product with many great SCADA features (data acquisition, alarming, HMI).

Unfortunately, the price of additional licensing is probably going to prevent me from using it for its intended role of creating HMI displays on a production floor. I can already imagine my boss comparing the price of those licenses against the price of continuing to use MFC/C++ (effectively free, if you ignore time and opportunity costs). There are also other risks and expenses to changing the architecture of an existing, productive plant — downtime, retraining, computer upgrades, etc. Still, it’s fun to imagine redoing everything the “right” way from the start.

One project I could still do: we really should be using OPC to communicate with devices on our plant floor. I’ll see if we can move away from our existing ad hoc communication methods. OPC is way better than what we’re doing right now. Maybe I can make a business case for improved productivity or reduced integration costs. OPC classic uses Component Object Model (COM), which is an old and somewhat frustrating core component of the WIN32 API which has given me lots of headaches over the years, but the new flavor of OPC UA is just web services, which are much easier to understand (albeit more verbose). Too bad the OPC foundation wants thousands of dollars per year for membership; maybe I’ll use OmniServer instead.

Living in the past with Vim, for fun and profit

A recent news article on HN (HN: Emacs and Vim, original article here) caused me to pause to reflect on my choice of editors. I use the GUI version of Vim for most of my serious editing, but I will also use Eclipse for editing Java, HTML, and JSPs and MS Visual Studio for C++/C#. Why so many different editors?

Each one is good at different things, which is why I use them all. But I keep coming back to Vim when I have to do something harder. For example, regular expression support in Vim is way better than in the other editors. Let’s say I need to extract the servlet names in a Tomcat access_log file like this:

127.0.0.1 - - [18/Oct/2014:15:29:53 -0400] "GET /examples/servlets HTTP/1.1" 302 -
127.0.0.1 - - [18/Oct/2014:15:29:53 -0400] "GET /examples/servlets/ HTTP/1.1" 200 5343
127.0.0.1 - - [18/Oct/2014:15:29:53 -0400] "GET /examples/servlets/images/code.gif HTTP/1.1" 200 292
127.0.0.1 - - [18/Oct/2014:15:29:53 -0400] "GET /examples/servlets/images/return.gif HTTP/1.1" 200 1231
127.0.0.1 - - [18/Oct/2014:15:29:53 -0400] "GET /examples/servlets/images/execute.gif HTTP/1.1" 200 1242
127.0.0.1 - - [18/Oct/2014:15:30:08 -0400] "GET /examples/servlets/servlet/HelloWorldExample HTTP/1.1" 200 359
127.0.0.1 - - [18/Oct/2014:15:30:11 -0400] "GET /examples/servlets/helloworld.html HTTP/1.1" 200 2612
127.0.0.1 - - [18/Oct/2014:15:30:25 -0400] "GET /examples/servlets/servlet/RequestInfoExample HTTP/1.1" 200 693
127.0.0.1 - - [18/Oct/2014:15:30:27 -0400] "GET /examples/servlets/reqinfo.html HTTP/1.1" 200 3674
127.0.0.1 - - [18/Oct/2014:15:30:51 -0400] "GET /examples/servlets/reqheaders.html HTTP/1.1" 200 2304
127.0.0.1 - - [18/Oct/2014:15:30:56 -0400] "GET /examples/servlets/servlet/RequestHeaderExample HTTP/1.1" 200 1067
127.0.0.1 - - [18/Oct/2014:15:31:04 -0400] "GET /examples/servlets/reqparams.html HTTP/1.1" 200 4650
127.0.0.1 - - [18/Oct/2014:15:31:35 -0400] "GET /examples/servlets/servlet/RequestParamExample HTTP/1.1" 200 657
127.0.0.1 - - [18/Oct/2014:15:31:56 -0400] "GET /examples/servlets/cookies.html HTTP/1.1" 200 2741
127.0.0.1 - - [18/Oct/2014:15:31:59 -0400] "GET /examples/servlets/servlet/CookieExample HTTP/1.1" 200 637
127.0.0.1 - - [18/Oct/2014:15:32:02 -0400] "GET /examples/servlets/sessions.html HTTP/1.1" 200 3267

So I don’t care about most of the content except the timestamp and the servlet name. The following regex solves the problem: s/^.*[\(.*\)]\.*servlets\(.*\) H.*$/\1 \2/

I know that looks like gibberish to someone who hasn’t used regexes much, but here is what it produces:

18/Oct/2014:15:29:53 -0400 
18/Oct/2014:15:29:53 -0400 /
18/Oct/2014:15:29:53 -0400 /images/code.gif
18/Oct/2014:15:29:53 -0400 /images/return.gif
18/Oct/2014:15:29:53 -0400 /images/execute.gif
18/Oct/2014:15:30:08 -0400 /servlet/HelloWorldExample
18/Oct/2014:15:30:11 -0400 /helloworld.html
18/Oct/2014:15:30:25 -0400 /servlet/RequestInfoExample
18/Oct/2014:15:30:27 -0400 /reqinfo.html
18/Oct/2014:15:30:51 -0400 /reqheaders.html
18/Oct/2014:15:30:56 -0400 /servlet/RequestHeaderExample
18/Oct/2014:15:31:04 -0400 /reqparams.html
18/Oct/2014:15:31:35 -0400 /servlet/RequestParamExample
18/Oct/2014:15:31:56 -0400 /cookies.html
18/Oct/2014:15:31:59 -0400 /servlet/CookieExample
18/Oct/2014:15:32:02 -0400 /sessions.html

That’s exactly what I want! It only took one command in Vim. I’m sure you can get your favorite editor to do it too, but Vim has always worked well for me.

Another example: Let’s say I need to generate some repetitious code because SQL Server’s tsql language doesn’t really support arrays, and someone else generated column names with numbers at the end.

INSERT INTO dbo.BadTable (NAME, ADDR, ZIP)
        SELECT NAME01, ADDR01, ZIP01
        FROM dbo.OtherBadTable
        WHERE ID='01'

INSERT INTO dbo.BadTable (NAME, ADDR, ZIP)
        SELECT NAME02, ADDR02, ZIP02
        FROM dbo.OtherBadTable
        WHERE ID='02'

The task at hand is to copy data from OtherBadTable to BadTable (for IDs 01 to 09). Yes, I know this table has a terrible design, but it wasn’t my choice, and I cannot change it. I could use dynamic SQL, but the column names are fixed, and there are only 9 of them.

Here’s a VIM solution. Start with the template below:

INSERT INTO dbo.BadTable (NAME, ADDR, ZIP)
        SELECT NAMEXX, ADDRXX, ZIPXX
        FROM dbo.OtherBadTable
        WHERE ID='YY'

Next, make 9 more copies with ggVGy (select all lines) and then pasting 9 times with 9P:

INSERT INTO dbo.BadTable (NAME, ADDR, ZIP)
        SELECT NAMEXX, ADDRXX, ZIPXX
        FROM dbo.OtherBadTable
        WHERE ID='YY'

INSERT INTO dbo.BadTable (NAME, ADDR, ZIP)
        SELECT NAMEXX, ADDRXX, ZIPXX
        FROM dbo.OtherBadTable
        WHERE ID='YY'

(etc.)

Finally, use these commands to make a list: let i=1 | g/XX/s//\=’0′.i/g | let i=i+1 and then let i=1 | g/YY/s//\=’0′.i/g | let i=i+1 . The final output is:

INSERT INTO dbo.BadTable (NAME, ADDR, ZIP)
        SELECT NAME01, ADDR01, ZIP01
        FROM dbo.OtherBadTable
        WHERE ID='01'

INSERT INTO dbo.BadTable (NAME, ADDR, ZIP)
        SELECT NAME02, ADDR02, ZIP02
        FROM dbo.OtherBadTable
        WHERE ID='02'

(etc.)

There! 50 lines of code generated with only 4 Vim commands. Obviously, it’s easy to extend this method to as many column names as you need. My point is just that this is a thankless, boring typing task, which can be automated easily in Vim.

I saved the best for my next post, but here’s an introduction. Macros in Vim are very powerful. You can record a macro from almost any set of Vim commands and then replay the macro to make a huge number of changes to file painlessly.

Download Vim today and give it a try!