All posts by djohn89

Extending Wonderware with RabbitMQ

Wonderware makes excellent HMIs (computer displays in an industrial environment), but what if you have a large legacy C# application that you want to use with Wonderware without paying for more licenses? I ran into this situation recently, and I came up with the following solution: create an “adapter” program that provides RabbitMQ functionality inside the Wonderware galaxy.

Specifically, load the MX Access Toolkit DLL and the RabbitMQ.NET client, and translate messages in between them. So if you get a RabbitMQ message that says “change Wonderware attribute X to a new value Y”, then the adapter does the equivalent operation in the Galaxy using the MX Access Toolkit. If a Wonderware attribute changes value (using MX Access subscribe()), then post a RabbitMQ message indicating the updated value to non-Wonderware clients who read messages from RabbitMQ. I used JSON for the message format because it’s simple, universally supported, and not too big.

This approach provides bi-directional communication between everything in Wonderware and everything outside Wonderware, using RabbitMQ as the universal platform for exchanging messages. RabbitMQ is free (but commercial support is available from Pivotal), fast, and very reliable. RabbitMQ can also reach web applications using WebSTOMP (i.e., Javascript) and non-C# applications using the other client libraries (Java, C++, Python, etc.).

Limitations of this approach: Wonderware is expensive, so if you just want a PLC communication library, you should use Kepware. Obviously, you’re not going get tech support for extending Wonderware in this way, and since you’re avoiding a lot of license fees per computer (Platform, ArchestrA license), you won’t get any help from the sales department either. Also, Wonderware now includes its own webserver, so you could just use that for your web applications.

For more details, see the source code at


Presentation – Using Big Data from Automotive Assembly Errors

Here is a link to the slides (Google slides)

Presented at FoxCon Toledo Software Developers Conference, Toledo, OH, on 1/27/2018.


Electronic torque tools used in automobile production produce a large amount of data as workers and robots physically assemble a vehicle. This data includes alarms and errors for various potential defects such as rehit bolts, cross-threaded bolts, incorrectly removed bolts, failure to achieve torque within engineering specifications, etc. In his presentation David will reveal how an analysis of this data can yield new insights into the causes of operator errors and solutions in new quality auditing systems, both on and off the production line.


FoxCon Toledo 2018 Abstracts

FoxCon Toledo 2018 Conference Presentation Abstracts

Park Inn Hotel, Toledo, Ohio. January 27th & 28th, 2018

Bob Pierce               Allentown

Living in a Legacy World

Last year Bob showed a case study of automating Smooth-On Corp., a large chemical manufacturing plant in Allentown, using Visual FoxPro. This year he will delve into the issues and opportunities of the process of migrating a large legacy application while still having to maintain and actively develop it to meet current demand.  He will be discussing how the business model changes over time to transition from entrepreneurial to corporate and the role of senior (Legacy) management in transitioning the business to the next generation to ensure future growth and viability.  This session is designed to be interactive, so your input is welcome.

Dave Bernard           Atlanta

Lessons Learned from 40 Years of Software Development

Well, 39 years, to be exact. Since the late 70’s, Dave has worked on many software development projects (over 75 since 2000 alone), including mainframe, minicomputer, microcomputer, and mobile platforms, for many different industries, and in multiple countries. Dave will share a number of technical and non-technical lessons he’s learned over that time, some that are obvious, some that are not-so-obvious, and some that are counterintuitive and even controversial. Dave expects this presentation to have a high interaction-to-material ratio.

David Johnson          Toledo

Big Data from Automotive Assembly Errors

Electronic torque tools used in automobile production produce a large amount of data as workers and robots physically assemble a vehicle.  This data includes alarms and errors for various potential defects such as rehit bolts, cross-threaded bolts, incorrectly removed bolts, failure to achieve torque within engineering specifications, etc.  In his presentation David will reveal how an analysis of this data can yield new insights into the causes of operator errors and solutions in new quality auditing systems, both on and off the production line.

David Johnson is a Systems Control And Data Acquisition (SCADA) programmer at the Toledo South Assembly Plant, which has built the Jeep Wrangler from 2006 until April. After then, the plant will retool to produce the new Jeep Scrambler pickup truck, available to consumers in mid to late 2019. The new 2018 Jeep Wrangler is now available from the Toledo North Assembly Plant.

Doug Hennig          Winnipeg

Practical Uses of wwDotNetBridge to Extend Your VFP Applications

wwDotNetBridge lets you call just about any .NET code directly from Visual FoxPro and helps overcome most of the limitations of regular .NET COM interop. This library by Rick Strahl allows you to provide .NET functionality to your Visual FoxPro applications that wouldn’t otherwise be available. In this session, you’ll see many practical examples that show how you can add new capabilities to your applications that would be difficult or impossible to achieve natively in Visual FoxPro.

Creating Beautiful Web Sites Easily Using Bootstrap

Laying out a web page using HTML and CSS can be challenging. Do you use the older table mechanism or CSS floats to place objects side-by-side? How do you deal with differences in browsers? And what about handling different devices: phones, tablets, laptops, and desktops?

Bootstrap is a free, open source framework for developing responsive, mobile-first web sites. It solves many problems web developers typically face and makes it easy to create beautiful web sites in record time, even for inexperienced developers.

This session shows how to get started with Bootstrap, examines using its grid system to easily layout your page elements, and discusses how Bootstrap components add attractive and functional elements to your web site. We’ll do a “makeover” of a real web site to show how easy is it to make it more attractive, functional, and mobile-friendly.

Kevin Cully               Atlanta

Accessing and manipulating application data using Xojo

Xojo is a great tool for programmers who are used to developing in Visual Foxpro and Microsoft .NET. It allows application development, not just only Windows, but also Mac, Linux, Web, and iOS. This presentation will focus on the various techniques used in Xojo accesses to update data: SQL Pass Through, SQL Prepared Statements, and Object Relational Mapping.

For the last two years Kevin has a Senior Business Analyst and a DBA for the Cherokee County Government, deploying Xojo based solutions in a variety of county operational units. Kevin has two boys in college and a small farm where he lives with his wife and two dogs where they grow blueberries.

Mike Levy              Cincinnati

Given I have to test, when I do, then it should be cool

The goal of this quick talk is to build awareness around a set of testing tools, commonly used in the Behavior-Driven Development space (BDD) for codifying acceptance criteria, but are additionally useful for the entire testing triangle.

Ondrej Balas          Detroit

Open Source Game Development in the .NET Ecosystem

With so many frameworks to choose from, aspiring game developers are often overwhelmed with options. In this session we’ll explore the decisions that go into choosing the right framework for your project. Next we’ll look at one in particular: Duality. Duality is a flexible and open source framework for developing 2D games with .NET. I’ll show you the fundamental patterns and principles behind game development and walk you through creating a simple game in Duality.

Ondrej Balas is the owner of UseTech Design, a Michigan-based development company that focuses primarily on .NET and other Microsoft technologies. Ondrej is also a Microsoft MVP in Visual Studio and Development Technologies, a writer for Visual Studio Magazine, and is very active in the Michigan software development community. Ondrej works across many industries including finance, healthcare, manufacturing, and logistics. Areas of expertise include similarity and matching across large data sets, algorithm design, distributed architecture, and software development practices.

Sam Nasr             Cleveland

Data Time Travel with Temporal Tables

SQL Server 2016 introduced Temporal Tables, allowing a developer to retrieve data from a specific point in time, without backups.  With a few TSQL commands a historical table can be created, automatically updated, and readily accessed.

Sam Nasr has been a software developer since 1995, focusing mostly on Microsoft technologies. Having achieved multiple certifications from Microsoft (MCSA, MCAD, MCTS, and MCT), Sam develops, teaches, and tours the country to present various topics in .Net. He’s involved with the Cleveland C#/VB.Net User Group, where he has been the group leader since 2003. In addition, he’s the leader of the .Net Study Group, an author for Visual Studio Magazine, and a Microsoft MVP since 2013.  When not coding, Sam loves spending time with his family and friends or volunteering at his local church.

FoxCon Toledo 2018 Schedule

FoxCon Toledo 2018

Conference Presentation Schedule

Park Inn Hotel, Toledo, Ohio

January 27th & 28th, 2018

Saturday January 27

Time    Speaker         Topic

08:00-09:00 Bob Ruple       Opening Comments

09:00-10:15 Bob Pierce     Living in a Legacy World

10:15-10:30                Break

10:30-11:45 Doug Hennig     Practical Uses of wwDotNetBridge to Extend Your VFP Applications

11:45-12:45 pm              Lunch at Park Inn Hotel

12:45-02:00 Sam Nasr        Data Time Travel with Temporal Tables

02:00-02:15                 Break

02:15-03:15 David Johnson   Using Big Data from Automotive Assembly Errors

03:15-03:30                Break

03:30-04:45 Dave Bernard    Lessons Learned from 40 Years of Software Development

06:30 Cocktails

07:30 Dinner

Sunday January 28

Time      Speaker        Topic

08:30-09:00 Bob Ruple       Opening Comments

09:00-10:15 Ondrej Balas   Open Source Game Development in the .NET Ecosystem

10:15-10:30                Break

10:30-11:45 Kevin Cully        Accessing and manipulating application data using Xojo

    11:45-12:45 pm              Lunch at Park Inn Hotel

12:45-01:15 Mike Levy       Given I have to test, when I do, then it should be cool

01:15-01:30                Break

01:30-02:45 Doug Hennig       Creating Beautiful Web Sites Easily Using Bootstrap

02:45-03:00                Closing Comments

3:00 pm                                       Conference Dismisses

Codemash 2018 reviews

I’m writing my thoughts here about the sessions I attended at CodeMash 2018. Overall, I had a great time and learned some new and important things about the latest technologies. This year I tried to attend more data science / machine learning talks, though several of them were cancelled at the last minute. (Perhaps the bad weather was to blame.)

I wrote these notes during or immediately after each talk, and I submitted some of the text to the session survey part of the Attendee Hub app on my phone. I think it’s a good idea to send the feedback to the speakers, but I don’t think it is used by reviewers for next year, so I’m not sure it matters very much. Regardless, I write to consolidate my own learning, if for no other reason.

Fast Neural Networks – a no brainer

Speakers: Riccardo Terrell

The speaker used the agent model because ANNs are embarassingly parallel. Map agents to nodes – 1:1, send updates between them (values forward, backpropagation corrections backward). Make agents reactive to messages (updates from dependent nodes). The speaker has an e-book on MEAP about it – Parallel NN as map reduce problem – simpler way to implement NN.

The speaker tried to cover too much content. Also, I didn’t hear any good outcomes or reasons for why he reimplemented backpropagation (a very old algorithm). What was learned by redoing it with actors?

Getting Started with Deep Learning

Speakers: Seth Juarez

confusion matrix- be careful of TN vs FP box location. Works for MS; uses VS to edit Python; tensorflow implementation in python for MNIST. Check channel 9 msdn for newer presentation

cool twilio use – text the presenter to ask questions during presentation

This speaker was great because he told lots of jokes to keep the talk interesting despite the underlying math. I really liked the twilio based questions via text messages!

Imposter Syndrome: Overcoming Self-Doubt in Success

Speakers: Heather Downing

The speaker did a very good job motivating the subject, but I would’ve liked to hear more about practical things that can be done to deal with imposter syndrome. I agree with “co-bragging”, but I think other good options include public speaking to build confidence or doing volunteer work to appreciate how it’s a 1st world problem.

cycle of failure: overly confident – procrastinate and rely on luck. under confident – put in excessive time and effort. Discount success and undermine ego.

How can you tell if someone will eventually succeed after a failure? You are not the failure; own your mistakes and learn from them. Don’t be scared of failure; be scared of not finding the truth.

“Co-bragging” – praise your co-workers’ acheivements, and they praise you. Create a positive culture that avoids hurtful comparisons.

Fake it until you become it, then pay it forward.

Machine Learning at Scale, How to Keep Learning as Your Data Keeps Increasing

Speakers: Matt Winkler

This was exactly why I come to codemash – I want to hear about the latest tech so that I can keep up with the industry standards. I thought the speaker did an excellent job of reviewing the latest ML implementations and describing how to deploy them at scale. I loved all the detailed examples!

data prep: spark, pandas, dplyR
scale up: spark cluster, HDInsight
aggregation: AML workbench

Azure Machine Learning workbench can automatically learn how to create columns by example (formatting and aggregating a date time column). (automatically generates python code!)

Nvidia’s latest GPU was announced at NIPS conference (academic AI researchers).

VM recommendations: use version control so that you can migrate easily. make scripts for any required setup. track your outcomes from experimenting with different models. benchmark the price effectiveness of different configurations.

home camera – Amazon deep lens – recognizes faces

MSSQL 2017 has integrated ML algorithms INSIDE it! see tutorials?!

Walking the High Wire: Patching Erlang Live

Speakers: John Daily

The speaker did a great job of motivating the use of Erlang for live patches, which is exactly what I wanted to hear about. But I would’ve liked to have seen a non-trivial example, or some more information about how it’s used in the real world.

Power isn’t pretty – erlang is designed for fault tolerance, not usability. Its network IO is fundamentally async but reliable, unlike RPC or CORBRA.

App architecture without RDBS vs NoSQL drama

Speakers: Jeff Putz

The speaker has a lot of good experience to share. He’s obviously worked in diverse applications, and I appreciate hearing about tradeoffs between technologies instead of just advertising the latest tech. I think it’s really important to repeat the message about not being obsessed with the latest tech just because it’s new. I wish the presentation didn’t get sidetracked by arguments about issues of personal preference in DB design.

NoSQL advantages – less CPU, high write thruput, maybe higher dev productivity

Fight the urge to normalize everything. Don’t make complicated schema for queries that will never be used. Focus on the problem domain, not the persistence and code style. Running multiple queries can be OK (as compared to a join).

SQL can do key value pairs OK. The death of SQL in 2010 was greatly exaggerated.

Aggregate queries with joins cause lots of database work. Avoid redoing them in real time. Use the client layer to maintain a cached state of frequently queried aggregates; don’t be afraid to store redundant data because it’s so cheap now. SQL was originally designed to minimize storage at the expense of CPU (i.e. normalization).

Scala for the Java Developer: It’s Easier Than You Think

Speakers: Justin Pihony

I liked this talk because I want to learn at least one new programming paradigm every time I come to CodeMash, and the speaker did a great job of being an ambassador for Scala. I appreciated hearing about the limitations and realistic expectations for the language. Things to improve: show applications and companies using it.

Scala runs on JVM; compatible with java, but it’s functional and immutable first. Less verbose; fixes many annoyances with Java’s legacy conventions. Includes REPL.

Ride the rails: Handling errors the functional way

Speakers: Sam Hanes

The speaker did a very good talk on functional programming basics in F#. I enjoy sessions like these for reminding me that there are alternatives to traditional imperative paradigms. Suggestion: use font colors with better contrast (dark red text on black backgrounds is difficult to read). Overall very good talk.

Functional programming – avoid mutable state.

Use bind to connect a switch function to two track handling. Exceptions can be converted into failures (if they are predictable enough to catch).

A Game of Theories: why languages do what they do

Speakers: Rae Krantz

The speaker did a fun talk about comparing programming languages and GoT. This type of talk is nice to have as a break in between “serious business” and learning new tech. Suggestions: use the full time slot. Show the same algorithm implemented in different languages. Talk more about the supporting libraries (not just the language itself).

Ruby, python, go, erlang, clojure, JavaScript – not sure why these 6 as opposed to any other survey of languages. Popularity?

(I went to this talk because the session “The Polyglot Data Scientist – Adventures with R, Python, and SQL” by Sarah Dutkiewicz was cancelled. It sounded like the snow storm scared a lot of people into leaving CodeMash early, and some other speakers cancelled their talks for being sick.)

Image Recognition with Convolutional Neural Networks using Keras and CoreML

Speakers: Tim LeMaster

This talk was too introductory, barely covering any applications and only talking about history. I walked out after 10 minutes to go eat lunch and do professional networking.

How to Count Your Chickens After They’ve Hatched

Speakers: Gary Short

The speaker is very entertaining and amusing, and it’s great to see a fun talk about a relevant topic (ML). Images are easy to relate to, and it’d be cool to see more talks with them. I don’t think C# was the right choice for this algorithm – python sklearn has built-in implementations for this problem.

Counting chickens in brightfield images – threshold grayscale image, then try k-NN. k-NN doesn’t work because k is unknown? The algorithm is pretty ad hoc, but more power to him if it works. 85% accurate but told to deploy it by customer farmer anyway.

[Sponsor Session] Attracting and Retaining Top Technical Talent (a.k.a. “Insomnia Cure #1 for Software Development Leaders”)

Speakers: Stout Systems

This talk was pretty good, and it make me feel better about my chances of getting a different job someday. On a daily basis, I really have no idea how good the market is for programmers, but I still remember the 2004 IT recession.

top 5 for recruiting/retaining talent –

5. salary (and PTO, retirement, health care, bonus, stock, options – cash equivalents)

4. workplace culture (office space features, remote work, flex schedule, work/life balance)

3. career growth (holding same job for years is boring and a career killer; upskilling and training is good. )

2. [lack of] process (shifting requirements, changing priorities, inconsistent deadlines, no deployments) – also leadership issues (excessive meetings, no clear decisions).

1. technology stack (fear of extinction; huge, messy codebase) – transform or evolve, within appropriate constraints (relevant tech, reasonable schedules). automate mundane tasks (build, deploy). allow some freedom of tools (svn vs git, OS, editors).

R Performance (It’s not R, it’s You)

Speakers: Tim Hoolihan

The speaker gave a pretty good overview of performance issues in R. I’m not really an R user, so I attended this talk just to see if it was much better or different than python based ML. My conclusion: no because R uses the python ML libraries!

How Not to Destroy Data

Speakers: Michael Perry

I liked how this talk summarized an academic topic in a relatable way. Despite being in the last time slot, I learned some fascinating ideas about Historical Modelling. I like having some more challenging topics to attend.

Audit log problems – not reliable or type safe. Large. Simple

Event sourcing. Derive object state by reading whole table of changes. Order is significant.

Historical modelling. Partially ordered. Not simple, but better??

* Every field is immutable
* Surrogate key is only used internally (not in API)
* everything else is the identity in API

Use timestamps as “uniquifiers”

* A fact cannot be deleted
* query uses WHERE NOT EXISTS subclause

speaker name precedessor record – a Name is identified by its nameId and all predecessor records. This is git-style version control inside a db!

Mutable properties can’t be part of entity. detect and resolve conflicts via knowing predecessors.

Events move entities forward thru workflow, pointing backward to previous event.

Advantages: no locks. Offline data is OK. Cluster synch is easier (eventual consistency via Active-Active clusters). Microservice as historical db.


Presentations List

Here is a list of work I’ve presented since 2015 at various local users groups and conferences. These projects show a commitment to continued professional development and networking. I’ll keep updating this list as time goes by.


FoxCon Toledo 2018 Presentation – Big Data from Automotive Assembly Errors (TBD: Jan 27, 2018 ?).

CodeMash 2018 Lightning talk (TBD) – Fast comparison of Tesseract and a Convolutional Recurrent Neural Network for automatically reading dot peened VIN stamps.


Google Developers Group Toledo and Toledo Web Professionals meetings on Nov 9, 2017 – Building Your Own Collaborative Editing Web Forms For Free.

FoxCon Toledo 2017 Presentation – Using Machine Learning to Automatically Predict and Identify Defects in Automotive Assembly Processes

Northwest Ohio SQL PASS Chapter presentation – Using Columnstore Indexes to Store and Analyze Billions of Torques. Cancelled because users group has been abandoned!


Toledo Web Professionals 2016 Presentation: Real-time Messaging to Webapps from a Production Database

FoxCon Toledo 2016 Presentation – Building Webapps to Help You Build a Jeep (video references here).


FoxCon Toledo 2015 Presentation – SQL Server Notifications in a manufacturing environment. Also presented at the NWNUG meeting in Feb 2015 and again at the FANUG meeting shortly after.

Northwest Ohio SQL PASS Chapter Presentation – an edited version of SQL Server Notifications in a manufacturing environment with less C# and more SQL Server details about database administration for Service Broker.

Presentation – Building Your Own Collaborative Editing Web Forms For Free

Links to slides: Google SlidesPDF format, PPT format

Presented at Google Developers Group Toledo and Toledo Web Professionals meetings on Nov 9, 2017.


Have your business users ever said, “we have this Excel form that we need to turn into a web form”? So you, as a full-stack web developer, turn it into a HTML form with an AJAX POST to save the data to a database table in the standard CRUD model. It works well enough, but then the users said, “we really need simultaneous editing for everyone in the office.” Now you’re in trouble because you’re trying to reinvent Google Docs / Sheets, which are very complex. How do those collaborative editing tools really work?

We will look at the Google Realtime API, which uses Operational Transformation to automatically synchronize a document between multiple simultaneous editors. Everything you know about distributed version control systems applies to this situation, but now you need to make it work seamlessly for users who have never heard of “git rebase”. The client and server implementation of Operational Transformation is very challenging.

Alternatives to Operational Transformation (OT) include Differential Synchronization (DS, aka. 3-way merge), Conflict-free Replicated Data Type (CRDT), and plain old Last Write Wins (LWW). For any of these methods, the critical elements include websockets, message queues, and persistent data storage. This presentation will include a demonstration of a simple implementation of collaborative editing with free, locally hosted software only.


Presentation idea – Using Columnstore Indexes to Store and Analyze Billions of Torques

This is an abstract I wrote for a local area users group, only to find out that they had stopped meeting. Toledo doesn’t seem to be able to keep many computer-based user interest groups alive. I’m posting it here for my own records.
Torque tools (impact wrenches, electronic motors, rotary torque sensors) produce lots of data as a vehicle is being built. Torque data is used to control the movement of the production lines and to do quality control analysis. During production, traditional row-based storage makes the most sense, using the VIN and torque tool ID as the primary key in a table optimized for insertion speed and VIN-based lookups. However, post-production it makes more sense to switch to column-based storage for quality control analysis because aggregation functions (mean, std. dev.) need to be computed over large ranges of data (regardless of VIN), and many torque values are similar. Columnstore indexes are a good solution to the latter storage and analysis problem, and they have gotten much better in SQL Server 2016. I’ll describe how they’re useful for this problem for analyzing how well the torque tools are working using simple statistical techniques.

Thoughts on Project Euler in Python

I’ve been doing Project Euler problems to learn more about Python 3.5+ (as opposed to 2.7). The Project Euler website says, “Real learning is an active process and seeing how it is done is a long way from experiencing that epiphany of discovery. Please do not deny others what you have so richly valued yourself.” So I’m not supposed to post full solutions or even hints. However, I can make general comments that are not specific to any Project Euler problems.

I find it very helpful to have the following tools ready to go before doing any Project Euler problems: a list of primes up to 10**9 or so, a list of primitive Pythagorean triplets up to 10**6, and the latest version of a big integer library. Much of Project Euler is based on prime numbers, so I used primegen to create a text file with lots of primes in it. Reimplementing the Sieve of Eratosthenes gets really boring after a while, and the Sieve of Atkin is much better anyway.

Pythagorean triplets show up occasionally, so I have a static text file with primitive triplets up to 10**6. I used an existing pythagorean triplets generator, but the matrix formula isn’t too terrible to implement.

Python 3 has much improved support arbitrary length integers over Python 2. The int datatype is gone, replaced by long, then unified back to int again. The only real problem is remembering the change in the integer division operator, but I’ve used “from future import __division__” practically everywhere.

Generators and comprehensions have big improvements in Python 3+. I really like them for using a functional programming style, and they can improve program speed and reduce memory usage a bit.

Finally, regardless of language, I always try to review some common strategies from dynamic programming and graph search algorithms. It’s surprising how often these ideas are part of the best solutions to Project Euler problems.

Linear algebra review

It’s been a while since I tried to solve a system of equations without
using a numerical library, so I figured it was time to do a linear
algebra review.

1 & 2\\
3 & 4
c_{1} & c_{2}\end{array}\right]

This should be the easiest matrix to work with: small size, nice integer values, non-colinear columns and rows, non-zero determinant, full rank, etc. Let’s go through the basic definitions, just because it’s been a while. $A$ is 2×2 (MxN), which is small and square. Let’s check the column space of $A$ for colinearity by reducing to row echelon form by adding $-3r_{1}^{T}$ to row 2:

1 & 2\\
0 & -2

Then adding $r_{2}^{T}$ to row 1:

1 & 0\\
0 & -2

So the good news is that the column space of $A$ has linearly independent columns, $\left[\begin{array}{c}
\end{array}\right]$ and $\left[\begin{array}{c}
\end{array}\right]$, which form a basis for $\mathbb{R^{\mathrm{2}}}$, and rank($A$) is 2. So the null space of $A$ is empty. That’s the best possible outcome for a matrix because it means that an inverse exists.

But before exploring that, let’s think about the row space of $A$. Using the same row reductions as above, we can conclude that $A$ has linearly independent rows, $\left[\begin{array}{cc} 1 & 2\end{array}\right]$ and $\left[\begin{array}{cc} 3 & 4\end{array}\right]$, which form a basis for $\mathbb{R^{\mathrm{2}}}$, and rank($A^{T}$) is 2. So the left null space of $A$ is also empty. The row space of $A$ is isomorphic to the column space of $A^{T}$ by definition, and $A$ happens to have full rank, so an inverse exists. Let’s use Gauss-Jordan elimination to find it:

1 & 2 & | & 1 & 0\\
3 & 4 & | & 0 & 1

Starting with an augmented matrix $\left[\begin{array}{ccc}
A & | & I\end{array}\right]$, we can use row operations to find $\left[\begin{array}{ccc}
I & | & A^{-1}\end{array}\right]$. Add $-3r_{1}^{T}$ to row 2:

1 & 2 & | & 1 & 0\\
0 & -2 & | & -3 & 1

Then add $r_{2}^{T}$ to row 1:

1 & 0 & | & -2 & 1\\
0 & 2 & | & -3 & 1

Finally rescale row 2:

1 & 0 & | & -2 & 1\\
0 & 1 & | & -\frac{3}{2} & \frac{1}{2}


-2 & 1\\
-\frac{3}{2} & \frac{1}{2}

Now we need to check that $A^{-1}A=I$:
2 & 1\\
-\frac{3}{2} & \frac{1}{2}
1 & 2\\
3 & 4
\end{array}\right] & = & \left[\begin{array}{cc}
1 & 0\\
0 & 1

Of course, there is an explicit formula for the inverse of a 2×2 matrix:

a & b\\
c & d
\end{array}\right]^{-1}=\frac{1}{\det A}\left[\begin{array}{cc}
d & -b\\
-c & a

For our $A$, $\det A=ad-bc=-2$. Unfortunately, analytical inverses don’t exist for larger matrices, or they are so long and complex as to be of limited utility. But there is at least one important idea to take away from the inverse: it can only exist if the determinant is non-zero. This becomes a very important fact for the eigenvalue problem.

One last thing I wanted to write about for now: the L2 norm (or Euclidean norm) of a vector $x=\left[\begin{array}{cccc}
x_{1} & x_{2} & \cdots & x_{n}\end{array}\right]^{T}$ is defined as:

\left\Vert x\right\Vert _{2}^{2}=\sum_{i=1}^{n}x_{i}^{2}

Suppose that we’re fitting data $\left(a_{ij},y_{i}\right)$ to a known linear model$A$ and we want to determine the unknown coefficients $x$ that best fit the data using ordinary least squares:

\end{array}\right] & = & \left[\begin{array}{cccc}
a_{1,1} & a_{1,2} & \cdots & a_{1,n}\\
a_{2,1} & a_{2,2} & \cdots & a_{2,n}\\
\vdots & \vdots & \ddots & \vdots\\
a_{m,1} & a_{m,2} & \cdots & a_{m,n}
y & = & Ax

And we want to choose the $x$ which minimizes the L2 norm of the residuals because we assume them to be Gaussian:

J\left(x\right)=\left\Vert Ax-y\right\Vert _{2}^{2}

Then $J\left(x\right)$ can be expanded:

\left\Vert Ax-y\right\Vert _{2}^{2} & = & \left(Ax-y\right)^{T}\left(Ax-y\right)\\
& = & \left(x^{T}A^{T}-y^{T}\right)\left(Ax-y\right)\\
& = & x^{T}A^{T}Ax-x^{T}A^{T}y-y^{T}Ax+y^{T}y

However, $x^{T}A^{T}y$ is a scalar that can be computed as $y^{T}Ax$ by reversing the order of the multiplications, so the last expression can be further simplied:

x^{T}A^{T}Ax-x^{T}A^{T}y-y^{T}Ax+y^{T}y & = & x^{T}A^{T}Ax-2y^{T}Ax+y^{T}y

This might look ugly, but we can now minimize $J\left(x\right)$ by
taking the derivative with respect to $x$:

\frac{dJ\left(x\right)}{dx} & = & 2x^{T}A^{T}A-2y^{T}A\\
& = & 2A^{T}Ax-2A^{T}y

Where we used the fact that $x^{T}A^{T}A=A^{T}Ax$ and $y^{T}A=A^{T}y$ because it’s just changing the order of the multiplications again.

Now we can derive the celebrated pseudo-inverse of $A$ by setting the derivative to zero:

2A^{T}Ax-2A^{T}y & = & 0\\
A^{T}Ax & = & A^{T}y\\
x & = & \left(A^{T}A\right)^{-1}A^{T}y

However, the inverse of $A^{T}A$ may not exist, or it may be very hard to compute (due to numerical instability). An alternative solution is to use gradient descent. Since we already have $\frac{dJ\left(x\right)}{dx}$ in a nice form:

\frac{dJ\left(x\right)}{dx} & = & 2A^{T}\left(Ax-y\right)

We can start at any point $x=x_{0}$ and take a step along the direction given by the derivative, $x_{1}=x_{0}-\gamma\frac{dJ\left(x\right)}{dx}$. The problem is how big of a step to take. Despite the existence of a global minimum and an analytical form for the derivative, the steps could either be too small (taking forever to converge) or too large (diverging even when starting near the global minimum). This is where I wrote a paper about bracketing the minimum using a priori constraints (i.e., Golden section search and Brent’s method), but variations on line search minimization are also possible.

I’ve read that the conjugate gradient method is the more popular solution to this problem now. Gradient descent searches strictly along the derivative, whereas the conjugate gradient method chooses a different search direction every time. The Grahm-Schmidt procedure is used to orthogonalize the gradient vectors, and then the conjugate gradient method moves along that new basis. This can be much faster than gradient descent, but it can become slow if the condition number of $A$ is too large. But it’s still a good choice because it doesn’t require the Hessian matrix to be calculated or inverted (as per Newton’s method).

If you’ve got lots of memory and $M,N$ are small-ish, the Levenberg-Marquardt algorithm can converge even faster because it approximates the Hessian with the Jacobian matrix and chooses directions either along the derivative or the Hessian, whichever is better. Unfortunately, it doesn’t work with regularization, and it has a few more internal parameters, and it usually runs out of memory when computing $\left(J^{T}J+\lambda I\right)^{-1}$. So I usually end up using the conjugate gradient method anyway because it can be regularized and doesn’t require crazy amounts of memory.

At CodeMash 2017 I heard a presentation about artificial neural networks where the presenter complained bitterly about how his L2 minimization (“backtracking”) in the neural network was converging very slowly. I thought to suggest an improved algorithm (conjugate gradient), but his talk was focused on a high level introduction with no math, so it didn’t seem appropriate at the time. That inspired me to write this post. Then I saw that someone else already did conjugate gradients with artificial neural networks in 1992. 😛