Speed up your algorithms the Roman way

We recently created a string matching application to gather data, analyze it and produce results based on the similarities between multiple documents. It was built with NodeJS and used MongoDB as a database. Sometimes our app experiences problems in performance, so now I want to share the ways our team used for determining them.

Several conditions limited the options available to us:

  • Very long processes – regular debugging methods don’t work for us
  • We don’t have the original data – according to security and GDPR we shouldn’t
  • Big data – a lot of records to analyze

Summarising the problem 

Once I googled, researched and developed a lot of things I have started to think how I can summarize all of the “speed up” techniques I found. I remembered one well-known phrase I learned at school. Julius Caesar said: “Veni, vidi, vici” (“I came; I saw; I conquered”). This phrase will help us to answer our question: “How can you speed up your algorithms?”

Veni

In my analogy, “Veni” (“I came”) means all of the preparations that we need to make before committing to action.  If we don’t do these things we won’t be able to “conquer”. In Caesar’s case, these things included: gathering an army, delivering it to some place, finding food and transport for the warriors, etc. Actually, a lot of things that aren’t so visible but are so important from the beginning. It is like the setup – very important to have a strong basis. If I need to discover and fix performance issues, I need to prepare my code to be in such condition that it gives me enough information to detect issues and weak points. So, I want to highlight a few aspects (there are lots of other points, but let’s talk only about these things): 

Several moments about this list:

  1. Logs and timers – we need to have really useful information from them.
  2. Error handlers – we need to be sure that we understand where the problem is because this understanding will allow us to deal with them. Really, do not ignore `err`, but handle them. This is the best practice.
  3. Memory – be sure that your operations won’t take more memory than available on hard drive or RAM. If you have “Out of memory” errors – you’re doing something wrong. In NodeJS we have streams that can help with this.
  4. Show normal error messages to the user – especially during the development you need to create a way for QA or users to report bugs to you when an error is fired from their local PC. In our case, we showed a complete error message with a few suggestions and a link to a log file. This helped us notably – bugs happen and we need to know about them asap.
  5. Do not crash – I know that this point is similar to the previous few points, but as software engineers we need to cover the whole picture and our app shouldn’t crash under any circumstances.
  6. Benchmarks – there are number of online services that you can use for gathering benchmarks of two or more algorithms to  find the best and fastest solution.

The result look like this:

Vidi

I use “Vidi” (“I saw”) to mean that from tons of information from metrics, logs and assumptions (actually infinite) we need to find the thing that should be fixed/changed. We need to analyze this knowledge. We are like doctors with a patient who needs a cure. Let’s take a look on one of the examples of timers:

From the first look, every operation “string-slow-readable-distance” takes much more time than “string-joined.” But if we take a look at `Count`, we understand that it is really important to prioritize “string-joined”. This is a small example of how we can detect our performance issues.

Vici

Vici (“I conquered”). The final step is solving the problem you’ve found. Whether you choose to use another module, another algorithm or even another framework, the point is that once you’ve laid the groundwork (Veni) and recognised what the problem is (Vidi) then you’re all set to resolve the problem and get your app working smoothly.

Improving performance is like a startup – you need to find a pain and then cure it. It is impossible to solve a problem that you can’t see. In the code you should to see the whole picture if you want to make it better. Only when you see the problem can you fix it. Don’t neglect to use Julius Caesar’s winning strategy.

Add comment

Your email address will not be published. Required fields are marked *