26/03/2017

[Java] 0/1 Knapsack problem O(n log n) heuristic

The 0/1 knapsack problem is a fairly common problem in daily life and it's unfortunately one of those problems where you must accept that the optimal solution cannot be computed as quickly as you would like, no matter how hard you think about the problem. As a friend put it, "if the solution looks like black magic, there is probably something wrong with it".

The problem description states: given a recipient with a fixed capacity and a set of items, each with a specific weight and value, the goal is to find the subset that maximises the total value of the chosen items while not exceeding the maximum recipient capacity.
In the 0/1 version, you cannot break an object (not possible to only take a fraction of it) so you can only either take it or not.

Sometimes calculating the optimal solution is too expensive or simply unfeasible (eg online knapsack) so a different, possibly suboptimal approach has to be taken. The one described here uses a greedy algorithm and aims at maximising the total value by selecting items after ordering them by desirability in terms of value per space left after they are picked.

20/03/2017

[SQL] Excel COUNTIFS - count columns matching criteria in a row

Excel in his arsenal of useful functions has COUNTIFS, basically a count of how many elements in a one dimensional range match a specific criteria. It says multidimensional, but it's not, it's either the same criteria twice for more dimensions or a different criteria. Key point: one list at a time.

However this is a very basic need which is not immediately achievable in SQL as well since we cannot loop over columns in a row. That is, unless we remember that PIVOTing is actually a thing. In this specific case we use the inverse operation, UNPIVOT.

[Java] Mergesort sorting algorithm for int arrays

Mergesort is well known as one of the best sorting algorithms out there with a O(n log n) worst case runtime; it doesn't get much better than this without strange big-O tricks.

But while understanding how it works is easy, implementing it is a different story given that it combines two nested function calls in the recursion and requires careful array index management. Also Java pass by value adds a small gotcha for good measure.

You can find my implementation for int[] arrays (as in: not Integer, not ArrayList, not List, not whatever else. Pure and simple int primitive data type with old fashioned array, the kind that does not change size once created if you can still remember those) on my Gist along with some test cases.

12/03/2017

[Python] BloggerBackup script to backup Blogger posts

This script allows users to quickly download posts from their Blogger blog in order to keep a backup copy. On Windows there is the amazing BloggerBackupUtility but sadly that's not available for Linux as well, hence this small project comes to life.

You can get the utility here or check out the source code here. To run it, you must have Python 3 installed.

Usage is:

 bloggerbackup --api-key <API key> --blog <blog URL> --backup-dir <backup directory> [--start-date <date>] [--end-date <date>] [--verbose]  

options:

   --help - prints this help  
   
   --api-key <api key> - mandatory API key used to issue queries  
   
   --blog <blog URL> - mandatory URL of the blog to operate on  
   
   --backup-dir <backup directory> - mandatory directory where to put the posts backup  
   
   --start-date <date> - optional, date from which to begin fetching posts in RFC 3339 format: yyyy-MM-ddTHH:mm:ss+HH:mm  
   
   --end-date <date> - optional, date where to stop fetching posts in RFC 3339 format: yyyy-MM-ddTHH:mm:ss+HH:mm  
   
   --verbose - optional, prints debug information while processing  

The script is extremely barebone and definitely improvable. Also, it requires you to setup an own API key in order to issue queries in the free tier. To do so, visit the Google credentials page.

06/03/2017

[Java] BST element distance - parent pointer

Well, that's a bit anticlimactic. After playing with the other approaches, the one that initially gave me the most pain was also quite fast to implement as soon as the AHA! moment came.

However spoiler alert: the results are quite disappointing.

[Java] BST element distance - lowest common ancestor

So I was thinking about the parent pointer approach to find the distance between to elements in a binary search tree and avoiding using the path lists approach, which means finding the lowest common ancestor; that's the lowest node in the tree that is common to both paths to the two elements.
If the elements are one the child of the other, then it will be equal to the parent, otherwise it will either be the root if the elements are on completely different branches or the divergence node if the elements are in the same subtree.

Then I got down to coding and when I got to the LCA logic, finally, ... I did not use the parent pointer yet again :(

[Java] BST element distance - path lists

The exercise that stole too much time from me last time was about binary search trees. The description is quite simple: find the distance between two elements in a given tree; that means how many hops do you need to do to navigate to element B from element A; if the elements are the same, distance is 0, if one or both elements are missing, distance is -1.