2012-05-16

2012 Google Code Jam, Round 1A

In the next round of the Google Code Jam this year, 3686 people scored at least some points. Only the top 1000 would advance, though, which took at least 53/100 points and a fast enough time.

I first attacked Problem B. Kingdom Rush. It has a long explanation you can see at the link, but here's a short summary:
  • There are N levels to finish
  • Each level can be completed with a 1 star or 2 star rating; if it is previously completed with 1 star, completing it with 2 stars only earns one additional star, and completing it with 1 star is meaningless. Thus, the maximum possible stars to earn is 2N
  • There is a prerequisite number of stars to get a 1 star rating on a level, and a greater or equal prerequisite to finish the level with 2 stars.
The object is to find the fewest number of runs possible to complete all levels with 2 stars, if it's possible at all. Levels don't have to be completed in any order.

One sample input is:
0 2
0 1
meaning that there are 2 levels, and the first level has a pre-req of 0 for 1 star, and 2 for 2 stars. One example run could be finishing level 2 with 1 star (now you have 1 star), then finishing level 2 with with 2 stars (now you have 2 stars), then finishing level 1 with 2 stars in one single run because the pre-req is only 2 stars, for a total of 3 runs.

An impossible input would be
1 1
1 1
1 1
because, starting with 0 stars, there's no way to finish any level first.
Also impossible would be
0 0
0 0
0 6
because you couldn't get 6 stars before attempting 2 stars on the last level.

My first approach was to try to finish the first possible 2-star level, and if one wasn't reachable with the current level of stars, try the first possible 1 star level. This solved the test data, but my submissions for the small input kept getting rejected, so I knew there must be an additional trick. I tried to come up with solutions that broke my algorithm, and discovered one.
0 1
0 3

With this input, first I'd try all the 2-star values, and fail. Then I'd take the first 1-star value, leaving:
0 1
0 3

The next run would take the first possible 2-star value, at 1.
0 1
0 3

Next, the 2 star search would fail again, and it would take the next 0.
0 1
0 3

Then the 3, for a total of 4 runs.
0 1
0 3


But, I can see a faster way.  First, take the last 0
0 1
0 3

Now that I have 1 star, I can complete the first level all at once
0 1
0 3

And with 3 stars, I can complete the second level, for a total of 3 runs
0 1
0 3

Adding the run for the final level lets this one get finished in 3 runs instead of the previous 4. So the flaw in the algorithm was in taking the first available 1-star, when instead I needed to take the 1-star associated with the highest possible 2-star value, in order to leave the others available to possibly finish both stars in one single run. I fixed my algorithm and submitted the small input, which was then correct. I submitted the large input and had to wait until the end to find out it if it was right, which it was.

Next, I attempted Problem A. Password Problem. Again, it has a long explanation, I'll try to boil down to some bullet points.

  • I have already typed A characters of my password, which is B characters wrong.
  • The chances I typed each of the A previous characters correctly are given
  • All future characters will be typed correctly
  • My choices are
    • keep typing
    • backspace 1 to A characters and continue typing
    • hit enter and start over.

The object is to choose the strategy to minimize the expected number of keystrokes.
Side note: I don't usually like expected value that much.

Here was the contest example:


Suppose my password is "guest" and I have already typed the first two characters, but I had a 40% chance of making a mistake when typing each of them. Then there are four cases: 

  1. I typed "gu" without error. This occurs with probability 0.6 * 0.6 = 0.36. 
  2. I typed the 'g' correctly but I made a mistake typing the 'u'. Then I have two letters typed still, but the second one is wrong: "gX". (Here, the 'X' character represents a mistyped letter.) This occurs with probability 0.6 * 0.4 = 0.24. 
  3. I typed the 'u' correctly but I made a mistake typing the 'g': "Xu". This occurs with probability 0.4 * 0.6 = 0.24. 
  4. I made a mistake typing both letters, so I have two incorrect letters: "XX". This occurs with probability 0.4 * 0.4 = 0.16. 

I don't know how many mistakes I actually made, but for any strategy, I can calculate the expected number of keys required to use it. This is shown in the table below:
        "gu" "gX" "Xu" "XX" Expected
Probability                            0.36 0.24 0.24 0.16 -
Keystrokes if I keep typing            4      10   10   10 7.84
Keystrokes if I press backspace once   6       6   12   12 8.4
Keystrokes if I press backspace twice  8       8    8    8 8
Keystrokes if I press enter right away 7       7    7    7 7


My first thought here was that I would need to calculate the 2^A possibilities, so if I got character 1 right or wrong, combined with if I got character 2 right or wrong, etc. Since A could be up to 99999 in the problem statement, I knew that wouldn't be feasible. Then I realized it only mattered what the odds were of typing the first x characters correct, and anything after the first wrong character didn't effect the answer to the problem. that is why the columns under "Xu" and "XX" look the same

I treated the "keep typing" case as pressing backspace 0 times, so that it fit more nicely in my code. Now there were only 2 cases to check. Pressing enter right away was easy, the answer was always the same, so the expected value was always B+2, for the first enter, the B characters of the password, and the final enter.

Since the probability could be different for each letter, I stored them in a vector with the intent to multiply the ones I needed. Then I realized that what I really needed was the cumulative probability of getting all letters right up to any given letter, so I stored the values as
p1, p1*p2, p1*p2*p3,...
resulting in a new vector
P(0), P(1), P(2), P(3)...

Now I could figure out the answer in linear time. Letting the number of backspaces be C, and looping over C from 0 to A, the expected value with C backspaces was the product of all of the probabilities of typing characters 0 to A-C correct, times 2C+B-A+1 (C backspaces, C replacement characters, the rest of the password, plus the enter key), plus 1 minus that product times 2C+2B-A+2 ( The 2C+B-A+1 from before, plus another B characters when it is rejected, and the final enter).

So the answer to each case, given A, B, and the vector of p's is:
min(B+2,P(A-C)(2C+B-A+1),(1-P(A-C))(2C+2B-A+2))
for all C 0 to A

(I really need to get LaTeX working for Blogger to write fancier equations.)

I thought I had cleverly solved this one, and submitted the small input. Unfortunately, the large was wrong because I had not planned on reading in 900,000 characters from the input file on a single line, and some code I wrote several years ago broke. This is because each case was input as:
A B
p1 p2 p3 ... pA

Where px is the probability of getting character x correct. The probabilities were in the form x.xxxxxx, so a total of 9 characters each including numbers, decimal point, and space. With A up to 99999, there were nearly 900k characters on one of the lines.

So the takeaway from this round was to fix my input reader. I finished with 43 points, placing at #1463.

On to Round 1B!




No comments:

Post a Comment