Sudoku

Sudoku is a logic-based number placement puzzle. The objective is to fill a 9×9 grid so that each column, each row, and each of the nine 3×3 boxes (also called blocks or regions) contains the digits from 1 to 9, only one time each (that is, exclusively). The puzzle setter provides a partially completed grid.

Completed Sudoku puzzles are a type of Latin square, with an additional constraint on the contents of individual regions. Leonhard Euler is sometimes incorrectly cited as the source of the puzzle, based on his work with Latin squares.

The modern puzzle was invented by an American architect, Howard Garns, in 1979 and published by Dell Magazines under the name "Number Place". It became popular in Japan in 1986, after it was published by Nikoli and given the name Sudoku, meaning single number. It became an international hit in 2005

Strategies

The strategy for solving a puzzle may be regarded as comprising a combination of three processes: scanning, marking up, and analyzing. The approach to analysis may vary according to the concepts and the representations on which it is based.

Scanning

Scanning is performed at the outset and throughout the solution. Scans need be performed only once between analyses. Scanning consists of two techniques:

Cross-hatching: The scanning of rows to identify which line in a region may contain a certain numeral by a process of elimination. The process is repeated with the columns. It is important to perform this process systematically, checking all of the digits 1–9.
Counting 1–9 in regions, rows, and columns to identify missing numerals. Counting based upon the last numeral discovered may speed up the search. It also can be the case, particularly in tougher puzzles, that the best way to ascertain the value of a cell is to count in reverse—that is, by scanning the cell\'s region, row, and column for values it cannot be, in order to see what remains.

Advanced solvers look for "contingencies" while scanning, narrowing a numeral\'s location within a row, column, or region to two or three cells. When those cells lie within the same row and region, they can be used for elimination during cross-hatching and counting. Puzzles solved by scanning alone without requiring the detection of contingencies are classified as "easy"; more difficult puzzles are not readily solved by basic scanning alone.

Logically, every sudoku puzzle, regardless of difficulty, is solved via scanning heuristics. In a true sudoku puzzle, every number has a necessary position in each part of the grid which can be deduced from the description or if you prefer definition of what a "true" sudoku is. The only difference between solving advanced puzzles and simpler puzzles is not the techniques used to solve the puzzle but recognizing the logical implications of the scanning heuristic. One such implication would be recognizing logical "contingencies" which just basically means narrowing down the possibilities of a given square via the relations between every other square.

Marking up

Scanning stops when no further numerals can be discovered, making it necessary to engage in logical analysis. One method to guide the analysis is to mark candidate numerals in the blank cells.

Subscript notation

In subscript notation, the candidate numerals are written in subscript in the cells. Because puzzles printed in a newspaper are too small to accommodate more than a few subscript digits of normal handwriting, solvers may create a larger copy of the puzzle. Using two colours, or mixing pencil and pen marks can be helpful.

Dot notation

The dot notation uses a pattern of dots in each square, where the dot position indicates a number from 1 to 9. The dot notation can be used on the original puzzle. Dexterity is required in placing the dots, since misplaced dots or inadvertent marks inevitably lead to confusion and may not be easily erased.

An alternative technique is to mark the numerals that a cell cannot be. The cell starts empty and as more constraints become known, it slowly fills until only one mark is missing. Assuming no mistakes are made and the marks can be overwritten with the value of a cell, there is no longer a need for any erasures.

Analysis

The two main approaches to analysis are "candidate elimination" and "what-if".

Candidate elimination

In "candidate elimination", progress is made by successively eliminating candidate numerals from cells to leave one choice. After each answer has been achieved, another scan may be performed—usually checking to see the effect of the contingencies. In general, if entering a particular numeral prevents completion of the other necessary placements, then the numeral in question can be eliminated as a candidate.

Doubles and triples

One method works by identifying "matched cell groups". For instance, if precisely two cells within a scope (a particular row, column, or region) contain the same two candidate numerals (p,q), or if precisely three cells within a scope contain the same three candidate numerals (p,q,r), these cells are said to be matched. The placement of those candidate numerals anywhere else within that same scope would make a solution impossible; therefore, those candidate numerals can be deleted from all other cells in the scope.

What-if

In the "what-if" approach (also called "guess-and-check", "bifurcation", "backtracking" and "Ariadne\'s thread"), a cell with two candidate numerals is selected, and a guess is made. The steps are repeated until a duplication is found or a cell is left without a possible candidate, in which case the alternative candidate must be the solution. For each cell\'s candidate, the question is posed: \'will entering a particular numeral prevent completion of the other placements of that numeral?\' If the answer is \'yes\', then that candidate can be eliminated. If the "what-if" exercises for both candidates show that either one is possible, another pair should be tried. Alternatively, if the "what-if" exercises for both candidates imply an identical result, then that result must be true. The what-if approach requires a pencil and eraser or a good layout memory.

There are three kind of conflicts, which can appear during puzzle solving:

basic conflicts - there are only N-1 different candidates in N cell in the area
fish conflicts - when eliminating number from N rows/columns, it will disappear also from N+1 columns/rows.
unique conflicts - this pattern means multiple solutions, all numbers in the pattern exist exactly two times in every area, row and column. If there is only one candidate in the cell, any virtual candidate can be added.

Encountering any of those would indicate that the puzzle is not uniquely solvable. Encountering any of them as a consequence of "what-if" indicates that an untried alternative is correct.